## ASU Electronic Theses and Dissertations

This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.

Contributor
Subject
Date Range
2011 2019

## Recent Submissions

Understanding customer preference is crucial for new product planning and marketing decisions. This thesis explores how historical data can be leveraged to understand and predict customer preference. This thesis presents a decision support framework that provides a holistic view on customer preference by following a two-phase procedure. Phase-1 uses cluster analysis to create product profiles based on which customer profiles are derived. Phase-2 then delves deep into each of the customer profiles and investigates causality behind their preference using Bayesian networks. This thesis illustrates the working of the framework using the case of Intel Corporation, world’s largest semiconductor manufacturing company. …

Contributors
Ram, Sudarshan Venkat, Kempf, Karl G, Wu, Teresa, et al.
Created Date
2017

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs and without the need for explicit relearning from scratch. In this thesis, a novel instance transfer technique that adapts a "Cost-sensitive" variation of AdaBoost is presented. The method capitalizes on the theoretical and functional properties of AdaBoost to selectively reuse outdated training instances obtained from a "source" domain to effectively …

Contributors
Venkatesan, Ashok, Panchanathan, Sethuraman, Li, Baoxin, et al.
Created Date
2011

Autonomic closure is a new general methodology for subgrid closures in large eddy simulations that circumvents the need to specify fixed closure models and instead allows a fully- adaptive self-optimizing closure. The closure is autonomic in the sense that the simulation itself determines the optimal relation at each point and time between any subgrid term and the variables in the simulation, through the solution of a local system identification problem. It is based on highly generalized representations of subgrid terms having degrees of freedom that are determined dynamically at each point and time in the simulation. This can be regarded …

Contributors
Kshitij, Abhinav, Dahm, Werner J.A., Herrmann, Marcus, et al.
Created Date
2019

Longitudinal recursive partitioning (LRP) is a tree-based method for longitudinal data. It takes a sample of individuals that were each measured repeatedly across time, and it splits them based on a set of covariates such that individuals with similar trajectories become grouped together into nodes. LRP does this by fitting a mixed-effects model to each node every time that it becomes partitioned and extracting the deviance, which is the measure of node purity. LRP is implemented using the classification and regression tree algorithm, which suffers from a variable selection bias and does not guarantee reaching a global optimum. Additionally, fitting …

Contributors
Stegmann, Gabriela, Grimm, Kevin, Edwards, Michael, et al.
Created Date
2019

The rapid advancements of technology have greatly extended the ubiquitous nature of smartphones acting as a gateway to numerous social media applications. This brings an immense convenience to the users of these applications wishing to stay connected to other individuals through sharing their statuses, posting their opinions, experiences, suggestions, etc on online social networks (OSNs). Exploring and analyzing this data has a great potential to enable deep and fine-grained insights into the behavior, emotions, and language of individuals in a society. This proposed dissertation focuses on utilizing these online social footprints to research two main threads – 1) Analysis: to …

Contributors
Manikonda, Lydia, Kambhampati, Subbarao, Liu, Huan, et al.
Created Date
2019

The students of Arizona State University, under the mentorship of Dr George Karady, have been collaborating with Salt River Project (SRP), a major power utility in the state of Arizona, trying to study and optimize a battery-supported grid-tied rooftop Photovoltaic (PV) system, sold by a commercial vendor. SRP believes this system has the potential to satisfy the needs of its customers, who opt for utilizing solar power to partially satisfy their power needs. An important part of this elaborate project is the development of a new load forecasting algorithm and a better control strategy for the optimized utilization of the …

Contributors
Hariharan, Aashiek, Karady, George G, Heydt, Gerald Thomas, et al.
Created Date
2018

Live streaming has risen to significant popularity in the recent past and largely this live streaming is a feature of existing social networks like Facebook, Instagram, and Snapchat. However, there does exist at least one social network entirely devoted to live streaming, and specifically the live streaming of video games, Twitch. This social network is unique for a number of reasons, not least because of its hyper-focus on live content and this uniqueness has challenges for social media researchers. Despite this uniqueness, almost no scientific work has been performed on this public social network. Thus, it is unclear what user …

Contributors
Jones, Isaac, Liu, Huan, Maciejewski, Ross, et al.
Created Date
2019

With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the user. The search engine user has to go through certain webpages before getting at the webpage he/she wanted. This problem of Information Overload can be solved using Automatic Text Summarization. Summarization is a process of obtaining at abridged version of documents so that user can have a quick view to understand what exactly the document is about. Email threads from …

Contributors
Nadella, Sravan, Davulcu, Hasan, Li, Baoxin, et al.
Created Date
2015

Reynolds-averaged Navier-Stokes (RANS) simulation is the industry standard for computing practical turbulent flows -- since large eddy simulation (LES) and direct numerical simulation (DNS) require comparatively massive computational power to simulate even relatively simple flows. RANS, like LES, requires that a user specify a “closure model” for the underlying turbulence physics. However, despite more than 60 years of research into turbulence modeling, current models remain largely unable to accurately predict key aspects of the complex turbulent flows frequently encountered in practical engineering applications. Recently a new approach, termed “autonomic closure”, has been developed for LES that avoids the need to …

Contributors
Ahlf, Rick, Dahm, Werner J.A., Wells, Valana, et al.
Created Date
2017

Identifying chemical compounds that inhibit bacterial infection has recently gained a considerable amount of attention given the increased number of highly resistant bacteria and the serious health threat it poses around the world. With the development of automated microscopy and image analysis systems, the process of identifying novel therapeutic drugs can generate an immense amount of data - easily reaching terabytes worth of information. Despite increasing the vast amount of data that is currently generated, traditional analytical methods have not increased the overall success rate of identifying active chemical compounds that eventually become novel therapeutic drugs. Moreover, multispectral imaging has …

Contributors
Trevino, Robert, Liu, Huan, Lamkin, Thomas J, et al.
Created Date
2016

Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an important role in the implementation and adoption of applications in areas such as public health. Advancements in machine learning and deep learning techniques have enabled rapid development of such pipelines. This dissertation presents entity extraction pipelines for two public health applications: virus phylogeography and pharmacovigilance. For virus phylogeography, geographical locations …

Contributors
Magge, Arjun, Scotch, Matthew, Gonzalez-Hernandez, Graciela, et al.
Created Date
2019

Significance of real-world knowledge for Natural Language Understanding(NLU) is well-known for decades. With advancements in technology, challenging tasks like question-answering, text-summarizing, and machine translation are made possible with continuous efforts in the field of Natural Language Processing(NLP). Yet, knowledge integration to answer common sense questions is still a daunting task. Logical reasoning has been a resort for many of the problems in NLP and has achieved considerable results in the field, but it is difficult to resolve the ambiguities in a natural language. Co-reference resolution is one of the problems where ambiguity arises due to the semantics of the sentence. …

Contributors
Prakash, Ashok, Baral, Chitta, Devarakonda, Murthy, et al.
Created Date
2019

High Voltage Direct Current (HVDC) Technology has several features that make it particularly attractive for specific transmission applications. Recent years have witnessed an unprecedented growth in the number of the HVDC projects, which demonstrates a heightened interest in the HVDC technology. In parallel, the use of renewable energy sources has dramatically increased. For instance, Kuwait has recently announced a renewable project to be completed in 2035; this project aims to produce 15% of the countrys energy consumption from renewable sources. However, facilities that use renewable sources, such as solar and wind, to provide clean energy, are mostly placed in remote …

Contributors
Albannai, Bassam Ahmad, Weng, Yang, Wu, Meng, et al.
Created Date
2019

The dawn of Internet of Things (IoT) has opened the opportunity for mainstream adoption of machine learning analytics. However, most research in machine learning has focused on discovery of new algorithms or fine-tuning the performance of existing algorithms. Little exists on the process of taking an algorithm from the lab-environment into the real-world, culminating in sustained value. Real-world applications are typically characterized by dynamic non-stationary systems with requirements around feasibility, stability and maintainability. Not much has been done to establish standards around the unique analytics demands of real-world scenarios. This research explores the problem of the why so few of …

Contributors
Shahapurkar, Som, Liu, Huan, Davulcu, Hasan, et al.
Created Date
2016

Cyber-systems and networks are the target of different types of cyber-threats and attacks, which are becoming more common, sophisticated, and damaging. Those attacks can vary in the way they are performed. However, there are similar strategies and tactics often used because they are time-proven to be effective. The motivations behind cyber-attacks play an important role in designating how attackers plan and proceed to achieve their goals. Generally, there are three categories of motivation are: political, economical, and socio-cultural motivations. These indicate that to defend against possible attacks in an enterprise environment, it is necessary to consider what makes such an …

Contributors
Created Date
2018

Endowing machines with the ability to understand digital images is a critical task for a host of high-impact applications, including pathology detection in radiographic imaging, autonomous vehicles, and assistive technology for the visually impaired. Computer vision systems rely on large corpora of annotated data in order to train task-specific visual recognition models. Despite significant advances made over the past decade, the fact remains collecting and annotating the data needed to successfully train a model is a prohibitively expensive endeavor. Moreover, these models are prone to rapid performance degradation when applied to data sampled from a different domain. Recent works in …

Contributors
Dudley, Andrew, Panchanathan, Sethuraman, Venkateswara, Hemanth, et al.
Created Date
2019

The healthcare system in this country is currently unacceptable. New technologies may contribute to reducing cost and improving outcomes. Early diagnosis and treatment represents the least risky option for addressing this issue. Such a technology needs to be inexpensive, highly sensitive, highly specific, and amenable to adoption in a clinic. This thesis explores an immunodiagnostic technology based on highly scalable, non-natural sequence peptide microarrays designed to profile the humoral immune response and address the healthcare problem. The primary aim of this thesis is to explore the ability of these arrays to map continuous (linear) epitopes. I discovered that using a …

Contributors
Richer, Joshua A., Johnston, Stephen A, Woodbury, Neal, et al.
Created Date
2014

In healthcare facilities, health information systems (HISs) are used to serve different purposes. The radiology department adopts multiple HISs in managing their operations and patient care. In general, the HISs that touch radiology fall into two categories: tracking HISs and archive HISs. Electronic Health Records (EHR) is a typical tracking HIS, which tracks the care each patient receives at multiple encounters and facilities. Archive HISs are typically specialized databases to store large-size data collected as part of the patient care. A typical example of an archive HIS is the Picture Archive and Communication System (PACS), which provides economical storage and …

Contributors
Wang, Kun, Li, Jing, Wu, Teresa, et al.
Created Date
2018

The subliminal impact of framing of social, political and environmental issues such as climate change has been studied for decades in political science and communications research. Media framing offers an “interpretative package" for average citizens on how to make sense of climate change and its consequences to their livelihoods, how to deal with its negative impacts, and which mitigation or adaptation policies to support. A line of related work has used bag of words and word-level features to detect frames automatically in text. Such works face limitations since standard keyword based features may not generalize well to accommodate surface variations …

Contributors
Alashri, Saud, Davulcu, Hasan, Desouza, Kevin C., et al.
Created Date
2018

A medical control system, a real-time controller, uses a predictive model of human physiology for estimation and controlling of drug concentration in the human body. Artificial Pancreas (AP) is an example of the control system which regulates blood glucose in T1D patients. The predictive model in the control system such as Bergman Minimal Model (BMM) is based on physiological modeling technique which separates the body into the number of anatomical compartments and each compartment's effect on body system is determined by their physiological parameters. These models are less accurate due to unaccounted physiological factors effecting target values. Estimation of a …

Contributors
Agrawal, Anurag, Gupta, Sandeep K. S., Banerjee, Ayan, et al.
Created Date
2017

Feature representations for raw data is one of the most important component in a machine learning system. Traditionally, features are \textit{hand crafted} by domain experts which can often be a time consuming process. Furthermore, they do not generalize well to unseen data and novel tasks. Recently, there have been many efforts to generate data-driven representations using clustering and sparse models. This dissertation focuses on building data-driven unsupervised models for analyzing raw data and developing efficient feature representations. Simultaneous segmentation and feature extraction approaches for silicon-pores sensor data are considered. Aggregating data into a matrix and performing low rank and sparse …

Contributors
Sattigeri, Prasanna, Spanias, Andreas, Thornton, Trevor, et al.
Created Date
2014

For the past three decades, the design of an effective strategy for generating poetry that matches that of a human’s creative capabilities and complexities has been an elusive goal in artificial intelligence (AI) and natural language generation (NLG) research, and among linguistic creativity researchers in particular. This thesis presents a novel approach to fixed verse poetry generation using neural word embeddings. During the course of generation, a two layered poetry classifier is developed. The first layer uses a lexicon based method to classify poems into types based on form and structure, and the second layer uses a supervised classification method …

Contributors
Magge Ranganatha, Arjun, Syrotiuk, Violet R, Baral, Chitta, et al.
Created Date
2016

Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks such as pharmacovigilance via the use of Natural Language Processing (NLP) techniques. One of the critical steps in information extraction pipelines is Named Entity Recognition (NER), where the mentions of entities such as diseases are located in text and their entity type are identified. However, the language in social media is highly informal, and user-expressed health-related concepts are often non-technical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and advanced …

Contributors
Nikfarjam, Azadeh, Gonzalez, Graciela, Greenes, Robert, et al.
Created Date
2016

This dissertation presents the development of structural health monitoring and prognostic health management methodologies for complex structures and systems in the field of mechanical engineering. To overcome various challenges historically associated with complex structures and systems such as complicated sensing mechanisms, noisy information, and large-size datasets, a hybrid monitoring framework comprising of solid mechanics concepts and data mining technologies is developed. In such a framework, the solid mechanics simulations provide additional intuitions to data mining techniques reducing the dependence of accuracy on the training set, while the data mining approaches fuse and interpret information from the targeted system enabling the …

Contributors
Created Date
2019

This research investigates the fine scale structure in Earth's mantle, especially for the lowermost mantle, where strong heterogeneity exists. Recent seismic tomography models have resolved large-scale features in the lower mantle, such as the large low shear velocity provinces (LLSVPs). However, differences are present between different models, especially at shorter length scales. Fine scale structures both within and outside LLSVPs are still poorly constrained. The drastic growth of global seismic networks presents densely sampled seismic data in unprecedented quality and quantity. In this work, the Empirical Wavelet construction method has been developed to document seismic travel time and waveform information …

Contributors
Lai, Hongyu, Garnero, Edward J, Till, Christy B, et al.
Created Date
2019

Wearable robotics has gained huge popularity in recent years due to its wide applications in rehabilitation, military, and industrial fields. The weakness of the skeletal muscles in the aging population and neurological injuries such as stroke and spinal cord injuries seriously limit the abilities of these individuals to perform daily activities. Therefore, there is an increasing attention in the development of wearable robots to assist the elderly and patients with disabilities for motion assistance and rehabilitation. In military and industrial sectors, wearable robots can increase the productivity of workers and soldiers. It is important for the wearable robots to maintain …

Contributors
Chinimilli, Prudhvi Tej, Redkar, Sangram, Zhang, Wenlong, et al.
Created Date
2018

Field of cyber threats is evolving rapidly and every day multitude of new information about malware and Advanced Persistent Threats (APTs) is generated in the form of malware reports, blog articles, forum posts, etc. However, current Threat Intelligence (TI) systems have several limitations. First, most of the TI systems examine and interpret data manually with the help of analysts. Second, some of them generate Indicators of Compromise (IOCs) directly using regular expressions without understanding the contextual meaning of those IOCs from the data sources which allows the tools to include lot of false positives. Third, lot of TI systems consider …

Contributors
Panwar, Anupam, Ahn, Gail-Joon, Doupé, Adam, et al.
Created Date
2017

Surgery as a profession requires significant training to improve both clinical decision making and psychomotor proficiency. In the medical knowledge domain, tools have been developed, validated, and accepted for evaluation of surgeons' competencies. However, assessment of the psychomotor skills still relies on the Halstedian model of apprenticeship, wherein surgeons are observed during residency for judgment of their skills. Although the value of this method of skills assessment cannot be ignored, novel methodologies of objective skills assessment need to be designed, developed, and evaluated that augment the traditional approach. Several sensor-based systems have been developed to measure a user's skill quantitatively, …

Contributors
Islam, Gazi, Li, Baoxin, Liang, Jianming, et al.
Created Date
2013

As the size and scope of valuable datasets has exploded across many industries and fields of research in recent years, an increasingly diverse audience has sought out effective tools for their large-scale data analytics needs. Over this period, machine learning researchers have also been very prolific in designing improved algorithms which are capable of finding the hidden structure within these datasets. As consumers of popular Big Data frameworks have sought to apply and benefit from these improved learning algorithms, the problems encountered with the frameworks have motivated a new generation of Big Data tools to address the shortcomings of the …

Contributors
Krouse, Brian Richard, Ye, Jieping, Liu, Huan, et al.
Created Date
2014

Machine learning models convert raw data in the form of video, images, audio, text, etc. into feature representations that are convenient for computational process- ing. Deep neural networks have proven to be very efficient feature extractors for a variety of machine learning tasks. Generative models based on deep neural networks introduce constraints on the feature space to learn transferable and disentangled rep- resentations. Transferable feature representations help in training machine learning models that are robust across different distributions of data. For example, with the application of transferable features in domain adaptation, models trained on a source distribution can be applied …

Contributors
Eusebio, Jose Miguel Ang, Panchanathan, Sethuraman, Davulcu, Hasan, et al.
Created Date
2018

Alzheimer's Disease (AD) is the most common form of dementia observed in elderly patients and has significant social-economic impact. There are many initiatives which aim to capture leading causes of AD. Several genetic, imaging, and biochemical markers are being explored to monitor progression of AD and explore treatment and detection options. The primary focus of this thesis is to identify key biomarkers to understand the pathogenesis and prognosis of Alzheimer's Disease. Feature selection is the process of finding a subset of relevant features to develop efficient and robust learning models. It is an active research topic in diverse areas such …

Contributors
Dubey, Rashmi, Ye, Jieping, Wang, Yalin, et al.
Created Date
2012

Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect the model performance. In this thesis, I focus on developing learning methods for the high-dimensional imbalanced biomedical data. In the first part, a sparse canonical correlation analysis (CCA) method is presented. The penalty terms is used to control the sparsity of the projection matrices of CCA. The sparse CCA method …

Contributors
Yang, Tao, Ye, Jieping, Wang, Yalin, et al.
Created Date
2013

The recent technological advances enable the collection of various complex, heterogeneous and high-dimensional data in biomedical domains. The increasing availability of the high-dimensional biomedical data creates the needs of new machine learning models for effective data analysis and knowledge discovery. This dissertation introduces several unsupervised and supervised methods to help understand the data, discover the patterns and improve the decision making. All the proposed methods can generalize to other industrial fields. The first topic of this dissertation focuses on the data clustering. Data clustering is often the first step for analyzing a dataset without the label information. Clustering high-dimensional data …

Contributors
Lin, Sangdi, Runger, George C, Kocher, Jean-Pierre A, et al.
Created Date
2018

Users often join an online social networking (OSN) site, like Facebook, to remain social, by either staying connected with friends or expanding social networks. On an OSN site, users generally share variety of personal information which is often expected to be visible to their friends, but sometimes vulnerable to unwarranted access from others. The recent study suggests that many personal attributes, including religious and political affiliations, sexual orientation, relationship status, age, and gender, are predictable using users' personal data from an OSN site. The majority of users want to remain socially active, and protect their personal data at the same …

Contributors
Gundecha, Pritam Sureshlal, Liu, Huan, Ahn, Gail-Joon, et al.
Created Date
2015

Machine learning methodologies are widely used in almost all aspects of software engineering. An effective machine learning model requires large amounts of data to achieve high accuracy. The data used for classification is mostly labeled, which is difficult to obtain. The dataset requires both high costs and effort to accurately label the data into different classes. With abundance of data, it becomes necessary that all the data should be labeled for its proper utilization and this work focuses on reducing the labeling effort for large dataset. The thesis presents a comparison of different classifiers performance to test if small set …

Contributors
Batra, Salil, Femiani, John, Amresh, Ashish, et al.
Created Date
2017

The ubiquity of single camera systems in society has made improving monocular depth estimation a topic of increasing interest in the broader computer vision community. Inspired by recent work in sparse-to-dense depth estimation, this thesis focuses on sparse patterns generated from feature detection based algorithms as opposed to regular grid sparse patterns used by previous work. This work focuses on using these feature-based sparse patterns to generate additional depth information by interpolating regions between clusters of samples that are in close proximity to each other. These interpolated sparse depths are used to enforce additional constraints on the network’s predictions. In …

Contributors
Rai, Anshul, Yang, Yezhou, Zhang, Wenlong, et al.
Created Date
2019

Parents fulfill a pivotal role in early childhood development of social and communication skills. In children with autism, the development of these skills can be delayed. Applied behavioral analysis (ABA) techniques have been created to aid in skill acquisition. Among these, pivotal response treatment (PRT) has been empirically shown to foster improvements. Research into PRT implementation has also shown that parents can be trained to be effective interventionists for their children. The current difficulty in PRT training is how to disseminate training to parents who need it, and how to support and motivate practitioners after training. Evaluation of the parents’ …

Contributors
Copenhaver Heath, Corey D, Panchanathan, Sethuraman, McDaniel, Troy, et al.
Created Date
2019

Multi-task learning (MTL) aims to improve the generalization performance (of the resulting classifiers) by learning multiple related tasks simultaneously. Specifically, MTL exploits the intrinsic task relatedness, based on which the informative domain knowledge from each task can be shared across multiple tasks and thus facilitate the individual task learning. It is particularly desirable to share the domain knowledge (among the tasks) when there are a number of related tasks but only limited training data is available for each task. Modeling the relationship of multiple tasks is critical to the generalization performance of the MTL algorithms. In this dissertation, I propose …

Contributors
Chen, Jianhui, Ye, Jieping, Kumar, Sudhir, et al.
Created Date
2011

Ensemble learning methods like bagging, boosting, adaptive boosting, stacking have traditionally shown promising results in improving the predictive accuracy in classification. These techniques have recently been widely used in various domains and applications owing to the improvements in computational efficiency and distributed computing advances. However, with the advent of wide variety of applications of machine learning techniques to class imbalance problems, further focus is needed to evaluate, improve and optimize other performance measures such as sensitivity (true positive rate) and specificity (true negative rate) in classification. This thesis demonstrates a novel approach to evaluate and optimize the performance measures (specifically …

Contributors
Bahl, Neeraj Dharampal, Bansal, Ajay, Amresh, Ashish, et al.
Created Date
2017

Reinforcement learning (RL) is a powerful methodology for teaching autonomous agents complex behaviors and skills. A critical component in most RL algorithms is the reward function -- a mathematical function that provides numerical estimates for desirable and undesirable states. Typically, the reward function must be hand-designed by a human expert and, as a result, the scope of a robot's autonomy and ability to safely explore and learn in new and unforeseen environments is constrained by the specifics of the designed reward function. In this thesis, I design and implement a stateful collision anticipation model with powerful predictive capability based upon …

Contributors
Richardson, Trevor W, Ben Amor, Heni, Yang, Yezhou, et al.
Created Date
2018

Online health forums provide a convenient channel for patients, caregivers, and medical professionals to share their experience, support and encourage each other, and form health communities. The fast growing content in health forums provides a large repository for people to seek valuable information. A forum user can issue a keyword query to search health forums regarding to some specific questions, e.g., what treatments are effective for a disease symptom? A medical researcher can discover medical knowledge in a timely and large-scale fashion by automatically aggregating the latest evidences emerging in health forums. This dissertation studies how to effectively discover information …

Contributors
Liu, Yunzhong, Chen, Yi, Liu, Huan, et al.
Created Date
2016

This research start utilizing an efficient sparse inverse covariance matrix (precision matrix) estimation technique to identify a set of highly correlated discriminative perspectives between radical and counter-radical groups. A ranking system has been developed that utilizes ranked perspectives to map Islamic organizations on a set of socio-cultural, political and behavioral scales based on their web site corpus. Simultaneously, a gold standard ranking of these organizations was created through domain experts and compute expert-to-expert agreements and present experimental results comparing the performance of the QUIC based scaling system to another baseline method for organizations. The QUIC based algorithm not only outperforms …

Contributors
Kim, Nyunsu, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2018

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness, but limited studies compared different statistical techniques with latest frameworks, and interpreted models in a unified approach. In this thesis, several data mining algorithms have been applied to analyze students’ code assignment submission data from a real classroom study. The goal of this work is to explore and predict students’ …

Contributors
Tian, Wenbo, Hsiao, Ihan, Bazzi, Rida, et al.
Created Date
2019

Reasoning about the activities of cyber threat actors is critical to defend against cyber attacks. However, this task is difficult for a variety of reasons. In simple terms, it is difficult to determine who the attacker is, what the desired goals are of the attacker, and how they will carry out their attacks. These three questions essentially entail understanding the attacker’s use of deception, the capabilities available, and the intent of launching the attack. These three issues are highly inter-related. If an adversary can hide their intent, they can better deceive a defender. If an adversary’s capabilities are not well …

Contributors
Nunes, Eric, Shakarian, Paulo, Ahn, Gail-Joon, et al.
Created Date
2018

As a promising solution to the problem of acquiring and storing large amounts of image and video data, spatial-multiplexing camera architectures have received lot of attention in the recent past. Such architectures have the attractive feature of combining a two-step process of acquisition and compression of pixel measurements in a conventional camera, into a single step. A popular variant is the single-pixel camera that obtains measurements of the scene using a pseudo-random measurement matrix. Advances in compressive sensing (CS) theory in the past decade have supplied the tools that, in theory, allow near-perfect reconstruction of an image from these measurements …

Contributors
Lohit, Suhas Anand, Turaga, Pavan, Spanias, Andreas, et al.
Created Date
2015

The goal of reinforcement learning is to enable systems to autonomously solve tasks in the real world, even in the absence of prior data. To succeed in such situations, reinforcement learning algorithms collect new experience through interactions with the environment to further the learning process. The behaviour is optimized by maximizing a reward function, which assigns high numerical values to desired behaviours. Especially in robotics, such interactions with the environment are expensive in terms of the required execution time, human involvement, and mechanical degradation of the system itself. Therefore, this thesis aims to introduce sample-efficient reinforcement learning methods which are …

Contributors
Luck, Kevin Sebastian, Ben Amor, Hani, Aukes, Daniel, et al.
Created Date
2019

Large-scale $\ell_1$-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. In many applications, it remains challenging to apply the sparse learning model to large-scale problems that have massive data samples with high-dimensional features. One popular and promising strategy is to scaling up the optimization problem in parallel. Parallel solvers run multiple cores on a shared memory system or a distributed environment to speed up the computation, while the practical usage is limited by the huge dimension in the feature space and synchronization problems. In this dissertation, I carry …

Contributors
Li, Qingyang, Ye, Jieping, Xue, Guoliang, et al.
Created Date
2017

A story is defined as "an actor(s) taking action(s) that culminates in a resolution(s)''. I present novel sets of features to facilitate story detection among text via supervised classification and further reveal different forms within stories via unsupervised clustering. First, I investigate the utility of a new set of semantic features compared to standard keyword features combined with statistical features, such as density of part-of-speech (POS) tags and named entities, to develop a story classifier. The proposed semantic features are based on <Subject, Verb, Object> triplets that can be extracted using a shallow parser. Experimental results show that a model …

Contributors
Ceran, Saadet Betul, Davulcu, Hasan, Corman, Steven R, et al.
Created Date
2016

Machine learning (ML) has played an important role in several modern technological innovations and has become an important tool for researchers in various fields of interest. Besides engineering, ML techniques have started to spread across various departments of study, like health-care, medicine, diagnostics, social science, finance, economics etc. These techniques require data to train the algorithms and model a complex system and make predictions based on that model. Due to development of sophisticated sensors it has become easier to collect large volumes of data which is used to make necessary hypotheses using ML. The promising results obtained using ML have …

Contributors
Dutta, Arindam, Bliss, Daniel W, Berisha, Visar, et al.
Created Date
2018

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of focus. In supervised learning like regression, the data consists of many features and only a subset of the features may be responsible for the result. Also, the features might require special structural requirements, which introduces additional complexity for feature selection. The sparse learning package, provides a set of algorithms for …

Contributors
Thulasiram, Ramesh L., Ye, Jieping, Xue, Guoliang, et al.
Created Date
2011