Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Date Range
2010 2018


Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect the model performance. In this thesis, I focus on developing learning methods for the high-dimensional imbalanced biomedical data. In the first part, a sparse canonical correlation analysis (CCA) method is presented. The penalty terms is used to control the sparsity of the projection matrices of CCA. The sparse CCA method ...

Contributors
Yang, Tao, Ye, Jieping, Wang, Yalin, et al.
Created Date
2013

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query optimization in a distributed event stream processing framework that supports such applications involving various query expressions for detecting events, monitoring conditions, handling data streams, and querying data. Materialized views store the results of the computed view so that subsequent access to the view retrieves the materialized results, avoiding the cost ...

Contributors
Chaudhari, Mahesh Balkrishna, Dietrich, Suzanne W, Urban, Susan D, et al.
Created Date
2011

In this research, I try to solve multi-class multi-label classication problem, where the goal is to automatically assign one or more labels(tags) to discussion topics seen in deepweb. I observed natural hierarchy in our dataset, and I used dierent techniques to ensure hierarchical integrity constraint on the predicted tag list. To solve `class imbalance' and `scarcity of labeled data' problems, I developed semisupervised model based on elastic search(ES) document relevance score. I evaluate our models using standard K-fold cross-validation method. Ensuring hierarchical integrity constraints improved F1 score by 11.9% over standard supervised learning, while our ES based semi-supervised learning model ...

Contributors
Patil, Revanth, Shakarian, Paulo, Doupe, Adam, et al.
Created Date
2018

Multi-tenancy architecture (MTA) is often used in Software-as-a-Service (SaaS) and the central idea is that multiple tenant applications can be developed using compo nents stored in the SaaS infrastructure. Recently, MTA has been extended where a tenant application can have its own sub-tenants as the tenant application acts like a SaaS infrastructure. In other words, MTA is extended to STA (Sub-Tenancy Architecture ). In STA, each tenant application not only need to develop its own functionalities, but also need to prepare an infrastructure to allow its sub-tenants to develop customized applications. This dissertation formulates eight models for STA, and proposes ...

Contributors
Zhong, Peide, Davulcu, Hasan, Sarjoughian, Hessam, et al.
Created Date
2017

The amount of time series data generated is increasing due to the integration of sensor technologies with everyday applications, such as gesture recognition, energy optimization, health care, video surveillance. The use of multiple sensors simultaneously for capturing different aspects of the real world attributes has also led to an increase in dimensionality from uni-variate to multi-variate time series. This has facilitated richer data representation but also has necessitated algorithms determining similarity between two multi-variate time series for search and analysis. Various algorithms have been extended from uni-variate to multi-variate case, such as multi-variate versions of Euclidean distance, edit distance, dynamic ...

Contributors
Garg, Yash, Candan, Kasim Selcuk, Chowell-Punete, Gerardo, et al.
Created Date
2015

The game held by National Basketball Association (NBA) is the most popular basketball event on earth. Each year, tons of statistical data are generated from this industry. Meanwhile, managing teams, sports media, and scientists are digging deep into the data ocean. Recent research literature is reviewed with respect to whether NBA teams could be analyzed as connected networks. However, it becomes very time-consuming, if not impossible, for human labor to capture every detail of game events on court of large amount. In this study, an alternative method is proposed to parse public resources from NBA related websites to build degenerated ...

Contributors
Zhang, Xiaoyu, Tong, Hanghang, He, Jingrui, et al.
Created Date
2017

Computer Vision as a eld has gone through signicant changes in the last decade. The eld has seen tremendous success in designing learning systems with hand-crafted features and in using representation learning to extract better features. In this dissertation some novel approaches to representation learning and task learning are studied. Multiple-instance learning which is generalization of supervised learning, is one example of task learning that is discussed. In particular, a novel non-parametric k- NN-based multiple-instance learning is proposed, which is shown to outperform other existing approaches. This solution is applied to a diabetic retinopathy pathology detection problem eectively. In cases ...

Contributors
Venkatesan, Ragav, Li, Baoxin, Turaga, Pavan, et al.
Created Date
2017

Navigating within non-linear structures is a challenge for all users when the space is large but the problem is most pronounced when the users are blind or visually impaired. Such users access digital content through screen readers like JAWS which read out the text on the screen. However presentation of non-linear narratives in such a manner without visual cues and information about spatial dependencies is very inefficient for such users. The NSDL Science Literacy StrandMaps are visual layouts to help students and teachers browse educational resources. A Strandmap shows relationships between concepts and how they build upon one another across ...

Contributors
Gaur, Shruti, Candan, Kasim Selçuk, Sundaram, Hari, et al.
Created Date
2011

Micro-blogging platforms like Twitter have become some of the most popular sites for people to share and express their views and opinions about public events like debates, sports events or other news articles. These social updates by people complement the written news articles or transcripts of events in giving the popular public opinion about these events. So it would be useful to annotate the transcript with tweets. The technical challenge is to align the tweets with the correct segment of the transcript. ET-LDA by Hu et al [9] addresses this issue by modeling the whole process with an LDA-based graphical ...

Contributors
Acharya, Anirudh, Kambhampati, Subbarao, Davulcu, Hasan, et al.
Created Date
2015

Online health forums provide a convenient channel for patients, caregivers, and medical professionals to share their experience, support and encourage each other, and form health communities. The fast growing content in health forums provides a large repository for people to seek valuable information. A forum user can issue a keyword query to search health forums regarding to some specific questions, e.g., what treatments are effective for a disease symptom? A medical researcher can discover medical knowledge in a timely and large-scale fashion by automatically aggregating the latest evidences emerging in health forums. This dissertation studies how to effectively discover information ...

Contributors
Liu, Yunzhong, Chen, Yi, Liu, Huan, et al.
Created Date
2016