ASU Electronic Theses and Dissertations

Permanent Link Feedback

Contributor
Date Range
2010 2017

In supervised learning, machine learning techniques can be applied to learn a model on a small set of labeled documents which can be used to classify a larger set of unknown documents. Machine learning techniques can be used to analyze a political scenario in a given society. A lot of research has been going on in this field to understand the interactions of various people in the society in response to actions taken by their organizations. This paper talks about understanding the Russian influence on people in Latvia. This is done by building an eeffective model learnt on initial set ...

Contributors
Bollapragada, Lakshmi Gayatri Niharika, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2016

Online health forums provide a convenient channel for patients, caregivers, and medical professionals to share their experience, support and encourage each other, and form health communities. The fast growing content in health forums provides a large repository for people to seek valuable information. A forum user can issue a keyword query to search health forums regarding to some specific questions, e.g., what treatments are effective for a disease symptom? A medical researcher can discover medical knowledge in a timely and large-scale fashion by automatically aggregating the latest evidences emerging in health forums. This dissertation studies how to effectively discover information ...

Contributors
Liu, Yunzhong, Chen, Yi, Liu, Huan, et al.
Created Date
2016

While discrete emotions like joy, anger, disgust etc. are quite popular, continuous emotion dimensions like arousal and valence are gaining popularity within the research community due to an increase in the availability of datasets annotated with these emotions. Unlike the discrete emotions, continuous emotions allow modeling of subtle and complex affect dimensions but are difficult to predict. Dimension reduction techniques form the core of emotion recognition systems and help create a new feature space that is more helpful in predicting emotions. But these techniques do not necessarily guarantee a better predictive capability as most of them are unsupervised, especially in ...

Contributors
Lade, Prasanth, Panchanathan, Sethuraman, Davulcu, Hasan, et al.
Created Date
2015

Proliferation of social media websites and discussion forums in the last decade has resulted in social media mining emerging as an effective mechanism to extract consumer patterns. Most research on social media and pharmacovigilance have concentrated on Adverse Drug Reaction (ADR) identification. Such methods employ a step of drug search followed by classification of the associated text as consisting an ADR or not. Although this method works efficiently for ADR classifications, if ADR evidence is present in users posts over time, drug mentions fail to capture such ADRs. It also fails to record additional user information which may provide an ...

Contributors
Chandrashekar, Pramod Bharadwaj Chandrashekar, Davulcu, Hasan, Gonzalez, Graciela, et al.
Created Date
2016

Cyber systems, including IoT (Internet of Things), are increasingly being used ubiquitously to vastly improve the efficiency and reduce the cost of critical application areas, such as finance, transportation, defense, and healthcare. Over the past two decades, computing efficiency and hardware cost have dramatically been improved. These improvements have made cyber systems omnipotent, and control many aspects of human lives. Emerging trends in successful cyber system breaches have shown increasing sophistication in attacks and that attackers are no longer limited by resources, including human and computing power. Most existing cyber defense systems for IoT systems have two major issues: (1) ...

Contributors
Buduru, Arun Balaji, Yau, Sik-Sang, Ahn, Gail-Joon, et al.
Created Date
2016

The amount of time series data generated is increasing due to the integration of sensor technologies with everyday applications, such as gesture recognition, energy optimization, health care, video surveillance. The use of multiple sensors simultaneously for capturing different aspects of the real world attributes has also led to an increase in dimensionality from uni-variate to multi-variate time series. This has facilitated richer data representation but also has necessitated algorithms determining similarity between two multi-variate time series for search and analysis. Various algorithms have been extended from uni-variate to multi-variate case, such as multi-variate versions of Euclidean distance, edit distance, dynamic ...

Contributors
Garg, Yash, Candan, Kasim Selcuk, Chowell-Punete, Gerardo, et al.
Created Date
2015

A new algebraic system, Test Algebra (TA), is proposed for identifying faults in combinatorial testing for SaaS (Software-as-a-Service) applications. In the context of cloud computing, SaaS is a new software delivery model, in which mission-critical applications are composed, deployed, and executed on cloud platforms. Testing SaaS applications is challenging because new applications need to be tested once they are composed, and prior to their deployment. A composition of components providing services yields a configuration providing a SaaS application. While individual components in the configuration may have been thoroughly tested, faults still arise due to interactions among the components composed, making ...

Contributors
Qi, Guanqiu, Tsai, Wei-Tek, Davulcu, Hasan, et al.
Created Date
2014

Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect the model performance. In this thesis, I focus on developing learning methods for the high-dimensional imbalanced biomedical data. In the first part, a sparse canonical correlation analysis (CCA) method is presented. The penalty terms is used to control the sparsity of the projection matrices of CCA. The sparse CCA method ...

Contributors
Yang, Tao, Ye, Jieping, Wang, Yalin, et al.
Created Date
2013

In recent years, there are increasing numbers of applications that use multi-variate time series data where multiple uni-variate time series coexist. However, there is a lack of systematic of multi-variate time series. This thesis focuses on (a) defining a simplified inter-related multi-variate time series (IMTS) model and (b) developing robust multi-variate temporal (RMT) feature extraction algorithm that can be used for locating, filtering, and describing salient features in multi-variate time series data sets. The proposed RMT feature can also be used for supporting multiple analysis tasks, such as visualization, segmentation, and searching / retrieving based on multi-variate time series similarities. ...

Contributors
Wang, Xiaolan, Candan, Kasim Selcuk, Sapino, Maria Luisa, et al.
Created Date
2013

The purpose of this research is to efficiently analyze certain data provided and to see if a useful trend can be observed as a result. This trend can be used to analyze certain probabilities. There are three main pieces of data which are being analyzed in this research: The value for δ of the call and put option, the %B value of the stock, and the amount of time until expiration of the stock option. The %B value is the most important. The purpose of analyzing the data is to see the relationship between the variables and, given certain values, ...

Contributors
Reeves, Michael Thomas, Richa, Andrea, McCarville, Daniel, et al.
Created Date
2015

In visualizing information hierarchies, icicle plots are efficient diagrams in that they provide the user a straightforward layout for different levels of data in a hierarchy and enable the user to compare items based on the item width. However, as the size of the hierarchy grows large, the items in an icicle plot end up being small and indistinguishable. In this thesis, by maintaining the positive characteristics of traditional icicle plots and incorporating new features such as dynamic diagram and active layer, we developed an interactive visualization that allows the user to selectively drill down or roll up to review ...

Contributors
Wu, Bi, Maciejewski, Ross, Runger, George, et al.
Created Date
2014

With the advent of Internet, the data being added online is increasing at enormous rate. Though search engines are using IR techniques to facilitate the search requests from users, the results are not effective towards the search query of the user. The search engine user has to go through certain webpages before getting at the webpage he/she wanted. This problem of Information Overload can be solved using Automatic Text Summarization. Summarization is a process of obtaining at abridged version of documents so that user can have a quick view to understand what exactly the document is about. Email threads from ...

Contributors
Nadella, Sravan, Davulcu, Hasan, Li, Baoxin, et al.
Created Date
2015

In this thesis multiple approaches are explored to enhance sentiment analysis of tweets. A standard sentiment analysis model with customized features is first trained and tested to establish a baseline. This is compared to an existing topic based mixture model and a new proposed topic based vector model both of which use Latent Dirichlet Allocation (LDA) for topic modeling. The proposed topic based vector model has higher accuracies in terms of averaged F scores than the other two models. Dissertation/Thesis

Contributors
Baskaran, Swetha, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2016

Multi-tenancy architecture (MTA) is often used in Software-as-a-Service (SaaS) and the central idea is that multiple tenant applications can be developed using compo nents stored in the SaaS infrastructure. Recently, MTA has been extended where a tenant application can have its own sub-tenants as the tenant application acts like a SaaS infrastructure. In other words, MTA is extended to STA (Sub-Tenancy Architecture ). In STA, each tenant application not only need to develop its own functionalities, but also need to prepare an infrastructure to allow its sub-tenants to develop customized applications. This dissertation formulates eight models for STA, and proposes ...

Contributors
Zhong, Peide, Davulcu, Hasan, Sarjoughian, Hessam, et al.
Created Date
2017

Similarity search in high-dimensional spaces is popular for applications like image processing, time series, and genome data. In higher dimensions, the phenomenon of curse of dimensionality kills the effectiveness of most of the index structures, giving way to approximate methods like Locality Sensitive Hashing (LSH), to answer similarity searches. In addition to range searches and k-nearest neighbor searches, there is a need to answer negative queries formed by excluded regions, in high-dimensional data. Though there have been a slew of variants of LSH to improve efficiency, reduce storage, and provide better accuracies, none of the techniques are capable of answering ...

Contributors
Bhat, Aneesha, Candan, Kasim Selcuk, Davulcu, Hasan, et al.
Created Date
2016

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many cases, where the most a cleaning system can do is to generate a (hopefully small) set of clean candidates for each dirty tuple. When the cleaning system is required to output a deterministic database, it is forced to pick one clean candidate (say the "most likely" candidate) per tuple. Such ...

Contributors
Rihan, Preet Inder Singh, Kambhampati, Subbarao, Liu, Huan, et al.
Created Date
2013

Measuring node centrality is a critical common denominator behind many important graph mining tasks. While the existing literature offers a wealth of different node centrality measures, it remains a daunting task on how to intervene the node centrality in a desired way. In this thesis, we study the problem of minimizing the centrality of one or more target nodes by edge operation. The heart of the proposed method is an accurate and efficient algorithm to estimate the impact of edge deletion on the spectrum of the underlying network, based on the observation that the edge deletion is essentially a local, ...

Contributors
Peng, Ruiyue, Tong, Hanghang, He, Jingrui, et al.
Created Date
2016

With the advent of social media (like Twitter, Facebook etc.,) people are easily sharing their opinions, sentiments and enforcing their ideologies on others like never before. Even people who are otherwise socially inactive would like to share their thoughts on current affairs by tweeting and sharing news feeds with their friends and acquaintances. In this thesis study, we chose Twitter as our main data platform to analyze shifts and movements of 27 political organizations in Indonesia. So far, we have collected over 30 million tweets and 150,000 news articles from RSS feeds of the corresponding organizations for our analysis. For ...

Contributors
Poornachandran, Sathishkumar, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2013

Crises or large-scale emergencies such as earthquakes and hurricanes cause massive damage to lives and property. Crisis response is an essential task to mitigate the impact of a crisis. An effective response to a crisis necessitates information gathering and analysis. Traditionally, this process has been restricted to the information collected by first responders on the ground in the affected region or by official agencies such as local governments involved in the response. However, the ubiquity of mobile devices has empowered people to publish information during a crisis through social media, such as the damage reports from a hurricane. Social media ...

Contributors
Kumar, Shamanth, Liu, Huan, Davulcu, Hasan, et al.
Created Date
2015

A story is defined as "an actor(s) taking action(s) that culminates in a resolution(s)''. I present novel sets of features to facilitate story detection among text via supervised classification and further reveal different forms within stories via unsupervised clustering. First, I investigate the utility of a new set of semantic features compared to standard keyword features combined with statistical features, such as density of part-of-speech (POS) tags and named entities, to develop a story classifier. The proposed semantic features are based on <Subject, Verb, Object> triplets that can be extracted using a shallow parser. Experimental results show that a model ...

Contributors
Ceran, Saadet Betul, Davulcu, Hasan, Corman, Steven R, et al.
Created Date
2016

This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries.

For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.