Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Date Range
2010 2019


Online social networks are the hubs of social activity in cyberspace, and using them to exchange knowledge, experiences, and opinions is common. In this work, an advanced topic modeling framework is designed to analyse complex longitudinal health information from social media with minimal human annotation, and Adverse Drug Events and Reaction (ADR) information is extracted and automatically processed by using a biased topic modeling method. This framework improves and extends existing topic modelling algorithms that incorporate background knowledge. Using this approach, background knowledge such as ADR terms and other biomedical knowledge can be incorporated during the text mining process, with …

Contributors
Yang, Jian, Gonzalez, Graciela, Davulcu, Hasan, et al.
Created Date
2017

Automating aspects of biocuration through biomedical information extraction could significantly impact biomedical research by enabling greater biocuration throughput and improving the feasibility of a wider scope. An important step in biomedical information extraction systems is named entity recognition (NER), where mentions of entities such as proteins and diseases are located within natural-language text and their semantic type is determined. This step is critical for later tasks in an information extraction pipeline, including normalization and relationship extraction. BANNER is a benchmark biomedical NER system using linear-chain conditional random fields and the rich feature set approach. A case study with BANNER locating …

Contributors
Leaman, James Robert, Gonzalez, Graciela, Baral, Chitta, et al.
Created Date
2013

Internet sites that support user-generated content, so-called Web 2.0, have become part of the fabric of everyday life in technologically advanced nations. Users collectively spend billions of hours consuming and creating content on social networking sites, weblogs (blogs), and various other types of sites in the United States and around the world. Given the fundamentally emotional nature of humans and the amount of emotional content that appears in Web 2.0 content, it is important to understand how such websites can affect the emotions of users. This work attempts to determine whether emotion spreads through an online social network (OSN). To …

Contributors
Cole, William David, Liu, Huan, Sarjoughian, Hessam, et al.
Created Date
2011

As the information available to lay users through autonomous data sources continues to increase, mediators become important to ensure that the wealth of information available is tapped effectively. A key challenge that these information mediators need to handle is the varying levels of incompleteness in the underlying databases in terms of missing attribute values. Existing approaches such as Query Processing over Incomplete Autonomous Databases (QPIAD) aim to mine and use Approximate Functional Dependencies (AFDs) to predict and retrieve relevant incomplete tuples. These approaches make independence assumptions about missing values--which critically hobbles their performance when there are tuples containing missing values …

Contributors
Raghunathan, Rohit, Kambhampati, Subbarao, Liu, Huan, et al.
Created Date
2011

This thesis deals with the analysis of interpersonal communication dynamics in online social networks and social media. Our central hypothesis is that communication dynamics between individuals manifest themselves via three key aspects: the information that is the content of communication, the social engagement i.e. the sociological framework emergent of the communication process, and the channel i.e. the media via which communication takes place. Communication dynamics have been of interest to researchers from multi-faceted domains over the past several decades. However, today we are faced with several modern capabilities encompassing a host of social media websites. These sites feature variegated interactional …

Contributors
De Choudhury, Munmun, Sundaram, Hari, Candan, K. Selcuk, et al.
Created Date
2011

Identifying chemical compounds that inhibit bacterial infection has recently gained a considerable amount of attention given the increased number of highly resistant bacteria and the serious health threat it poses around the world. With the development of automated microscopy and image analysis systems, the process of identifying novel therapeutic drugs can generate an immense amount of data - easily reaching terabytes worth of information. Despite increasing the vast amount of data that is currently generated, traditional analytical methods have not increased the overall success rate of identifying active chemical compounds that eventually become novel therapeutic drugs. Moreover, multispectral imaging has …

Contributors
Trevino, Robert, Liu, Huan, Lamkin, Thomas J, et al.
Created Date
2016

Phishing is a form of online fraud where a spoofed website tries to gain access to user's sensitive information by tricking the user into believing that it is a benign website. There are several solutions to detect phishing attacks such as educating users, using blacklists or extracting phishing characteristics found to exist in phishing attacks. In this thesis, we analyze approaches that extract features from phishing websites and train classification models with extracted feature set to classify phishing websites. We create an exhaustive list of all features used in these approaches and categorize them into 6 broader categories and 33 …

Contributors
Namasivayam, Bhuvana Lalitha, Bazzi, Rida, Zhao, Ziming, et al.
Created Date
2017

A myriad of social media services are emerging in recent years that allow people to communicate and express themselves conveniently and easily. The pervasive use of social media generates massive data at an unprecedented rate. It becomes increasingly difficult for online users to find relevant information or, in other words, exacerbates the information overload problem. Meanwhile, users in social media can be both passive content consumers and active content producers, causing the quality of user-generated content can vary dramatically from excellence to abuse or spam, which results in a problem of information credibility. Trust, providing evidence about with whom users …

Contributors
Tang, Jiliang, Liu, Huan, Xue, Guoliang, et al.
Created Date
2015

In most social networking websites, users are allowed to perform interactive activities. One of the fundamental features that these sites provide is to connecting with users of their kind. On one hand, this activity makes online connections visible and tangible; on the other hand, it enables the exploration of our connections and the expansion of our social networks easier. The aggregation of people who share common interests forms social groups, which are fundamental parts of our social lives. Social behavioral analysis at a group level is an active research area and attracts many interests from the industry. Challenges of my …

Contributors
Wang, Xufei, Liu, Huan, Kambhampati, Subbarao, et al.
Created Date
2013

Recommender systems are a type of information filtering system that suggests items that may be of interest to a user. Most information retrieval systems have an overwhelmingly large number of entries. Most users would experience information overload if they were forced to explore the full set of results. The goal of recommender systems is to overcome this limitation by predicting how users will value certain items and returning the items that should be of the highest interest to the user. Most recommender systems collect explicit user feedback, such as a rating, and attempt to optimize their model to this rating …

Contributors
Ackerman, Brian, Chen, Yi, Candan, Kasim, et al.
Created Date
2012

US Senate is the venue of political debates where the federal bills are formed and voted. Senators show their support/opposition along the bills with their votes. This information makes it possible to extract the polarity of the senators. Similarly, blogosphere plays an increasingly important role as a forum for public debate. Authors display sentiment toward issues, organizations or people using a natural language. In this research, given a mixed set of senators/blogs debating on a set of political issues from opposing camps, I use signed bipartite graphs for modeling debates, and I propose an algorithm for partitioning both the opinion …

Contributors
Gokalp, Sedat, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2015

The dawn of Internet of Things (IoT) has opened the opportunity for mainstream adoption of machine learning analytics. However, most research in machine learning has focused on discovery of new algorithms or fine-tuning the performance of existing algorithms. Little exists on the process of taking an algorithm from the lab-environment into the real-world, culminating in sustained value. Real-world applications are typically characterized by dynamic non-stationary systems with requirements around feasibility, stability and maintainability. Not much has been done to establish standards around the unique analytics demands of real-world scenarios. This research explores the problem of the why so few of …

Contributors
Shahapurkar, Som, Liu, Huan, Davulcu, Hasan, et al.
Created Date
2016

Prognostics and health management (PHM) is a method that permits the reliability of a system to be evaluated in its actual application conditions. This work involved developing a robust system to determine the advent of failure. Using the data from the PHM experiment, a model was developed to estimate the prognostic features and build a condition based system based on measured prognostics. To enable prognostics, a framework was developed to extract load parameters required for damage assessment from irregular time-load data. As a part of the methodology, a database engine was built to maintain and monitor the experimental data. This …

Contributors
Varadarajan, Gayathri, Liu, Huan, Ye, Jieping, et al.
Created Date
2010

Social media has become the norm of everyone for communication. The usage of social media has increased exponentially in the last decade. The myriads of Social media services such as Facebook, Twitter, Snapchat, and Instagram etc allow people to connect with their friends, and followers freely. The attackers who try to take advantage of this situation has also increased at an exponential rate. Every social media service has its own recommender systems and user profiling algorithms. These algorithms use users current information to make different recommendations. Often the data that is formed from social media services is Linked data as …

Contributors
Magham, Venkatesh, Liu, Huan, Wu, Liang, et al.
Created Date
2019

Exabytes of data are created online every day. This deluge of data is no more apparent than it is on social media. Naturally, finding ways to leverage this unprecedented source of human information is an active area of research. Social media platforms have become laboratories for conducting experiments about people at scales thought unimaginable only a few years ago. Researchers and practitioners use social media to extract actionable patterns such as where aid should be distributed in a crisis. However, the validity of these patterns relies on having a representative dataset. As this dissertation shows, the data collected from social …

Contributors
Morstatter, Fred, Liu, Huan, Kambhampati, Subbarao, et al.
Created Date
2017

Social media platforms such as Twitter, Facebook, and blogs have emerged as valuable - in fact, the de facto - virtual town halls for people to discover, report, share and communicate with others about various types of events. These events range from widely-known events such as the U.S Presidential debate to smaller scale, local events such as a local Halloween block party. During these events, we often witness a large amount of commentary contributed by crowds on social media. This burst of social media responses surges with the "second-screen" behavior and greatly enriches the user experience when interacting with the …

Contributors
Hu, Yuheng, Kambhampati, Subbarao, Horvitz, Eric, et al.
Created Date
2014

The field of Data Mining is widely recognized and accepted for its applications in many business problems to guide decision-making processes based on data. However, in recent times, the scope of these problems has swollen and the methods are under scrutiny for applicability and relevance to real-world circumstances. At the crossroads of innovation and standards, it is important to examine and understand whether the current theoretical methods for industrial applications (which include KDD, SEMMA and CRISP-DM) encompass all possible scenarios that could arise in practical situations. Do the methods require changes or enhancements? As part of the thesis I study …

Contributors
Anand, Aneeth, Liu, Huan, Kempf, Karl G, et al.
Created Date
2012

Online social media is popular due to its real-time nature, extensive connectivity and a large user base. This motivates users to employ social media for seeking information by reaching out to their large number of social connections. Information seeking can manifest in the form of requests for personal and time-critical information or gathering perspectives on important issues. Social media platforms are not designed for resource seeking and experience large volumes of messages, leading to requests not being fulfilled satisfactorily. Designing frameworks to facilitate efficient information seeking in social media will help users to obtain appropriate assistance for their needs and …

Contributors
Ranganath, Suhas, Liu, Huan, Lai, Ying-Cheng, et al.
Created Date
2017

A statement appearing in social media provides a very significant challenge for determining the provenance of the statement. Provenance describes the origin, custody, and ownership of something. Most statements appearing in social media are not published with corresponding provenance data. However, the same characteristics that make the social media environment challenging, including the massive amounts of data available, large numbers of users, and a highly dynamic environment, provide unique and untapped opportunities for solving the provenance problem for social media. Current approaches for tracking provenance data do not scale for online social media and consequently there is a gap in …

Contributors
Barbier, Geoffrey, Liu, Huan, Bell, Herbert, et al.
Created Date
2011

While techniques for reading DNA in some capacity has been possible for decades, the ability to accurately edit genomes at scale has remained elusive. Novel techniques have been introduced recently to aid in the writing of DNA sequences. While writing DNA is more accessible, it still remains expensive, justifying the increased interest in in silico predictions of cell behavior. In order to accurately predict the behavior of cells it is necessary to extensively model the cell environment, including gene-to-gene interactions as completely as possible. Significant algorithmic advances have been made for identifying these interactions, but despite these improvements current techniques …

Contributors
Faucon, Philippe Christophe, Liu, Huan, Wang, Xiao, et al.
Created Date
2017