Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Date Range
2010 2019


Continuous advancements in biomedical research have resulted in the production of vast amounts of scientific data and literature discussing them. The ultimate goal of computational biology is to translate these large amounts of data into actual knowledge of the complex biological processes and accurate life science models. The ability to rapidly and effectively survey the literature is necessary for the creation of large scale models of the relationships among biomedical entities as well as hypothesis generation to guide biomedical research. To reduce the effort and time spent in performing these activities, an intelligent search system is required. Even though many …

Contributors
Kanwar, Pradeep, Davulcu, Hasan, Dinu, Valentin, et al.
Created Date
2010

Text search is a very useful way of retrieving document information from a particular website. The public generally use internet search engines over the local enterprise search engines, because the enterprise content is not cross linked and does not follow a page rank algorithm. On the other hand the enterprise search engine uses metadata information, which allows the user to specify the conditions that any retrieved document should meet. Therefore, using metadata information for searching will also be very useful. My thesis aims on developing an enterprise search engine using metadata information by providing advanced features like faceted navigation. The …

Contributors
Sanaka, Srinivasa Raviteja, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2010

Embedded Networked Systems (ENS) consist of various devices, which are embedded into physical objects (e.g., home appliances, vehicles, buidlings, people). With rapid advances in processing and networking technologies, these devices can be fully connected and pervasive in the environment. The devices can interact with the physical world, collaborate to share resources, and provide context-aware services. This dissertation focuses on collaboration in ENS to provide smart services. However, there are several challenges because the system must be - scalable to a huge number of devices; robust against noise, loss and failure; and secure despite communicating with strangers. To address these challenges, …

Contributors
Kim, Su Jin, Gupta, Sandeep K. S., Dasgupta, Partha, et al.
Created Date
2010

Data-driven applications are becoming increasingly complex with support for processing events and data streams in a loosely-coupled distributed environment, providing integrated access to heterogeneous data sources such as relational databases and XML documents. This dissertation explores the use of materialized views over structured heterogeneous data sources to support multiple query optimization in a distributed event stream processing framework that supports such applications involving various query expressions for detecting events, monitoring conditions, handling data streams, and querying data. Materialized views store the results of the computed view so that subsequent access to the view retrieves the materialized results, avoiding the cost …

Contributors
Chaudhari, Mahesh Balkrishna, Dietrich, Suzanne W, Urban, Susan D, et al.
Created Date
2011

Genes have widely different pertinences to the etiology and pathology of diseases. Thus, they can be ranked according to their disease-significance on a genomic scale, which is the subject of gene prioritization. Given a set of genes known to be related to a disease, it is reasonable to use them as a basis to determine the significance of other candidate genes, which will then be ranked based on the association they exhibit with respect to the given set of known genes. Experimental and computational data of various kinds have different reliability and relevance to a disease under study. This work …

Contributors
Lee, Jang, Gonzalez, Graciela, Ye, Jieping, et al.
Created Date
2011

Most existing approaches to complex event processing over streaming data rely on the assumption that the matches to the queries are rare and that the goal of the system is to identify these few matches within the incoming deluge of data. In many applications, such as stock market analysis and user credit card purchase pattern monitoring, however the matches to the user queries are in fact plentiful and the system has to efficiently sift through these many matches to locate only the few most preferable matches. In this work, we propose a complex pattern ranking (CPR) framework for specifying top-k …

Contributors
Wang, Xinxin, Candan, K. Selcuk, Chen, Yi, et al.
Created Date
2011

Templates are wildly used in Web sites development. Finding the template for a given set of Web pages could be very important and useful for many applications like Web page classification and monitoring content and structure changes of Web pages. In this thesis, two novel sequence-based Web page template detection algorithms are presented. Different from tree mapping algorithms which are based on tree edit distance, sequence-based template detection algorithms operate on the Prüfer/Consolidated Prüfer sequences of trees. Since there are one-to-one correspondences between Prüfer/Consolidated Prüfer sequences and trees, sequence-based template detection algorithms identify the template by finding a common subsequence …

Contributors
Huang, Wei, Candan, Kasim Selçuk, Sundaram, Hari, et al.
Created Date
2011

Currently Java is making its way into the embedded systems and mobile devices like androids. The programs written in Java are compiled into machine independent binary class byte codes. A Java Virtual Machine (JVM) executes these classes. The Java platform additionally specifies the Java Native Interface (JNI). JNI allows Java code that runs within a JVM to interoperate with applications or libraries that are written in other languages and compiled to the host CPU ISA. JNI plays an important role in embedded system as it provides a mechanism to interact with libraries specific to the platform. This thesis addresses the …

Contributors
Chandrian, Preetham, Lee, Yann-Hang, Davulcu, Hasan, et al.
Created Date
2011

As pointed out in the keynote speech by H. V. Jagadish in SIGMOD'07, and also commonly agreed in the database community, the usability of structured data by casual users is as important as the data management systems' functionalities. A major hardness of using structured data is the problem of easily retrieving information from them given a user's information needs. Learning and using a structured query language (e.g., SQL and XQuery) is overwhelmingly burdensome for most users, as not only are these languages sophisticated, but the users need to know the data schema. Keyword search provides us with opportunities to conveniently …

Contributors
Liu, Ziyang, Chen, Yi, Candan, Kasim S, et al.
Created Date
2011

Navigating within non-linear structures is a challenge for all users when the space is large but the problem is most pronounced when the users are blind or visually impaired. Such users access digital content through screen readers like JAWS which read out the text on the screen. However presentation of non-linear narratives in such a manner without visual cues and information about spatial dependencies is very inefficient for such users. The NSDL Science Literacy StrandMaps are visual layouts to help students and teachers browse educational resources. A Strandmap shows relationships between concepts and how they build upon one another across …

Contributors
Gaur, Shruti, Candan, Kasim Selçuk, Sundaram, Hari, et al.
Created Date
2011

This thesis research attempts to observe, measure and visualize the communication patterns among developers of an open source community and analyze how this can be inferred in terms of progress of that open source project. Here I attempted to analyze the Ubuntu open source project's email data (9 subproject log archives over a period of five years) and focused on drawing more precise metrics from different perspectives of the communication data. Also, I attempted to overcome the scalability issue by using Apache Pig libraries, which run on a MapReduce framework based Hadoop Cluster. I described four metrics based on which …

Contributors
Motamarri, Lakshminarayana, Santanam, Raghu, Ye, Jieping, et al.
Created Date
2011

This thesis addresses the problem of online schema updates where the goal is to be able to update relational database schemas without reducing the database system's availability. Unlike some other work in this area, this thesis presents an approach which is completely client-driven and does not require specialized database management systems (DBMS). Also, unlike other client-driven work, this approach provides support for a richer set of schema updates including vertical split (normalization), horizontal split, vertical and horizontal merge (union), difference and intersection. The update process automatically generates a runtime update client from a mapping between the old the new schemas. …

Contributors
Tyagi, Preetika, Bazzi, Rida, Candan, Kasim S, et al.
Created Date
2011

Source selection is one of the foremost challenges for searching deep-web. For a user query, source selection involves selecting a subset of deep-web sources expected to provide relevant answers to the user query. Existing source selection models employ query-similarity based local measures for assessing source quality. These local measures are necessary but not sufficient as they are agnostic to source trustworthiness and result importance, which, given the autonomous and uncurated nature of deep-web, have become indispensible for searching deep-web. SourceRank provides a global measure for assessing source quality based on source trustworthiness and result importance. SourceRank's effectiveness has been evaluated …

Contributors
Jha, Manishkumar, Kambhampati, Subbarao, Liu, Huan, et al.
Created Date
2011

Situations of sensory overload are steadily becoming more frequent as the ubiquity of technology approaches reality--particularly with the advent of socio-communicative smartphone applications, and pervasive, high speed wireless networks. Although the ease of accessing information has improved our communication effectiveness and efficiency, our visual and auditory modalities--those modalities that today's computerized devices and displays largely engage--have become overloaded, creating possibilities for distractions, delays and high cognitive load; which in turn can lead to a loss of situational awareness, increasing chances for life threatening situations such as texting while driving. Surprisingly, alternative modalities for information delivery have seen little exploration. Touch, …

Contributors
Mcdaniel, Troy Lee, Panchanathan, Sethuraman, Davulcu, Hasan, et al.
Created Date
2012

Muslim radicalism is recognized as one of the greatest security threats for the United States and the rest of the world. Use of force to eliminate specific radical entities is ineffective in containing radicalism as a whole. There is a need to understand the origin, ideologies and behavior of Radical and Counter-Radical organizations and how they shape up over a period of time. Recognizing and supporting counter-radical organizations is one of the most important steps towards impeding radical organizations. A lot of research has already been done to categorize and recognize organizations, to understand their behavior, their interactions with other …

Contributors
Nair, Shreejay, Davulcu, Hasan, Dasgpta, Partha, et al.
Created Date
2012

Analysis of political texts, which contains a huge amount of personal political opinions, sentiments, and emotions towards powerful individuals, leaders, organizations, and a large number of people, is an interesting task, which can lead to discover interesting interactions between the political parties and people. Recently, political blogosphere plays an increasingly important role in politics, as a forum for debating political issues. Most of the political weblogs are biased towards their political parties, and they generally express their sentiments towards their issues (i.e. leaders, topics etc.,) and also towards issues of the opposing parties. In this thesis, I have modeled the …

Contributors
Thirumalai, Dananjayan, Davulcu, Hasan, Sarjoughian, Hessam, et al.
Created Date
2012

The overall contribution of the Minerva Initiative at ASU is to map social organizations in a multidimensional space that provides a measure of their radical or counter radical influence over the demographics of a nation. This tool serves as a simple content management system to store and track project resources like documents, images, videos and web links. It provides centralized and secure access to email conversations among project team members. Conversations are categorized into one of the seven pre-defined categories. Each category is associated with a certain set of keywords and we follow a frequency based approach for matching email …

Contributors
Nair, Apurva, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2012

The pay-as-you-go economic model of cloud computing increases the visibility, traceability, and verifiability of software costs. Application developers must understand how their software uses resources when running in the cloud in order to stay within budgeted costs and/or produce expected profits. Cloud computing's unique economic model also leads naturally to an earn-as-you-go profit model for many cloud based applications. These applications can benefit from low level analyses for cost optimization and verification. Testing cloud applications to ensure they meet monetary cost objectives has not been well explored in the current literature. When considering revenues and costs for cloud applications, the …

Contributors
Buell, Kevin, Collofello, James, Davulcu, Hasan, et al.
Created Date
2012

This dissertation presents the Temporal Event Query Language (TEQL), a new language for querying event streams. Event Stream Processing enables online querying of streams of events to extract relevant data in a timely manner. TEQL enables querying of interval-based event streams using temporal database operators. Temporal databases and temporal query languages have been a subject of research for more than 30 years and are a natural fit for expressing queries that involve a temporal dimension. However, operators developed in this context cannot be directly applied to event streams. The research extends a preexisting relational framework for event stream processing to …

Contributors
Shiva, Foruhar Ali, Urban, Susan D, Chen, Yi, et al.
Created Date
2012

A semiconductor supply chain modeling and simulation platform using Linear Program (LP) optimization and parallel Discrete Event System Specification (DEVS) process models has been developed in a joint effort by ASU and Intel Corporation. A Knowledge Interchange Broker (KIBDEVS/LP) was developed to broker information synchronously between the DEVS and LP models. Recently a single-echelon heuristic Inventory Strategy Module (ISM) was added to correct for forecast bias in customer demand data using different smoothing techniques. The optimization model could then use information provided by the forecast model to make better decisions for the process model. The composition of ISM with LP …

Contributors
Smith, James Melkon, Sarjoughian, Hessam S, Davulcu, Hasan, et al.
Created Date
2012

Process migration is a heavily studied research area and has a number of applications in distributed systems. Process migration means transferring a process running on one machine to another such that it resumes execution from the point at which it was suspended. The conventional approach to implement process migration is to move the entire state information of the process (including hardware context, virtual memory, files etc.) from one machine to another. Copying all the state information is costly. This thesis proposes and demonstrates a new approach of migrating a process between two cores of Intel Single Chip Cloud (SCC), an …

Contributors
Jain, Vaibhav, Dasgupta, Partha, Shriavstava, Aviral, et al.
Created Date
2013

Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine learning approach is followed, in which a module is first trained with pre-classified training data and then class of test data is predicted. Good feature extraction is an important step in the machine learning approach and hence the main component of this text classifier is semantic triplet based features in …

Contributors
Karad, Ravi Chandravadan, Davulcu, Hasan, Corman, Steven, et al.
Created Date
2013

The increasing popularity of Twitter renders improved trustworthiness and relevance assessment of tweets much more important for search. However, given the limitations on the size of tweets, it is hard to extract measures for ranking from the tweet's content alone. I propose a method of ranking tweets by generating a reputation score for each tweet that is based not just on content, but also additional information from the Twitter ecosystem that consists of users, tweets, and the web pages that tweets link to. This information is obtained by modeling the Twitter ecosystem as a three-layer graph. The reputation score is …

Contributors
Ravikumar, Srijith, Kambhampati, Subbarao, Davulcu, Hasan, et al.
Created Date
2013

The complexity of the systems that software engineers build has continuously grown since the inception of the field. What has not changed is the engineers' mental capacity to operate on about seven distinct pieces of information at a time. The widespread use of UML has led to more abstract software design activities, however the same cannot be said for reverse engineering activities. The introduction of abstraction to reverse engineering will allow the engineer to move farther away from the details of the system, increasing his ability to see the role that domain level concepts play in the system. In this …

Contributors
Carey, Maurice, Colbourn, Charles, Collofello, James, et al.
Created Date
2013

Our research focuses on finding answers through decentralized search, for complex, imprecise queries (such as "Which is the best hair salon nearby?") in situations where there is a spatiotemporal constraint (say answer needs to be found within 15 minutes) associated with the query. In general, human networks are good in answering imprecise queries. We try to use the social network of a person to answer his query. Our research aims at designing a framework that exploits the user's social network in order to maximize the answers for a given query. Exploiting an user's social network has several challenges. The major …

Contributors
Swaminathan, Neelakantan, Sundaram, Hari, Davulcu, Hasan, et al.
Created Date
2013

With the advent of social media (like Twitter, Facebook etc.,) people are easily sharing their opinions, sentiments and enforcing their ideologies on others like never before. Even people who are otherwise socially inactive would like to share their thoughts on current affairs by tweeting and sharing news feeds with their friends and acquaintances. In this thesis study, we chose Twitter as our main data platform to analyze shifts and movements of 27 political organizations in Indonesia. So far, we have collected over 30 million tweets and 150,000 news articles from RSS feeds of the corresponding organizations for our analysis. For …

Contributors
Poornachandran, Sathishkumar, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2013

In recent years, there are increasing numbers of applications that use multi-variate time series data where multiple uni-variate time series coexist. However, there is a lack of systematic of multi-variate time series. This thesis focuses on (a) defining a simplified inter-related multi-variate time series (IMTS) model and (b) developing robust multi-variate temporal (RMT) feature extraction algorithm that can be used for locating, filtering, and describing salient features in multi-variate time series data sets. The proposed RMT feature can also be used for supporting multiple analysis tasks, such as visualization, segmentation, and searching / retrieving based on multi-variate time series similarities. …

Contributors
Wang, Xiaolan, Candan, Kasim Selcuk, Sapino, Maria Luisa, et al.
Created Date
2013

Learning from high dimensional biomedical data attracts lots of attention recently. High dimensional biomedical data often suffer from the curse of dimensionality and have imbalanced class distributions. Both of these features of biomedical data, high dimensionality and imbalanced class distributions, are challenging for traditional machine learning methods and may affect the model performance. In this thesis, I focus on developing learning methods for the high-dimensional imbalanced biomedical data. In the first part, a sparse canonical correlation analysis (CCA) method is presented. The penalty terms is used to control the sparsity of the projection matrices of CCA. The sparse CCA method …

Contributors
Yang, Tao, Ye, Jieping, Wang, Yalin, et al.
Created Date
2013

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many cases, where the most a cleaning system can do is to generate a (hopefully small) set of clean candidates for each dirty tuple. When the cleaning system is required to output a deterministic database, it is forced to pick one clean candidate (say the "most likely" candidate) per tuple. Such …

Contributors
Rihan, Preet Inder Singh, Kambhampati, Subbarao, Liu, Huan, et al.
Created Date
2013

There has been a lot of research in the field of artificial intelligence about thinking machines. Alan Turing proposed a test to observe a machine's intelligent behaviour with respect to natural language conversation. The Winograd schema challenge is suggested as an alternative, to the Turing test. It needs inferencing capabilities, reasoning abilities and background knowledge to get the answer right. It involves a coreference resolution task in which a machine is given a sentence containing a situation which involves two entities, one pronoun and some more information about the situation and the machine has to come up with the right …

Contributors
Budukh, Tejas Ulhas, Baral, Chitta, Vanlehn, Kurt, et al.
Created Date
2013

Continuous Delivery, as one of the youngest and most popular member of agile model family, has become a popular concept and method in software development industry recently. Instead of the traditional software development method, which requirements and solutions must be fixed before starting software developing, it promotes adaptive planning, evolutionary development and delivery, and encourages rapid and flexible response to change. However, several problems prevent Continuous Delivery to be introduced into education world. Taking into the consideration of the barriers, we propose a new Cloud based Continuous Delivery Software Developing System. This system is designed to fully utilize the whole …

Contributors
Deng, Yuli, Huang, Dijiang, Davulcu, Hasan, et al.
Created Date
2013

The wide adoption and continued advancement of information and communications technologies (ICT) have made it easier than ever for individuals and groups to stay connected over long distances. These advances have greatly contributed in dramatically changing the dynamics of the modern day workplace to the point where it is now commonplace to see large, distributed multidisciplinary teams working together on a daily basis. However, in this environment, motivating, understanding, and valuing the diverse contributions of individual workers in collaborative enterprises becomes challenging. To address these issues, this thesis presents the goals, design, and implementation of Taskville, a distributed workplace game …

Contributors
Nikkila, Shawn, Sundaram, Hari, Byrne, Daragh, et al.
Created Date
2013

In contemporary society, sustainability and public well-being have been pressing challenges. Some of the important questions are:how can sustainable practices, such as reducing carbon emission, be encouraged? , How can a healthy lifestyle be maintained?Even though individuals are interested, they are unable to adopt these behaviors due to resource constraints. Developing a framework to enable cooperative behavior adoption and to sustain it for a long period of time is a major challenge. As a part of developing this framework, I am focusing on methods to understand behavior diffusion over time. Facilitating behavior diffusion with resource constraints in a large population …

Contributors
Dey, Anindita, Sundaram, Hari, Turaga, Pavan, et al.
Created Date
2013

Biological organisms are made up of cells containing numerous interconnected biochemical processes. Diseases occur when normal functionality of these processes is disrupted, manifesting as disease symptoms. Thus, understanding these biochemical processes and their interrelationships is a primary task in biomedical research and a prerequisite for activities including diagnosing diseases and drug development. Scientists studying these interconnected processes have identified various pathways involved in drug metabolism, diseases, and signal transduction, etc. High-throughput technologies, new algorithms and speed improvements over the last decade have resulted in deeper knowledge about biological systems, leading to more refined pathways. Such pathways tend to be large …

Contributors
Anwar, Saadat, Baral, Chitta, Inoue, Katsumi, et al.
Created Date
2014

As the size and scope of valuable datasets has exploded across many industries and fields of research in recent years, an increasingly diverse audience has sought out effective tools for their large-scale data analytics needs. Over this period, machine learning researchers have also been very prolific in designing improved algorithms which are capable of finding the hidden structure within these datasets. As consumers of popular Big Data frameworks have sought to apply and benefit from these improved learning algorithms, the problems encountered with the frameworks have motivated a new generation of Big Data tools to address the shortcomings of the …

Contributors
Krouse, Brian Richard, Ye, Jieping, Liu, Huan, et al.
Created Date
2014

Contemporary online social platforms present individuals with social signals in the form of news feed on their peers' activities. On networks such as Facebook, Quora, network operator decides how that information is shown to an individual. Then the user, with her own interests and resource constraints selectively acts on a subset of items presented to her. The network operator again, shows that activity to a selection of peers, and thus creating a behavioral loop. That mechanism of interaction and information flow raises some very interesting questions such as: can network operator design social signals to promote a particular activity like …

Contributors
Le, Tien Dinh, Sundaram, Hari, Davulcu, Hasan, et al.
Created Date
2014

Twitter is a micro-blogging platform where the users can be social, informational or both. In certain cases, users generate tweets that have no "hashtags" or "@mentions"; we call it an orphaned tweet. The user will be more interested to find more "context" of an orphaned tweet presumably to engage with his/her friend on that topic. Finding context for an Orphaned tweet manually is challenging because of larger social graph of a user , the enormous volume of tweets generated per second, topic diversity, and limited information from tweet length of 140 characters. To help the user to get the context …

Contributors
Vijayakumar, Manikandan, Kambhampati, Subbarao, Liu, Huan, et al.
Created Date
2014

Multidimensional data have various representations. Thanks to their simplicity in modeling multidimensional data and the availability of various mathematical tools (such as tensor decompositions) that support multi-aspect analysis of such data, tensors are increasingly being used in many application domains including scientific data management, sensor data management, and social network data analysis. Relational model, on the other hand, enables semantic manipulation of data using relational operators, such as projection, selection, Cartesian-product, and set operators. For many multidimensional data applications, tensor operations as well as relational operations need to be supported throughout the data life cycle. In this thesis, we introduce …

Contributors
Kim, Mijung, Candan, K. Selcuk, Davulcu, Hasan, et al.
Created Date
2014

In visualizing information hierarchies, icicle plots are efficient diagrams in that they provide the user a straightforward layout for different levels of data in a hierarchy and enable the user to compare items based on the item width. However, as the size of the hierarchy grows large, the items in an icicle plot end up being small and indistinguishable. In this thesis, by maintaining the positive characteristics of traditional icicle plots and incorporating new features such as dynamic diagram and active layer, we developed an interactive visualization that allows the user to selectively drill down or roll up to review …

Contributors
Wu, Bi, Maciejewski, Ross, Runger, George, et al.
Created Date
2014

Advances in data collection technologies have made it cost-effective to obtain heterogeneous data from multiple data sources. Very often, the data are of very high dimension and feature selection is preferred in order to reduce noise, save computational cost and learn interpretable models. Due to the multi-modality nature of heterogeneous data, it is interesting to design efficient machine learning models that are capable of performing variable selection and feature group (data source) selection simultaneously (a.k.a bi-level selection). In this thesis, I carry out research along this direction with a particular focus on designing efficient optimization algorithms. I start with a …

Contributors
Xiang, Shuo, Ye, Jieping, Mittelmann, Hans D, et al.
Created Date
2014

A new algebraic system, Test Algebra (TA), is proposed for identifying faults in combinatorial testing for SaaS (Software-as-a-Service) applications. In the context of cloud computing, SaaS is a new software delivery model, in which mission-critical applications are composed, deployed, and executed on cloud platforms. Testing SaaS applications is challenging because new applications need to be tested once they are composed, and prior to their deployment. A composition of components providing services yields a configuration providing a SaaS application. While individual components in the configuration may have been thoroughly tested, faults still arise due to interactions among the components composed, making …

Contributors
Qi, Guanqiu, Tsai, Wei-Tek, Davulcu, Hasan, et al.
Created Date
2014

Interactive remote e-learning is one of the youngest and most popular methods that is used in today's teaching method. WebRTC, on the other hand, has become the popular concept and method in real time communication. Unlike the old fashioned Adobe Flash, user will communicate directly to each other rather than calling server as the middle man. The world is changing from plug-in to web-browser. However, the WebRTC have not been widely used for school education. By taking into consideration of the WebRTC solution for data transferring, we propose a new Cloud based interactive multimedia which enables virtual lab learning environment. …

Contributors
Li, Qingyun, Huang, Dijiang, Davulcu, Hasan, et al.
Created Date
2014

Skyline queries extract interesting points that are non-dominated and help paint the bigger picture of the data in question. They are valuable in many multi-criteria decision applications and are becoming a staple of decision support systems. An assumption commonly made by many skyline algorithms is that a skyline query is applied to a single static data source or data stream. Unfortunately, this assumption does not hold in many applications in which a skyline query may involve attributes belonging to multiple data sources and requires a join operation to be performed before the skyline can be produced. Recently, various skyline-join algorithms …

Contributors
Nagendra, Mithila, Candan, Kasim Selcuk, Chen, Yi, et al.
Created Date
2014

With the rise of social media, hundreds of millions of people spend countless hours all over the globe on social media to connect, interact, share, and create user-generated data. This rich environment provides tremendous opportunities for many different players to easily and effectively reach out to people, interact with them, influence them, or get their opinions. There are two pieces of information that attract most attention on social media sites, including user preferences and interactions. Businesses and organizations use this information to better understand and therefore provide customized services to social media users. This data can be used for different …

Contributors
Abbasi, Mohammad Ali, Liu, Huan, Davulcu, Hasan, et al.
Created Date
2014

Corporations invest considerable resources to create, preserve and analyze their data; yet while organizations are interested in protecting against unauthorized data transfer, there lacks a comprehensive metric to discriminate what data are at risk of leaking. This thesis motivates the need for a quantitative leakage risk metric, and provides a risk assessment system, called Whispers, for computing it. Using unsupervised machine learning techniques, Whispers uncovers themes in an organization's document corpus, including previously unknown or unclassified data. Then, by correlating the document with its authors, Whispers can identify which data are easier to contain, and conversely which are at risk. …

Contributors
Wright, Jeremy Lee, Syrotiuk, Violet, Davulcu, Hasan, et al.
Created Date
2014

Skyline queries are a well-established technique used in multi criteria decision applications. There is a recent interest among the research community to efficiently compute skylines but the problem of presenting the skyline that takes into account the preferences of the user is still open. Each user has varying interests towards each attribute and hence "one size fits all" methodology might not satisfy all the users. True user satisfaction can be obtained only when the skyline is tailored specifically for each user based on his preferences. This research investigates the problem of preference aware skyline processing which consists of inferring the …

Contributors
Rathinavelu, Sriram, Candan, Kasim Selcuk, Davulcu, Hasan, et al.
Created Date
2014

While discrete emotions like joy, anger, disgust etc. are quite popular, continuous emotion dimensions like arousal and valence are gaining popularity within the research community due to an increase in the availability of datasets annotated with these emotions. Unlike the discrete emotions, continuous emotions allow modeling of subtle and complex affect dimensions but are difficult to predict. Dimension reduction techniques form the core of emotion recognition systems and help create a new feature space that is more helpful in predicting emotions. But these techniques do not necessarily guarantee a better predictive capability as most of them are unsupervised, especially in …

Contributors
Lade, Prasanth, Panchanathan, Sethuraman, Davulcu, Hasan, et al.
Created Date
2015

Crises or large-scale emergencies such as earthquakes and hurricanes cause massive damage to lives and property. Crisis response is an essential task to mitigate the impact of a crisis. An effective response to a crisis necessitates information gathering and analysis. Traditionally, this process has been restricted to the information collected by first responders on the ground in the affected region or by official agencies such as local governments involved in the response. However, the ubiquity of mobile devices has empowered people to publish information during a crisis through social media, such as the damage reports from a hurricane. Social media …

Contributors
Kumar, Shamanth, Liu, Huan, Davulcu, Hasan, et al.
Created Date
2015

US Senate is the venue of political debates where the federal bills are formed and voted. Senators show their support/opposition along the bills with their votes. This information makes it possible to extract the polarity of the senators. Similarly, blogosphere plays an increasingly important role as a forum for public debate. Authors display sentiment toward issues, organizations or people using a natural language. In this research, given a mixed set of senators/blogs debating on a set of political issues from opposing camps, I use signed bipartite graphs for modeling debates, and I propose an algorithm for partitioning both the opinion …

Contributors
Gokalp, Sedat, Davulcu, Hasan, Sen, Arunabha, et al.
Created Date
2015

The purpose of this research is to efficiently analyze certain data provided and to see if a useful trend can be observed as a result. This trend can be used to analyze certain probabilities. There are three main pieces of data which are being analyzed in this research: The value for δ of the call and put option, the %B value of the stock, and the amount of time until expiration of the stock option. The %B value is the most important. The purpose of analyzing the data is to see the relationship between the variables and, given certain values, …

Contributors
Reeves, Michael Thomas, Richa, Andrea, McCarville, Daniel, et al.
Created Date
2015