Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Date Range
2010 2019


Despite the fact that machine learning supports the development of computer vision applications by shortening the development cycle, finding a general learning algorithm that solves a wide range of applications is still bounded by the ”no free lunch theorem”. The search for the right algorithm to solve a specific problem is driven by the problem itself, the data availability and many other requirements. Automated visual inspection (AVI) systems represent a major part of these challenging computer vision applications. They are gaining growing interest in the manufacturing industry to detect defective products and keep these from reaching customers. The process of …

Contributors
Haddad, Bashar Muneer, Karam, Lina, Li, Baoxin, et al.
Created Date
2019

Facial Expressions Recognition using the Convolution Neural Network has been actively researched upon in the last decade due to its high number of applications in the human-computer interaction domain. As Convolution Neural Networks have the exceptional ability to learn, they outperform the methods using handcrafted features. Though the state-of-the-art models achieve high accuracy on the lab-controlled images, they still struggle for the wild expressions. Wild expressions are captured in a real-world setting and have natural expressions. Wild databases have many challenges such as occlusion, variations in lighting conditions and head poses. In this work, I address these challenges and propose …

Contributors
Chhabra, Sachin, Li, Baoxin, Venkateswara, Hemanth, et al.
Created Date
2019

In this thesis, a new approach to learning-based planning is presented where critical regions of an environment with low probability measure are learned from a given set of motion plans. Critical regions are learned using convolutional neural networks (CNN) to improve sampling processes for motion planning (MP). In addition to an identification network, a new sampling-based motion planner, Learn and Link, is introduced. This planner leverages critical regions to overcome the limitations of uniform sampling while still maintaining guarantees of correctness inherent to sampling-based algorithms. Learn and Link is evaluated against planners from the Open Motion Planning Library (OMPL) on …

Contributors
Molina, Daniel Antonio, Srivastava, Siddharth, Li, Baoxin, et al.
Created Date
2019

Information forensics and security have come a long way in just a few years thanks to the recent advances in biometric recognition. The main challenge remains a proper design of a biometric modality that can be resilient to unconstrained conditions, such as quality distortions. This work presents a solution to face and ear recognition under unconstrained visual variations, with a main focus on recognition in the presence of blur, occlusion and additive noise distortions. First, the dissertation addresses the problem of scene variations in the presence of blur, occlusion and additive noise distortions resulting from capture, processing and transmission. Despite …

Contributors
Mounsef, Jinane, Karam, Lina, Papandreou-Suppapola, Antonia, et al.
Created Date
2018

Visual processing in social media platforms is a key step in gathering and understanding information in the era of Internet and big data. Online data is rich in content, but its processing faces many challenges including: varying scales for objects of interest, unreliable and/or missing labels, the inadequacy of single modal data and difficulty in analyzing high dimensional data. Towards facilitating the processing and understanding of online data, this dissertation primarily focuses on three challenges that I feel are of great practical importance: handling scale differences in computer vision tasks, such as facial component detection and face retrieval, developing efficient …

Contributors
Zhou, Xu, Li, Baoxin, Hsiao, Sharon, et al.
Created Date
2018

Computer vision technology automatically extracts high level, meaningful information from visual data such as images or videos, and the object recognition and detection algorithms are essential in most computer vision applications. In this dissertation, we focus on developing algorithms used for real life computer vision applications, presenting innovative algorithms for object segmentation and feature extraction for objects and actions recognition in video data, and sparse feature selection algorithms for medical image analysis, as well as automated feature extraction using convolutional neural network for blood cancer grading. To detect and classify objects in video, the objects have to be separated from …

Contributors
Cao, Jun, Li, Baoxin, Liu, Huan, et al.
Created Date
2018

Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for …

Contributors
Dodge, Samuel Fuller, Karam, Lina, Jayasuriya, Suren, et al.
Created Date
2018

Social media refers computer-based technology that allows the sharing of information and building the virtual networks and communities. With the development of internet based services and applications, user can engage with social media via computer and smart mobile devices. In recent years, social media has taken the form of different activities such as social network, business network, text sharing, photo sharing, blogging, etc. With the increasing popularity of social media, it has accumulated a large amount of data which enables understanding the human behavior possible. Compared with traditional survey based methods, the analysis of social media provides us a golden …

Contributors
Wang, Yilin, Li, Baoxin, Liu, Huan, et al.
Created Date
2018

Image Understanding is a long-established discipline in computer vision, which encompasses a body of advanced image processing techniques, that are used to locate (“where”), characterize and recognize (“what”) objects, regions, and their attributes in the image. However, the notion of “understanding” (and the goal of artificial intelligent machines) goes beyond factual recall of the recognized components and includes reasoning and thinking beyond what can be seen (or perceived). Understanding is often evaluated by asking questions of increasing difficulty. Thus, the expected functionalities of an intelligent Image Understanding system can be expressed in terms of the functionalities that are required to …

Contributors
Aditya, Somak, Baral, Chitta, Yang, Yezhou, et al.
Created Date
2018

With the emergence of edge computing paradigm, many applications such as image recognition and augmented reality require to perform machine learning (ML) and artificial intelligence (AI) tasks on edge devices. Most AI and ML models are large and computational heavy, whereas edge devices are usually equipped with limited computational and storage resources. Such models can be compressed and reduced in order to be placed on edge devices, but they may loose their capability and may not generalize and perform well compared to large models. Recent works used knowledge transfer techniques to transfer information from a large network (termed teacher) to …

Contributors
Sistla, Ragini, Zhao, Ming, Zhao, Ming, et al.
Created Date
2018

Digital imaging and image processing technologies have revolutionized the way in which we capture, store, receive, view, utilize, and share images. In image-based applications, through different processing stages (e.g., acquisition, compression, and transmission), images are subjected to different types of distortions which degrade their visual quality. Image Quality Assessment (IQA) attempts to use computational models to automatically evaluate and estimate the image quality in accordance with subjective evaluations. Moreover, with the fast development of computer vision techniques, it is important in practice to extract and understand the information contained in blurred images or regions. The work in this dissertation focuses …

Contributors
Golestaneh, Seyedalireza, Karam, Lina, Bliss, Daniel W., et al.
Created Date
2018

Social Computing is an area of computer science concerned with dynamics of communities and cultures, created through computer-mediated social interaction. Various social media platforms, such as social network services and microblogging, enable users to come together and create social movements expressing their opinions on diverse sets of issues, events, complaints, grievances, and goals. Methods for monitoring and summarizing these types of sociopolitical trends, its leaders and followers, messages, and dynamics are needed. In this dissertation, a framework comprising of community and content-based computational methods is presented to provide insights for multilingual and noisy political social media content. First, a model …

Contributors
Alzahrani, Sultan, Davulcu, Hasan, Corman, Steve R., et al.
Created Date
2018

Deep learning architectures have been widely explored in computer vision and have depicted commendable performance in a variety of applications. A fundamental challenge in training deep networks is the requirement of large amounts of labeled training data. While gathering large quantities of unlabeled data is cheap and easy, annotating the data is an expensive process in terms of time, labor and human expertise. Thus, developing algorithms that minimize the human effort in training deep models is of immense practical importance. Active learning algorithms automatically identify salient and exemplar samples from large amounts of unlabeled data and can augment maximal information …

Contributors
Ranganathan, Hiranmayi, Sethuraman, Panchanathan, Papandreou-Suppappola, Antonia, et al.
Created Date
2018

The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos. The feature extraction processes can …

Contributors
Chandakkar, Parag Shridhar, Li, Baoxin, Yang, Yezhou, et al.
Created Date
2017

Computer Vision as a eld has gone through signicant changes in the last decade. The eld has seen tremendous success in designing learning systems with hand-crafted features and in using representation learning to extract better features. In this dissertation some novel approaches to representation learning and task learning are studied. Multiple-instance learning which is generalization of supervised learning, is one example of task learning that is discussed. In particular, a novel non-parametric k- NN-based multiple-instance learning is proposed, which is shown to outperform other existing approaches. This solution is applied to a diabetic retinopathy pathology detection problem eectively. In cases …

Contributors
Venkatesan, Ragav, Li, Baoxin, Turaga, Pavan, et al.
Created Date
2017

Compressive sensing theory allows to sense and reconstruct signals/images with lower sampling rate than Nyquist rate. Applications in resource constrained environment stand to benefit from this theory, opening up many possibilities for new applications at the same time. The traditional inference pipeline for computer vision sequence reconstructing the image from compressive measurements. However,the reconstruction process is a computationally expensive step that also provides poor results at high compression rate. There have been several successful attempts to perform inference tasks directly on compressive measurements such as activity recognition. In this thesis, I am interested to tackle a more challenging vision problem …

Contributors
Huang, Li-chi, Turaga, Pavan, Yang, Yezhou, et al.
Created Date
2017

Bangladesh is a secular democracy with almost 90% of its population constituting of Muslims and the rest 10% constituting of the minority groups that includes Hindus, Christians, Buddhists, Ahmadi Muslims, Shia, Sufi, LGBT groups and Atheists. In recent years, Bangladesh has experienced an increase in attacks by religious extremist groups, such as IS and AQIS affiliates, hate-groups and politically motivated violence. Attacks have also become indiscriminate, with assailants targeting a wide variety of individuals, including religious minorities and foreigners. According to the telecoms regulator, the number of internet users in Bangladesh now stands at over 66.8 million reaching 41% penetration. …

Contributors
Chhabra, Pankaj, Davulcu, Hasan, Li, Baoxin, et al.
Created Date
2017

Visual Question Answering (VQA) is a new research area involving technologies ranging from computer vision, natural language processing, to other sub-fields of artificial intelligence such as knowledge representation. The fundamental task is to take as input one image and one question (in text) related to the given image, and to generate a textual answer to the input question. There are two key research problems in VQA: image understanding and the question answering. My research mainly focuses on developing solutions to support solving these two problems. In image understanding, one important research area is semantic segmentation, which takes images as input …

Contributors
Tian, Qiongjie, Li, Baoxin, Tong, Hanghang, et al.
Created Date
2017

Improving accessibility to public buildings by people with special needs has been an important societal commitment that is mandated by federal laws. In the information age, accessibility can mean more than simply providing physical accommodations like ramps for wheel-chairs. Better yet, accessibility will be fundamentally improved, if a user can be made aware of important location-specific information like functions of offices near the user within a building. A smart environment may help a new person quickly get acquainted about the environment. Such features can be more critical for cases of making an indoor environment more accessible to people with visual …

Contributors
Lagisetty, Jashmi, Li, Baoxin, Hedgpeth, Terri, et al.
Created Date
2017

Light field imaging is limited in its computational processing demands of high sampling for both spatial and angular dimensions. Single-shot light field cameras sacrifice spatial resolution to sample angular viewpoints, typically by multiplexing incoming rays onto a 2D sensor array. While this resolution can be recovered using compressive sensing, these iterative solutions are slow in processing a light field. We present a deep learning approach using a new, two branch network architecture, consisting jointly of an autoencoder and a 4D CNN, to recover a high resolution 4D light field from a single coded 2D image. This network decreases reconstruction time …

Contributors
Gupta, Mayank, Turaga, Pavan, Yang, Yezhou, et al.
Created Date
2017