Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Date Range
2011 2019


Despite the fact that machine learning supports the development of computer vision applications by shortening the development cycle, finding a general learning algorithm that solves a wide range of applications is still bounded by the ”no free lunch theorem”. The search for the right algorithm to solve a specific problem is driven by the problem itself, the data availability and many other requirements. Automated visual inspection (AVI) systems represent a major part of these challenging computer vision applications. They are gaining growing interest in the manufacturing industry to detect defective products and keep these from reaching customers. The process of …

Contributors
Haddad, Bashar Muneer, Karam, Lina, Li, Baoxin, et al.
Created Date
2019

Non-line-of-sight (NLOS) imaging of objects not visible to either the camera or illumina- tion source is a challenging task with vital applications including surveillance and robotics. Recent NLOS reconstruction advances have been achieved using time-resolved measure- ments. Acquiring these time-resolved measurements requires expensive and specialized detectors and laser sources. In work proposes a data-driven approach for NLOS 3D local- ization requiring only a conventional camera and projector. The localisation is performed using a voxelisation and a regression problem. Accuracy of greater than 90% is achieved in localizing a NLOS object to a 5cm × 5cm × 5cm volume in real …

Contributors
Chandran, Sreenithy, Jayasuriya, Suren, Turaga, Pavan, et al.
Created Date
2019

Speech is generated by articulators acting on a phonatory source. Identification of this phonatory source and articulatory geometry are individually challenging and ill-posed problems, called speech separation and articulatory inversion, respectively. There exists a trade-off between decomposition and recovered articulatory geometry due to multiple possible mappings between an articulatory configuration and the speech produced. However, if measurements are obtained only from a microphone sensor, they lack any invasive insight and add additional challenge to an already difficult problem. A joint non-invasive estimation strategy that couples articulatory and phonatory knowledge would lead to better articulatory speech synthesis. In this thesis, a …

Contributors
Venkataramani, Adarsh Akkshai, Papandreou-Suppappola, Antonia, Bliss, Daniel W, et al.
Created Date
2018

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from relative motion between the objects and the imaging perspective. Today, optical flow fields are utilized to solve problems in various areas such as object detection and tracking, interpolation, visual odometry, etc. In this dissertation, three problems from different areas of computer vision and the solutions that make use of modified optical flow methods are explained. The contributions of this dissertation …

Contributors
Kanberoglu, Berkay, Frakes, David, Turaga, Pavan, et al.
Created Date
2018

Human movement is a complex process influenced by physiological and psychological factors. The execution of movement is varied from person to person, and the number of possible strategies for completing a specific movement task is almost infinite. Different choices of strategies can be perceived by humans as having different degrees of quality, and the quality can be defined with regard to aesthetic, athletic, or health-related ratings. It is useful to measure and track the quality of a person's movements, for various applications, especially with the prevalence of low-cost and portable cameras and sensors today. Furthermore, based on such measurements, feedback …

Contributors
Wang, Qiao, Turaga, Pavan, Spanias, Andreas, et al.
Created Date
2018

Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for …

Contributors
Dodge, Samuel Fuller, Karam, Lina, Jayasuriya, Suren, et al.
Created Date
2018

Generating real-world content for VR is challenging in terms of capturing and processing at high resolution and high frame-rates. The content needs to represent a truly immersive experience, where the user can look around in 360-degree view and perceive the depth of the scene. The existing solutions only capture and offload the compute load to the server. But offloading large amounts of raw camera feeds takes longer latencies and poses difficulties for real-time applications. By capturing and computing on the edge, we can closely integrate the systems and optimize for low latency. However, moving the traditional stitching algorithms to battery …

Contributors
Gunnam, Sridhar, LiKamWa, Robert, Turaga, Pavan, et al.
Created Date
2018

The performance of most of the visual computing tasks depends on the quality of the features extracted from the raw data. Insightful feature representation increases the performance of many learning algorithms by exposing the underlying explanatory factors of the output for the unobserved input. A good representation should also handle anomalies in the data such as missing samples and noisy input caused by the undesired, external factors of variation. It should also reduce the data redundancy. Over the years, many feature extraction processes have been invented to produce good representations of raw images and videos. The feature extraction processes can …

Contributors
Chandakkar, Parag Shridhar, Li, Baoxin, Yang, Yezhou, et al.
Created Date
2017

Computer Vision as a eld has gone through signicant changes in the last decade. The eld has seen tremendous success in designing learning systems with hand-crafted features and in using representation learning to extract better features. In this dissertation some novel approaches to representation learning and task learning are studied. Multiple-instance learning which is generalization of supervised learning, is one example of task learning that is discussed. In particular, a novel non-parametric k- NN-based multiple-instance learning is proposed, which is shown to outperform other existing approaches. This solution is applied to a diabetic retinopathy pathology detection problem eectively. In cases …

Contributors
Venkatesan, Ragav, Li, Baoxin, Turaga, Pavan, et al.
Created Date
2017

Compressive sensing theory allows to sense and reconstruct signals/images with lower sampling rate than Nyquist rate. Applications in resource constrained environment stand to benefit from this theory, opening up many possibilities for new applications at the same time. The traditional inference pipeline for computer vision sequence reconstructing the image from compressive measurements. However,the reconstruction process is a computationally expensive step that also provides poor results at high compression rate. There have been several successful attempts to perform inference tasks directly on compressive measurements such as activity recognition. In this thesis, I am interested to tackle a more challenging vision problem …

Contributors
Huang, Li-chi, Turaga, Pavan, Yang, Yezhou, et al.
Created Date
2017

Light field imaging is limited in its computational processing demands of high sampling for both spatial and angular dimensions. Single-shot light field cameras sacrifice spatial resolution to sample angular viewpoints, typically by multiplexing incoming rays onto a 2D sensor array. While this resolution can be recovered using compressive sensing, these iterative solutions are slow in processing a light field. We present a deep learning approach using a new, two branch network architecture, consisting jointly of an autoencoder and a 4D CNN, to recover a high resolution 4D light field from a single coded 2D image. This network decreases reconstruction time …

Contributors
Gupta, Mayank, Turaga, Pavan, Yang, Yezhou, et al.
Created Date
2017

In UAVs and parking lots, it is typical to first collect an enormous number of pixels using conventional imagers. This is followed by employment of expensive methods to compress by throwing away redundant data. Subsequently, the compressed data is transmitted to a ground station. The past decade has seen the emergence of novel imagers called spatial-multiplexing cameras, which offer compression at the sensing level itself by providing an arbitrary linear measurements of the scene instead of pixel-based sampling. In this dissertation, I discuss various approaches for effective information extraction from spatial-multiplexing measurements and present the trade-offs between reliability of the …

Contributors
Kulkarni, Kuldeep Sharad, Turaga, Pavan, Li, Baoxin, et al.
Created Date
2017

When dancers are granted agency over music, as in interactive dance systems, the actors are most often concerned with the problem of creating a staged performance for an audience. However, as is reflected by the above quote, the practice of Argentine tango social dance is most concerned with participants internal experience and their relationship to the broader tango community. In this dissertation I explore creative approaches to enrich the sense of connection, that is, the experience of oneness with a partner and complete immersion in music and dance for Argentine tango dancers by providing agency over musical activities through the …

Contributors
Brown, Courtney Douglass, Paine, Garth, Feisst, Sabine, et al.
Created Date
2017

Computational thinking, the fundamental way of thinking in computer science, including information sourcing and problem solving behind programming, is considered vital to children who live in a digital era. Most of current educational games designed to teach children about coding either rely on external curricular materials or are too complicated to work well with young children. In this thesis project, Guardy, an iOS tower defense game, was developed to help children over 8 years old learn about and practice using basic concepts in programming. The game is built with the SpriteKit, a graphics rendering and animation infrastructure in Apple’s integrated …

Contributors
Wang, Xiaoxiao, Nelson, Brian C., Turaga, Pavan, et al.
Created Date
2017

High-level inference tasks in video applications such as recognition, video retrieval, and zero-shot classification have become an active research area in recent years. One fundamental requirement for such applications is to extract high-quality features that maintain high-level information in the videos. Many video feature extraction algorithms have been purposed, such as STIP, HOG3D, and Dense Trajectories. These algorithms are often referred to as “handcrafted” features as they were deliberately designed based on some reasonable considerations. However, these algorithms may fail when dealing with high-level tasks or complex scene videos. Due to the success of using deep convolution neural networks (CNNs) …

Contributors
Hu, Sheng-Hung, Li, Baoxin, Turaga, Pavan, et al.
Created Date
2016

Multi-sensor fusion is a fundamental problem in Robot Perception. For a robot to operate in a real world environment, multiple sensors are often needed. Thus, fusing data from various sensors accurately is vital for robot perception. In the first part of this thesis, the problem of fusing information from a LIDAR, a color camera and a thermal camera to build RGB-Depth-Thermal (RGBDT) maps is investigated. An algorithm that solves a non-linear optimization problem to compute the relative pose between the cameras and the LIDAR is presented. The relative pose estimate is then used to find the color and thermal texture …

Contributors
Krishnan, Aravindhan K., Saripalli, Srikanth, Klesh, Andrew, et al.
Created Date
2016

The tradition of building musical robots and automata is thousands of years old. Despite this rich history, even today musical robots do not play with as much nuance and subtlety as human musicians. In particular, most instruments allow the player to manipulate timbre while playing; if a violinist is told to sustain an E, they will select which string to play it on, how much bow pressure and velocity to use, whether to use the entire bow or only the portion near the tip or the frog, how close to the bridge or fingerboard to contact the string, whether or …

Contributors
Krzyzaniak, Michael Joseph, Coleman, Grisha, Turaga, Pavan, et al.
Created Date
2016

The human motion is defined as an amalgamation of several physical traits such as bipedal locomotion, posture and manual dexterity, and mental expectation. In addition to the “positive” body form defined by these traits, casting light on the body produces a “negative” of the body: its shadow. We often interchangeably use with silhouettes in the place of shadow to emphasize indifference to interior features. In a manner of speaking, the shadow is an alter ego that imitates the individual. The principal value of shadow is its non-invasive behaviour of reflecting precisely the actions of the individual it is attached to. …

Contributors
Seshasayee, Sudarshan Prashanth, Sha, Xin Wei, Turaga, Pavan, et al.
Created Date
2016

Several music players have evolved in multi-dimensional and surround sound systems. The audio players are implemented as software applications for different audio hardware systems. Digital formats and wireless networks allow for audio content to be readily accessible on smart networked devices. Therefore, different audio output platforms ranging from multispeaker high-end surround systems to single unit Bluetooth speakers have been developed. A large body of research has been carried out in audio processing, beamforming, sound fields etc. and new formats are developed to create realistic audio experiences. An emerging trend is seen towards high definition AV systems, virtual reality gears as …

Contributors
Dharmadhikari, Chinmay Nrusinha, Spanias, Andreas, Turaga, Pavan, et al.
Created Date
2016

This thesis aims to explore the language of different bodies in the field of dance by analyzing the habitual patterns of dancers from different backgrounds and vernaculars. Contextually, the term habitual patterns is defined as the postures or poses that tend to re-appear, often unintentionally, as the dancer performs improvisational dance. The focus lies in exposing the movement vocabulary of a dancer to reveal his/her unique fingerprint. The proposed approach for uncovering these movement patterns is to use a clustering technique; mainly k-means. In addition to a static method of analysis, this paper uses an online method of clustering using …

Contributors
Iyengar, Varsha, Xin Wei, Sha, Turaga, Pavan, et al.
Created Date
2016