Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Resource Type
  • Doctoral Dissertation
Date Range
2014 2019


Much evidence has shown that first language (L1) plays an important role in the formation of L2 phonological system during second language (L2) learning process. This combines with the fact that different L1s have distinct phonological patterns to indicate the diverse L2 speech learning outcomes for speakers from different L1 backgrounds. This dissertation hypothesizes that phonological distances between accented speech and speakers' L1 speech are also correlated with perceived accentedness, and the correlations are negative for some phonological properties. Moreover, contrastive phonological distinctions between L1s and L2 will manifest themselves in the accented speech produced by speaker from these L1s. …

Contributors
Tu, Ming, Berisha, Visar, Liss, Julie M, et al.
Created Date
2018

Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental phoneme errors and suprasegmental lexical boundary errors are developed to reveal the perceptual strategies of the listener. A comprehensive set of automated acoustic measures are developed to quantify variations in the acoustic signal from three perceptual aspects, including articulation, prosody, and vocal quality. The developed measures have been validated on …

Contributors
Jiao, Yishan, Berisha, Visar, Liss, Julie, et al.
Created Date
2019

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from relative motion between the objects and the imaging perspective. Today, optical flow fields are utilized to solve problems in various areas such as object detection and tracking, interpolation, visual odometry, etc. In this dissertation, three problems from different areas of computer vision and the solutions that make use of modified optical flow methods are explained. The contributions of this dissertation …

Contributors
Kanberoglu, Berkay, Frakes, David, Turaga, Pavan, et al.
Created Date
2018

The present study describes audiovisual sentence recognition in normal hearing listeners, bimodal cochlear implant (CI) listeners and bilateral CI listeners. This study explores a new set of sentences (the AzAV sentences) that were created to have equal auditory intelligibility and equal gain from visual information. The aims of Experiment I were to (i) compare the lip reading difficulty of the AzAV sentences to that of other sentence materials, (ii) compare the speech-reading ability of CI listeners to that of normal-hearing listeners and (iii) assess the gain in speech understanding when listeners have both auditory and visual information from easy-to-lip-read and …

Contributors
Wang, Shuai, Dorman, Michael, Berisha, Visar, et al.
Created Date
2015

Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents a set of computational methods, that generalize well across different conditions, for speech-based applications involving emotion recognition and keyword detection, and ambient sounds-based applications such as lifelogging. The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions …

Contributors
Shah, Mohit, Spanias, Andreas, Chakrabarti, Chaitali, et al.
Created Date
2015

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance. This dissertation focuses on the representation learning approaches as the fusion strategy. Specifically, the objective is to learn the shared latent representation which jointly exploit the structural information encoded in all modalities, such that a straightforward learning model can be adopted to obtain the prediction. We first consider sensor fusion, …

Contributors
Song, Huan, Spanias, Andreas, Thiagarajan, Jayaraman, et al.
Created Date
2018

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is …

Contributors
Wisler, Alan, Berisha, Visar, Spanias, Andreas, et al.
Created Date
2017

Compressed sensing (CS) is a novel approach to collecting and analyzing data of all types. By exploiting prior knowledge of the compressibility of many naturally-occurring signals, specially designed sensors can dramatically undersample the data of interest and still achieve high performance. However, the generated data are pseudorandomly mixed and must be processed before use. In this work, a model of a single-pixel compressive video camera is used to explore the problems of performing inference based on these undersampled measurements. Three broad types of inference from CS measurements are considered: recovery of video frames, target tracking, and object classification/detection. Potential applications …

Contributors
Braun, Henry Carlton, Turaga, Pavan K, Spanias, Andreas S, et al.
Created Date
2016

In many applications, measured sensor data is meaningful only when the location of sensors is accurately known. Therefore, the localization accuracy is crucial. In this dissertation, both location estimation and location detection problems are considered. In location estimation problems, sensor nodes at known locations, called anchors, transmit signals to sensor nodes at unknown locations, called nodes, and use these transmissions to estimate the location of the nodes. Specifically, the location estimation in the presence of fading channels using time of arrival (TOA) measurements with narrowband communication signals is considered. Meanwhile, the Cramer-Rao lower bound (CRLB) for localization error under different …

Contributors
Zhang, Xue, Tepedelenlioglu, Cihan, Spanias, Andreas, et al.
Created Date
2016

This work considers the problem of multiple detection and tracking in two complex time-varying environments, urban terrain and underwater. Tracking multiple radar targets in urban environments is rst investigated by exploiting multipath signal returns, wideband underwater acoustic (UWA) communications channels are estimated using adaptive learning methods, and multiple UWA communications users are detected by designing the transmit signal to match the environment. For the urban environment, a multi-target tracking algorithm is proposed that integrates multipath-to-measurement association and the probability hypothesis density method implemented using particle filtering. The algorithm is designed to track an unknown time-varying number of targets by extracting …

Contributors
Zhou, Meng, Papandreou-Suppappola, Antonia, Tepedelenlioglu, Cihan, et al.
Created Date
2014