Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Mime Type
Date Range
2013 2019


Much evidence has shown that first language (L1) plays an important role in the formation of L2 phonological system during second language (L2) learning process. This combines with the fact that different L1s have distinct phonological patterns to indicate the diverse L2 speech learning outcomes for speakers from different L1 backgrounds. This dissertation hypothesizes that phonological distances between accented speech and speakers' L1 speech are also correlated with perceived accentedness, and the correlations are negative for some phonological properties. Moreover, contrastive phonological distinctions between L1s and L2 will manifest themselves in the accented speech produced by speaker from these L1s. …

Contributors
Tu, Ming, Berisha, Visar, Liss, Julie M, et al.
Created Date
2018

Speech intelligibility measures how much a speaker can be understood by a listener. Traditional measures of intelligibility, such as word accuracy, are not sufficient to reveal the reasons of intelligibility degradation. This dissertation investigates the underlying sources of intelligibility degradations from both perspectives of the speaker and the listener. Segmental phoneme errors and suprasegmental lexical boundary errors are developed to reveal the perceptual strategies of the listener. A comprehensive set of automated acoustic measures are developed to quantify variations in the acoustic signal from three perceptual aspects, including articulation, prosody, and vocal quality. The developed measures have been validated on …

Contributors
Jiao, Yishan, Berisha, Visar, Liss, Julie, et al.
Created Date
2019

Dementia is a syndrome resulting from an acquired brain disease that affects many domains of cognitive impairment. The progressive disorder generally affects memory, attention, executive functions, communication, and other cognitive domains that significantly alter everyday function (Quinn, 2014). The purpose of this research was to gather a systematic review of cognitive-communication assessments and screeners used in assessing dementia to assist in early prognosis. From this review, there is potential in developing a new test to address the areas that people with dementia often have deficits in 1) Memory, 2) Attention, 3) Executive Functions, 4) Language, and 5) Visuospatial Skills. In …

Contributors
Miller, Marissa, Liss, Julie M, Berisha, Visar, et al.
Created Date
2019

Motion estimation is a core task in computer vision and many applications utilize optical flow methods as fundamental tools to analyze motion in images and videos. Optical flow is the apparent motion of objects in image sequences that results from relative motion between the objects and the imaging perspective. Today, optical flow fields are utilized to solve problems in various areas such as object detection and tracking, interpolation, visual odometry, etc. In this dissertation, three problems from different areas of computer vision and the solutions that make use of modified optical flow methods are explained. The contributions of this dissertation …

Contributors
Kanberoglu, Berkay, Frakes, David, Turaga, Pavan, et al.
Created Date
2018

The processing power and storage capacity of portable devices have improved considerably over the past decade. This has motivated the implementation of sophisticated audio and other signal processing algorithms on such mobile devices. Of particular interest in this thesis is audio/speech processing based on perceptual criteria. Specifically, estimation of parameters from human auditory models, such as auditory patterns and loudness, involves computationally intensive operations which can strain device resources. Hence, strategies for implementing computationally efficient human auditory models for loudness estimation have been studied in this thesis. Existing algorithms for reducing computations in auditory pattern and loudness estimation have been …

Contributors
Kalyanasundaram, Girish, Spanias, Andreas S, Tepedelenlioglu, Cihan, et al.
Created Date
2013

Everyday speech communication typically takes place face-to-face. Accordingly, the task of perceiving speech is a multisensory phenomenon involving both auditory and visual information. The current investigation examines how visual information influences recognition of dysarthric speech. It also explores where the influence of visual information is dependent upon age. Forty adults participated in the study that measured intelligibility (percent words correct) of dysarthric speech in auditory versus audiovisual conditions. Participants were then separated into two groups: older adults (age range 47 to 68) and young adults (age range 19 to 36) to examine the influence of age. Findings revealed that all …

Contributors
Fall, Elizabeth, Liss, Julie, Berisha, Visar, et al.
Created Date
2014

The present study describes audiovisual sentence recognition in normal hearing listeners, bimodal cochlear implant (CI) listeners and bilateral CI listeners. This study explores a new set of sentences (the AzAV sentences) that were created to have equal auditory intelligibility and equal gain from visual information. The aims of Experiment I were to (i) compare the lip reading difficulty of the AzAV sentences to that of other sentence materials, (ii) compare the speech-reading ability of CI listeners to that of normal-hearing listeners and (iii) assess the gain in speech understanding when listeners have both auditory and visual information from easy-to-lip-read and …

Contributors
Wang, Shuai, Dorman, Michael, Berisha, Visar, et al.
Created Date
2015

Many neurological disorders, especially those that result in dementia, impact speech and language production. A number of studies have shown that there exist subtle changes in linguistic complexity in these individuals that precede disease onset. However, these studies are conducted on controlled speech samples from a specific task. This thesis explores the possibility of using natural language processing in order to detect declining linguistic complexity from more natural discourse. We use existing data from public figures suspected (or at risk) of suffering from cognitive-linguistic decline, downloaded from the Internet, to detect changes in linguistic complexity. In particular, we focus on …

Contributors
Wang, Shuai, Berisha, Visar, LaCross, Amy, et al.
Created Date
2016

Audio signals, such as speech and ambient sounds convey rich information pertaining to a user’s activity, mood or intent. Enabling machines to understand this contextual information is necessary to bridge the gap in human-machine interaction. This is challenging due to its subjective nature, hence, requiring sophisticated techniques. This dissertation presents a set of computational methods, that generalize well across different conditions, for speech-based applications involving emotion recognition and keyword detection, and ambient sounds-based applications such as lifelogging. The expression and perception of emotions varies across speakers and cultures, thus, determining features and classification methods that generalize well to different conditions …

Contributors
Shah, Mohit, Spanias, Andreas, Chakrabarti, Chaitali, et al.
Created Date
2015

Modern machine learning systems leverage data and features from multiple modalities to gain more predictive power. In most scenarios, the modalities are vastly different and the acquired data are heterogeneous in nature. Consequently, building highly effective fusion algorithms is at the core to achieve improved model robustness and inferencing performance. This dissertation focuses on the representation learning approaches as the fusion strategy. Specifically, the objective is to learn the shared latent representation which jointly exploit the structural information encoded in all modalities, such that a straightforward learning model can be adopted to obtain the prediction. We first consider sensor fusion, …

Contributors
Song, Huan, Spanias, Andreas, Thiagarajan, Jayaraman, et al.
Created Date
2018

The purpose of this study was to identify acoustic markers that correlate with accurate and inaccurate /r/ production in children ages 5-8 using signal processing. In addition, the researcher aimed to identify predictive acoustic markers that relate to changes in /r/ accuracy. A total of 35 children (23 accurate, 12 inaccurate, 8 longitudinal) were recorded. Computerized stimuli were presented on a PC laptop computer and the children were asked to do five tasks to elicit spontaneous and imitated /r/ production in all positions. Files were edited and analyzed using a filter bank approach centered at 40 frequencies based on the …

Contributors
Becvar, Brittany Patricia, Azuma, Tamiko, Weinhold, Juliet, et al.
Created Date
2017

Through decades of clinical progress, cochlear implants have brought the world of speech and language to thousands of profoundly deaf patients. However, the technology has many possible areas for improvement, including providing information of non-linguistic cues, also called indexical properties of speech. The field of sensory substitution, providing information relating one sense to another, offers a potential avenue to further assist those with cochlear implants, in addition to the promise they hold for those without existing aids. A user study with a vibrotactile device is evaluated to exhibit the effectiveness of this approach in an auditory gender discrimination task. Additionally, …

Contributors
Butts, Austin McRae, Helms Tillery, Stephen, Berisha, Visar, et al.
Created Date
2015

The ability to identify unoccupied resources in the radio spectrum is a key capability for opportunistic users in a cognitive radio environment. This paper draws upon and extends geometrically based ideas in statistical signal processing to develop estimators for the rank and the occupied subspace in a multi-user environment from multiple temporal samples of the signal received at a single antenna. These estimators enable identification of resources, such as the orthogonal complement of the occupied subspace, that may be exploitable by an opportunistic user. This concept is supported by simulations showing the estimation of the number of users in a …

Contributors
Beaudet, Kaitlyn, Cochran, Douglas, Turaga, Pavan, et al.
Created Date
2014

Glottal fry is a vocal register characterized by low frequency and increased signal perturbation, and is perceptually identified by its popping, creaky quality. Recently, the use of the glottal fry vocal register has received growing awareness and attention in popular culture and media in the United States. The creaky quality that was originally associated with vocal pathologies is indeed becoming “trendy,” particularly among young women across the United States. But while existing studies have defined, quantified, and attempted to explain the use of glottal fry in conversational speech, there is currently no explanation for the increasing prevalence of the use …

Contributors
Delfino, Christine R., Liss, Julie M, Borrie, Stephanie A, et al.
Created Date
2015

Information divergence functions, such as the Kullback-Leibler divergence or the Hellinger distance, play a critical role in statistical signal processing and information theory; however estimating them can be challenge. Most often, parametric assumptions are made about the two distributions to estimate the divergence of interest. In cases where no parametric model fits the data, non-parametric density estimation is used. In statistical signal processing applications, Gaussianity is usually assumed since closed-form expressions for common divergence measures have been derived for this family of distributions. Parametric assumptions are preferred when it is known that the data follows the model, however this is …

Contributors
Wisler, Alan, Berisha, Visar, Spanias, Andreas, et al.
Created Date
2017

Head movement is known to have the benefit of improving the accuracy of sound localization for humans and animals. Marmoset is a small bodied New World monkey species and it has become an emerging model for studying the auditory functions. This thesis aims to detect the horizontal and vertical rotation of head movement in marmoset monkeys. Experiments were conducted in a sound-attenuated acoustic chamber. Head movement of marmoset monkey was studied under various auditory and visual stimulation conditions. With increasing complexity, these conditions are (1) idle, (2) sound-alone, (3) sound and visual signals, and (4) alert signal by opening and …

Contributors
Simhadri, Sravanthi, Zhou, Yi, Turaga, Pavan, et al.
Created Date
2014

Compressed sensing (CS) is a novel approach to collecting and analyzing data of all types. By exploiting prior knowledge of the compressibility of many naturally-occurring signals, specially designed sensors can dramatically undersample the data of interest and still achieve high performance. However, the generated data are pseudorandomly mixed and must be processed before use. In this work, a model of a single-pixel compressive video camera is used to explore the problems of performing inference based on these undersampled measurements. Three broad types of inference from CS measurements are considered: recovery of video frames, target tracking, and object classification/detection. Potential applications …

Contributors
Braun, Henry Carlton, Turaga, Pavan K, Spanias, Andreas S, et al.
Created Date
2016

A human communications research project at Arizona State University aurally recorded the daily interactions of aware and consenting employees and their visiting clients at the Software Factory, a software engineering consulting team, over a three year period. The resulting dataset contains valuable insights on the communication networks that the participants formed however it is far too vast to be processed manually by researchers. In this work, digital signal processing techniques are employed to develop a software toolkit that can aid in estimating the observable networks contained in the Software Factory recordings. A four-step process is employed that starts with parsing …

Contributors
Pressler, Daniel, Bliss, Daniel W, Berisha, Visar, et al.
Created Date
2018

Deep neural networks (DNN) have shown tremendous success in various cognitive tasks, such as image classification, speech recognition, etc. However, their usage on resource-constrained edge devices has been limited due to high computation and large memory requirement. To overcome these challenges, recent works have extensively investigated model compression techniques such as element-wise sparsity, structured sparsity and quantization. While most of these works have applied these compression techniques in isolation, there have been very few studies on application of quantization and structured sparsity together on a DNN model. This thesis co-optimizes structured sparsity and quantization constraints on DNN models during training. …

Contributors
Srivastava, Gaurav, Seo, Jae-Sun, Chakrabarti, Chaitali, et al.
Created Date
2018

The problem of cooperative radar and communications signaling is investigated. Each system typically considers the other system a source of interference. Consequently, the tradition is to have them operate in orthogonal frequency bands. By considering the radar and communications operations to be a single joint system, performance bounds on a receiver that observes communications and radar return in the same frequency allocation are derived. Bounds in performance of the joint system is measured in terms of data information rate for communications and radar estimation information rate for the radar. Inner bounds on performance are constructed. Dissertation/Thesis

Contributors
Chiriyath, Alex Rajan, Bliss, Daniel W, Kosut, Oliver, et al.
Created Date
2014

This dissertation is focused on developing an algorithm to provide current state estimation and future state predictions for biomechanical human walking features. The goal is to develop a system which is capable of evaluating the current action a subject is taking while walking and then use this to predict the future states of biomechanical features. This work focuses on the exploration and analysis of Interaction Primitives (Amor er al, 2014) and their relevance to biomechanical prediction for human walking. Built on the framework of Probabilistic Movement Primitives, Interaction Primitives utilize an EKF SLAM algorithm to localize and map a distribution …

Contributors
Clark, Geoffrey Mitchell, Ben Amor, Heni, Si, Jennie, et al.
Created Date
2018

In many applications, measured sensor data is meaningful only when the location of sensors is accurately known. Therefore, the localization accuracy is crucial. In this dissertation, both location estimation and location detection problems are considered. In location estimation problems, sensor nodes at known locations, called anchors, transmit signals to sensor nodes at unknown locations, called nodes, and use these transmissions to estimate the location of the nodes. Specifically, the location estimation in the presence of fading channels using time of arrival (TOA) measurements with narrowband communication signals is considered. Meanwhile, the Cramer-Rao lower bound (CRLB) for localization error under different …

Contributors
Zhang, Xue, Tepedelenlioglu, Cihan, Spanias, Andreas, et al.
Created Date
2016

This work considers the problem of multiple detection and tracking in two complex time-varying environments, urban terrain and underwater. Tracking multiple radar targets in urban environments is rst investigated by exploiting multipath signal returns, wideband underwater acoustic (UWA) communications channels are estimated using adaptive learning methods, and multiple UWA communications users are detected by designing the transmit signal to match the environment. For the urban environment, a multi-target tracking algorithm is proposed that integrates multipath-to-measurement association and the probability hypothesis density method implemented using particle filtering. The algorithm is designed to track an unknown time-varying number of targets by extracting …

Contributors
Zhou, Meng, Papandreou-Suppappola, Antonia, Tepedelenlioglu, Cihan, et al.
Created Date
2014

Our ability to understand networks is important to many applications, from the analysis and modeling of biological networks to analyzing social networks. Unveiling network dynamics allows us to make predictions and decisions. Moreover, network dynamics models have inspired new ideas for computational methods involving multi-agent cooperation, offering effective solutions for optimization tasks. This dissertation presents new theoretical results on network inference and multi-agent optimization, split into two parts - The first part deals with modeling and identification of network dynamics. I study two types of network dynamics arising from social and gene networks. Based on the network dynamics, the proposed …

Contributors
Wai, Hoi To, Scaglione, Anna, Berisha, Visar, et al.
Created Date
2017

With advances in automatic speech recognition, spoken dialogue systems are assuming increasingly social roles. There is a growing need for these systems to be socially responsive, capable of building rapport with users. In human-human interactions, rapport is critical to patient-doctor communication, conflict resolution, educational interactions, and social engagement. Rapport between people promotes successful collaboration, motivation, and task success. Dialogue systems which can build rapport with their user may produce similar effects, personalizing interactions to create better outcomes. This dissertation focuses on how dialogue systems can build rapport utilizing acoustic-prosodic entrainment. Acoustic-prosodic entrainment occurs when individuals adapt their acoustic-prosodic features of …

Contributors
Lubold, Nichola Anne, Walker, Erin, Pon-Barry, Heather, et al.
Created Date
2018

As the demand for spectrum sharing between radar and communications systems is steadily increasing, the coexistence between the two systems is a growing and very challenging problem. Radar tracking in the presence of strong communications interference can result in low probability of detection even when sequential Monte Carlo tracking methods such as the particle filter (PF) are used that better match the target kinematic model. In particular, the tracking performance can fluctuate as the power level of the communications interference can vary dynamically and unpredictably. This work proposes to integrate the interacting multiple model (IMM) selection approach with the PF …

Contributors
ZHOU, JIAN, Papandreou-Suppappola, Antonia, Kovvali, Narayan, et al.
Created Date
2015

RF convergence of radar and communications users is rapidly becoming an issue for a multitude of stakeholders. To hedge against growing spectral congestion, research into cooperative radar and communications systems has been identified as a critical necessity for the United States and other countries. Further, the joint sensing-communicating paradigm appears imminent in several technological domains. In the pursuit of co-designing radar and communications systems that work cooperatively and benefit from each other's existence, joint radar-communications metrics are defined and bounded as a measure of performance. Estimation rate is introduced, a novel measure of radar estimation information as a function of …

Contributors
Paul, Bryan, Bliss, Daniel W., Berisha, Visar, et al.
Created Date
2017

Machine learning (ML) has played an important role in several modern technological innovations and has become an important tool for researchers in various fields of interest. Besides engineering, ML techniques have started to spread across various departments of study, like health-care, medicine, diagnostics, social science, finance, economics etc. These techniques require data to train the algorithms and model a complex system and make predictions based on that model. Due to development of sophisticated sensors it has become easier to collect large volumes of data which is used to make necessary hypotheses using ML. The promising results obtained using ML have …

Contributors
Dutta, Arindam, Bliss, Daniel W, Berisha, Visar, et al.
Created Date
2018

As the number of devices with wireless capabilities and the proximity of these devices to each other increases, better ways to handle the interference they cause need to be explored. Also important is for these devices to keep up with the demand for data rates while not compromising on industry established expectations of power consumption and mobility. Current methods of distributing the spectrum among all participants are expected to not cope with the demand in a very near future. In this thesis, the effect of employing sophisticated multiple-input, multiple-output (MIMO) systems in this regard is explored. The efficacy of systems …

Contributors
Thontadarya, Niranjan, Bliss, Daniel W, Berisha, Visar, et al.
Created Date
2014

The activation of the primary motor cortex (M1) is common in speech perception tasks that involve difficult listening conditions. Although the challenge of recognizing and discriminating non-native speech sounds appears to be an instantiation of listening under difficult circumstances, it is still unknown if M1 recruitment is facilitatory of second language speech perception. The purpose of this study was to investigate the role of M1 associated with speech motor centers in processing acoustic inputs in the native (L1) and second language (L2), using repetitive Transcranial Magnetic Stimulation (rTMS) to selectively alter neural activity in M1. Thirty-six healthy English/Spanish bilingual subjects …

Contributors
Barragan, Beatriz, Liss, Julie, Berisha, Visar, et al.
Created Date
2018

In recent years, there has been an increased interest in sharing available bandwidth to avoid spectrum congestion. With an ever-increasing number wireless users, it is critical to develop signal processing based spectrum sharing algorithms to achieve cooperative use of the allocated spectrum among multiple systems in order to reduce interference between systems. This work studies the radar and communications systems coexistence problem using two main approaches. The first approach develops methodologies to increase radar target tracking performance under low signal-to-interference-plus-noise ratio (SINR) conditions due to the coexistence of strong communications interference. The second approach jointly optimizes the performance of both …

Contributors
Kota, John Stephen, Papandreou-Suppappola, Antonia, Berisha, Visar, et al.
Created Date
2016

The recent spotlight on concussion has illuminated deficits in the current standard of care with regard to addressing acute and persistent cognitive signs and symptoms of mild brain injury. This stems, in part, from the diffuse nature of the injury, which tends not to produce focal cognitive or behavioral deficits that are easily identified or tracked. Indeed it has been shown that patients with enduring symptoms have difficulty describing their problems; therefore, there is an urgent need for a sensitive measure of brain activity that corresponds with higher order cognitive processing. The development of a neurophysiological metric that maps to …

Contributors
Utianski, Rene Lynn, Liss, Julie M, Berisha, Visar, et al.
Created Date
2014

In recent years, conventional convolutional neural network (CNN) has achieved outstanding performance in image and speech processing applications. Unfortunately, the pooling operation in CNN ignores important spatial information which is an important attribute in many applications. The recently proposed capsule network retains spatial information and improves the capabilities of traditional CNN. It uses capsules to describe features in multiple dimensions and dynamic routing to increase the statistical stability of the network. In this work, we first use capsule network for overlapping digit recognition problem. We evaluate the performance of the network with respect to recognition accuracy, convergence and training time …

Contributors
XIONG, YAN, Chakrabarti, Chaitali, Berisha, Visar, et al.
Created Date
2018