Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Status
  • Public
Date Range
2010 2019


Visual Question Answering (VQA) is a new research area involving technologies ranging from computer vision, natural language processing, to other sub-fields of artificial intelligence such as knowledge representation. The fundamental task is to take as input one image and one question (in text) related to the given image, and to generate a textual answer to the input question. There are two key research problems in VQA: image understanding and the question answering. My research mainly focuses on developing solutions to support solving these two problems. In image understanding, one important research area is semantic segmentation, which takes images as input …

Contributors
Tian, Qiongjie, Li, Baoxin, Tong, Hanghang, et al.
Created Date
2017

Present day Internet Protocol (IP) based video transport and dissemination systems are heterogeneous in that they differ in network bandwidth, display resolutions and processing capabilities. One important objective in such an environment is the flexible adaptation of once-encoded content and to achieve this, one popular method is the scalable video coding (SVC) technique. The SVC extension of the H.264/AVC standard has higher compression efficiency when compared to the previous scalable video standards. The network transport of 3D video, which is obtained by superimposing two views of a video scene, poses significant challenges due to the increased video data compared to …

Contributors
Pulipaka, Venkata Sai Akshay, Reisslein, Martin, Karam, Lina, et al.
Created Date
2012

Mixture of experts is a machine learning ensemble approach that consists of individual models that are trained to be ``experts'' on subsets of the data, and a gating network that provides weights to output a combination of the expert predictions. Mixture of experts models do not currently see wide use due to difficulty in training diverse experts and high computational requirements. This work presents modifications of the mixture of experts formulation that use domain knowledge to improve training, and incorporate parameter sharing among experts to reduce computational requirements. First, this work presents an application of mixture of experts models for …

Contributors
Dodge, Samuel Fuller, Karam, Lina, Jayasuriya, Suren, et al.
Created Date
2018

Video deinterlacing is a key technique in digital video processing, particularly with the widespread usage of LCD and plasma TVs. This thesis proposes a novel spatio-temporal, non-linear video deinterlacing technique that adaptively chooses between the results from one dimensional control grid interpolation (1DCGI), vertical temporal filter (VTF) and temporal line averaging (LA). The proposed method performs better than several popular benchmarking methods in terms of both visual quality and peak signal to noise ratio (PSNR). The algorithm performs better than existing approaches like edge-based line averaging (ELA) and spatio-temporal edge-based median filtering (STELA) on fine moving edges and semi-static regions …

Contributors
Venkatesan, Ragav, Frakes, David H, Li, Baoxin, et al.
Created Date
2012

High-level inference tasks in video applications such as recognition, video retrieval, and zero-shot classification have become an active research area in recent years. One fundamental requirement for such applications is to extract high-quality features that maintain high-level information in the videos. Many video feature extraction algorithms have been purposed, such as STIP, HOG3D, and Dense Trajectories. These algorithms are often referred to as “handcrafted” features as they were deliberately designed based on some reasonable considerations. However, these algorithms may fail when dealing with high-level tasks or complex scene videos. Due to the success of using deep convolution neural networks (CNNs) …

Contributors
Hu, Sheng-Hung, Li, Baoxin, Turaga, Pavan, et al.
Created Date
2016

Digital imaging and image processing technologies have revolutionized the way in which we capture, store, receive, view, utilize, and share images. In image-based applications, through different processing stages (e.g., acquisition, compression, and transmission), images are subjected to different types of distortions which degrade their visual quality. Image Quality Assessment (IQA) attempts to use computational models to automatically evaluate and estimate the image quality in accordance with subjective evaluations. Moreover, with the fast development of computer vision techniques, it is important in practice to extract and understand the information contained in blurred images or regions. The work in this dissertation focuses …

Contributors
Golestaneh, Seyedalireza, Karam, Lina, Bliss, Daniel W., et al.
Created Date
2018

Blur is an important attribute in the study and modeling of the human visual system. In this work, 3D blur discrimination experiments are conducted to measure the just noticeable additional blur required to differentiate a target blur from the reference blur level. The past studies on blur discrimination have measured the sensitivity of the human visual system to blur using 2D test patterns. In this dissertation, subjective tests are performed to measure blur discrimination thresholds using stereoscopic 3D test patterns. The results of this study indicate that, in the symmetric stereo viewing case, binocular disparity does not affect the blur …

Contributors
Subedar, Mahesh, Karam, Lina, Abousleman, Glen, et al.
Created Date
2015

In recent years, several methods have been proposed to encode sentences into fixed length continuous vectors called sentence representation or sentence embedding. With the recent advancements in various deep learning methods applied in Natural Language Processing (NLP), these representations play a crucial role in tasks such as named entity recognition, question answering and sentence classification. Traditionally, sentence vector representations are learnt from its constituent word representations, also known as word embeddings. Various methods to learn the distributed representation (embedding) of words have been proposed using the notion of Distributional Semantics, i.e. “meaning of a word is characterized by the company …

Contributors
Rath, Trideep, Baral, Chitta, Li, Baoxin, et al.
Created Date
2017