Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Date Range
2011 2018


This dissertation investigates the classification of systemic lupus erythematosus (SLE) in the presence of non-SLE alternatives, while developing novel curve classification methodologies with wide ranging applications. Functional data representations of plasma thermogram measurements and the corresponding derivative curves provide predictors yet to be investigated for SLE identification. Functional nonparametric classifiers form a methodological basis, which is used herein to develop a) the family of ESFuNC segment-wise curve classification algorithms and b) per-pixel ensembles based on logistic regression and fused-LASSO. The proposed methods achieve test set accuracy rates as high as 94.3%, while returning information about regions of the temperature domain …

Contributors
Buscaglia, Robert, Kamarianakis, Yiannis, Armbruster, Dieter, et al.
Created Date
2018

Modern, advanced statistical tools from data mining and machine learning have become commonplace in molecular biology in large part because of the “big data” demands of various kinds of “-omics” (e.g., genomics, transcriptomics, metabolomics, etc.). However, in other fields of biology where empirical data sets are conventionally smaller, more traditional statistical methods of inference are still very effective and widely used. Nevertheless, with the decrease in cost of high-performance computing, these fields are starting to employ simulation models to generate insights into questions that have been elusive in the laboratory and field. Although these computational models allow for exquisite control …

Contributors
Seto, Christian, Pavlic, Theodore, Li, Jing, et al.
Created Date
2018

Random forest (RF) is a popular and powerful technique nowadays. It can be used for classification, regression and unsupervised clustering. In its original form introduced by Leo Breiman, RF is used as a predictive model to generate predictions for new observations. Recent researches have proposed several methods based on RF for feature selection and for generating prediction intervals. However, they are limited in their applicability and accuracy. In this dissertation, RF is applied to build a predictive model for a complex dataset, and used as the basis for two novel methods for biomarker discovery and generating prediction interval. Firstly, a …

Contributors
Guan, Xin, Liu, Li, Runger, George, et al.
Created Date
2017

Proliferation of social media websites and discussion forums in the last decade has resulted in social media mining emerging as an effective mechanism to extract consumer patterns. Most research on social media and pharmacovigilance have concentrated on Adverse Drug Reaction (ADR) identification. Such methods employ a step of drug search followed by classification of the associated text as consisting an ADR or not. Although this method works efficiently for ADR classifications, if ADR evidence is present in users posts over time, drug mentions fail to capture such ADRs. It also fails to record additional user information which may provide an …

Contributors
Chandrashekar, Pramod Bharadwaj Chandrashekar, Davulcu, Hasan, Gonzalez, Graciela, et al.
Created Date
2016

Statistical Methods have been widely used in understanding factors for clinical and public health data. Statistical hypotheses are procedures for testing pre-stated hypotheses. The development and properties of these procedures as well as their performance are based upon certain assumptions. Desirable properties of statistical tests are to maintain validity and to perform well even if these assumptions are not met. A statistical test that maintains such desirable properties is called robust. Mathematical models are typically mechanistic framework, used to study dynamic interactions between components (mechanisms) of a system, and how these interactions give rise to the changes in behavior (patterns) …

Contributors
Gonzalez, Beverly, Castillo-Chavez, Carlos, Mubayi, Anuj, et al.
Created Date
2015

In anthropological models of social organization, kinship is perceived to be fundamental to social structure. This project aimed to understand how individuals buried in neighborhoods or patio groups were affiliated, by considering multiple possibilities of fictive and biological kinship, short or long-term co-residence, and long-distance kin affiliation. The social organization of the ancient Maya urban center of Copan, Honduras during the Late Classic (AD 600-822) period was evaluated through analysis of the human skeletal remains drawn from the largest collection yet recovered in Mesoamerica (n=1200). The research question was: What are the roles that kinship (biological or fictive) and co-residence …

Contributors
Miller, Katherine Anne, Buikstra, Jane E, Bell, Ellen E, et al.
Created Date
2015

The advent of new high throughput technology allows for increasingly detailed characterization of the immune system in healthy, disease, and age states. The immune system is composed of two main branches: the innate and adaptive immune system, though the border between these two states is appearing less distinct. The adaptive immune system is further split into two main categories: humoral and cellular immunity. The humoral immune response produces antibodies against specific targets, and these antibodies can be used to learn about disease and normal states. In this document, I use antibodies to characterize the immune system in two ways: 1. …

Contributors
Whittemore, Kurt, Sykes, Kathryn, Johnston, Stephen A, et al.
Created Date
2014

In blindness research, the corpus callosum (CC) is the most frequently studied sub-cortical structure, due to its important involvement in visual processing. While most callosal analyses from brain structural magnetic resonance images (MRI) are limited to the 2D mid-sagittal slice, we propose a novel framework to capture a complete set of 3D morphological differences in the corpus callosum between two groups of subjects. The CCs are segmented from whole brain T1-weighted MRI and modeled as 3D tetrahedral meshes. The callosal surface is divided into superior and inferior patches on which we compute a volumetric harmonic field by solving the Laplace's …

Contributors
Xu, Liang, Wang, Yalin, Maciejewski, Ross, et al.
Created Date
2013

Immunosignaturing is a technology that allows the humoral immune response to be observed through the binding of antibodies to random sequence peptides. The immunosignaturing microarray is based on complex mixtures of antibodies binding to arrays of random sequence peptides in a multiplexed fashion. There are computational and statistical challenges to the analysis of immunosignaturing data. The overall aim of my dissertation is to develop novel computational and statistical methods for immunosignaturing data to access its potential for diagnostics and drug discovery. Firstly, I discovered that a classification algorithm Naive Bayes which leverages the biological independence of the probes on our …

Contributors
Kukreja, Muskan, Johnston, Stephen Albert, Stafford, Phillip, et al.
Created Date
2012

The living world we inhabit and observe is extraordinarily complex. From the perspective of a person analyzing data about the living world, complexity is most commonly encountered in two forms: 1) in the sheer size of the datasets that must be analyzed and the physical number of mathematical computations necessary to obtain an answer and 2) in the underlying structure of the data, which does not conform to classical normal theory statistical assumptions and includes clustering and unobserved latent constructs. Until recently, the methods and tools necessary to effectively address the complexity of biomedical data were not ordinarily available. The …

Contributors
Brown, Justin Reed, Dinu, Valentin, Johnson, William, et al.
Created Date
2012