Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Date Range
2013 2019


Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading algorithms, including BART. The results show that XBART maintains BART’s predictive power while reducing its computation time. The thesis also describes the development of a Python package implementing XBART. Dissertation/Thesis

Contributors
Yalov, Saar, Hahn, P. Richard, McCulloch, Robert, et al.
Created Date
2019

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness, but limited studies compared different statistical techniques with latest frameworks, and interpreted models in a unified approach. In this thesis, several data mining algorithms have been applied to analyze students’ code assignment submission data from a real classroom study. The goal of this work is to explore and predict students’ …

Contributors
Tian, Wenbo, Hsiao, Ihan, Bazzi, Rida, et al.
Created Date
2019

The recent technological advances enable the collection of various complex, heterogeneous and high-dimensional data in biomedical domains. The increasing availability of the high-dimensional biomedical data creates the needs of new machine learning models for effective data analysis and knowledge discovery. This dissertation introduces several unsupervised and supervised methods to help understand the data, discover the patterns and improve the decision making. All the proposed methods can generalize to other industrial fields. The first topic of this dissertation focuses on the data clustering. Data clustering is often the first step for analyzing a dataset without the label information. Clustering high-dimensional data …

Contributors
Lin, Sangdi, Runger, George C, Kocher, Jean-Pierre A, et al.
Created Date
2018

A major challenge in health-related policy and program evaluation research is attributing underlying causal relationships where complicated processes may exist in natural or quasi-experimental settings. Spatial interaction and heterogeneity between units at individual or group levels can violate both components of the Stable-Unit-Treatment-Value-Assumption (SUTVA) that are core to the counterfactual framework, making treatment effects difficult to assess. New approaches are needed in health studies to develop spatially dynamic causal modeling methods to both derive insights from data that are sensitive to spatial differences and dependencies, and also be able to rely on a more robust, dynamic technical infrastructure needed for …

Contributors
Kolak, Marynia Aniela, Anselin, Luc, Rey, Sergio, et al.
Created Date
2017

This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA …

Contributors
Zheng, Yi, Stufken, John, Reiser, Mark, et al.
Created Date
2017

The dawn of Internet of Things (IoT) has opened the opportunity for mainstream adoption of machine learning analytics. However, most research in machine learning has focused on discovery of new algorithms or fine-tuning the performance of existing algorithms. Little exists on the process of taking an algorithm from the lab-environment into the real-world, culminating in sustained value. Real-world applications are typically characterized by dynamic non-stationary systems with requirements around feasibility, stability and maintainability. Not much has been done to establish standards around the unique analytics demands of real-world scenarios. This research explores the problem of the why so few of …

Contributors
Shahapurkar, Som, Liu, Huan, Davulcu, Hasan, et al.
Created Date
2016

Tracking targets in the presence of clutter is inevitable, and presents many challenges. Additionally, rapid, drastic changes in clutter density between different environments or scenarios can make it even more difficult for tracking algorithms to adapt. A novel approach to target tracking in such dynamic clutter environments is proposed using a particle filter (PF) integrated with Interacting Multiple Models (IMMs) to compensate and adapt to the transition between different clutter densities. This model was implemented for the case of a monostatic sensor tracking a single target moving with constant velocity along a two-dimensional trajectory, which crossed between regions of drastically …

Contributors
Dutson, Karl J, Papandreou-Suppappola, Antonia, Kovvali, Narayan, et al.
Created Date
2015

Complex systems are pervasive in science and engineering. Some examples include complex engineered networks such as the internet, the power grid, and transportation networks. The complexity of such systems arises not just from their size, but also from their structure, operation (including control and management), evolution over time, and that people are involved in their design and operation. Our understanding of such systems is limited because their behaviour cannot be characterized using traditional techniques of modelling and analysis. As a step in model development, statistically designed screening experiments may be used to identify the main effects and interactions most significant …

Contributors
Aldaco-Gastelum, Abraham Netzahualcoyotl, Syrotiuk, Violet R., Colbourn, Charles J., et al.
Created Date
2015

Statistics is taught at every level of education, yet teachers often have to assume their students have no knowledge of statistics and start from scratch each time they set out to teach statistics. The motivation for this experimental study comes from interest in exploring educational applications of augmented reality (AR) delivered via mobile technology that could potentially provide rich, contextualized learning for understanding concepts related to statistics education. This study examined the effects of AR experiences for learning basic statistical concepts. Using a 3 x 2 research design, this study compared learning gains of 252 undergraduate and graduate students from …

Contributors
Conley, Quincy, Atkinson, Robert K, Nguyen, Frank, et al.
Created Date
2013

Parallel Monte Carlo applications require the pseudorandom numbers used on each processor to be independent in a probabilistic sense. The TestU01 software package is the standard testing suite for detecting stream dependence and other properties that make certain pseudorandom generators ineffective in parallel (as well as serial) settings. TestU01 employs two basic schemes for testing parallel generated streams. The first applies serial tests to the individual streams and then tests the resulting P-values for uniformity. The second turns all the parallel generated streams into one long vector and then applies serial tests to the resulting concatenated stream. Various forms of …

Contributors
Ismay, Chester Ivan, Eubank, Randall, Young, Dennis, et al.
Created Date
2013