Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Resource Type
  • Masters Thesis
  • 1 Text
Subject
Date Range
2011 2019


Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading algorithms, including BART. The results show that XBART maintains BART’s predictive power while reducing its computation time. The thesis also describes the development of a Python package implementing XBART. Dissertation/Thesis

Contributors
Yalov, Saar, Hahn, P. Richard, McCulloch, Robert, et al.
Created Date
2019

This thesis presents a family of adaptive curvature methods for gradient-based stochastic optimization. In particular, a general algorithmic framework is introduced along with a practical implementation that yields an efficient, adaptive curvature gradient descent algorithm. To this end, a theoretical and practical link between curvature matrix estimation and shrinkage methods for covariance matrices is established. The use of shrinkage improves estimation accuracy of the curvature matrix when data samples are scarce. This thesis also introduce several insights that result in data- and computation-efficient update equations. Empirical results suggest that the proposed method compares favorably with existing second-order techniques based on …

Contributors
Barron, Trevor Paul, Ben Amor, Heni, He, Jingrui, et al.
Created Date
2019

In this work, I present a Bayesian inference computational framework for the analysis of widefield microscopy data that addresses three challenges: (1) counting and localizing stationary fluorescent molecules; (2) inferring a spatially-dependent effective fluorescence profile that describes the spatially-varying rate at which fluorescent molecules emit subsequently-detected photons (due to different illumination intensities or different local environments); and (3) inferring the camera gain. My general theoretical framework utilizes the Bayesian nonparametric Gaussian and beta-Bernoulli processes with a Markov chain Monte Carlo sampling scheme, which I further specify and implement for Total Internal Reflection Fluorescence (TIRF) microscopy data, benchmarking the method on …

Contributors
Wallgren, Ross Tod, Presse, Steve, Armbruster, Hans, et al.
Created Date
2019

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness, but limited studies compared different statistical techniques with latest frameworks, and interpreted models in a unified approach. In this thesis, several data mining algorithms have been applied to analyze students’ code assignment submission data from a real classroom study. The goal of this work is to explore and predict students’ …

Contributors
Tian, Wenbo, Hsiao, Ihan, Bazzi, Rida, et al.
Created Date
2019

Statistical model selection using the Akaike Information Criterion (AIC) and similar criteria is a useful tool for comparing multiple and non-nested models without the specification of a null model, which has made it increasingly popular in the natural and social sciences. De- spite their common usage, model selection methods are not driven by a notion of statistical confidence, so their results entail an unknown de- gree of uncertainty. This paper introduces a general framework which extends notions of Type-I and Type-II error to model selection. A theo- retical method for controlling Type-I error using Difference of Goodness of Fit (DGOF) …

Contributors
Cullan, Michael, Sterner, Beckett, Fricks, John, et al.
Created Date
2018

Understanding customer preference is crucial for new product planning and marketing decisions. This thesis explores how historical data can be leveraged to understand and predict customer preference. This thesis presents a decision support framework that provides a holistic view on customer preference by following a two-phase procedure. Phase-1 uses cluster analysis to create product profiles based on which customer profiles are derived. Phase-2 then delves deep into each of the customer profiles and investigates causality behind their preference using Bayesian networks. This thesis illustrates the working of the framework using the case of Intel Corporation, world’s largest semiconductor manufacturing company. …

Contributors
Ram, Sudarshan Venkat, Kempf, Karl G, Wu, Teresa, et al.
Created Date
2017

This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA …

Contributors
Zheng, Yi, Stufken, John, Reiser, Mark, et al.
Created Date
2017

Distributed Renewable energy generators are now contributing a significant amount of energy into the energy grid. Consequently, reliability adequacy of such energy generators will depend on making accurate forecasts of energy produced by them. Power outputs of Solar PV systems depend on the stochastic variation of environmental factors (solar irradiance, ambient temperature & wind speed) and random mechanical failures/repairs. Monte Carlo Simulation which is typically used to model such problems becomes too computationally intensive leading to simplifying state-space assumptions. Multi-state models for power system reliability offer a higher flexibility in providing a description of system state evolution and an accurate …

Contributors
Kadloor, Nikhil, Kuitche, Joseph, Pan, Rong, et al.
Created Date
2017

Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts, for anomaly detection in categorical data in which neither the dimension, the specific attributes involved, nor the form of the pattern is known a priori. I use RandomForest (RF) technique as an effective learner for artificial contrast. RF is a powerful algorithm that can handle relations of attributes in high dimensional data and detect anomalies while providing probability estimates for …

Contributors
Mousavi, Seyyedehnasim, Runger, George, Wu, Teresa, et al.
Created Date
2016

The operating temperature of photovoltaic (PV) modules is affected by external factors such as irradiance, wind speed and ambient temperature as well as internal factors like material properties and design properties. These factors can make a difference in the operating temperatures between cells within a module and between modules within a plant. This is a three-part thesis. Part 1 investigates the behavior of temperature distribution of PV cells within a module through outdoor temperature monitoring under various operating conditions (Pmax, Voc and Isc) and examines deviation in the temperature coefficient values pertaining to this temperature variation. ANOVA, a statistical tool, …

Contributors
PAVGI, ASHWINI, Tamizhmani, Govindasamy, Phelan, Patrick, et al.
Created Date
2016