Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Language
  • English
Subject
Date Range
2010 2019


Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools; see e.g., Froelich & Habing, 2007). It remains to be seen how such procedures perform in the context of small-scale assessments characterized by relatively small sample sizes and/or short tests. The fact that some procedures come with minimum allowable values for characteristics of the data, such as the number of …

Contributors
Reichenberg, Ray E., Levy, Roy, Thompson, Marilyn S., et al.
Created Date
2013

Many longitudinal studies, especially in clinical trials, suffer from missing data issues. Most estimation procedures assume that the missing values are ignorable or missing at random (MAR). However, this assumption leads to unrealistic simplification and is implausible for many cases. For example, an investigator is examining the effect of treatment on depression. Subjects are scheduled with doctors on a regular basis and asked questions about recent emotional situations. Patients who are experiencing severe depression are more likely to miss an appointment and leave the data missing for that particular visit. Data that are not missing at random may produce bias …

Contributors
Zhang, Jun, Reiser, Mark, Barber, Jarrett, et al.
Created Date
2013

Value-added models (VAMs) are used by many states to assess contributions of individual teachers and schools to students' academic growth. The generalized persistence VAM, one of the most flexible in the literature, estimates the ``value added'' by individual teachers to their students' current and future test scores by employing a mixed model with a longitudinal database of test scores. There is concern, however, that missing values that are common in the longitudinal student scores can bias value-added assessments, especially when the models serve as a basis for personnel decisions -- such as promoting or dismissing teachers -- as they are …

Contributors
Karl, Andrew Thomas, Lohr, Sharon L, Yang, Yan, et al.
Created Date
2012

Understanding customer preference is crucial for new product planning and marketing decisions. This thesis explores how historical data can be leveraged to understand and predict customer preference. This thesis presents a decision support framework that provides a holistic view on customer preference by following a two-phase procedure. Phase-1 uses cluster analysis to create product profiles based on which customer profiles are derived. Phase-2 then delves deep into each of the customer profiles and investigates causality behind their preference using Bayesian networks. This thesis illustrates the working of the framework using the case of Intel Corporation, world’s largest semiconductor manufacturing company. …

Contributors
Ram, Sudarshan Venkat, Kempf, Karl G, Wu, Teresa, et al.
Created Date
2017

Complex systems are pervasive in science and engineering. Some examples include complex engineered networks such as the internet, the power grid, and transportation networks. The complexity of such systems arises not just from their size, but also from their structure, operation (including control and management), evolution over time, and that people are involved in their design and operation. Our understanding of such systems is limited because their behaviour cannot be characterized using traditional techniques of modelling and analysis. As a step in model development, statistically designed screening experiments may be used to identify the main effects and interactions most significant …

Contributors
Aldaco-Gastelum, Abraham Netzahualcoyotl, Syrotiuk, Violet R., Colbourn, Charles J., et al.
Created Date
2015

The Pearson and likelihood ratio statistics are commonly used to test goodness-of-fit for models applied to data from a multinomial distribution. When data are from a table formed by cross-classification of a large number of variables, the common statistics may have low power and inaccurate Type I error level due to sparseness in the cells of the table. The GFfit statistic can be used to examine model fit in subtables. It is proposed to assess model fit by using a new version of GFfit statistic based on orthogonal components of Pearson chi-square as a diagnostic to examine the fit on …

Contributors
Zhu, Junfei, Reiser, Mark, Stufken, John, et al.
Created Date
2017

Urban growth, from regional sprawl to global urbanization, is the most rapid, drastic, and irreversible form of human modification to the natural environment. Extensive land cover modifications during urban growth have altered the local energy balance, causing the city warmer than its surrounding rural environment, a phenomenon known as an urban heat island (UHI). How are the seasonal and diurnal surface temperatures related to the land surface characteristics, and what land cover types and/or patterns are desirable for ameliorating climate in a fast growing desert city? This dissertation scrutinizes these questions and seeks to address them using a combination of …

Contributors
Fan, Chao, Myint, Soe W, Li, Wenwen, et al.
Created Date
2016

The main objective of this research is to develop an approach to PV module lifetime prediction. In doing so, the aim is to move from empirical generalizations to a formal predictive science based on data-driven case studies of the crystalline silicon PV systems. The evaluation of PV systems aged 5 to 30 years old that results in systematic predictive capability that is absent today. The warranty period provided by the manufacturers typically range from 20 to 25 years for crystalline silicon modules. The end of lifetime (for example, the time-to-degrade by 20% from rated power) of PV modules is usually …

Contributors
Kuitche, Joseph Mathurin, Pan, Rong, TamizhMani, Govindasamy, et al.
Created Date
2014

Statistical process control (SPC) and predictive analytics have been used in industrial manufacturing and design, but up until now have not been applied to threshold data of vital sign monitoring in remote care settings. In this study of 20 elders with COPD and/or CHF, extended months of peak flow monitoring (FEV1) using telemedicine are examined to determine when an earlier or later clinical intervention may have been advised. This study demonstrated that SPC may bring less than a 2.0% increase in clinician workload while providing more robust statistically-derived thresholds than clinician-derived thresholds. Using a random K-fold model, FEV1 output was …

Contributors
Fralick, Celeste Rachelle, Muthuswamy, Jitendran, O'Shea, Terrance, et al.
Created Date
2013

Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading algorithms, including BART. The results show that XBART maintains BART’s predictive power while reducing its computation time. The thesis also describes the development of a Python package implementing XBART. Dissertation/Thesis

Contributors
Yalov, Saar, Hahn, P. Richard, McCulloch, Robert, et al.
Created Date
2019

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs and without the need for explicit relearning from scratch. In this thesis, a novel instance transfer technique that adapts a "Cost-sensitive" variation of AdaBoost is presented. The method capitalizes on the theoretical and functional properties of AdaBoost to selectively reuse outdated training instances obtained from a "source" domain to effectively …

Contributors
Venkatesan, Ashok, Panchanathan, Sethuraman, Li, Baoxin, et al.
Created Date
2011

The Pearson and likelihood ratio statistics are well-known in goodness-of-fit testing and are commonly used for models applied to multinomial count data. When data are from a table formed by the cross-classification of a large number of variables, these goodness-of-fit statistics may have lower power and inaccurate Type I error rate due to sparseness. Pearson's statistic can be decomposed into orthogonal components associated with the marginal distributions of observed variables, and an omnibus fit statistic can be obtained as a sum of these components. When the statistic is a sum of components for lower-order marginals, it has good performance for …

Contributors
Dassanayake, Mudiyanselage Maduranga Kasun, Reiser, Mark, Kao, Ming-Hung, et al.
Created Date
2018

Quadratic growth curves of 2nd degree polynomial are widely used in longitudinal studies. For a 2nd degree polynomial, the vertex represents the location of the curve in the XY plane. For a quadratic growth curve, we propose an approximate confidence region as well as the confidence interval for x and y-coordinates of the vertex using two methods, the gradient method and the delta method. Under some models, an indirect test on the location of the curve can be based on the intercept and slope parameters, but in other models, a direct test on the vertex is required. We present a …

Contributors
Yu, Wanchunzi, Reiser, Mark, Barber, Jarrett, et al.
Created Date
2015

This dissertation presents methods for addressing research problems that currently can only adequately be solved using Quality Reliability Engineering (QRE) approaches especially accelerated life testing (ALT) of electronic printed wiring boards with applications to avionics circuit boards. The methods presented in this research are generally applicable to circuit boards, but the data generated and their analysis is for high performance avionics. Avionics equipment typically requires 20 years expected life by aircraft equipment manufacturers and therefore ALT is the only practical way of performing life test estimates. Both thermal and vibration ALT induced failure are performed and analyzed to resolve industry …

Contributors
Juarez, Joseph Moses, Montgomery, Douglas C., Borror, Connie M., et al.
Created Date
2012

This thesis presents a family of adaptive curvature methods for gradient-based stochastic optimization. In particular, a general algorithmic framework is introduced along with a practical implementation that yields an efficient, adaptive curvature gradient descent algorithm. To this end, a theoretical and practical link between curvature matrix estimation and shrinkage methods for covariance matrices is established. The use of shrinkage improves estimation accuracy of the curvature matrix when data samples are scarce. This thesis also introduce several insights that result in data- and computation-efficient update equations. Empirical results suggest that the proposed method compares favorably with existing second-order techniques based on …

Contributors
Barron, Trevor Paul, Ben Amor, Heni, He, Jingrui, et al.
Created Date
2019

This work presents two complementary studies that propose heuristic methods to capture characteristics of data using the ensemble learning method of random forest. The first study is motivated by the problem in education of determining teacher effectiveness in student achievement. Value-added models (VAMs), constructed as linear mixed models, use students’ test scores as outcome variables and teachers’ contributions as random effects to ascribe changes in student performance to the teachers who have taught them. The VAMs teacher score is the empirical best linear unbiased predictor (EBLUP). This approach is limited by the adequacy of the assumed model specification with respect …

Contributors
Valdivia, Arturo, Eubank, Randall, Young, Dennis, et al.
Created Date
2013

This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA …

Contributors
Zheng, Yi, Stufken, John, Reiser, Mark, et al.
Created Date
2017

Designing studies that use latent growth modeling to investigate change over time calls for optimal approaches for conducting power analysis for a priori determination of required sample size. This investigation (1) studied the impacts of variations in specified parameters, design features, and model misspecification in simulation-based power analyses and (2) compared power estimates across three common power analysis techniques: the Monte Carlo method; the Satorra-Saris method; and the method developed by MacCallum, Browne, and Cai (MBC). Choice of sample size, effect size, and slope variance parameters markedly influenced power estimates; however, level-1 error variance and number of repeated measures (3 …

Contributors
Van Vleet, Bethany L., Thompson, Marilyn S., Green, Samuel B., et al.
Created Date
2011

This dissertation proposes a new set of analytical methods for high dimensional physiological sensors. The methodologies developed in this work were motivated by problems in learning science, but also apply to numerous disciplines where high dimensional signals are present. In the education field, more data is now available from traditional sources and there is an important need for analytical methods to translate this data into improved learning. Affecting Computing which is the study of new techniques that develop systems to recognize and model human emotions is integrating different physiological signals such as electroencephalogram (EEG) and electromyogram (EMG) to detect and …

Contributors
Lujan Moreno, Gustavo A., Runger, George C, Atkinson, Robert K, et al.
Created Date
2017

Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts, for anomaly detection in categorical data in which neither the dimension, the specific attributes involved, nor the form of the pattern is known a priori. I use RandomForest (RF) technique as an effective learner for artificial contrast. RF is a powerful algorithm that can handle relations of attributes in high dimensional data and detect anomalies while providing probability estimates for …

Contributors
Mousavi, Seyyedehnasim, Runger, George, Wu, Teresa, et al.
Created Date
2016