Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Contributor
Resource Type
  • Masters Thesis
  • 1 Text
Subject
Date Range
2011 2019


Dimensionality assessment is an important component of evaluating item response data. Existing approaches to evaluating common assumptions of unidimensionality, such as DIMTEST (Nandakumar & Stout, 1993; Stout, 1987; Stout, Froelich, & Gao, 2001), have been shown to work well under large-scale assessment conditions (e.g., large sample sizes and item pools; see e.g., Froelich & Habing, 2007). It remains to be seen how such procedures perform in the context of small-scale assessments characterized by relatively small sample sizes and/or short tests. The fact that some procedures come with minimum allowable values for characteristics of the data, such as the number of …

Contributors
Reichenberg, Ray E., Levy, Roy, Thompson, Marilyn S., et al.
Created Date
2013

Understanding customer preference is crucial for new product planning and marketing decisions. This thesis explores how historical data can be leveraged to understand and predict customer preference. This thesis presents a decision support framework that provides a holistic view on customer preference by following a two-phase procedure. Phase-1 uses cluster analysis to create product profiles based on which customer profiles are derived. Phase-2 then delves deep into each of the customer profiles and investigates causality behind their preference using Bayesian networks. This thesis illustrates the working of the framework using the case of Intel Corporation, world’s largest semiconductor manufacturing company. …

Contributors
Ram, Sudarshan Venkat, Kempf, Karl G, Wu, Teresa, et al.
Created Date
2017

Bayesian Additive Regression Trees (BART) is a non-parametric Bayesian model that often outperforms other popular predictive models in terms of out-of-sample error. This thesis studies a modified version of BART called Accelerated Bayesian Additive Regression Trees (XBART). The study consists of simulation and real data experiments comparing XBART to other leading algorithms, including BART. The results show that XBART maintains BART’s predictive power while reducing its computation time. The thesis also describes the development of a Python package implementing XBART. Dissertation/Thesis

Contributors
Yalov, Saar, Hahn, P. Richard, McCulloch, Robert, et al.
Created Date
2019

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs and without the need for explicit relearning from scratch. In this thesis, a novel instance transfer technique that adapts a "Cost-sensitive" variation of AdaBoost is presented. The method capitalizes on the theoretical and functional properties of AdaBoost to selectively reuse outdated training instances obtained from a "source" domain to effectively …

Contributors
Venkatesan, Ashok, Panchanathan, Sethuraman, Li, Baoxin, et al.
Created Date
2011

This thesis presents a family of adaptive curvature methods for gradient-based stochastic optimization. In particular, a general algorithmic framework is introduced along with a practical implementation that yields an efficient, adaptive curvature gradient descent algorithm. To this end, a theoretical and practical link between curvature matrix estimation and shrinkage methods for covariance matrices is established. The use of shrinkage improves estimation accuracy of the curvature matrix when data samples are scarce. This thesis also introduce several insights that result in data- and computation-efficient update equations. Empirical results suggest that the proposed method compares favorably with existing second-order techniques based on …

Contributors
Barron, Trevor Paul, Ben Amor, Heni, He, Jingrui, et al.
Created Date
2019

This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA …

Contributors
Zheng, Yi, Stufken, John, Reiser, Mark, et al.
Created Date
2017

Anomaly is a deviation from the normal behavior of the system and anomaly detection techniques try to identify unusual instances based on deviation from the normal data. In this work, I propose a machine-learning algorithm, referred to as Artificial Contrasts, for anomaly detection in categorical data in which neither the dimension, the specific attributes involved, nor the form of the pattern is known a priori. I use RandomForest (RF) technique as an effective learner for artificial contrast. RF is a powerful algorithm that can handle relations of attributes in high dimensional data and detect anomalies while providing probability estimates for …

Contributors
Mousavi, Seyyedehnasim, Runger, George, Wu, Teresa, et al.
Created Date
2016

This thesis presents a meta-analysis of lead-free solder reliability. The qualitative analyses of the failure modes of lead- free solder under different stress tests including drop test, bend test, thermal test and vibration test are discussed. The main cause of failure of lead- free solder is fatigue crack, and the speed of propagation of the initial crack could differ from different test conditions and different solder materials. A quantitative analysis about the fatigue behavior of SAC lead-free solder under thermal preconditioning process is conducted. This thesis presents a method of making prediction of failure life of solder alloy by building …

Contributors
Xu, Xinyue, Pan, Rong, Montgomery, Douglas, et al.
Created Date
2014

In this work, I present a Bayesian inference computational framework for the analysis of widefield microscopy data that addresses three challenges: (1) counting and localizing stationary fluorescent molecules; (2) inferring a spatially-dependent effective fluorescence profile that describes the spatially-varying rate at which fluorescent molecules emit subsequently-detected photons (due to different illumination intensities or different local environments); and (3) inferring the camera gain. My general theoretical framework utilizes the Bayesian nonparametric Gaussian and beta-Bernoulli processes with a Markov chain Monte Carlo sampling scheme, which I further specify and implement for Total Internal Reflection Fluorescence (TIRF) microscopy data, benchmarking the method on …

Contributors
Wallgren, Ross Tod, Presse, Steve, Armbruster, Hans, et al.
Created Date
2019

This thesis examines the application of statistical signal processing approaches to data arising from surveys intended to measure psychological and sociological phenomena underpinning human social dynamics. The use of signal processing methods for analysis of signals arising from measurement of social, biological, and other non-traditional phenomena has been an important and growing area of signal processing research over the past decade. Here, we explore the application of statistical modeling and signal processing concepts to data obtained from the Global Group Relations Project, specifically to understand and quantify the effects and interactions of social psychological factors related to intergroup conflicts. We …

Contributors
Liu, Hui, Taylor, Thomas, Cochran, Douglas, et al.
Created Date
2012

Statistical model selection using the Akaike Information Criterion (AIC) and similar criteria is a useful tool for comparing multiple and non-nested models without the specification of a null model, which has made it increasingly popular in the natural and social sciences. De- spite their common usage, model selection methods are not driven by a notion of statistical confidence, so their results entail an unknown de- gree of uncertainty. This paper introduces a general framework which extends notions of Type-I and Type-II error to model selection. A theo- retical method for controlling Type-I error using Difference of Goodness of Fit (DGOF) …

Contributors
Cullan, Michael, Sterner, Beckett, Fricks, John, et al.
Created Date
2018

When analyzing longitudinal data it is essential to account both for the correlation inherent from the repeated measures of the responses as well as the correlation realized on account of the feedback created between the responses at a particular time and the predictors at other times. A generalized method of moments (GMM) for estimating the coefficients in longitudinal data is presented. The appropriate and valid estimating equations associated with the time-dependent covariates are identified, thus providing substantial gains in efficiency over generalized estimating equations (GEE) with the independent working correlation. Identifying the estimating equations for computation is of utmost importance. …

Contributors
Yin, Jianqiong, Wilson, Jeffrey Wilson, Reiser, Mark, et al.
Created Date
2012

This is a two part thesis: Part 1 of this thesis determines the most dominant failure modes of field aged photovoltaic (PV) modules using experimental data and statistical analysis, FMECA (Failure Mode, Effect, and Criticality Analysis). The failure and degradation modes of about 5900 crystalline-Si glass/polymer modules fielded for 6 to 16 years in three different photovoltaic (PV) power plants with different mounting systems under the hot-dry desert climate of Arizona are evaluated. A statistical reliability tool, FMECA that uses Risk Priority Number (RPN) is performed for each PV power plant to determine the dominant failure modes in the modules …

Contributors
Shrestha, Sanjay Mohan, Tamizhmani, Govindsamy, Srinivasan, Devrajan, et al.
Created Date
2014

The objective of this thesis is to investigate the various types of energy end-uses to be expected in future high efficiency single family residences. For this purpose, this study has analyzed monitored data from 14 houses in the 2013 Solar Decathlon competition, and segregates the energy consumption patterns in various residential end-uses (such as lights, refrigerators, washing machines, ...). The analysis was not straight-forward since these homes were operated according to schedules previously determined by the contest rules. The analysis approach allowed the isolation of the comfort energy use by the Heating, Venting and Cooling (HVAC) systems. HVAC are the …

Contributors
Garkhail, Rahul, Reddy, T Agami, Bryan, Harvey, et al.
Created Date
2014

Given the importance of buildings as major consumers of resources worldwide, several organizations are working avidly to ensure the negative impacts of buildings are minimized. The U.S. Green Building Council's (USGBC) Leadership in Energy and Environmental Design (LEED) rating system is one such effort to recognize buildings that are designed to achieve a superior performance in several areas including energy consumption and indoor environmental quality (IEQ). The primary objectives of this study are to investigate the performance of LEED certified facilities in terms of energy consumption and occupant satisfaction with IEQ, and introduce a framework to assess the performance of …

Contributors
Chokor, Abbas, El Asmar, Mounir, Chong, Oswald, et al.
Created Date
2015

The inherent intermittency in solar energy resources poses challenges to scheduling generation, transmission, and distribution systems. Energy storage devices are often used to mitigate variability in renewable asset generation and provide a mechanism to shift renewable power between periods of the day. In the absence of storage, however, time series forecasting techniques can be used to estimate future solar resource availability to improve the accuracy of solar generator scheduling. The knowledge of future solar availability helps scheduling solar generation at high-penetration levels, and assists with the selection and scheduling of spinning reserves. This study employs statistical techniques to improve the …

Contributors
Soundiah Regunathan Rajasekaran, Dhiwaakar Purusothaman, Johnson, Nathan G, Karady, George G, et al.
Created Date
2016

Researchers are often interested in estimating interactions in multilevel models, but many researchers assume that the same procedures and interpretations for interactions in single-level models apply to multilevel models. However, estimating interactions in multilevel models is much more complex than in single-level models. Because uncentered (RAS) or grand mean centered (CGM) level-1 predictors in two-level models contain two sources of variability (i.e., within-cluster variability and between-cluster variability), interactions involving RAS or CGM level-1 predictors also contain more than one source of variability. In this Master’s thesis, I use simulations to demonstrate that ignoring the four sources of variability in a …

Contributors
Mazza, Gina Lynn, Enders, Craig K., Aiken, Leona S., et al.
Created Date
2015

The Partition of Variance (POV) method is a simplistic way to identify large sources of variation in manufacturing systems. This method identifies the variance by estimating the variance of the means (between variance) and the means of the variance (within variance). The project shows that the method correctly identifies the variance source when compared to the ANOVA method. Although the variance estimators deteriorate when varying degrees of non-normality is introduced through simulation; however, the POV method is shown to be a more stable measure of variance in the aggregate. The POV method also provides non-negative, stable estimates for interaction when …

Contributors
Little, David John, Borror, Connie, Montgomery, Douglas, et al.
Created Date
2015

The present thesis explores how statistical methods are conceptualized, used, and interpreted in quantitative Hispanic sociolinguistics in light of the group of statistical methods espoused by Kline (2013) and named by Cumming (2012) as the “new statistics.” The new statistics, as a conceptual framework, repudiates null hypothesis statistical testing (NHST) and replaces it with the ESCI method, or Effect Sizes and Confidence Intervals, as well as meta-analytic thinking. In this thesis, a descriptive review of 44 studies found in three academic journals over the last decade (2005 – 2015), NHST was found to have a tight grip on most researchers. …

Contributors
Kidhardt, Paul Adrian, Cerron-Palomino, Alvaro, Gonzalez-Lopez, Veronica, et al.
Created Date
2015

This is a two-part thesis: Part 1 characterizes soiling losses using various techniques to understand the effect of soiling on photovoltaic modules. The higher the angle of incidence (AOI), the lower will be the photovoltaic (PV) module performance. Our research group has already reported the AOI investigation for cleaned modules of five different technologies with air/glass interface. However, the modules that are installed in the field would invariably develop a soil layer with varying thickness depending on the site condition, rainfall and tilt angle. The soiled module will have the air/soil/glass interface rather than air/glass interface. This study investigates the …

Contributors
Boppana, Sravanthi, Tamizhmani, Govindasamy, Srinivasan, Devarajan, et al.
Created Date
2015

Distributed Renewable energy generators are now contributing a significant amount of energy into the energy grid. Consequently, reliability adequacy of such energy generators will depend on making accurate forecasts of energy produced by them. Power outputs of Solar PV systems depend on the stochastic variation of environmental factors (solar irradiance, ambient temperature & wind speed) and random mechanical failures/repairs. Monte Carlo Simulation which is typically used to model such problems becomes too computationally intensive leading to simplifying state-space assumptions. Multi-state models for power system reliability offer a higher flexibility in providing a description of system state evolution and an accurate …

Contributors
Kadloor, Nikhil, Kuitche, Joseph, Pan, Rong, et al.
Created Date
2017

Photovoltaic (PV) modules are typically rated at three test conditions: STC (standard test conditions), NOCT (nominal operating cell temperature) and Low E (low irradiance). The current thesis deals with the power rating of PV modules at twenty-three test conditions as per the recent International Electrotechnical Commission (IEC) standard of IEC 61853 – 1. In the current research, an automation software tool developed by a previous researcher of ASU – PRL (ASU Photovoltaic Reliability Laboratory) is validated at various stages. Also in the current research, the power rating of PV modules for four different manufacturers is carried out according to IEC …

Contributors
Vemula, Meena Gupta, Tamizhmani, Govindasamy, Macia, Narcio F., et al.
Created Date
2012

Due to large data resources generated by online educational applications, Educational Data Mining (EDM) has improved learning effects in different ways: Students Visualization, Recommendations for students, Students Modeling, Grouping Students, etc. A lot of programming assignments have the features like automating submissions, examining the test cases to verify the correctness, but limited studies compared different statistical techniques with latest frameworks, and interpreted models in a unified approach. In this thesis, several data mining algorithms have been applied to analyze students’ code assignment submission data from a real classroom study. The goal of this work is to explore and predict students’ …

Contributors
Tian, Wenbo, Hsiao, Ihan, Bazzi, Rida, et al.
Created Date
2019

Obtaining high-quality experimental designs to optimize statistical efficiency and data quality is quite challenging for functional magnetic resonance imaging (fMRI). The primary fMRI design issue is on the selection of the best sequence of stimuli based on a statistically meaningful optimality criterion. Some previous studies have provided some guidance and powerful computational tools for obtaining good fMRI designs. However, these results are mainly for basic experimental settings with simple statistical models. In this work, a type of modern fMRI experiments is considered, in which the design matrix of the statistical model depends not only on the selected design, but also …

Contributors
Zhou, Lin, Kao, Ming-hung, Reiser, Mark, et al.
Created Date
2014

In many classication problems data samples cannot be collected easily, example in drug trials, biological experiments and study on cancer patients. In many situations the data set size is small and there are many outliers. When classifying such data, example cancer vs normal patients the consequences of mis-classication are probably more important than any other data type, because the data point could be a cancer patient or the classication decision could help determine what gene might be over expressed and perhaps a cause of cancer. These mis-classications are typically higher in the presence of outlier data points. The aim of …

Contributors
Gupta, Sidharth, Kim, Seungchan, Welfert, Bruno, et al.
Created Date
2011

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of focus. In supervised learning like regression, the data consists of many features and only a subset of the features may be responsible for the result. Also, the features might require special structural requirements, which introduces additional complexity for feature selection. The sparse learning package, provides a set of algorithms for …

Contributors
Thulasiram, Ramesh L., Ye, Jieping, Xue, Guoliang, et al.
Created Date
2011

Tracking targets in the presence of clutter is inevitable, and presents many challenges. Additionally, rapid, drastic changes in clutter density between different environments or scenarios can make it even more difficult for tracking algorithms to adapt. A novel approach to target tracking in such dynamic clutter environments is proposed using a particle filter (PF) integrated with Interacting Multiple Models (IMMs) to compensate and adapt to the transition between different clutter densities. This model was implemented for the case of a monostatic sensor tracking a single target moving with constant velocity along a two-dimensional trajectory, which crossed between regions of drastically …

Contributors
Dutson, Karl J, Papandreou-Suppappola, Antonia, Kovvali, Narayan, et al.
Created Date
2015

The operating temperature of photovoltaic (PV) modules is affected by external factors such as irradiance, wind speed and ambient temperature as well as internal factors like material properties and design properties. These factors can make a difference in the operating temperatures between cells within a module and between modules within a plant. This is a three-part thesis. Part 1 investigates the behavior of temperature distribution of PV cells within a module through outdoor temperature monitoring under various operating conditions (Pmax, Voc and Isc) and examines deviation in the temperature coefficient values pertaining to this temperature variation. ANOVA, a statistical tool, …

Contributors
PAVGI, ASHWINI, Tamizhmani, Govindasamy, Phelan, Patrick, et al.
Created Date
2016

The purpose of this study was to examine under which conditions "good" data characteristics can compensate for "poor" characteristics in Latent Class Analysis (LCA), as well as to set forth guidelines regarding the minimum sample size and ideal number and quality of indicators. In particular, we studied to which extent including a larger number of high quality indicators can compensate for a small sample size in LCA. The results suggest that in general, larger sample size, more indicators, higher quality of indicators, and a larger covariate effect correspond to more converged and proper replications, as well as fewer boundary estimates …

Contributors
Wurpts, Ingrid Carlson, Geiser, Christian, Aiken, Leona, et al.
Created Date
2012

Methods to test hypotheses of mediated effects in the pretest-posttest control group design are understudied in the behavioral sciences (MacKinnon, 2008). Because many studies aim to answer questions about mediating processes in the pretest-posttest control group design, there is a need to determine which model is most appropriate to test hypotheses about mediating processes and what happens to estimates of the mediated effect when model assumptions are violated in this design. The goal of this project was to outline estimator characteristics of four longitudinal mediation models and the cross-sectional mediation model. Models were compared on type 1 error rates, statistical …

Contributors
Valente, Matthew, MacKinnon, David, West, Stephen, et al.
Created Date
2015

A simulation study was conducted to explore the influence of partial loading invariance and partial intercept invariance on the latent mean comparison of the second-order factor within a higher-order confirmatory factor analysis (CFA) model. Noninvariant loadings or intercepts were generated to be at one of the two levels or both levels for a second-order CFA model. The numbers and directions of differences in noninvariant loadings or intercepts were also manipulated, along with total sample size and effect size of the second-order factor mean difference. Data were analyzed using correct and incorrect specifications of noninvariant loadings and intercepts. Results summarized across …

Contributors
Liu, Yixing, Thompson, Marilyn, Green, Samuel, et al.
Created Date
2016

Currently, there is a clear gap in the missing data literature for three-level models. To date, the literature has only focused on the theoretical and algorithmic work required to implement three-level imputation using the joint model (JM) method of imputation, leaving relatively no work done on fully conditional specication (FCS) method. Moreover, the literature lacks any methodological evaluation of three-level imputation. Thus, this thesis serves two purposes: (1) to develop an algorithm in order to implement FCS in the context of a three-level model and (2) to evaluate both imputation methods. The simulation investigated a random intercept model under both …

Contributors
Keller, Brian Tinnell, Enders, Craig K, Grimm, Kevin J, et al.
Created Date
2015

Threshold regression is used to model regime switching dynamics where the effects of the explanatory variables in predicting the response variable depend on whether a certain threshold has been crossed. When regime-switching dynamics are present, new estimation problems arise related to estimating the value of the threshold. Conventional methods utilize an iterative search procedure, seeking to minimize the sum of squares criterion. However, when unnecessary variables are included in the model or certain variables drop out of the model depending on the regime, this method may have high variability. This paper proposes Lasso-type methods as an alternative to ordinary least …

Contributors
van Schaijik, Maria, Kamarianakis, Yiannis, Kamarianakis, Yiannis, et al.
Created Date
2015