Skip to main content

ASU Electronic Theses and Dissertations


This collection includes most of the ASU Theses and Dissertations from 2011 to present. ASU Theses and Dissertations are available in downloadable PDF format; however, a small percentage of items are under embargo. Information about the dissertations/theses includes degree information, committee members, an abstract, supporting data or media.

In addition to the electronic theses found in the ASU Digital Repository, ASU Theses and Dissertations can be found in the ASU Library Catalog.

Dissertations and Theses granted by Arizona State University are archived and made available through a joint effort of the ASU Graduate College and the ASU Libraries. For more information or questions about this collection contact or visit the Digital Repository ETD Library Guide or contact the ASU Graduate College at gradformat@asu.edu.


Subject
Date Range
2010 2019


In mixture-process variable experiments, it is common that the number of runs is greater than in mixture-only or process-variable experiments. These experiments have to estimate the parameters from the mixture components, process variables, and interactions of both variables. In some of these experiments there are variables that are hard to change or cannot be controlled under normal operating conditions. These situations often prohibit a complete randomization for the experimental runs due to practical and economical considerations. Furthermore, the process variables can be categorized into two types: variables that are controllable and directly affect the response, and variables that are uncontrollable …

Contributors
Cho, Tae-Yeon, Montgomery, Douglas C, Borror, Connie M, et al.
Created Date
2010

Public health surveillance is a special case of the general problem where counts (or rates) of events are monitored for changes. Modern data complements event counts with many additional measurements (such as geographic, demographic, and others) that comprise high-dimensional covariates. This leads to an important challenge to detect a change that only occurs within a region, initially unspecified, defined by these covariates. Current methods are typically limited to spatial and/or temporal covariate information and often fail to use all the information available in modern data that can be paramount in unveiling these subtle changes. Additional complexities associated with modern health …

Contributors
Davila, Saylisse, Runger, George C, Montgomery, Douglas C, et al.
Created Date
2010

Although the issue of factorial invariance has received increasing attention in the literature, the focus is typically on differences in factor structure across groups that are directly observed, such as those denoted by sex or ethnicity. While establishing factorial invariance across observed groups is a requisite step in making meaningful cross-group comparisons, failure to attend to possible sources of latent class heterogeneity in the form of class-based differences in factor structure has the potential to compromise conclusions with respect to observed groups and may result in misguided attempts at instrument development and theory refinement. The present studies examined the sensitivity …

Contributors
Blackwell, Kimberly Carol, Millsap, Roger E, Aiken, Leona S, et al.
Created Date
2011

By the von Neumann min-max theorem, a two person zero sum game with finitely many pure strategies has a unique value for each player (summing to zero) and each player has a non-empty set of optimal mixed strategies. If the payoffs are independent, identically distributed (iid) uniform (0,1) random variables, then with probability one, both players have unique optimal mixed strategies utilizing the same number of pure strategies with positive probability (Jonasson 2004). The pure strategies with positive probability in the unique optimal mixed strategies are called saddle squares. In 1957, Goldman evaluated the probability of a saddle point (a …

Contributors
Manley, Michael, Kadell, Kevin W. J., Kao, Ming-Hung, et al.
Created Date
2011

In many classication problems data samples cannot be collected easily, example in drug trials, biological experiments and study on cancer patients. In many situations the data set size is small and there are many outliers. When classifying such data, example cancer vs normal patients the consequences of mis-classication are probably more important than any other data type, because the data point could be a cancer patient or the classication decision could help determine what gene might be over expressed and perhaps a cause of cancer. These mis-classications are typically higher in the presence of outlier data points. The aim of …

Contributors
Gupta, Sidharth, Kim, Seungchan, Welfert, Bruno, et al.
Created Date
2011

Mostly, manufacturing tolerance charts are used these days for manufacturing tolerance transfer but these have the limitation of being one dimensional only. Some research has been undertaken for the three dimensional geometric tolerances but it is too theoretical and yet to be ready for operator level usage. In this research, a new three dimensional model for tolerance transfer in manufacturing process planning is presented that is user friendly in the sense that it is built upon the Coordinate Measuring Machine (CMM) readings that are readily available in any decent manufacturing facility. This model can take care of datum reference change …

Contributors
Khan, M Nadeem Shafi, Phelan, Patrick E, Montgomery, Douglas, et al.
Created Date
2011

Yield is a key process performance characteristic in the capital-intensive semiconductor fabrication process. In an industry where machines cost millions of dollars and cycle times are a number of months, predicting and optimizing yield are critical to process improvement, customer satisfaction, and financial success. Semiconductor yield modeling is essential to identifying processing issues, improving quality, and meeting customer demand in the industry. However, the complicated fabrication process, the massive amount of data collected, and the number of models available make yield modeling a complex and challenging task. This work presents modeling strategies to forecast yield using generalized linear models (GLMs) …

Contributors
Krueger, Dana Cheree, Montgomery, Douglas C., Fowler, John, et al.
Created Date
2011

Sparse learning is a technique in machine learning for feature selection and dimensionality reduction, to find a sparse set of the most relevant features. In any machine learning problem, there is a considerable amount of irrelevant information, and separating relevant information from the irrelevant information has been a topic of focus. In supervised learning like regression, the data consists of many features and only a subset of the features may be responsible for the result. Also, the features might require special structural requirements, which introduces additional complexity for feature selection. The sparse learning package, provides a set of algorithms for …

Contributors
Thulasiram, Ramesh L., Ye, Jieping, Xue, Guoliang, et al.
Created Date
2011

Real-world environments are characterized by non-stationary and continuously evolving data. Learning a classification model on this data would require a framework that is able to adapt itself to newer circumstances. Under such circumstances, transfer learning has come to be a dependable methodology for improving classification performance with reduced training costs and without the need for explicit relearning from scratch. In this thesis, a novel instance transfer technique that adapts a "Cost-sensitive" variation of AdaBoost is presented. The method capitalizes on the theoretical and functional properties of AdaBoost to selectively reuse outdated training instances obtained from a "source" domain to effectively …

Contributors
Venkatesan, Ashok, Panchanathan, Sethuraman, Li, Baoxin, et al.
Created Date
2011

It is common in the analysis of data to provide a goodness-of-fit test to assess the performance of a model. In the analysis of contingency tables, goodness-of-fit statistics are frequently employed when modeling social science, educational or psychological data where the interest is often directed at investigating the association among multi-categorical variables. Pearson's chi-squared statistic is well-known in goodness-of-fit testing, but it is sometimes considered to produce an omnibus test as it gives little guidance to the source of poor fit once the null hypothesis is rejected. However, its components can provide powerful directional tests. In this dissertation, orthogonal components …

Contributors
Milovanovic, Jelena, Young, Dennis, Reiser, Mark, et al.
Created Date
2011

Designing studies that use latent growth modeling to investigate change over time calls for optimal approaches for conducting power analysis for a priori determination of required sample size. This investigation (1) studied the impacts of variations in specified parameters, design features, and model misspecification in simulation-based power analyses and (2) compared power estimates across three common power analysis techniques: the Monte Carlo method; the Satorra-Saris method; and the method developed by MacCallum, Browne, and Cai (MBC). Choice of sample size, effect size, and slope variance parameters markedly influenced power estimates; however, level-1 error variance and number of repeated measures (3 …

Contributors
Van Vleet, Bethany L., Thompson, Marilyn S., Green, Samuel B., et al.
Created Date
2011

Estimating cointegrating relationships requires specific techniques. Canonical correlations are used to determine the rank and space of the cointegrating matrix. The vectors used to transform the data into canonical variables have an eigenvector representation, and the associated canonical correlations have an eigenvalue representation. The number of cointegrating relations is chosen based upon a theoretical difference in the convergence rates of the eignevalues. The number of cointegrating relations is consistently estimated using a threshold function which places a lower bound on the eigenvalues associated with cointegrating relations and an upper bound on the eigenvalues on the eigenvalues not associated with cointegrating …

Contributors
Nowak, Adam Daniel, Ahn, Seung C, Liu, Crocker, et al.
Created Date
2012

Temporal data are increasingly prevalent and important in analytics. Time series (TS) data are chronological sequences of observations and an important class of temporal data. Fields such as medicine, finance, learning science and multimedia naturally generate TS data. Each series provide a high-dimensional data vector that challenges the learning of the relevant patterns This dissertation proposes TS representations and methods for supervised TS analysis. The approaches combine new representations that handle translations and dilations of patterns with bag-of-features strategies and tree-based ensemble learning. This provides flexibility in handling time-warped patterns in a computationally efficient way. The ensemble learners provide a …

Contributors
Baydogan, Mustafa Gokce, Runger, George C, Atkinson, Robert, et al.
Created Date
2012

This dissertation involves three problems that are all related by the use of the singular value decomposition (SVD) or generalized singular value decomposition (GSVD). The specific problems are (i) derivation of a generalized singular value expansion (GSVE), (ii) analysis of the properties of the chi-squared method for regularization parameter selection in the case of nonnormal data and (iii) formulation of a partial canonical correlation concept for continuous time stochastic processes. The finite dimensional SVD has an infinite dimensional generalization to compact operators. However, the form of the finite dimensional GSVD developed in, e.g., Van Loan does not extend directly to …

Contributors
Huang, Qing, Eubank, Randall, Renaut, Rosemary, et al.
Created Date
2012

When analyzing longitudinal data it is essential to account both for the correlation inherent from the repeated measures of the responses as well as the correlation realized on account of the feedback created between the responses at a particular time and the predictors at other times. A generalized method of moments (GMM) for estimating the coefficients in longitudinal data is presented. The appropriate and valid estimating equations associated with the time-dependent covariates are identified, thus providing substantial gains in efficiency over generalized estimating equations (GEE) with the independent working correlation. Identifying the estimating equations for computation is of utmost importance. …

Contributors
Yin, Jianqiong, Wilson, Jeffrey Wilson, Reiser, Mark, et al.
Created Date
2012

A least total area of triangle method was proposed by Teissier (1948) for fitting a straight line to data from a pair of variables without treating either variable as the dependent variable while allowing each of the variables to have measurement errors. This method is commonly called Reduced Major Axis (RMA) regression and is often used instead of Ordinary Least Squares (OLS) regression. Results for confidence intervals, hypothesis testing and asymptotic distributions of coefficient estimates in the bivariate case are reviewed. A generalization of RMA to more than two variables for fitting a plane to data is obtained by minimizing …

Contributors
Li, Jingjin, Young, Dennis, Eubank, Randall, et al.
Created Date
2012

This thesis examines the application of statistical signal processing approaches to data arising from surveys intended to measure psychological and sociological phenomena underpinning human social dynamics. The use of signal processing methods for analysis of signals arising from measurement of social, biological, and other non-traditional phenomena has been an important and growing area of signal processing research over the past decade. Here, we explore the application of statistical modeling and signal processing concepts to data obtained from the Global Group Relations Project, specifically to understand and quantify the effects and interactions of social psychological factors related to intergroup conflicts. We …

Contributors
Liu, Hui, Taylor, Thomas, Cochran, Douglas, et al.
Created Date
2012

Photovoltaic (PV) modules are typically rated at three test conditions: STC (standard test conditions), NOCT (nominal operating cell temperature) and Low E (low irradiance). The current thesis deals with the power rating of PV modules at twenty-three test conditions as per the recent International Electrotechnical Commission (IEC) standard of IEC 61853 – 1. In the current research, an automation software tool developed by a previous researcher of ASU – PRL (ASU Photovoltaic Reliability Laboratory) is validated at various stages. Also in the current research, the power rating of PV modules for four different manufacturers is carried out according to IEC …

Contributors
Vemula, Meena Gupta, Tamizhmani, Govindasamy, Macia, Narcio F., et al.
Created Date
2012

The living world we inhabit and observe is extraordinarily complex. From the perspective of a person analyzing data about the living world, complexity is most commonly encountered in two forms: 1) in the sheer size of the datasets that must be analyzed and the physical number of mathematical computations necessary to obtain an answer and 2) in the underlying structure of the data, which does not conform to classical normal theory statistical assumptions and includes clustering and unobserved latent constructs. Until recently, the methods and tools necessary to effectively address the complexity of biomedical data were not ordinarily available. The …

Contributors
Brown, Justin Reed, Dinu, Valentin, Johnson, William, et al.
Created Date
2012

The purpose of this study was to examine under which conditions "good" data characteristics can compensate for "poor" characteristics in Latent Class Analysis (LCA), as well as to set forth guidelines regarding the minimum sample size and ideal number and quality of indicators. In particular, we studied to which extent including a larger number of high quality indicators can compensate for a small sample size in LCA. The results suggest that in general, larger sample size, more indicators, higher quality of indicators, and a larger covariate effect correspond to more converged and proper replications, as well as fewer boundary estimates …

Contributors
Wurpts, Ingrid Carlson, Geiser, Christian, Aiken, Leona, et al.
Created Date
2012