ASU Electronic Theses and Dissertations
- 1 English
- 1 Public
Recent efforts in data cleaning have focused mostly on problems like data deduplication, record matching, and data standardization; few of these focus on fixing incorrect attribute values in tuples. Correcting values in tuples is typically performed by a minimum cost repair of tuples that violate static constraints like CFDs (which have to be provided by domain experts, or learned from a clean sample of the database). In this thesis, I provide a method for correcting individual attribute values in a structured database using a Bayesian generative model and a statistical error model learned from the noisy database directly. I thus …
- De, Sushovan, Kambhampati, Subbarao, Chen, Yi, et al.
- Created Date