Data Cleansing In Genomics

Goal. Typical data cleaning tasks include record matching, deduplication, and column segmentation which often need logic that go beyond using traditional relational.

On the Importance of Data Cleansing and Pre-processing for Genomic Studies. Raymond Ng Computer Science, UBC (ICapture and BC Cancer Research). My Key Genomic Projects.

The objectives of this paper are to understand the relationship between genotype files and annotation files and prepare marker data for analysis.

