CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)
Case 1: In order to improve the business, a national-chain supermarket starts a project to keep track of their customers. Regular customers can collect points or receive discounts by using their store card on each purchase. Temporary customers who are not members to the store will be assigned to a same temporary store card. Now supermarket is hiring the data mining analyst to help them on this project. Question: If you are the data mining analyst, how will you design the project and what data you need for the project? Data Mining and Machine Learning Applications
Case 2: Researchers found that individuals have different responses or reactions to the same drug treatment. For example, two smokers have the same smoking history. One is detected to have lung cancer and the other one does not. Single Nucleotide Polymorphisms (SNPs) are an important resource to explain these phenomenons. One possible project is study the association between the SNPs and the DNA sequences. Question: If you are the researcher, how will you design this project? Data Mining and Machine Learning Applications
Cancer – Different Fates This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
SNPs ASNPs B SNPs CSNPs D SNPs May Be the Solution This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
What Is Variation in the Genome? Common Sequence Variations Polymorphism Deletions Translocations Insertions Chromosome This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
SNPs Are the Most Common Type of Variation At least 1 percent of the population Most of the population Common sequence G to C SNP site Variant sequence This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
The Genome Contains Genes Gene 2 Coding region Protein 2 Protein 1 Noncoding region Gene 1 Coding region This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
Variation in the Human Genome Person 1Person 2 = Variations in DNA This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
Variations Causing No Changes = Variations in DNA that cause no changes This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
Variations Causing Harmless Changes = Variations in DNA that cause harmless changes This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
Variations Causing Harmful Changes = Variation in DNA that causes harmful change No Disease Hemophilia This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
Variations Causing Latent Changes Many years later = Variations in DNA that cause latent effects This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):