Presentation is loading. Please wait.

Presentation is loading. Please wait.

:NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BIOSTATISTIC/BIOINFORMATIC TOOLS FOR GENETICS DATA: DATA MANAGEMENT AND ANALYSIS RICHARD.

Similar presentations


Presentation on theme: ":NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BIOSTATISTIC/BIOINFORMATIC TOOLS FOR GENETICS DATA: DATA MANAGEMENT AND ANALYSIS RICHARD."— Presentation transcript:

1 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BIOSTATISTIC/BIOINFORMATIC TOOLS FOR GENETICS DATA: DATA MANAGEMENT AND ANALYSIS RICHARD ANNEY NEUROPSYCHIATRIC GENETICS RESEARCH GROUP WORKSHEET, TUTORIALS AND SLIDES AVAILABLE ON P:\Personal Folders\anneyr\stata9\talk http://www.medicine.tcd.ie/psychiatry/research/neuropsychiatry/

2 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE Overview

3 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE STATA9 A STATISTICAL SOFTWARE PACKAGE LESS PRETTY THAN SPSS GUI POWERFUL AND “SCRIPT” FRIENDLY LESS CLICKING AND DROP-DOWN …MORE SCRIPTING

4 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE STATA9: SET UP FOLDER STRUCTURE SET UP FOLDERS TO STORE YOUR; DO-FILES CR FILE AN FILE DTA-FILES LOG-FILES INPUT-FILES (TXT) OUTPUT-FILES

5 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY HOW DO I GET FILES INTO STATA? HOW DO I MERGE MY DATA WITH ANOTHER FILE? CAN I GENERATE A FEW BASIC STATISTICS ON MY MARKERS? CAN I PERFORM A CASE-CONTROL STUDY? IS MY QUANTITATIVE VARIABLE ASSOCIATED WITH A GENOTYPE?

6 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE STATA9: LOOK AT ME!! MAIN WINDOW

7 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE STATA9: LOOK AT ME!! DO-WINDOW

8 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE STATA9: LOOK AT ME!! MAIN WINDOW

9 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE STATA9: LOOK AT ME!! DTA-EDITOR WINDOW

10 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY

11 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY cr00 genotype_qtlsnp.do 1.ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND 2.CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND 3.MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND 4.TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

12 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY cr00 genotype_qtlsnp.do 1.ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND 2.CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND 3.MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND 4.TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

13 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY cr00 genotype_qtlsnp.do 1.ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND 2.CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND 3.MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND 4.TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

14 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY cr00 genotype_qtlsnp.do 1.ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND 2.CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND 3.MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND 4.TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

15 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY THE COMBINED *.DTA FILE THE TABULATE FUNCTION 1= ONLY IN 1 st FILE 2=ONLY IN 2 nd FILE 3=IN BOTH 1 st & 2 nd FILE

16 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY cr00 genotype_qtlsnp.do 1.ADDING TAB-TEXT FILES TO STATA USING THE INSHEET COMMAND, SORTING THE KEY VARIABLE USING THE SORT COMMAND AND SAVE AS *.DTA FILES USING THE SAVE COMMAND 2.CONVERTING “STRINGS” TO NUMBER VARIABLES USING THE GENERATE AND REPLACE COMMAND 3.MERGING USING THE KEY VARIABLE USING THE MERGE COMMAND 4.TABULATING THE MERGE USING THE TABULATE COMMAND AND ORDER VARIABLES USING THE ORDER VARIABLE

17 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY an00 genotype_qtlsnp.do CREATING THE LOG FILE USING THE LOG COMMAND OPENING THE *.DTA FILE USING THE USE COMMAND CREATING GENOTYPE VARIABLES FROM ALLELE VARIABLES USING GTYPE PROTOCOL TABULATE THE GENOTYPE VARIABLES USING THE TABULATE COMMAND

18 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY 1.TEST HWE USING GTAB COMMAND 2.TEST HWE USING GENHW COMMAND

19 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY 1.TEST PAIR-WISE LINKAGE DISEQUILIBRIUM USING PWLD COMMAND 2.TEST ASSOCIATION WITH BINARY TRAIT USING GENCC COMMAND

20 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY QTLSNP COMMAND MODELS CODOMINANT (THREE MODELS) DOMINANT RECESSIVE

21 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY 1.TEST WHETHER A QUANTITATIVE VARIABLE IS ASSOCIATED WITH DIFFERENT INHERITENCE MODELS USING QTLSNP COMMAND - CODOMINANT

22 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY 1.TEST WHETHER A QUANTITATIVE VARIABLE IS ASSOCIATED WITH DIFFERENT INHERITENCE MODELS USING QTLSNP COMMAND – DOMINANT 2.NOT ASSOCIATED SO MINIMAL OUTPUT

23 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY 1.TEST WHETHER A QUANTITATIVE VARIABLE IS ASSOCIATED WITH DIFFERENT INHERITENCE MODELS USING QTLSNP COMMAND - RECESSIVE

24 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PROBLEM 1: BASIC CASE-CONTROL ASSOCIATION STUDY

25 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax© DATABASE AND ANALYSIS PLATFORM MASTER DATABASE FOR STORING ALL OUR “MASTER” GENETIC AND PHENOTYPE DATASETS ONGOING PROCESS TO UPLOAD AND MANAGE DATA

26 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Structure FIVE DOMAINS; 1.GENOTYPES/SNPS 2.MAPS 3.PEDIGREES 4.AFFECTION 5.PHENOTYPES

27 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Structure FIVE DOMAINS; 1.GENOTYPES/SNPS 2.MAPS 3.PEDIGREES 4.AFFECTION 5.PHENOTYPES

28 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Structure FIVE DOMAINS; 1.GENOTYPES/SNPS 2.MAPS 3.PEDIGREES 4.AFFECTION 5.PHENOTYPES

29 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Structure FIVE DOMAINS; 1.GENOTYPES/SNPS 2.MAPS 3.PEDIGREES 4.AFFECTION 5.PHENOTYPES

30 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Structure FIVE DOMAINS; 1.GENOTYPES/SNPS 2.MAPS 3.PEDIGREES 4.AFFECTION 5.PHENOTYPES

31 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Structure FIVE DOMAINS; 1.GENOTYPES/SNPS 2.MAPS 3.PEDIGREES 4.AFFECTION 5.PHENOTYPES

32 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA) TWO EXAMPLES 1.BASIC EXCEL FILE 2.TAQ-MAN FILE

33 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA): BASIC EXCEL FILE

34 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN PED AFF-FILE (VIA STATA): BASIC EXCEL FILE

35 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA): BASIC EXCEL FILE

36 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA): BASIC EXCEL FILE

37 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA): BASIC EXCEL FILE

38 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA): TAQ-MAN FILE

39 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA): TAQ-MAN FILE

40 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE FROM OUTPUT TO GEN-FILE (VIA STATA): TAQ-MAN FILE

41 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

42 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Types of Analysis QUALITY PED-CHECK MERLIN BASIC MEASURES (MAF, HWE, CALL) FAMILY-BASED MENDEL MERLIN GENEHUNTER SIMWALK FBAT/PBAT TRANSMIT QTDT PLINK HAPLOVIEW R-PACKAGE CASE-CONTROL ALLELE ASSOCIATION MENDEL PHASE SNPHAP PLINK R-PACKAGE

43 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax: Types of Analysis FOR MOST ANALYSIS YOU NEED TO SELECT MATCHED GEN PED MAP – b128 NOW UPLOADED AFF

44 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

45 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

46 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

47 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

48 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

49 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

50 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

51 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

52 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BC|SNPmax

53 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… GETTING STARTED

54 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… RUNNING PLINK FROM YOUR OWN COMPUTER WHY? 1.MULTIPLE ANALYSES 2.KEEP A RECORD OF YOUR WORK IN BAT AND SCRPT 3.EASE OF USE 4.EASE OF REPEATING TASK 5.SCRIPTS NOT DROP DOWN MENUS 6.RUNNING >1 CHROMOSOME (BC|SNPmax ADDRESSED) 7.POST-ANALYSIS INTERGRATION USING PERL AND STATA

55 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… FOLDER STRUCTURE ANALYSIS DATASET OUTPUT

56 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… DATASETS PED & MAP BINARY FILES BINARY PED (BED) BINARY MAP (BIM) FAMILY FILES (FAM)

57 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… PED & MAP BINARY FILES BINARY PED (BED) BINARY MAP (BIM) FAMILY FILES (FAM)

58 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… PED & MAP BINARY FILES BINARY PED (BED) BINARY MAP (BIM) FAMILY FILES (FAM)

59 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… PED & MAP BINARY FILES BINARY PED (BED) BINARY MAP (BIM) FAMILY FILES (FAM)

60 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… PED & MAP BINARY FILES BINARY PED (BED) BINARY MAP (BIM) FAMILY FILES (FAM)

61 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE EXAMPLE ANALYSES IN PLINK… DATA TRANSFORMATION DATA FILTERING AND PRUNING DATA MERGING SUMMARY STATS MISSINGNESS HWE MAF MENDEL ERRORS INCLUSION THRESHOLDS POPULATION STRATIFICATION ASSOCIATION CASE/CONTROL QTL GxE NEW MULTIPLE CORRECTION TESTING (--adjust) FAMILY-BASED TDT POO PERMUTATION EPISTASIS HAPLOTYPE ANALYSIS NEW PROXY-ASSOCIATION (FROM SNP TO HAPLOTYPE) R-PACKAGE NEW MODIFY OUTPUT PLOG10 P<x GENOMIC CONTROL QQ-PLOT

62 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… : RUNNING TDT IN PLINK CAN RUN FROM COMMAND LINE AND USING gPLINK (GUI) RECOMMEND BAT AND SCRPT FILES

63 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE PLINK… : SUMMARY TABLES IN STATA INSHEET THE TDT.CLEAN FILE ADD GENE NAMES ADD CHROMOSOME POSITION ADJUST OR TO RISK GENERATE GRAPHS OF DATA GENERATE TABLES BY GENE GENERATE TABLES BY POSITION GENERATE TABLES BY P-VALUE SELECT COLUMNS FOR OTHER ANALYSES (GENMAPP)

64 :NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE THE END!


Download ppt ":NEUROPSYCHIATRIC GENETICS [BIOSTATISTICS|BIOINFORMATICS] CORE BIOSTATISTIC/BIOINFORMATIC TOOLS FOR GENETICS DATA: DATA MANAGEMENT AND ANALYSIS RICHARD."

Similar presentations


Ads by Google