Presentation is loading. Please wait.

Presentation is loading. Please wait.

DAY 2. GETTING FAMILIAR WITH NGS SANGREA SHIM. INDEX  Day 2  Get familiar with NGS  Understanding of NGS raw read file  Quality issue  Alignment/Mapping.

Similar presentations


Presentation on theme: "DAY 2. GETTING FAMILIAR WITH NGS SANGREA SHIM. INDEX  Day 2  Get familiar with NGS  Understanding of NGS raw read file  Quality issue  Alignment/Mapping."— Presentation transcript:

1 DAY 2. GETTING FAMILIAR WITH NGS SANGREA SHIM

2 INDEX  Day 2  Get familiar with NGS  Understanding of NGS raw read file  Quality issue  Alignment/Mapping against reference sequence  Understanding of Alignment  Calling Variations from alignments result  Understanding of variants calling format

3 FLOW CHART SolexaQA bwa bowtie2 bwa bowtie2 Alignment samtools SAM samtools BAM Sorted BAM samtools bcftools samtools bcftools pileup VCF selection JoinMap4 Map construction DNA/RNA NGS platform Raw read Sequences Raw read Sequences Quality trimming This is what we are going to do in this course

4 RAW READS – FASTQ FORMAT Read ID: Machine ID, FlowCell No. Read sequences + Quality seqeunces Phred Score Q=-10 log 10 P Phred ScoreProbability of incorrect base call Accuracy 101/1090% 201/10099% 301/100099.9%

5 ASCII CODE

6 QUALITY TRIMMED FASTQ BeforeAfter

7 ALIGNMENT (BOWTIE2)  FM index  Similar with Burrows-Wheeler Transform  Reducing turnaround time in sequence alignment  More faster than bwa  Insertion/Deletion of small size can be detected  This is for free!!

8 BURROWS-WHEELER TRANSFORM  So called, bowtie2-build  Reference sequences must be transformed before alignment  Command  $ bowtie2-build  Usually using same name for input and output  $ bowtie2-build Gmax_189.fa Gmax_189.fa  Vradi_ver6.fa.?.bt2, Vradi_ver6.fa.rev.?.bt2 will be created

9 CREATING SAM FILE  Command  bowtie2 –x -U -S  It will take some time  SAM file will be created

10 SAM FILE

11 SAM TO BAM  SAM  is an human readable format  BAM  is an binary file which is not readable for human  is computer readable  is much compact in file size  samtools  samtools view –bS [input.sam] > [output.bam]

12 BAM FILE  Looks like this  Can you read this?

13 SORTING ALIGNMENT  BAM sort  samtools  $samtools sort [input.bam] [output.bam]  E.g.) $samtools sort cheongja3.bam cheongja3.bam.sorted  Alignment will be sorted

14 CALLING VARIATION  Reference fasta file should be indexed  $samtools faidx [reference.fa]  Using samtools pileup and bcftools  $samtools mpileup –DSuf [reference.fa] [input.bam] | bcftools view –vcg - > [output.vcf]

15 VCF FORMAT

16 FILTERING OUT SNP  $grep –v ‘INDEL’ [input.vcf] > [output.vcf]  vcfutil.pl varFilter –d [integer] –D [integer] –Q [integer]  -Q INT minimum RMS mapping quality for SNPs [10]  -d INT minimum read depth [2]  -D INT maximum read depth [10000000]

17 TODAY’S PRACTICE  Real data analysis  Basic python class

18 THANK YOU  Q & A

19 DAY 2. PRACTICE- BASIC PYTHON LANGUAGE CLASS TAEYOUNG LEE

20 VARIABLE

21

22  String type  All the characters are string type  ‘a’, ‘b’, ‘c’, ‘d’, ‘0’, ‘1’, ‘2’, ‘3’, ‘0.1’…  You have to use ‘’ or “” for string type  A : variable A  ‘A’ : string value A  Special character(\ = ₩ )  ‘\n’ : newline character  ‘\t’ : tab  ‘\’’ : ‘  ‘\”’ : “  ‘\\’ : \ VARIABLE TYPE

23  Integer type  All the integers are integer type  1, 2, 3, 4, 100, 72038, 900223  Float type  Represent decimal number or fractional number  1/3, 0.23, 1.8, 3.141592 VARIABLE TYPE

24  Cannot use add between str and int type variable  ‘Crop’ + ‘ Genomics’ = ‘Crop Genomics’  ‘Crop’ + ‘4555’ = ‘Crop4555’  ‘880’ + ‘4555’ = ‘8804555’  880 + 4555 = 5435  ‘880’ + 4555 = error  ‘Crop’ + 4555 = error  Between str and float also. CHARACTERISTICS OF VARIABLE

25  If you use float at least once, that variable will be float  5/2 = 2  1+2 = 3  5.0/2 = 2.5  5/2.0 = 2.5  1.0+2 = 3.0 CHARACTERISTICS OF VARIABLE

26  You can multiply string variable  2*3 = 6  ‘2’*3 = 222  ‘hello’*3 = hellohellohello  Hello*3 vs. ‘Hello’*3 CHARACTERISTICS OF VARIABLE

27  You can use these kind of symbols in integer and float type variable  +, -, *, /  //, % CHARACTERISTICS OF VARIABLE

28  List  Dictionary OTHER VARIABLES

29  List  Is set by []  The list of other values or variable  List_a = [1,2,’a’,’b’,[a,b]]  List also can value of list  Can get empty value  List_b = [] OTHER VARIABLES

30  Dictionary  Is set by {}  Like a dictionary, had keys and values  Dic_a = {‘English’:‘ 영어 ’} → Dic_a[‘English’] = ‘ 영어 ’  One key only have one value whatever, list, string, integer or dictionary  Usage)  Dic_amino_acid = {‘ATG’:‘Met’, ‘TGA:*’}  Dic_amino_acid = {} Dic_amino_acid[‘ATG’] = ‘Met’  Key = [‘ATG’,’TGA’] value = [‘Met’, ‘*’] Dic_amino_acid = dict(zip(key,value)) OTHER VARIABLES

31  vi filename.py  Python code files have.py as extension START PYTHON CODING

32  What is the fuction  Already set fuction by other programmer  Ex) print, if, for, open, etc..  Print (standard output function)  Function for print something  Usage)  Print a  Print ‘a’  Print ‘a’*3  Print 3*4  Print print with newline character  Print ‘a\n’ BASIC FUNCTIONS

33  Standard input functions  Input  For integer  Raw_input  For string  Usage)  A = input(“enter some integers”)  B = raw_input(“enter some words”) BASIC FUNCTIONS

34  If  For judgment  If conditional sentence were satisfied, some command were executed  If not, the other command were executed BASIC FUNCTIONS Meaning Math Symbol Python Symbols Less than<< Greater than>> Less than or equal≤<= Greater than or equal≥>= Equals=== Not equal≠!= Containin Not containnot in

35  If BASIC FUNCTIONS True False if … elif … else Status AStatus BStatus C

36  Functions for loop  For  Useful for limited loop  Usage) For variable_name in list_name:  range()  make list of integer  Ex) range(2) = [0,1] range(1,5) = [1,2,3,4] range(1,5,2) = [1,3] BASIC FUNCTIONS  len()  Calculate length  Ex) len(‘ABC’) = 3 len([1,2]) = 2

37  Functions for loop  While  Useful for infinite loop  Usage while conditional_sentence:  If conditional sentence is true, loop are work.  While 1 mean always true, so it is infinite loop BASIC FUNCTIONS

38  break & continue  They are always used with if and loop functions  break  If conditional sentence is true, the loop will be terminated  continue  If conditional sentence is true, that element of loop will be passed BASIC FUNCTIONS

39 1. Make a file which contains Gm05's gene information using /data2/python_study/Gmax_109_gene_exons.gff3 2. Write down the python script for print "This is sequence file" 3. Write down the python script for print the things which you were entered using standard input 4. You can get two integer entered by standard input and save them into variable A and B and save their sum into C. print variable C. 5. You can get two integer entered by standard input and print the bigger one 6. You can make dictionary for pairs between codon and amino acid. Print amino acid matched with codon entered by standard input 7. Same with 2, but repeat infinitely using loop sentence PRACTICE

40  For choosing elements of list and string INDEXING

41

42  For choosing the range of list and string SLICING

43

44

45 EXPANDED SLICING

46  List1 = List2 vs. List1 = List2[:] SLICING LIST

47  1. The number is entered by standard input and print multiplying matrix of that number. For example you input 3, you have to get this result 3 * 1 = 3, 3 * 2 = 6, 3 * 3 = 9, 3* 4 = 12, 3 * 5 = 15, 3 * 6 = 18, 3 * 7 = 21, 3 * 8 = 24, 3 * 9 = 27 Hint) you need to use loop sentence  2. Enter string by standard input and print the length of string  3. Enter string and integer by standard input and print string repeatedly the number of time you enter  4. Enter two strings using standard input and save into s1 and s2. if two strings are different, concatenate two string and print, else, print ‘same’  5. Enter two strings using standard input and save into s1 and s2. If s2 is longer than s1 and the length of s1 is odd number, concatenate s1, s2 and print, else, concatenate s2, s1 and print  6. Enter string using standard input and print reverse of that PRACTICE

48 THAT’S IT FOR TODAY  Q & A


Download ppt "DAY 2. GETTING FAMILIAR WITH NGS SANGREA SHIM. INDEX  Day 2  Get familiar with NGS  Understanding of NGS raw read file  Quality issue  Alignment/Mapping."

Similar presentations


Ads by Google