Download presentation
Presentation is loading. Please wait.
1
Matrices A set of elements organized in a table (along rows and columns) Wikipedia image
2
Matrices Python does not have direct support for matrix manipulation. For Bio/CS 251 matrices are provided through support.py makeMatrix(rows, cols) # creates a matrix with the # given rows and cols randomMatrix(rows, cols) # creates a matrix with the # given rows and cols with all # cells set to random values getRows(M) # returns the number of rows # of the given matrix getCols(M) # returns the number of cols M[r][c] = # puts 5 in cell (r, c) score = M[r][c] # puts value of cell(r, c) in score
3
Matrices Indexing of rows and columns starts at 0 1 2 3 4 7 4 9
1 2 3 4 7 4 9 >>> M = makeMatrix(3, 5) # creates 3x5 matrix >>> rows = getRows(M) >>> print rows 3 >>> cols = getCols(M) >>> print cols 5 >>> M[0][0] = 7 >>> M[2][4] = 9 >>> M[1][2] = 4 >>> total = M[0][0] + M[2][4] + M[1][2] >>> print total
4
Matrix Processing Fill all cells of a matrix with the number 9
To FILL each cell of a given matrix with the value 9: 1. for each row index in the matrix: 2. for each column index in the matrix: 3. set cell of current row, col to 9 def fillMatrix(M): for r in range(0, getRows(M)): for c in range(0, getCols(M)): M[r][c] = 9 >>> D = makeMatrix(3, 5) >>> fillMatrix(D) >>> print D | |
5
Matrix Processing Add all the values in a matrix
To ADD all cells of a given matrix: set current total to 0 1. for each row index in the matrix: 2. for each column index in the matrix: 3. update total with current cell value 4. return total >>> D = randomMatrix(3, 5) >>> print D | | | | | | >>> total = addElements(D) >>> print total 32 def addElements(M): total = 0 for r in range(0, getRows(M)): for c in range(0, getCols(M)): total = total + M[r][c] return total
6
Sequence Similarity Provides insight about the sequence under investigation – gene-coding regions (DNA), function (proteins) Typically assessed via the process of “sequence alignment” Standard sequence alignment algorithms Dot Plots Global Alignment Semiglobal Alignment Local Alignment Standard software BLAST, FASTA – find high scoring local alignments between query and a target database
7
Dot Plots The simplest method for identifying similarities between two sequence Uses a 2-dimensional table one of the sequences labels the rows the other sequence labels the columns place a ● in each cell that has matching (row, column) labels Example: Dot plot for “GATTACA” and “TACACATTG”
8
Dot Plots G A T C ? ? ● ? ? ● ? ? ? ? ● ? ? ? ? ● ? ? ● ?
9
Dot Plots G A T C ● ACA ACATT TACA TAC ATT
10
Dot Plots The simplest method for identifying similarities between two sequence Diagonal lines indicate regions of similarity SE slope – similarity along the direction of the sequences SW slope – similarity along one sequence in reverse Susceptible to noise – especially with DNA since only 4 possible symbols there will be a lot of “random hits” Noise can be addressed using a sliding window consider fragments of length W in the two sequences place ● in each cell that is the “origin” of the sliding window
11
Dot Plots (W = 2) G A T C ? ? ? ? ● ? ? ? ? ? ? ? ● ?
12
Dot Plots (W = 2) Compare with next slide with W = 1
G A T C ● Compare with next slide with W = 1 noise has disappeared one fewer dots per matching region in general if N matches per region, #dots = N – (W-1)
13
Dot Plots (W = 1) G A T C ● Compare with previous slide with W = 2
14
Self Alignment (W = 1) In self alignment
C ● In self alignment main diagonal is filled in completely matrix is symmetric about main diagonal
15
Dot Plots Original paper
Maizel JV and Lenk RP: Enhanced graphic matrix analysis of nucleic acid and protein sequences. Proc Natl Acad Sci USA 78:7665, 1981. Used a sliding window of odd length centered at the base Our examples used a sliding window anchored at the base G G
16
Dot Plots in Python Compute the dot plot matrix given two sequences
To MAKE a DOT PLOT given two sequences: 1. Create a matrix with rows and columns equal to length of first and second sequence respectively 2. for each row index in the matrix: 3. for each column index in the matrix: 4. if symbol in first sequence equals symbol in second sequence 5. place a dot at current cell 6. return the matrix >>> M = makeDotPlot("GATTACA", "TACACATTG") >>> print M | * * | | * * * | | * | | * * * | | * |
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.