Probabilistic Software Workshop September 29, 2014

Slides:



Advertisements
Similar presentations
DNA Identification: Mixture Weight & Inference
Advertisements

Overcoming DNA Stochastic Effects 2010 NEAFS & NEDIAI Meeting November, 2010 Manchester, VT Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics.
DNA Mixture Interpretations and Statistics – To Include or Exclude Cybergenetics © Prescription for Criminal Justice Forensics ABA Criminal Justice.
Finding Truth in DNA Mixture Evidence Innocence Network Conference April, 2013 Charlotte, NC Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA.
TrueAllele ® Challenges in Court and Culture Cybergenetics © th Annual DNA Technology Educational Seminar Centre of Forensic Sciences and the.
How Inclusion Interpretation of DNA Mixture Evidence Reduces Identification Information American Academy of Forensic Sciences February, 2013 Washington,
Creating informative DNA libraries using computer reinterpretation of existing data Northeastern Association of Forensic Scientists November, 2011 Newport,
Preventing rape in the military through effective DNA computing Forensics Europe Expo Forensics Seminar Theatre April, 2014 London, UK Mark W Perlin, PhD,
Coding a Safer Society through Computer Interpretation of DNA Evidence Cybergenetics © MATLAB Virtual Conference March, 2014 Mark W Perlin, PhD,
Probabilistic Software Workshop September 29, 2014 TrueAllele® Casework Mark W. Perlin, PhD, MD, PhD.
TrueAllele ® Interpretation of DNA Mixture Evidence 9 th International Conference on Forensic Inference and Statistics August, 2014 Leiden University,
How TrueAllele® Works (Part 1)
TrueAllele ® Casework Validation on PowerPlex ® 21 Mixture Data Australian and New Zealand Forensic Science Society September, 2014 Adelaide, South Australia.
Using TrueAllele ® Casework to Separate DNA Mixtures of Relatives California Association of Criminalists October, 2014 San Francisco, CA Jennifer Hornyak,
Kern Regional Crime Laboratory Laboratory Director: Dr. Kevin W. P. Miller TRUEALLELE® WORK AND WORKFLOW: KERN COUNTY’S FIRST CASES APRIL 23, 2014.
Revolutionising DNA analysis in major crime investigations The Investigator Conferences Green Park Conference Centre May, 2014 Aylesbury, Buckinghamshire.
No DNA Left Behind: When "inconclusive" really means "informative" Schenectady County District Attorney’s Office January, 2014 Mark W Perlin, PhD, MD,
Challenging DNA Evidence The Fifth Judicial District of Pennsylvania Criminal Division February, 2015 Pittsburgh, PA Mark W Perlin, PhD, MD, PhD Cybergenetics,
Revolutionising DNA analysis in major crime investigations The Investigator Conferences Green Park Conference Centre May, 2014 Aylesbury, Buckinghamshire.
Solving Cold Cases by TrueAllele ® Analysis of DNA Evidence Finding Closure: The Science, Law and Politics of Cold Case Investigations October, 2014 Pittsburgh,
TrueAllele ® Mixture Interpretation Cybergenetics © th Annual DNA Technology Educational Seminar Centre of Forensic Sciences and the Promega.
DNA Mixture Statistics Cybergenetics © Spring Institute Commonwealth's Attorney's Services Council Richmond, Virginia March, 2013 Mark W.
Computer Interpretation of Uncertain DNA Evidence National Institute of Justice Computer v. Human June, 2011 Arlington, VA Mark W Perlin, PhD, MD, PhD.
TrueAllele ® Modeling of DNA Mixture Genotypes California Association of Crime Laboratory Directors October, 2014 San Francisco, CA Mark W Perlin, PhD,
How TrueAllele ® Works (Part 2) Degraded DNA and Allele Dropout Cybergenetics Webinar November, 2014 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh,
TrueAllele ® Genetic Calculator: Implementation in the NYSP Crime Laboratory NYS DNA Subcommittee May 19, 2010 Barry Duceman, Ph.D New York State Police.
Reanimating Zombie™ DNA Penn State Dickinson Law School September, 2012 State College, Pennsylvania Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh,
Computer interpretation of touch DNA mixtures Seminar for Chiefs of Police in Western Pennsylvania May, 2014 Allegheny County, PA Mark W Perlin, PhD, MD,
Separating Familial Mixtures, One Genotype at a Time Northeastern Association of Forensic Scientists November, 2014 Hershey, PA Ria David, PhD, Martin.
Cybergenetics Webinar January, 2015 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics © How TrueAllele ® Works (Part 4)
Cracking the DNA mixture code – computer analysis of UK crime cases Forensics Europe Expo Forensic Innovation Conference April, 2014 London, UK Mark W.
Unleashing Forensic DNA through Computer Intelligence Forensics Europe Expo Forensic Innovation Conference April, 2013 London, UK Mark W Perlin, PhD, MD,
Rapid DNA Response: On the Wings of TrueAllele Mid-Atlantic Association of Forensic Scientists May, 2015 Cambridge, Maryland Martin Bowkley, Matthew Legler,
Getting Past First Bayes with DNA Mixtures American Academy of Forensic Sciences February, 2014 Seattle, WA Mark W Perlin, PhD, MD, PhD Cybergenetics,
Compute first, ask questions later: an efficient TrueAllele ® workflow Midwestern Association of Forensic Scientists October, 2014 St. Paul, MN Martin.
TrueAllele ® interpretation of Allegheny County DNA mixtures Cybergenetics © Continuing Legal Education Allegheny County Courthouse February,
Virginia TrueAllele ® Validation Study: Casework Comparison Presented at AAFS, February, 2013 Published in PLOS ONE, March, 2014 Mark W Perlin, PhD, MD,
TrueAllele ® Computing: All the DNA, all the time Continuing Professional Development Sydney, Australia March, 2014 Mark W Perlin, PhD, MD, PhD Cybergenetics,
Murder in McKeesport October 25, 2008 Tamir Thomas.
When Good DNA Goes Bad International Conference on Forensic Research & Technology October, 2012 Chicago, Illinois Mark W Perlin, PhD, MD, PhD Cybergenetics,
DNA Mapping the Crime Scene: Do Computers Dream of Electric Peaks? 23rd International Symposium on Human Identification October, 2012 Nashville, TN Mark.
Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics © Duquesne University October, 2015 Pittsburgh, PA What’s in a Match?
Open Access DNA Database Duquesne University March, 2013 Pittsburgh, PA Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics ©
Jack Ballantyne and Mark Perlin International Conference on Inference and Statistics, July , Seattle, WA.
Objective DNA Mixture Information in the Courtroom: Relevance, Reliability & Acceptance NIST International Symposium on Forensic Science Error Management:
Simple Reporting of Complex DNA Evidence: Automated Computer Interpretation Promega 14th International Symposium on Human Identification Pointe Hilton.
DNA Identification: Quantitative Data Modeling Cybergenetics © Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.
DNA-led investigation through computer interpretation of evidence Pennsylvania State Police Training Seminar Hershey, PA April, 2014 Mark W Perlin, PhD,
Data summary – “alleles” Threshold Over threshold, peaks are labeled as allele events All-or-none allele peaks, each given equal status Allele Pair 8,
Issues with DNA Evidence, Past and Future Washington County Bar Association March, 2016 Washington, PA Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh,
Separating DNA Mixtures by Computer to Identify and Convict a Serial Rapist Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Garett Sugimoto,
Understanding DNA Evidence Beaver County Courthouse March, 2016 Beaver, PA Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA Cybergenetics ©
DNA: TrueAllele® Statistical Analysis, Probabilistic Genotyping
Validating TrueAllele® genotyping on ten contributor DNA mixtures
Shedding Light on Inconclusive DNA: TrueAllele® Computer Analysis
How to Defend Yourself Against DNA Mixtures
Explaining the Likelihood Ratio in DNA Mixture Interpretation
On the threshold of injustice: manipulating DNA evidence
Machines can work it out: Automated TrueAllele® workflow
Virginia TrueAllele® Validation Study: Casework Comparison
Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA, USA
American Academy of Forensic Sciences Criminalistics Section
Solving Crimes using MCMC to Analyze Previously Unusable DNA Evidence
Investigative DNA Databases that Preserve Identification Information
DNA Identification: Inclusion Genotype and LR
Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA, USA
DNA Identification: Mixture Interpretation
Exonerating the Innocent with Probabilistic Genotyping
David W. Bauer1, PhD Nasir Butt2, PhD Jeffrey Oblock2
Presentation transcript:

Probabilistic Software Workshop September 29, 2014 TrueAllele® Casework Mark W. Perlin, PhD, MD, PhD

Cybergenetics TrueAllele® Casework ViewStation User Client Database Server Interpret/Match Expansion Visual User Interface VUIer™ Software Parallel Processing Computers

Design Philosophy Use all the data (peak heights, replicates) Objective, no examination bias (no suspect) One architecture: evidentiary & investigative Model STR parameters & variation Infer genotypes, then match them Likelihood ratio (LR) match statistic

Basic Features Visual user interface Fast and easy to use Flexible workflow (batch, case, confirm) Greater productivity with fewer samples Consistent and accurate answers LR number can include or exclude Easy to explain results

Basic Capabilities Client (many users) Server (central database, parallel computing) Solves many problems at once (10 to 100) Fast on most mixtures (1-2 hours) Thorough on challenging mixtures Preserves identification information

Intended Application Low-template & degraded DNA Mixtures (any number of contributors) Kinship & paternity Investigative database Non-suspect CODIS search Familial search Disaster victim identification

Input Files Data Original sequencer data file (.fsa, .hid) GeneMapper® ID Peak Table Export (.txt) Genotype Reference profile (.txt) Kinship pedigree file (.txt) Population Many populations included (FBI, NIST, country, state, ...) Customizable allele frequency database (.txt)

Visual User Interaction

Output Files Probabilistic genotypes (.xls) Likelihood ratio match statistics (.xls) Contributor mixture weights (.xls) Specificity analysis results (.xls) CODIS-uploadable profiles (.cmf, .xml) Export mobile case results (.zip)

Likelihood Ratio Likelihood ratio (LR) requires genotype probability O( H | data) LR = O( H ) Bayes theorem + probability + algebra … Σx P{dX|X=x,…} P{dY|Y=x,…} P{X=x} = ΣΣx,y P{dX|X=x,…} P{dY|Y=y,…} P{X=x, Y=y} genotype probability: posterior, likelihood & prior

Genotype Inference Mixture weight variables Hierarchical Bayesian model induces a set of forces in a high-dimensional parameter space Genotype variables Hierarchical mixture weight locus variables • small DNA amounts • degraded contributions • K = 1, 2, 3, 4, 5, 6, ... unknown contributors • joint likelihood function

Markov chain Monte Carlo Sample from the posterior probability distribution Next state? Current state P{Next state} Transition probability = P{Current state}

Modeling STR Data Variation genotype Variance parameters Hierarchical (e.g., customized for DNA template or locus) Differential degradation Mixture weight Relative amplification PCR stutter PCR peak height Background noise Hierarchy of successive pattern transformations data

Types of DNA Profiles Simple mixtures (e.g., 2-3 contributors) Low-template DNA mixtures Low minor contributors (e.g., 5%-15%) Differentially degraded mixtures Multiple amplifications, jointly analyzed Many contributors (e.g., 4, 5, 6, ...)

STR Kits PowerPlex® 16 PowerPlex® 21 PowerPlex® Fusion Profiler®, COfiler® & Profiler Plus® Identifiler® & Identifiler® Plus GlobalFiler™ SGM plus® MiniFiler™ IDplex

Genetic Analyzers ABI 310 ABI 3100 ABI 3100-Avant ABI 3130 ABI 3130xl

Published Validation Studies Perlin MW, Sinelnikov A. An information gap in DNA evidence interpretation. PLoS ONE. 2009;4(12):e8327. Perlin MW, Legler MM, Spencer CE, Smith JL, Allan WP, Belrose JL, Duceman BW. Validating TrueAllele® DNA mixture interpretation. Journal of Forensic Sciences. 2011;56(6):1430-47. Ballantyne J, Hanson EK, Perlin MW. DNA mixture genotyping by probabilistic computer interpretation of binomially-sampled laser captured cell populations: Combining quantitative data for greater identification information. Science & Justice. 2013;53(2):103-14. Perlin MW, Belrose JL, Duceman BW. New York State TrueAllele® Casework validation study. Journal of Forensic Sciences. 2013;58(6):1458-66. Perlin MW, Dormer K, Hornyak J, Schiermeier-Wood L, Greenspoon S. TrueAllele® Casework on Virginia DNA mixture evidence: computer and manual interpretation in 72 reported criminal cases. PLOS ONE. 2014;(9)3:e92837. Perlin MW, Hornyak J, Sugimoto G, Miller K. TrueAllele® genotype identification on DNA mixtures containing up to five unknown contributors. Journal of Forensic Sciences. 2015;in press.

TrueAllele Casework on Virginia DNA mixture evidence: computer and manual interpretation in 72 reported criminal cases. Perlin MW, Dormer K, Hornyak J, Schiermeier-Wood L, Greenspoon S PLoS ONE (2014) 9(3): e92837 Sensitive The extent to which interpretation identifies the correct person True DNA mixture inclusions 101 reported genotype matches 82 with DNA statistic over a million

TrueAllele Sensitivity log(LR) match distribution 11.05 (5.42) 113 billion TrueAllele

Specific The extent to which interpretation does not misidentify the wrong person True exclusions, without false inclusions 101 matching genotypes x 10,000 random references x 3 ethnic populations, for over 1,000,000 nonmatching comparisons

TrueAllele Specificity log(LR) mismatch distribution – 19.47

Reproducible The extent to which interpretation gives the same answer to the same question MCMC computing has sampling variation duplicate computer runs on 101 matching genotypes measure log(LR) variation

TrueAllele Reproducibility Concordance in two independent computer runs standard deviation (within-group) 0.305

Manual Inclusion Method Over threshold, peaks become binary allele events Allele pairs 7, 7 7, 10 7, 12 7, 14 10, 10 10, 12 10, 14 12, 12 12, 14 14, 14 All-or-none allele peaks, disregard quantitative data Analytical threshold

CPI Information 6.83 (2.22) 6.68 million CPI Combined probability of inclusion Simplify data, easy procedure, apply simple formula PI = (p1 + p2 + ... + pk)2

Modified Inclusion Method Higher threshold for human review Apply two thresholds, doubly disregard the data Stochastic threshold in 2010 Analytical threshold in 2000

Modified CPI Information 6.83 (2.22) 6.68 million CPI 2.15 (1.68) 140 mCPI

Method Comparison 6.83 (2.22) 6.68 million CPI 2.15 (1.68) mCPI 140 11.05 (5.42) 113 billion TrueAllele

Method Accuracy Kolmogorov Smirnov test K-S p-value 0.106 0.215 0.106 0.215 0.561 1e-22 0.735 1e-25

Invariant Behavior no significant difference in regression line slope Perlin MW, Hornyak J, Sugimoto G, Miller K. TrueAllele® genotype identification on DNA mixtures containing up to five unknown contributors. Journal of Forensic Sciences. 2015;in press. Invariant Behavior no significant difference in regression line slope (p > 0.05)

Sufficient Contributors small negative slope values statistically different from zero (p < 0.01)

Training Requirements Science & Software Lectures and reading Three day hands-on instruction Operator Training Problem solving on challenging data Grading, examinations & certification Reporting & Testifying (Optional)

User Support User education and training Software manuals and procedures Validation assistance and service Admissibility, reporting & testifying Cybergenetics website YouTube TrueAllele channel Software, hardware & networking On-site and remote (by phone or Internet)

Computer Hardware System Requirements (For VUIer™ Client Software)   Windows Mac Operating System Windows XP Windows 7 Mac OS X v10.6 Snow Leopard Mac OS X v10.7 Lion Mac OS X v10.8 Mountain Lion Mac OS X v10.9 Mavericks Processor At least 1 GHz At least 1 GHz Intel (Core 2 duo) Memory At least 256 MB At least 1 GB

Interpret and identify Future Updates 1999. Version 1 2004. Model refinement 2009. Version 25, deployment 2014. Cloud computing Interpret and identify anywhere, anytime Your cloud, or ours

Cybergenetics Experience Invented math & algorithms 20 years Developed computer systems 15 years Support users and workflow 10 laboratories Used routinely in casework 3 labs Validate system reliability 20 studies Educate the community 50 talks Train & certify analysts 200 students Go to court for admissibility 5 hearings Testify about LR results 20 trials Educate lawyers and laymen 1,000 people Make the ideas understandable 200 reports

Admissibility Hearings • California • Pennsylvania • Virginia • United Kingdom • Australia Appellate precedent in Pennsylvania

TrueAllele in Criminal Trials About 200 case reports filed on DNA evidence Court testimony: • state • federal • military • international Crimes: • armed robbery • child abduction • child molestation • murder • rape • terrorism • weapons

Investigative DNA Database Infer genotypes, and then match with LR World Trade Center disaster

TrueAllele YouTube channel Further Information http://www.cybgen.com/information • Courses • Newsletters • Newsroom • Patents • Presentations • Publications • Webinars http://www.youtube.com/user/TrueAllele TrueAllele YouTube channel perlin@cybgen.com