Flow Cytometry and Reproducible Analysis Cliburn Chan Department of Biostatistics and Bioinformatics, DUMC.

Slides:



Advertisements
Similar presentations
Radiopharmaceutical Production
Advertisements

Course in Statistics and Data analysis Course B DAY2 September 2009 Stephan Frickenhaus
P20 Seminar November 12, Statistical Collaboration Part 1: Working with Statisticians from Start to Finish Part 2: Essentials of Data Management.
Working with Statisticians At some point, a statistician is likely to be asked to analyze your data. This can lead to much unhappiness.
A Visual Follow-Along Guide to the Instructions of the NBTA Modular Hotel RFP.
Introduction to SPSS Allen Risley Academic Technology Services, CSUSM
Rutgers University - Center for Vector Biology Data Management and Manipulations: The Good, the Bad and the Fuhgeddaboudit ! Lisa Reed Center for Vector.
John Porter Why this presentation? The forms data take for analysis are often different than the forms data take for archival storage Spreadsheets are.
Generating Random Samples SAS, EXCEL, JMP, SPSS. Population of Data  Sample Data should be in a dataset where each row represents an individual unit,
Basic Concept of Data Coding Codes, Variables, and File Structures.
Collections Management Museums Reporting in KE EMu.
Access Tutorial 8 Sharing, Integrating, and Analyzing Data
Using Microsoft Word’s Mail Merge Features Lunch and Learn: March 15, 2005.
Reporting in EMu Crystal != Reporting or Why is reporting so difficult and can we do anything about it? Bernard Marshall KE Software.
Managing Your Own Data (…if you have to) Kathryn A. Carson, Sc.M. Senior Research Associate Department of Epidemiology Johns Hopkins Bloomberg School of.
Chapter Sixteen Starting the Data Analysis Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
A web based Project Management and Tracking System Zheng Wang, Yuntian Zhao, Yanhong Li Biostatistics & Statistical programming.
Data quality control, Data formats and preservation, Versioning and authenticity, Data storage Managing research data well workshop London, 30 June 2009.
Biostatistics Analysis Center Center for Clinical Epidemiology and Biostatistics University of Pennsylvania School of Medicine Minimum Documentation Requirements.
Conducting Usability Tests ITSW 1410 Presentation Media Software Instructor: Glenda H. Easter.
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
DAY 15: ACCESS CHAPTER 2 Larry Reaves October 7,
Data Collection Tools and Creation of a Usable Database Adam Schlichting University of Illinois at Chicago Department of Emergency Medicine Last updated:
1 Data List Spreadsheets or simple databases - a different use of Spreadsheets Bent Thomsen.
Data and its manifestations. Storage and Retrieval techniques.
Microsoft Excel 2007 © Wiley Publishing All Rights Reserved. The L Line The Express Line to Learning L Line.
1 ADVANCED MICROSOFT EXCEL Lesson 9 Applying Advanced Worksheets and Charts Options.
Flow cytometry to evaluate vaccine-induced T cell responses: standardized analysis of large numbers of FCS files Stephen De Rosa, M.D. HVTN Laboratory.
Advanced Lesson 5: Advanced Data Management Excel can import data, or bring it in from other sources and file formats. Importing data is useful because.
Advanced Higher Physics Investigation Report. Hello, and welcome to Advanced Higher Physics Investigation Presentation.
Colleague, Excel & Word Best of Friends Presented by: Joan Kaun & Yvonne Nelson College of the Rockies.
Microsoft Office 2007 Access Chapter 3 Maintaining a Database.
Step by Step Instruction: How to Conduct Direct Certification using File Upload: Standard Format Released January 2014 “How to Conduct Direct Certification.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Copyright © Software Carpentry 2011 This work is licensed under the Creative Commons Attribution License See
The Report Generator Viewing Student Outcomes. Install the Report Generator In a browser, go to Click.
Software for Flow Cytometry Data Analysis Hélène Dujardin, PhD TreeStar / Celeza GmbH.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Working with Data Lists.
Case study : creating a usable MARC file from a spreadsheet Thomas Meehan Head of Current Cataloguing UCL Library Services CILIP CIG Metadata.
By Arthur Dryver, PhD 1 and Wasita Boonsathorn, PhD 2 1 Graduate School of Business Administration, NIDA 2 School of Human Resource Development, NIDA.
Excel and Data Analysis. Excel can be a powerful tool for analysis Excel provides many tools for analyzing data –Filtering –Sorting –Formulas –Charts.
John Porter Sheng Shan Lu M. Gastil Gastil-Buhl With special thanks to Chau-Chin Lin and Chi-Wen Hsaio.
Bioinformatics for biologists
1 PEER Session 02/04/15. 2  Multiple good data management software options exist – quantitative (e.g., SPSS), qualitative (e.g, atlas.ti), mixed (e.g.,
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
Paper 2 Exam Tips Guidance: 1.Evidence Document 2.Unit 9: – Communication ( ) 3.Unit 10: - Document Production (Word) 4.Unit 16: PowerPoint 5.Unit.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Excel Class Outline What is a spreadsheet? What can you do with them? The Cell - basic unit of a spreadsheet Making a Table - cells in Rows and Columns.
Spreadsheet Evidence By.... P2 – DEVELOP A COMPLEX SPREADSHEET MODEL TO MEET PARTICULAR NEEDS.
To create text styles click on Home >> Tab under Change Styles
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
Data quality & VALIDATION
Compatible with the latest browsers; Chrome, Safari, Firefox, Opera and Internet Explorer 9 and above.
DATABASE.
Mail Merge for Lotus Notes and Excel User Guide
Mail Merge for Lotus Notes and Excel User Guide
Essential Skills Wales
Week 12 Option 3: Database Design
Managing Multiple Worksheets and Workbooks
Complete Management of your Entire Backflow Program
ECONOMETRICS ii – spring 2018
Rogers Sourcing Using Excel to Upload Responses into Ariba
SESSION 4 Annual Catch Estimates
Access Tutorial 8 Sharing, Integrating, and Analyzing Data
TRAINING OF FOCAL POINTS on the CountrySTAT SYSTEM based on FENIX
TERMS AND CONDITIONS   These PowerPoint slides are a tool for lecturers, and as such: YOU MAY add content to the slides, delete content from the slides,
Login Main Functions Via SAS Information Delivery Portal
Presentation transcript:

Flow Cytometry and Reproducible Analysis Cliburn Chan Department of Biostatistics and Bioinformatics, DUMC

Reproducible Analysis Can someone in a different lab replicate your results? Can someone else in your lab replicate your results? Can you replicate your own results – 6 months later? – When FlowJo goes from version 10.0 to 11.0? – When your lab catches fire and all your computers melt into toxic waste?

Complexity of flow analysis Experimental design Running the experiment Raw data (FCS files) Compensation Transformation Gating strategy Gates  MFI and relative frequencies Statistical analysis – e.g. outcome correlation

Experimental design Is randomization done correctly? Is the sample size sufficient? Is there an SOP for annotating the experiment? – MIATA – MiFlowcyt What is the informatics strategy to ensure that data is recorded accurately and backed-up safely?

Running the experiment Stuff I know little about … Janet and Jennifer will teach in this workshop – Instrument calibration – Bridging studies – Reagent qualification – Use of appropriate biological controls – Use of appropriate technical controls

Raw data (FCS files) Is there a file naming SOP that is followed? Is there an SOP for recording FCS metadata? – Channel labels – fluorochrome, antibody, FMO

Inconsistent annotation example

Compensation, transformation and gating strategy Compensation is Real = Spillover -1 × Observed Transformation is complicated – can think of as linear (low values) and log (high values) Gating strategy is hard to replicate, but can be stored as a template and “re-used” with tweaking Compensation, transformation and gating should be done on a per-batch and not per-file basis Would recommend storing workspace containing this data in both.jo and.xml formats

Working with statisticians At some point, a statistician is likely to be asked to analyze your data. This can lead to much unhappiness. Statisticians do not like Excel – The first thing they will try to do is export to a CSV or delimited file, for import into SAS or R – If this is difficult to do, they will not like you

Excel rules for happy statisticians 1 worksheet = 1 table 1 cell = 1 value Data/metadata = comprehensive & consistent Formatting = None Validation = Yes

1 worksheet = 1 table A table has column headers and a number of rows and nothing else – it is RECTANGULAR Do not put more than 1 table in a worksheet Do not use non-rectangular tables Example of good worksheet

1 worksheet = 1 table

1 cell = 1 value Easy to filter by tube, sample or subject Easy to write validation rules or lookup table

1 cell = 1 value ID column has 3 different values Need to do text parsing to recover information – very error prone

Data: column names Consistent column names across worksheets – Singlets/Lymphocytes – Singlet/Lymphs – Singlets / Lymphocytes – Singlets/Lymphoctyes Use full gating path for column name – Singlets/Lymphocytes/Viable/CD4+/CM/IFN+

Data: What to record Better to have more data than less data – Sample type (PBMC, whole blood) – Recovery – Viability Better to have basic than derived data – Counts better than relative frequencies Keep link to raw data for reproducibility – Path to FCS and workspace files on server Use special indicator for missing data (e.g. NAN), not zero Use as many columns as you need and name them sensibly and consistently

Data: Versioning Do not change the data in the worksheet once it has been handed to statistician. If there are errors that must be corrected, make a new copy, label the filename with date and version, and send that to statistician – ArcticRatExperiment_07May2013_Version01.xlsx – ArcticRatExperiment_17May2013_Version02.xlsx

Formatting Don’t do it. Avoid putting information via: – Highlighting – Fancy spacing – Different fonts and font effects – Merging cells – Comments Will it survive a round-trip from Excel to CSV and back again?

Formatting - Before

Formatting - After Comments are lost Highlighting is lost Bad cell formatting is lost Merged cells become missing information

Summary of Reproducible Analysis Know what you are doing from PBMC to Excel SOPs are important Annotation is important Excel is OK if you use NONE of its features Keep all necessary data in the same place Keep a remote backup Talk with your statistician

Biologist talks to Statistician