Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.

Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and complexes  Domains and conclusion

Scenario  Ran a set of expression experiments to study a given disease state.  Need to put the results into a functional context.

Atherosclerosis  Most common fatal disease in the U.S., and not well-understood.

Microarray analysis  Analyzed 51 artery segments from the hearts from 22 heart transplant patients.  Classified segments by their disease pathology.  Will assess the differences between Type I (moderate) and Type V (severe) atherosclerosis.  Performed microarray analysis of each segment.  Agilent expression array with probesets for 13,000 human genes.

SAM microarray statistic  For each gene i, contrasted expression in Type I and Type V lesions with SAM ( Proc Natl Acad Sci USA 98: 5116-21, 2001).  High positive SAM score: gene expressed more highly in Type V lesions.  Large negative SAM score: gene expressed more highly in Type I lesions.

Analysis pipeline 1. Biomarker identification  For formal studies, use machine learning methods  For exploratory work, select several genes with extreme SAM scores.

Analysis pipeline, continued 2. Biomarker association  Basic question: for this context, what is common among the biomarker genes?  Approaches  Exhaustive reading  GO analysis  Literature searching

pros and cons of this approach  Pro: associations are specific to this disease context  Pro: identifies relevant literature  Con: might not find associations on all of your biomarkers  Con: might find associations on other genes

Iterative literature searching  Perform an initial search  Color the network by SAM d-score  Identify any new “responsive” genes  Add to biomarker list  Repeat

Discussion topic Why not use all genes with extreme SAM scores as biomarkers? Why iterate?

Once you have a good network: 1. Use BiNGO to identify the enriched GO terms 2. Look at the genes corresponding to selected enriched terms 3. Check the literature search sentences for those genes 4. Choose one or two sentences, look at the abstracts. 5. Iterate if desired (or go to lunch)

Final points  No right or wrong answers, only plausible or novel hypotheses.  You can take any approach you wish.  “If it was easy, everyone would be doing it”.

Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.

Similar presentations

Presentation on theme: "Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and.

Similar presentations

Presentation on theme: "Overview  Introduction  Biological network data  Text mining  Gene Ontology  Expression data basics  Expression, text mining, and GO  Modules and."— Presentation transcript:

Similar presentations

About project

Feedback