midterm project Course: Statistics in Bioinformatics Date: 指導教授 : 陳光琦 學生 : 吳昱賢
midterm project Course: Statistics in Bioinformatics Date: 指導教授 : 陳光琦 學生 : 吳昱賢
GEO In National Center for Biotechnology Information (NCBI)
Figure 4 Schematic overview of the query workflow, and how the various features and tools are interlinked.
Figure 4
GEO Database The Gene Expression Omnibus (GEO) is a public repository(database) that archives and freely distributes highthroughput gene expression data submitted by the scientific community.
Figure 1 Schematic diagram of the relations between GEO Platform, Sample, DataSet, and Profiles. For each gene on a Platform, multiple Sample measurement values are generated. Related Samples constitute a DataSet, from which multiple gene expression profile entities are generated.
Figure 2 Screenshot of a typical DataSet record GDS877 (Gonzalez et al., 2005). The record includes a summary of the experiment, links to related records and publications, subset designations and classifications, download options, and access to mining features such as cluster heat maps and ‘Query group A vs B’ tool.Gonzalez et al., 2005
Figure 3
Screenshot of Entrez GEO Profiles retrieval results; each entity includes sequence identifier and DataSet information, and a tiny profile image. Links to other Entrez databases or related profiles are provided above the thumbnail image. The expanded profile chart depicts values (bars) and rank (squares) information for the crystallin gene across each Sample in GEO DataSet GDS877 (Gonzalez et al., 2005). Experimental subset groupings are reflected in labels at foot of chart.Gonzalez et al., 2005
Who can use GEO data? Anybody can access and download public GEO data. There are no login requirements. All data are in the public domain, but please read our data disclaimer.data disclaimer
How can I query and analyze GEO data? Several features are provided to assist with the exploration, visualization, and analysis of GEO data. These include individual gene expression profile charts, DataSet hierarchical and K-means/median clusters, DataSet value distribution charts, a 'Query mean group A vs B' tool, and profile and sequence neighbor searches. Alternatively, full text, tab-delimited value data tables provided with DataSet downloads (available on the DataSet record, or via FTP) may prove suitable for upload into your favorite microarray analysis software package. Please see the overview or recent publications for more information.FTPoverviewpublications