Download presentation
Presentation is loading. Please wait.
Published byRoss Warner Modified over 9 years ago
1
Web Technologies for Bioinformatics Ken Baclawski
2
Data Formats n Flat files n Spreadsheets n Relational databases n Web sites
3
XML Documents n Flexible very popular text format n Self-describing records
4
XML Documents (continued) n Hierarchical structure
5
Purpose of Data n Data is collected and stored for a purpose. n The format serves that purpose. n Using data for another purpose is common. n Data presentation (such as on a Web site) is one example of such a use. n It is important to anticipate that data will be used for many purposes. n Data is reused by transforming it.
6
Statistical Analysis as a Transformation Process n Transformation consists of a series of steps. n Specialized equipment and software is used for each step. n Separation into steps reduces the overall effort.
7
Web Site Construction n Web sites can be constructed using a Web site authoring tool (e.g., Front Page). n Alternatively, one could use a transformation process to separate concerns.
8
Advantages of Transformation n Reduces the overall effort. n Presentation style is independent of the source content. n Presentation style can be changed with immediate effect. n Uniform enforcement of presentation style. n Updates to content are immediate. n Content can be used for many other purposes: – Many reports in many formats – Proposals – Data sharing with other institutions – Data mining
9
Transformation Languages n Traditional programming languages such as Perl, Java, etc. n Rule-based (declarative) languages such as the XML Transformation language (XSLT). – Rule-based rather than procedural – Transform each kind of element with a template – Matching and processing of elements is analogous to the digestion of polymers with enzymes.
10
Transformation as Digestion n The blue enzyme attacks the polymer at two locations. n The resulting three polymers are then attacked by the green enzyme.
11
XSLT “Digestion” n An XSLT program consists of templates n Each template processes a set of matching elements n A template can break up the element to be processed by other templates
12
<xsl:transform version="1.0” xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
13
5.67 <Concentration unit="nm">43 <interaction substrate="Sub89033"> 4.37 75 <P id="Mtr245"> 0.65 <Concentration unit="um">0.53 <interaction substrate="Sub80933"> 8.87 8.4 <S id="Sub89032"/>
14
5.67 43 4.37 75 0.65 0.53 8.87 8.4
15
Ontologies n The structure of data is its ontology. – Database schema – XML Document Type Definition (DTD) n An ontology defines the concepts and relationships between them in a domain. n Transformations are fundamental: – Queries – Organizing data (views) – Transformation for new purposes
16
Research Areas n Ontologies for bioinformatics n Ontology development in general – Constructing ontologies – Validation and testing of ontologies n New ontology languages to capture more meaning n Transformation languages
17
Research Areas n Inference and deduction – Logical inference – Probabilistic inference – Scientific inference – Other forms of inference n Integrating inference with – Data mining – Experimental processes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.