S. Herget, R.Ranzinger, K.Maass and C.- W.v.d.Lieth Presented by Yingxin Guo GlycoCT—a unifying sequence format for carbohydrates.

Slides:



Advertisements
Similar presentations
Can I Use It, and If so, How? Christian Lieske SAP AG – MultiLingual Technology Discussion of Consortium Proposal for OLIF2 File Header.
Advertisements

AR for Horn clause logic Introducing: Unification.
Chapter 12 File Processing and Data Management Concepts
Transform and Conquer Chapter 6. Transform and Conquer Solve problem by transforming into: a more convenient instance of the same problem (instance simplification)
Fundamentals, Design, and Implementation, 9/e Appendix A Data Structures for Database Processing.
Chapter 6: Transform and Conquer
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Tuesday, May 14 Genetic Algorithms Handouts: Lecture Notes Question: when should there be an additional review session?
1 Trees. 2 Outline –Tree Structures –Tree Node Level and Path Length –Binary Tree Definition –Binary Tree Nodes –Binary Search Trees.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 5 Understanding Entity Relationship Diagrams.
Data Management Design
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
1 Efficiently Mining Frequent Trees in a Forest Mohammed J. Zaki.
UML class diagrams and XML schemas Karl Lieberherr UBS AG Northeastern University.
File and Database Design; Logic Modeling Class 24.
Binary Trees Chapter 6.
Introduction to UDDI From: OASIS, Introduction to UDDI: Important Features and Functional Concepts.
IST Databases and DBMSs Todd S. Bacastow January 2005.
Advanced Data Structures and Algorithms COSC-600 Lecture presentation-6.
CHP - 9 File Structures. INTRODUCTION In some of the previous chapters, we have discussed representations of and operations on data structures. These.
XML-to-Relational Schema Mapping Algorithm ODTDMap Speaker: Artem Chebotko* Wayne State University Joint work with Mustafa Atay,
Advanced Algorithms Analysis and Design Lecture 8 (Continue Lecture 7…..) Elementry Data Structures By Engr Huma Ayub Vine.
Section 11 : Normalisation
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
XML과 Database 홍기형 성신여자대학교 성신여자대학교 홍기형.
Microsoft Access 2003 Define some key Access terminology: Field – A single characteristic or attribute of a person, place, object, event, or idea. Record.
The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.
Processing of structured documents Spring 2002, Part 2 Helena Ahonen-Myka.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Chapter 6 Binary Trees. 6.1 Trees, Binary Trees, and Binary Search Trees Linked lists usually are more flexible than arrays, but it is difficult to use.
© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (continued) Advanced Implementation of Tables.
UNIT 2.
Database A database is a collection of data organized to meet users’ needs. In this section: Database Structure Database Tools Industrial Databases Concepts.
 2001 Prentice Hall Business Publishing, Accounting Information Systems, 8/E, Bodnar/Hopwood A field may be a single character or number, or it.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Computer Science: A Structured Programming Approach Using C Trees Trees are used extensively in computer science to represent algebraic formulas;
OASIS SDD TC Version Proposal Draft 2 after Jan F2F Brent A. Miller STSM, IBM Corp.
Chapter 10 Designing the Files and Databases. SAD/CHAPTER 102 Learning Objectives Discuss the conversion from a logical data model to a physical database.
By Rashid Khan Lesson 6-Building a Directory Service.
Glycan database. Database of molecules Two models (of vocabularies) – Proteins / Nucleic Acids Residues (+ modifications) Genbank / Swissprot – Compounds.
Trees : Part 1 Section 4.1 (1) Theory and Terminology (2) Preorder, Postorder and Levelorder Traversals.
Tree Traversals, TreeSort 20 February Expression Tree Leaves are operands Interior nodes are operators A binary tree to represent (A - B) + C.
Lection №4 Development of the Relational Databases.
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: There is a unique simple path between any 2 of its.
Introduction to Active Directory
Copyright © Curt Hill Other Trees Applications of the Tree Structure.
Chapter 3 The Relational Model. Objectives u Terminology of relational model. u How tables are used to represent data. u Connection between mathematical.
1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo Yuanying and Ling Tok Wang.
Logical Database Design and Relation Data Model Muhammad Nasir
Copyright © 2007, Oracle. All rights reserved. Managing Items and Item Catalogs.
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
1 10 Systems Analysis and Design in a Changing World, 2 nd Edition, Satzinger, Jackson, & Burd Chapter 10 Designing Databases.
1 Trees : Part 1 Reading: Section 4.1 Theory and Terminology Preorder, Postorder and Levelorder Traversals.
XML Extensible Markup Language
LECTURE TWO Introduction to Databases: Data models Relational database concepts Introduction to DDL & DML.
Database Design, Application Development, and Administration, 6 th Edition Copyright © 2015 by Michael V. Mannino. All rights reserved. Chapter 5 Understanding.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Expanding the Notion of Links DeRose, S.J. Expanding the Notion of Links. In Proceedings of Hypertext ‘89 (Nov. 5-8, Pittsburgh, PA). ACM, New York, 1989,
Network Topologies for Scalable Multi-User Virtual Environments Lingrui Liang.
Databases and DBMSs Todd S. Bacastow January
Lecture 1 (UNIT -4) TREE SUNIL KUMAR CIT-UPES.
Domain Name System (DNS)
UML to XSD.
(edited by Nadia Al-Ghreimil)
Database management concepts
Topological Ordering Algorithm: Example
(edited by Nadia Al-Ghreimil)
Topological Ordering Algorithm: Example
Topological Ordering Algorithm: Example
Topological Ordering Algorithm: Example
Presentation transcript:

S. Herget, R.Ranzinger, K.Maass and C.- W.v.d.Lieth Presented by Yingxin Guo GlycoCT—a unifying sequence format for carbohydrates

An overview of the sequence formats used in glycobioinformatics

Special structural features

Uniqueness—A central requirement for encoding carbohydrate sequences Why Server as primary key in database Beneficial for the implementation of exact structure search How Apply strict sorting rules Define a controlled vocabulary Support encoding of uncertain linkages and unspecified monosaccharides

General idea of GlycoCT

Basic monosaccharide namespace

Basic residue(RES) entities in GlycoCT Substituents and other entities

Modeling the topology Residue entities are modeled in RES section. Linkages are modeled in LIN section. Atom replacement schema.

Encoding linkage

Encoding Repeating units

Encoding alternative units

Encoding underdetermined units

Sorting Why One central requirement is to generate a unique representation for all carbohydrates. Sorting is used to determine the order of appearance of elements. How A set of hierarchical rules are used in GlycoCT to define the ordering of residues, linkages and special structural features. Residue comparison algorithm Linkage comparison algorithm Underdetermined subtree comparison algorithm Alternative subtree comparison algorithm

Residue comparison Apply when there are multiple starting points exist. Rules Number of child residues. Length of the longest branch. Number of terminal residues. Number of branching points. Lexical order.

Linkage comparison Rules Number of bonds between parent and child residues. Atom linkage position at the parent residue. Atom linkage position at the child residue. Linkage type at the parent residue. Comparison of child residues with residue comparison algorithm. Decide the internal order of the RES and LIN sections

Underdetermined subtree & Alternative subtree comparison The encoding of UND and ALT is handled separately from the description of the other topological features. Apply the set of rules from the residue and linkage comparison algorithm to each UND and ALT to determine internal order. The reducing residues of UNDs and ALTs are compared with the residue comparison. If two compared UNDs are identical, the parent residues and linkages(linkage between UND and main graph) are compared.

First application and results All the monosaccharides from CarbBank were translated to the naming defined by GlycoCT different names in CarbBank resulted in 474 different basetypes and 29 different substituents, reducing the number of distinct residues by 65%. Two main reasons for the reduction The separation of monosaccharides into basetype and substituents The unique encoding for monosaccharides

Conclusion A superset of capabilities of all known sequence formats in glycobioinformatics Support structurally undetermined sequences The consistent naming scheme for monosaccharides can be easily maintained.