Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semi-Structured Data and Agile Application Development

Similar presentations


Presentation on theme: "Semi-Structured Data and Agile Application Development"— Presentation transcript:

1 Semi-Structured Data and Agile Application Development
CS411 – Project 2 Spring 2017 Richard Weeks

2 The Problem - Software Project Outcomes

3 Choosing Agile Software Development
Deliver a partially implemented product as soon as possible Iterate! Deliver frequently (every 2-8 weeks) Get and incorporate feedback Don’t get hung up on long term planning

4 Databases Impact Methodology Selection
Classic RDBMS cannot support multiple data schemas RDBMS schema must be designed before application development Application requirements must be gathered before development begins (the Waterfall methodology)

5 Semi-Structured Data Is Different
Semi-structured data defines its own schema1 Many different data schemas can be stored simultaneously1 Commonly represented as XML1 – many agile frameworks have good support for XML (de)serialization Information is accessed by path1 Physical storage strategy depends on implementation

6 Shredded Storage Lore introduced a strategy of storing data at leaf vertexes and labeling edges; paths start from the root node5 Monet, XRel, XLight and others create a relational model of the nodes and relationships (edges) in the XML Document Object Model (DOM)7,8,9 Leverages relational querying and query optimization techniques Recursive query required to rebuild a full XML document used to add data to the database

7 XML Native Storage IMB DB2’s Native XML and Natix split the XML document across storage blocks at natural (to XML) boundaries – entire subtrees3,6 Only the blocks holding nodes relevant to a query must be loaded Storing nodes from the same document together improves I/O performance book author author publisher content chapter chapter

8 LOB (Flat) Storage Store XML as text or compressed XML binary3
All query optimization is via indexes updated when documents are inserted, updated, or deleted Very good for whole-document storage and retrieval – the only thing to focus on at the whole-document level

9 Query Language Lore uses a custom query language called Lorel based loosely on OQL5 Optimized with indexes that go from the values (bottom) ”up,” traversing toward the root XPath and XQuery, languages designed for selecting information from XML documents, have become popular with databases Involve 13 different axes, 5 of which are major axes and can be optimized with an R-Tree on the pre- and post-order position of each node2 Can also be optimized by indexes for flat storage, including partial indexing on prefixes UnQL was described but never truly implemented

10 References Garcia-Molina, Hector, Jeffrey D Ullman, and Jennifer Widom. Database Systems. Upper Saddle River, NJ: Pearson Education, , 483, 484, Print. Schmidt, Albrecht et al. "Efficient Relational Storage And Retrieval Of XML Documents". The World Wide Web And Databases: Third International Workshop Webdb Dallas, TX, USA, May 18–19,2000 Selected Papers. G Goos et al. 1st ed. Berlin: Springer, Web. 11 Apr Yoshikawa, M. et al XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Transactions on Internet Technology. 1, 1 (2001), Grust, Torsten, Maurice Van Keulen, and Jens Teubner. "Accelerating Xpath Evaluation In Any RDBMS". ACM Transactions on Database Systems 29.1 (2004): Web. 15 Apr Zafari, H. et al XLight, An Efficient Relational Schema to Store and Query XML Data. International Conference on Data Storage and Data Engineering (2010). Kanne, Carl-Christian, and Guido Moerkotte. "Efficient storage of XML data." Technical reports 99 (2008). Lynch, Jennifer. "Standish Group 2015 Chaos Report - Q&A With Jennifer Lynch". InfoQ. N.p., Web. 8 Apr McHugh, Jason et al. "Lore". ACM SIGMOD Record (1997): Web. 9 Apr Nicola, Matthias and van der Linden, Bert Native XML support in DB2 universal database. In Proceedings of the 31st international conference on Very large data bases (VLDB '05). VLDB Endowment


Download ppt "Semi-Structured Data and Agile Application Development"

Similar presentations


Ads by Google