The Sisyphus Database Retrieval Software Performance Antipattern Robert F. Dugan Jr. Dept. of Computer Science Stonehill College Easton, MA 02357 USA

Slides:



Advertisements
Similar presentations
PHP I.
Advertisements

Organisation Of Data (1) Database Theory
Analysis of : Operator Scheduling in a Data Stream Manager CS561 – Advanced Database Systems By Eric Bloom.
Operating Systems Lecture 10 Issues in Paging and Virtual Memory Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing.
CompSci Searching & Sorting. CompSci Searching & Sorting The Plan  Searching  Sorting  Java Context.
UMass Lowell Computer Science Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2010 Lecture 3 Tuesday, 2/9/10 Amortized Analysis.
UMass Lowell Computer Science Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2009 Lecture 3 Tuesday, 2/10/09 Amortized Analysis.
Modern Information Retrieval Chapter 1 Introduction.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
Chapter 14 The Second Component: The Database.
CS 524 (Wi 2003/04) - Asim LUMS 1 Cache Basics Adapted from a presentation by Beth Richardson
Chapter 9 Introduction to the Document Object Model (DOM) JavaScript, Third Edition.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Attribute databases. GIS Definition Diagram Output Query Results.
Introduction to Databases. Case Example: File based Processing Real Estate Agent’s office Property for sale or rent Potential Buyer/renter Staff/employees.
The Design Discipline.
NiagaraCQ : A Scalable Continuous Query System for Internet Databases (modified slides available on course webpage) Jianjun Chen et al Computer Sciences.
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Database Design – Lecture 16
Application-Layer Anycasting By Samarat Bhattacharjee et al. Presented by Matt Miller September 30, 2002.
Access Primer UoN workshop Naivasha, 30 July – 4 August 2006.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Operating System Concepts Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information Engineering Da-Yeh University.
Lecture#1 on Internet. Internet Addressing IP address: pattern of 32 or 128 bits often represented in dotted decimal notation IP address: pattern of 32.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Architecture for a Database System
6 Chapter Databases and Information Management. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits.
SAMANVITHA RAMAYANAM 18 TH FEBRUARY 2010 CPE 691 LAYERED APPLICATION.
Databases & Data Mining CPS 181s April 3, Databases in eCommerce The move to eCommerce is in part driven by the ability to gather data that benefits.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
SOEN 6011 Software Engineering Processes Section SS Fall 2007 Dr Greg Butler
Improving Efficiency of I/O Bound Systems More Memory, Better Caching Newer and Faster Disk Drives Set Object Access (SETOBJACC) Reorganize (RGZPFM) w/
Event Management & ITIL V3
Creational Patterns CSE301 University of Sunderland Harry R Erwin, PhD.
Hotspot Detection in a Service Oriented Architecture Pranay Anchuri,
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
DATA STRUCTURE & ALGORITHMS (BCS 1223) CHAPTER 8 : SEARCHING.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
CS 1308 Computer Literacy and the Internet
INFORMATION MANAGEMENT Unit 2 SO 4 Explain the advantages of using a database approach compared to using traditional file processing; Advantages including.
Chapter 1 Introduction to Databases © Pearson Education Limited 1995, 2005.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Chapter 9 Database Systems © 2007 Pearson Addison-Wesley. All rights reserved.
GIS Data Models GEOG 370 Christine Erlien, Instructor.
Lecture on Binary Search and Sorting. Another Algorithm Example SEARCHING: a common problem in computer science involves storing and maintaining large.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Information Retrieval
Indexing Correlated Probabilistic Databases Bhargav Kanagal, Amol Deshpande University of Maryland, College Park, USA SIGMOD Presented.
Matt Young 1. Patterns  Reusable solution to a problem  Demonstrates good design practices  Can speed up development 2.
20 Copyright © 2008, Oracle. All rights reserved. Cache Management.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Databases Flat Files & Relational Databases. Learning Objectives Describe flat files and databases. Explain the advantages that using a relational database.
System Architecture & Hardware Configurations Dr. D. Bilal IS 582 Spring 2008.
INFORMATION TECHNOLOGY DATABASE MANAGEMENT. A database is a collection of information organized to provide efficient retrieval. The collected information.
October 15-18, 2013 Charlotte, NC Accelerating Database Performance Using Compression Joseph D’Antoni, Solutions Architect Anexinet.
How to Approximate a Set Without Knowing It’s Size In Advance? Rasmus Pagh Gil Segev Udi Wieder IT University of Copenhagen Stanford Microsoft Research.
1 UNIT 13 The World Wide Web. Introduction 2 The World Wide Web: ▫ Commonly referred to as WWW or the Web. ▫ Is a service on the Internet. It consists.
Tool Support for Testing Classify different types of test tools according to their purpose Explain the benefits of using test tools.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Managing Data Resources File Organization and databases for business information systems.
The purpose of a CPU is to process data Custom written software is created for a user to meet exact purpose Off the shelf software is developed by a software.
Why indexing? For efficient searching of a document
CSE-291 Cloud Computing, Fall 2016 Kesden
Chapter 10 Transaction Management and Concurrency Control
Database management concepts
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
Presentation transcript:

The Sisyphus Database Retrieval Software Performance Antipattern Robert F. Dugan Jr. Dept. of Computer Science Stonehill College Easton, MA USA Ali Shokoufandeh Dept. of Math/Computer Science Drexel University Philadelphia, PA USA Ephraim P. Glinert Dept. of Computer Science Rensselaer Polytechnic Inst. Troy, NY USA Third International Workshop on Software and Performance July 24-26, 2002 Rome, Italy

Overview Software Performance Antipatterns Sisyphus Database Retrieval Antipattern SolutionsExperiments Real World Challenges Future Work

Software Performance Antipatterns Software Design Patterns: Effective solution to a common software design problem singleton, proxy, iterator, observer/listener [Gamma et al. 1995] Software Design Antipatterns: “A commonly occurring solution to a problem that generates decidedly negative consequences.” [Brown et al. 1998] “god” class, dead code, class proliferation

Software Performance Antipatterns “Software Performance Antipatterns”, Smith and Williams, WOSP 2000 “God” Class Circuitous Treasure Hunt Excessive Dynamic Allocation One Lane Bridge A commonly occurring solution to a software design problem that generates decidedly negative performance consequences

Sisyphus Database Retrieval Antipattern 1) Issue request to display list subset 2) Issue database query to retrieve entire list 3) Return query results 4) Determine number of items displayed 5) Iterate through result set discarding all items until first item to display is reached 6) Continue through result set rendering items for display until last item to display is reached 7) Discard remaining result set 8) Display subset examples: , address book, search results

Sisyphus Database Retrieval Antipattern Key to this antipattern is the processing necessary to retrieve the entire list from which a subset is extracted must be repeated. Recalls Greek myth of Sisyphus damned for all eternity to push a stone up a hill only to watch it roll back down again. Sisyphus by Franz von Stock

Sisyphus Database Retrieval Antipattern Three tier system selected. SPE techniques used to model and analyze antipattern. ResourceAnalysisDisk Database Server: Number of disk I/O operations rises linearly with the size of the total list. I/O reduction possible with database caching, but memory resource contention as system scales to more users CPU Browser: linear dependence on list subset Web Server: linear dependence on start position of subset within result set; linear dependence on list subset Database Server: log linearly with the size of the total list; linear dependence on start position of subset within result set; linear dependence on list subset Network Browser-Web Server: linear dependence on list subset Web-Database Server: linear dependence on start position of subset within result set

Solutions: Index and Rownum multi-attribute index and rownum select lname, fname, phone, address from contacts where userid=45 and rownum <= 50 Advantages: processing beyond subset eliminated sorting result set eliminated Disadvantages linear dependence on subset start position multi-attribute index prevents dynamic sorting no total list size

Solutions: Upper/Lower Bound multi-attribute index, lower bound attribute value, rownum select lname, fname, phone, address from contacts where userid=45 and rownum <= SUBSETSIZE and lname > ENDSUBSETLASTNAME Linear dependence on list subset size Disadvantages: lower bound attribute must be unique multi-attribute index prevents dynamic sorting no total list size

Solutions: Sequence Numbers Each list element assigned unique sequence number Combination of user and sequence number is unique select lname, fname, phone, address from contacts where userid=45 and lnameSeq >= subListStart and lnameSeq <= subListEnd Advantages Linear dependence on list subset size No restriction on duplicate list elements Trivial to compute list size Multiple sorting criteria possible Cost of maintaining sequence number

Solutions: Caching Amortize cost of full list retrieval across subset views List resides outside database after first subset retrieval Advantages: Useful when listSize/subSetViews <= subListSize, e.g. list shared across multiple users Resources eliminated completely after first retrieval Linear dependence on list subset size Compute total list size once Disadvantages Potentially significant response time for first retrieval Cache state maintained between requests complicating scaling Cache consistency Tier memory required for cache

Experiments Subset Start Antipattern get/discard tuple (ms) Antipattern get/render tuple (ms) Seq. Number get/discard tuple (ms) Seq. Number get/render tuple (ms)

Real World Challenges eCal provides a web based calendar/address book system Antipattern uncovered by performance engineering Resistance to design change from database and application development teams because of schedules Experimental evidence reinforced antipattern as problem for lists above 100 elements Debate over average list size per user List subset handling logic encapsulated in stored procedures isolating application logic Monitor average list sizes in production, when average exceeds 100, then sequence number solution used

Future Work Software Performance Antipattern Workshop Great opportunity for veteran performance engineers from industry to contribute Compendium of Antipatterns much like Addison-Wesley’s Design Patterns book Coming soon (WOSP 2003?, SIGMETRICs?) Caching Techniques