Exploiting Application Semantics: Harvest, Yield CS 444A Fall 99 Software for Critical Systems Armando Fox & David Dill © 1999 Armando Fox.

Slides:



Advertisements
Similar presentations
Chapter 1 Overview of Databases and Transaction Processing.
Advertisements

CS 795 – Spring  “Software Systems are increasingly Situated in dynamic, mission critical settings ◦ Operational profile is dynamic, and depends.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Overview of Databases and Transaction Processing Chapter 1.
© 2001 Stanford Distinguishing P, S, D state n Persistent: loss inevitably affects application correctness, cannot easily be regenerated l Example: billing.
Fall 2007cs4251 Distributed Computing Umar Kalim Dept. of Communication Systems Engineering 31/10/2007.
G Robert Grimm New York University Scalable Network Services.
Software Testing and Quality Assurance Testing Web Applications.
SWE Introduction to Software Engineering
Outline IS400: Development of Business Applications on the Internet Fall 2004 Instructor: Dr. Boris Jukic Web Applications for Business: Performance Issues.
Reliability and Dependability in Computer Networks CS 552 Computer Networks Side Credits: A. Tjang, W. Sanders.
G Robert Grimm New York University Scalable Network Services.
Microsoft ® Official Course Developing Optimized Internet Sites Microsoft SharePoint 2013 SharePoint Practice.
1 Recap Database: –collection of data central to some enterprise that is managed by a Database Management System –reflection of the current state of the.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
1 Introduction Introduction to database systems Database Management Systems (DBMS) Type of Databases Database Design Database Design Considerations.
Implementing High Availability
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
Latency as a Performability Metric for Internet Services Pete Broadwell
Chapter 1 Overview of Databases and Transaction Processing.
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Analysis of Simulation Results Andy Wang CIS Computer Systems Performance Analysis.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
Web Content Management System CREATED BY Joshua Jylsus Mendes MBA –IT Jitendra Purohit MBA – IT
PMIT-6102 Advanced Database Systems
Chapter 7: Architecture Design Omar Meqdadi SE 273 Lecture 7 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Software Engineering Aspects of Web Based Development Cathy Huttenhoff Software Engineer Senior Seminar Fall 2009.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Computer Measurement Group, India Optimal Design Principles for better Performance of Next generation Systems Balachandar Gurusamy,
Software Performance Testing Based on Workload Characterization Elaine Weyuker Alberto Avritzer Joe Kondek Danielle Liu AT&T Labs.
Web Application Servers Dean Jacobs BEA WebLogic.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
Database Design – Lecture 18 Client/Server, Data Warehouse and E-Commerce Database Design.
Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University.
CS 347Notes101 CS 347 Parallel and Distributed Data Processing Distributed Information Retrieval Hector Garcia-Molina Zoltan Gyongyi.
1 CMPT 275 High Level Design Phase Modularization.
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
CS 501: Software Engineering Fall 1999 Lecture 12 System Architecture III Distributed Objects.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Chapter 1 Overview of Databases and Transactions.
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
Combining Systems and Databases: A Search Engine Retrospective By: Rooma Rathore Rohini Prinja Author: Eric A. Brewer.
USER PRIORITIZATION: HURRY UP AND WAIT BY SHAILESH SHUKLA.
Chapter 1 Overview of Databases and Transaction Processing.
 Software reliability is the probability that software will work properly in a specified environment and for a given amount of time. Using the following.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Chapter 5:Architectural Design l Establishing the overall structure of a software.
GENERAL SCALABILITY CONSIDERATIONS
Modularity Most useful abstractions an OS wants to offer can’t be directly realized by hardware Modularity is one technique the OS uses to provide better.
Cluster-Based Scalable
The Case for a Session State Storage Layer
Improving searches through community clustering of information
Distributed File Systems
CSE-291 Cloud Computing, Fall 2016 Kesden
Web Software Model CS 4640 Programming Languages for Web Applications
Software Design and Architecture
Storage Virtualization
Overview of Databases and Transaction Processing
Web Application Architectures
Admission Control and Request Scheduling in E-Commerce Web Sites
THREE TIER MOBILE COMPUTING ARCHITECTURE
Interpret the execution mode of SQL query in F1 Query paper
Web Application Architectures
Decoupled Storage: “Free the Replicas!”
Distributed Systems CS
Transaction Properties: ACID vs. BASE
Presentation transcript:

Exploiting Application Semantics: Harvest, Yield CS 444A Fall 99 Software for Critical Systems Armando Fox & David Dill © 1999 Armando Fox

© 1999, Armando Fox Harvest and Yield n Yield: probability of completing a query n Harvest: (application-specific) fidelity of the answer l Fraction of data represented? l Precision? l Semantic proximity? n Harvest/yield questions: l When can we trade harvest for yield to improve availability? l How to measure harvest “threshold” below which response is not useful? n Application decomposition to improve “degradation tolerance” (and therefore availability)

© 1999, Armando Fox Example 1: Search Engine n Stripe database randomly across all nodes, replicate high-priority data l Random striping: worst case == average case l Replication: high priority data unlikely to be lost l Harvest: fraction of nodes reporting n Questions… l Why not just wait for all nodes to report back? l Should harvest be reported to end user? l What is the “useful” harvest threshold? l Is nondeterminism a problem? n Trade harvest for yield/throughput

© 1999, Armando Fox Example 2: TranSend n lossy on-the-fly Web image compression, extensively parameterized (per user, device, etc.) n Harvest: “semantic fidelity” of what you get l Worst case: the original image original l Intermediate case: “close” image that has been previously computed and cached l Metrics for semantic fidelity? n Trade harvest for yield/throughput desired delivered

© 1999, Armando Fox Synergies With Cluster Computing n Common thread: trade harvest for yield or throughput l Search engine: harvest == fraction of nodes reporting l TranSend: harvest == semantic similarity between delivered & requested results l Both cases: a direct function of availability of computing power n Harvest vs. Yield neat tricks l Migrating a cluster with zero downtime l Managing user classes

© 1999, Armando Fox Application Decomposition n Decomposing the canonical e-commerce site Billing Profiles Catalog Internet $ $ FE W W W T W W W A

© 1999, Armando Fox Component State Management n User profiles l Read often, write infrequently l Must persist across sessions n Online merchandise catalog l Read-only database l Presentation should depend on user preferences n Shopping cart l Read and write frequently l Need not persist across sessions n Billing (transactional DB)

© 1999, Armando Fox Degradable State: Shopping Cart n Shopping cart is more “state-intensive” than billing l Shopping cart subsystem manipulates more incremental state per user than billing l But, shopping cart is an “optimization” l We can reflect this into provisioning/growth n One idea: Keep cart contents in a cache l Not a database! l Non-ACID l Soft state by definition

© 1999, Armando Fox Per-Component Failure Modes l Shopping cart can be periodically checkpointed to user profile l Transformation & aggregation modules can present catalog based on user input Billing Profiles Catalog Internet $ $ FE W W W T W W W A

© 1999, Armando Fox Degradable State, cont’d. n What’s the probability of losing the shopping cart? l HW or SW failure in cache (e.g. transient node failure, write corruption) l Eviction: rate depends on cache size and working set size; can grow cache incrementally to fix problem n What happens to users when cache is thrashing? l Turn off shopping cart for everyone? l “Use at own risk” shopping cart? l Rent some machines at high cost, until new ones arrive? l Probably cheaper to deploy a Web cache node than a new DB node!

© 1999, Armando Fox State vs. Interactivity Characterization n Idea: to decompose applications, characterize the pieces according to state and interactivity requirements n Rationale l State management is easy, if performance is not a concern l High performance is easy, if there is no state to manage l Proper decomposition may allow harvest-intolerant pieces to fail independently but overall service still available in degraded form n Goal: Combine harvest-intolerant pieces into a harvest-tolerant service, then quantify harvest degradation

© 1999, Armando Fox Interactivity Characterization n R: read interactivity l Example: static content server n W: write and read interactivity l Example: DB with high update load n N: non-interactive l Example: user requests are deferred or processed offline, with confirmation sent later n More like a spectrum than three distinct points l Treads the line of “performance vs. correctness”

© 1999, Armando Fox State Characterization n P: persistent l Failure to store persistently == incorrect behavior n S: soft/regeneratable l Stored state is an optimization, loss affects “only” performance l regeneration is possible but expensive, so expected to be rare (If both the state and the ability to generate it are lost, incorrect application behavior results) n D: degradable l trading harvest for some other property, e.g. yield or interactive performance

© 1999, Armando Fox The Resulting Space

© 1999, Armando Fox Example: Aggregated hit counter WS Hits WS Hits (W,P) (N,P) (W,S) n Naïve implementation: shared hit counter is (W,P) n Reimplementation: l Each node keeps in-memory counter: (W,S) l Main counter is periodically updated: (N,P) l Main counter must be fast to read: (R,P) l If staleness/imprecision is allowed, main counter can be (R,D) using caching

© 1999, Armando Fox Open Problems n Quantifying state degradation n Quantifying probabilistic availability n Abstractions for manipulating reduced-fidelity results n Might this apply to reactive systems?