2/14/01RightOrder : Telegraph & Java1 Telegraph Java Experiences Sam Madden UC Berkeley

Slides:



Advertisements
Similar presentations
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Advertisements

Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Database System Concepts and Architecture
Part IV: Memory Management
COS 461 Fall 1997 Workstation Clusters u replace big mainframe machines with a group of small cheap machines u get performance of big machines on the cost-curve.
Spark: Cluster Computing with Working Sets
Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
6/4/98SIGMOD'98 -- Cornell Predator Project1 Secure and Portable Database Extensibility Tobias Mayr Michael Godfrey Praveen Seshadri Thorsten von Eicken.
1 Software Testing and Quality Assurance Lecture 31 – SWE 205 Course Objective: Basics of Programming Languages & Software Construction Techniques.
Optimizing RAM-latency Dominated Applications
Intro to Java The Java Virtual Machine. What is the JVM  a software emulation of a hypothetical computing machine that runs Java bytecodes (Java compiler.
Java Introduction 劉登榮 Deng-Rung Liu 87/7/15. Outline 4 History 4 Why Java? 4 Java Concept 4 Java in Real World 4 Language Overview 4 Java Performance!?
Adaptive Optimization in the Jalapeño JVM M. Arnold, S. Fink, D. Grove, M. Hind, P. Sweeney Presented by Andrew Cove Spring 2006.
1 Testing Concurrent Programs Why Test?  Eliminate bugs?  Software Engineering vs Computer Science perspectives What properties are we testing for? 
CSCI 224 Introduction to Java Programming. Course Objectives  Learn the Java programming language: Syntax, Idioms Patterns, Styles  Become comfortable.
Component-Level Energy Consumption Estimation for Distributed Java-Based Software Systems Sam Malek George Mason University Chiyoung Seo Yahoo! Nenad Medvidovic.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
DB Libraries: An Alternative to DBMS By Matt Stegman November 22, 2005.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Lecture 10 : Introduction to Java Virtual Machine
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
Titanium/Java Performance Analysis Ryan Huebsch Group: Boon Thau Loo, Matt Harren Joe Hellerstein, Ion Stoica, Scott Shenker P I E R Peer-to-Peer.
Roopa.T PESIT, Bangalore. Source and Credits Dalvik VM, Dan Bornstein Google IO 2008 The Dalvik virtual machine Architecture by David Ehringer.
Embedded System Lab. 오명훈 Memory Resource Management in VMware ESX Server Carl A. Waldspurger VMware, Inc. Palo Alto, CA USA
Comparison of Distributed Operating Systems. Systems Discussed ◦Plan 9 ◦AgentOS ◦Clouds ◦E1 ◦MOSIX.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
National Taiwan University Department of Computer Science and Information Engineering National Taiwan University Department of Computer Science and Information.
Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
CSC Multiprocessor Programming, Spring, 2012 Chapter 11 – Performance and Scalability Dr. Dale E. Parson, week 12.
Big Data Engineering: Recent Performance Enhancements in JVM- based Frameworks Mayuresh Kunjir.
CSE 598c – Virtual Machines Survey Proposal: Improving Performance for the JVM Sandra Rueda.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Telegraph Status Joe Hellerstein. Overview Telegraph Design Goals, Current Status First Application: FFF (Deep Web) Budding Application: Traffic Sensor.
1 Copyright © 2005, Oracle. All rights reserved. Following a Tuning Methodology.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Duke CPS Java: make it run, make it right, make it fast (see Byte, May 1998, for more details) l “Java isn’t fast enough for ‘real’ applications”
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
® July 21, 2004GC Summer School1 Cycles to Recycle: Copy GC Without Stopping the World The Sapphire Collector Richard L. Hudson J. Eliot B. Moss Originally.
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
SEDA. How We Got Here On Tuesday we were talking about Multics and Unix. Fast forward years. How has the OS (e.g., Linux) changed? Some of Multics.
Troubleshooting Dennis Shasha and Philippe Bonnet, 2013.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Eliminating External Fragmentation in a Non-Moving Garbage Collector for Java Author: Fridtjof Siebert, CASES 2000 Michael Sallas Object-Oriented Languages.
Practical Hadoop: do’s and don’ts by example Kacper Surdy, Zbigniew Baranowski.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Java Distributed Object System
Module 11: File Structure
Java 9: The Quest for Very Large Heaps
Database Management Systems (CS 564)
Scaling SQL with different approaches
Software Architecture in Practice
Database Performance Tuning and Query Optimization
Flight Recorder in OpenJDK
HashKV: Enabling Efficient Updates in KV Storage via Hashing
Introduction to Computer Systems
Module 11: Data Storage Structure
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
So far… Text RO …. printf() RW link printf Linking, loading
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
(A Research Proposal for Optimizing DBMS on CMP)
Chapter 11 Database Performance Tuning and Query Optimization
By Rajanikanth B Overview Of Java By Rajanikanth B
Performance And Scalability In Oracle9i And SQL Server 2000
CSC Multiprocessor Programming, Spring, 2011
Presentation transcript:

2/14/01RightOrder : Telegraph & Java1 Telegraph Java Experiences Sam Madden UC Berkeley

2/14/01RightOrder : Telegraph & Java2 Telegraph Overview 100% Java In memory database Query engine for alternative sources Web Sensors Testbed for adaptive query processing

2/14/01RightOrder : Telegraph & Java3 Telegraph & WWW : FFF Federated Facts and Figures Collect Data on the Election Based on Avnur and Hellerstein Sigmod ‘00 Work: Eddies Route tuples dynamically based on source loads and selectivities

2/14/01RightOrder : Telegraph & Java4 fff.cs.berkeley.edu

2/14/01RightOrder : Telegraph & Java5 Architecture Overview Query Parser Jlex & CUP Preoptimizer Chooses Access Paths Eddy Routes Tuples To Modules

2/14/01RightOrder : Telegraph & Java6 Modules Doubly-Pipelined Hash Joins Index Joins For probing into web-pages Aggregates & Group Bys Scans Telegraph Screen Scraper: View web pages as Relations

2/14/01RightOrder : Telegraph & Java7 Execution Framework One Thread Per Query Iterator Model for Queries Experimented with Thread Per Module Linux threads are expensive Two Memory Management Models Java Objects Home Rolled Byte Arrays

2/14/01RightOrder : Telegraph & Java8 Tuples as Java Objects Tuple Data stored as a Java Object Each in separate byte array Tuples copied on joins, aggregates Issues Memory Management between Modules, Queries, Garbage collector control Allocation Overhead Performance: 30, byte tuples / sec -> 5.9 MB / sec

2/14/01RightOrder : Telegraph & Java9 Tuples As Byte Array All tuples stored in same byte array / query Surrogate Java Objects Offset, Size Surrogate Objects Byte Array Directory

2/14/01RightOrder : Telegraph & Java10 Byte Array (cont) Allows explicit control over memory / query (or module) Compaction eliminates garbage collection randomness Lower throughput: 15,000 t/sec No surrogate object reuse Synchronization costs

2/14/01RightOrder : Telegraph & Java11 Other System Pieces XML Based Catalog Java Introspection Helps Applet-based Front End JDBC Interface Fault Tolerance / Multiple Servers Via simple UNIX tools

2/14/01RightOrder : Telegraph & Java12 RightOrder Questions Performance vs. C JNI Issues Garbage Collection Issues Serialization Costs Lots of Java Objects JDBC vs ODI

2/14/01RightOrder : Telegraph & Java13 Performance Vs. C JVM + JIT Performance Encouraging: IBM JIT == 60% of Intel C compiler, faster than MSC for low level benchmarks IBM JIT 2x Faster than HotSpot for Telegraph Scans Stability Issues

2/14/01RightOrder : Telegraph & Java14 JIT Performance vs C IBM JIT Optimized Intel Optimized MS Source:

2/14/01RightOrder : Telegraph & Java15 Performance Gotchas Synchronization ~2x Function Call overhead in HotSpot Used in Libraries: Vector, StringBuffer String allocation single most intensive operation in Telegraph Mercatur: 20% initial CPU Cost Garbage Collection Java dumb about reuse Mercatur: 15% Cost OceanStore: 30ms avg latency, 1S peak

2/14/01RightOrder : Telegraph & Java16 More Gotchas Finalization Finalizing methods allows inlining Serialization RMI, JNI use serialization Philippsen & Haumacher Show Performance Slowness

2/14/01RightOrder : Telegraph & Java17 Performance Tools Tools to address some issues JAX, Jopt: make bytecode smaller, faster Bytecode optimizer Good profiler, memory allocation and garbage collection monitor

2/14/01RightOrder : Telegraph & Java18 JNI Issues Not a part of Telegraph JNI overhead quite large (JDK 1.1.8, PII 300 MHz) Source: Matt Welsh. A System Support High Performance Communication and IO In Java. Master’s Thesis, UC Berkeley, 1999.

2/14/01RightOrder : Telegraph & Java19 More JNI But, this is being worked on IBM JDK 100,000 B copy in 5ms, vs 23ms for (500 Mhz PIII) JNI allows synchronization (pin / unpin), thread management See GCJ + CNI: access Java objects via C++ classes

2/14/01RightOrder : Telegraph & Java20 Garbage Collection Performance Big problem: 1 S or longer to GC lots of objects Most Java GCs blocking (not concurrent or multi- threaded) Unexpected Latencies OceanStore: Network File Server, 30ms avg. latencies for network updates, 1000 ms peak due to GC In high-concurrency apps, such delays disastrous

2/14/01RightOrder : Telegraph & Java21 Garbage Collection Cont. Limited Control Runtime.gc() only a hint Runtime.freeMemory() unreliable No way to disable No object reuse Lots of unnecessary memory allocations

2/14/01RightOrder : Telegraph & Java22 Serialization Not in Telegraph Philippsen and Haumacher, “More Efficient Object Serialization.” International Workshop on Java for Parallel and Distributed Computing. San Juan, April, Serialization costs for RMI are 50% of total RMI time Discard longevity for 7x speed up Sun Serialization provides versioning Complete class description stored with each serialized object Most standard classes forward compatible (JDK docs note special cases) See

2/14/01RightOrder : Telegraph & Java23 Lots of Objects GC Issues Serious Memory Management GC makes programmers allocate willy-nilly Hard to partition memory space Telegraph byte-array ugliness due to inability to limit usage of concurrent modules, queries

2/14/01RightOrder : Telegraph & Java24 Storage Overheads Java Object class is big: Integer requires 23 bytes in JDK 1.3 int requires 4.3 bytes No way to circumvent object fields Use primitives or hand-written serialization whenever possible

2/14/01RightOrder : Telegraph & Java25 JDBC vs ODI No experience with Oracle JDBC overheads are high, but don’t have specific performance numbers

2/14/01RightOrder : Telegraph & Java26 Bottom Line Java great for many reasons GC, standard libraries, type safety, introspection, etc. Significant reductions in development and debugging time. Java performance isn’t bad Especially with some tuning Memory Management an Issue Lack of control over JVMs bad When to garbage collect, how to serialize, etc.