Dr. N. MamoulisAdvanced Database Technologies1 Topics 9-10: Database optimization for modern machines Computer architecture is changing over the years,

Slides:

Advertisements

Similar presentations

Arjun Suresh S7, R College of Engineering Trivandrum.

Advertisements

1 Parallel Scientific Computing: Algorithms and Tools Lecture #2 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.

Topics covered: CPU Architecture CSE 243: Introduction to Computer Architecture and Hardware/Software Interface.

Understanding Operating Systems Fifth Edition

Microprocessor Dr. Rabie A. Ramadan Al-Azhar University Lecture 1.

CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.

Parallel Database Systems The Future Of High Performance Database Systems David Dewitt and Jim Gray 1992 Presented By – Ajith Karimpana.

Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)

Overview of Cache and Virtual MemorySlide 1 The Need for a Cache (edited from notes with Behrooz Parhami’s Computer Architecture textbook) Cache memories.

Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.

Last Time –Main memory indexing (T trees) and a real system. –Optimize for CPU, space, and logging. But things have changed drastically! Hardware trend:

Prof. Bodik CS 164 Lecture 171 Register Allocation Lecture 19.

Memory Management 2010.

Memory Organization.

Register Allocation (via graph coloring)

1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.

1 Improving Hash Join Performance through Prefetching _________________________________________________By SHIMIN CHEN Intel Research Pittsburgh ANASTASSIA.

Register Allocation (via graph coloring). Lecture Outline Memory Hierarchy Management Register Allocation –Register interference graph –Graph coloring.

Computer Organization and Architecture

CS 524 (Wi 2003/04) - Asim LUMS 1 Cache Basics Adapted from a presentation by Beth Richardson

4/29/09Prof. Hilfinger CS164 Lecture 381 Register Allocation Lecture 28 (from notes by G. Necula and R. Bodik)

Dutch-Belgium DataBase Day University of Antwerp, MonetDB/x100 Peter Boncz, Marcin Zukowski, Niels Nes.

 Prof. Dr. M. H. Assal Introduction to Computer AS 26/10/2014.

Systems I Locality and Caching

Lecture 11: DMBS Internals

C-Store: Column Stores over Solid State Drives Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 19, 2009.

Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands Stefan.

3 1 3 C H A P T E R Hardware: Input, Processing, and Output Devices.

Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.

 Higher associativity means more complex hardware  But a highly-associative cache will also exhibit a lower miss rate —Each set has more blocks, so there’s.

CACHE MEMORY Cache memory, also called CPU memory, is random access memory (RAM) that a computer microprocessor can access more quickly than it can access.

DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.

MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.

Operating Systems Lecture 02: Computer System Overview Anda Iamnitchi

Cosc 2150: Computer Organization Chapter 6, Part 2 Virtual Memory.

IT253: Computer Organization

Database Architecture Optimized for the new Bottleneck: Memory Access Chau Man Hau Wong Suet Fai.

L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.

Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.

Introduction to Computer Architecture. What is binary? We use the decimal (base 10) number system Binary is the base 2 number system Ten different numbers.

Database Management COP4540, SCS, FIU Physical Database Design (ch. 16 & ch. 3)

DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.

Weaving Relations for Cache Performance Anastassia Ailamaki Carnegie Mellon David DeWitt, Mark Hill, and Marios Skounakis University of Wisconsin-Madison.

Database Systems Disk Management Concepts. WHY DO DISKS NEED MANAGING? logical information  physical representation bigger databases, larger records,

Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.

Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.

Chapter 5 Memory III CSE 820. Michigan State University Computer Science and Engineering Miss Rate Reduction (cont’d)

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.

Performance Tuning John Black CS 425 UNR, Fall 2000.

Lecture on Central Process Unit (CPU)

Lectures 8 & 9 Virtual Memory - Paging & Segmentation System Design.

DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.

DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.

1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.

Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.

CS 704 Advanced Computer Architecture

ESE532: System-on-a-Chip Architecture

The Memory System (Chapter 5)

Topics 10: Cache Conscious Indexes

Lecture 16: Data Storage Wednesday, November 6, 2006.

Chapter 2 – Computer hardware

5.2 Eleven Advanced Optimizations of Cache Performance

Cache Memory Presentation I

Lecture 11: DMBS Internals

Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics

Virtual Memory: Working Sets

Overview Problem Solution CPU vs Memory performance imbalance

Presentation transcript:

Dr. N. MamoulisAdvanced Database Technologies1 Topics 9-10: Database optimization for modern machines Computer architecture is changing over the years, so we have to change our programmes, too! Most database operators, indexes, buffering techniques and storage schemes were developed and optimized over 20 years ago, so we need to re- tune them for the modern reality.

Dr. N. MamoulisAdvanced Database Technologies2 Moore’s Law – Dictionary Term Moore's Law /morz law/ prov. The observation that the logic density of silicon integrated circuits has closely followed the curve (bits per square inch) = 2 (t ) where t is time in years; that is, the amount of information storable on a given amount of silicon has roughly doubled every year since the technology was invented. This relation, first uttered in 1964 by semiconductor engineer Gordon Moore (who co-founded Intel four years later) held until the late 1970s, at which point the doubling period slowed to 18 months. The doubling period remained at that value through time of writing (late 1999). Moore's Law is apparently self-fulfilling. The implication is that somebody, somewhere is going to be able to build a better chip than you if you rest on your laurels, so you'd better start pushing hard on the problem.

Dr. N. MamoulisAdvanced Database Technologies3 Features of Modern Machines and Future Trends CPU speed, memory bandwidth, and disk bandwidth, follow Moore’s Law. On the other hand, memory latency improves by only 1% per year. Memories are becoming larger and slower relatively to memory bandwidth and CPU speed. Most of the time the processor of your PC is idle, waiting for data to be fetched from memory. The role of main memory caches is becoming very important in the overall performance.

Dr. N. MamoulisAdvanced Database Technologies4 Features of Modern Machines and Future Trends (cont’d) Disk bandwidth also follows Moore’s Law. On the other hand, disk seek time improves by only 10% per year. Thus a random disk access today may cost as much as sequential accesses.

Dr. N. MamoulisAdvanced Database Technologies5 Features of Modern Machines and Future Trends (cont’d) Another characteristic of modern machines is that their processors (AMD Athlon, Intel P4), have parallel processing units able to process at the same time multiple (e.g., 5-9) independent instructions.

Dr. N. MamoulisAdvanced Database Technologies6 The new bottleneck: Memory Access Memory is now hierarchical: two levels of caching CPU L1 cache L2 cache Main memory CPU die L1 cache-line L2 cache-line Memory page Memory-latency: the time needed to transfer 1 byte from the main memory to the L2 cache. Cache (L2) miss: if the requested data is not in cache and needs to be fetched from main memory Cache-line: The transfer unit from main memory to cache (e.g., L2 cache- line = 128 bytes)

Dr. N. MamoulisAdvanced Database Technologies7 Example of Memory Latency Effects in Databases Employee(id: 4 bytes, name: 20 bytes, age: 4 bytes, gender: 1 byte) Disk/Memory representation: Mike Chan 30 M George Best 27 M bytes

Dr. N. MamoulisAdvanced Database Technologies8 Example of Memory Latency Effects in Databases (cont’d) Example query: how many girls work for the company? SELECT COUNT(*) FROM EMPLOYEE WHERE gender = “F”; What is the cost of this query, assuming that the relation is in main memory? We have to access the gender values, compare them to “F” and add 1 (or 0).

Dr. N. MamoulisAdvanced Database Technologies9 Example of Memory Latency Effects in Databases (cont’d) Accessing a specific address from memory loads also the information around it in the cache-line (in parallel over a wide bus). Example: Accessing the gender of the first tuple: Mike Chan 30 M George Best 27 M bytes MMem: Cache: cache-line (128 bytes)

Dr. N. MamoulisAdvanced Database Technologies10 Example of Memory Latency Effects in Databases (cont’d) Thus a cache-line at a time is accessed from main memory. If the requested information is already in cache (e.g., the gender of the second tuple), main memory is not accessed. Otherwise, a cache-miss occurs.

Dr. N. MamoulisAdvanced Database Technologies11 Effect of cache misses The stride.c program demonstrates how cache misses may dominate the query processing cost. CSIS7101/stride.c initialize a random memory buffer; sum = 0; ilimit = ; //tentative for(i=0; i<ilimit; i+=stride) sum += buffer[i]; printf("sum=%d\n",sum); stride=5 +

Dr. N. MamoulisAdvanced Database Technologies12 Effect of cache misses (cont’d)

Dr. N. MamoulisAdvanced Database Technologies13 Re-designing the DBMS We need to redesign the DBMS in order to face the new bottleneck: memory access New storage schemes are proposed to minimize memory and disk accesses. Query processing techniques are redesigned to take under consideration the memory-latency effects. Algorithms are changed to take advantage of the intraparallelism of instruction execution. New instruction types are used (e.g., SIMD instructions).

Dr. N. MamoulisAdvanced Database Technologies14 Problems in optimizing the DBMS Programming languages do not have control over the replacement policy of memory caches. These are determined by the hardware. As a result, we cannot apply buffer management techniques for memory cache management. Also, simple instructions generated by programming languages cannot control the parallel instruction execution capabilities of the machine. In many cases, we re-write the programs, trying to “fool” the page replacement policy, and the instruction execution in order to make them more efficient.

Dr. N. MamoulisAdvanced Database Technologies15 The N-ary storage model The N-ary storage model (NSM) stores the information for each tuple as a sequence of bytes. Disk/Memory representation: Mike Chan 30 M George Best 27 M bytes

Dr. N. MamoulisAdvanced Database Technologies16 A Decomposition Storage Model (DSM) The table is vertically decomposed, and information for each attribute is stored sequentially Mike Chan George Best M M... a1 a2a3

Dr. N. MamoulisAdvanced Database Technologies17 Properties of DSM The relation is decomposed into many binary tables, one for each attribute. The attributes of a binary table are a surrogate id and the attribute from the original relation. The surrogate-id is necessary to bring back together information for a tuple, by joining the binary tables. Example: Print the information for tuple with id=12305 SELECT * FROM EMPLOYEE WHERE id=12305 NSM SELECT id,name,age,gender FROM a1,a2,a3 WHERE a1.id=12305 AND a2.id = a1.id AND a3.id = a1.id DSM

Dr. N. MamoulisAdvanced Database Technologies18 Properties of DSM (cont’d) Advantages of DSM If the relation has many attributes, but queries involve only few attributes, then:  Much less information is read from disk, than in NSM  Cache misses are fewer than in NSM, because the stride is smaller Projection queries are very fast Disadvantages of DSM If a tuple needs to be reconstructed, this will require many joins. The size of the decomposed relation is larger than the original due to the replication of the surrogate key.

Dr. N. MamoulisAdvanced Database Technologies19 A Decomposition Model with Unary tables (Monet) 1 1 M F... value of first surrogate size of attribute type in bytes Given the surrogate sid of a tuple, we can compute its attribute value v by: v = *(table_address+(sid - first_surrogate)*size) not materialized

Dr. N. MamoulisAdvanced Database Technologies20 Partition Attributes Across Attempts to combine advantages of both NSM and DSM, while avoiding their disadvantages. The data are split to pages, like in NSM, but the organization in each page is like DSM. Thus: Cache misses are minimized because the information for a specific attribute is stored compactly in the page. Record reconstruction cost is low, since the tuples are actually stored like in NSM on disk, but vertically partitioned in each page, where the reconstruction cost is minimal.

Dr. N. MamoulisAdvanced Database Technologies21 Mike Chan George Best How do pages look in each schema? PAGE HEADER Mike Chan30 M George Best 27 M... PAGE HEADER John Kit33 M Flora Ho 27 F... NSM PAGE HEADER Mike Chan12305 Best... PAGE HEADER DSM George PAGE HEADER PAGE HEADER... PAGE HEADER... PAX many tables John Kit Flora Ho PAGE HEADER

Dr. N. MamoulisAdvanced Database Technologies22 Comparison between PAX and NSM SELECT AVG(age) FROM EMPLOYEE WHERE id<20000 PAGE HEADER Mike Chan30 M George Best 27 M... NSM Mike Chan George Best PAGE HEADER... PAX Cache Mike Chan 30 M Cache irrelevant data are fetched to cache relevant data are fetched to cache

Dr. N. MamoulisAdvanced Database Technologies23 SELECT AVG(age) FROM EMPLOYEE WHERE name=“M*” Comparison between PAX and DSM Mike Chan George Best PAGE HEADER... PAX Cache Mike Chan George... Cache Mike Chan George join is avoided PAGE HEADER Mike Chan12305 Best... DSM George PAGE HEADER qualifying ids have to be joined with other pages join

Dr. N. MamoulisAdvanced Database Technologies24 Is PAX always better than DSM? NO. If the relation has many attributes (e.g., 10) and the query involves only few (e.g., 2) then the join may be more beneficial, than reading whole tuples in memory. Example: R(a1,a2,a3,..., a10) Each attribute is 4 bytes long. DSM: D1(id,a1), D2(id,a2),...,D10(id,a10). Query: SELECT avg(a2) FROM R WHERE a1=x; Query selectivity is very high 1%. The qualifying ids can fit in memory PAX has to read the whole table=40bytes  |R| DSM has to read D1, apply the query, and create an intermediate table X with qualifying ids (in memory). Then it has to read D2 to get the qualifying a2 that join with X. So the total bytes DSM reads from disk are 16bytes  |R|.

Dr. N. MamoulisAdvanced Database Technologies25 Summary In modern computer architectures the bottleneck is memory access. In many cases the processor is waiting for data to be fetched from memory. Memory access is no longer “random”. When we access a location in disk then data around it are loaded to a fast memory chip (cache). Access locality is very important. Database operators, storage and buffering schemes, indexes are optimized for the reduction of cache misses.

Dr. N. MamoulisAdvanced Database Technologies26 References George P. Copeland, Setrag Khoshafian, A Decomposition Storage Model. SIGMOD, Peter A. Boncz, Stefan Manegold, Martin L. Kersten, Database Architecture Optimized for the New Bottleneck: Memory Access. VLDB, Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, Marios Skounakis: Weaving Relations for Cache Performance, VLDB, P. A. Boncz. Monet: A Next-Generation DBMS Kernel For Query-Intensive Applications. PhD dissertation. Universiteit van Amsterdam, May 2002.