VectorWise The world’s fastest database GIUA, 13 September 2011
© 2011 Ingres Corporation
DBT-3 Database Schema © 2011 Ingres Corporation
DBT-3 Data © 2011 Ingres Corporation
What is VectorWise? Started as an academic project –Centrum Wiskunde & Informatica (CWI) Python MonetDB X100 VectorWise Adopted as an Ingres community project Joint venture company set up between CWI and Ingres Corp. Now wholly owned by Ingres Corp. © 2011 Ingres Corporation
What is VectorWise for? Data warehousing Data marts Data mining Online Analytical Processing (OLAP) Business Intelligence © 2011 Ingres Corporation
VectorWise Technology On Chip Computing Time / Cycles to Process Data Processed DISK RAM CHIP 10GB 2-3GB MB Millions Vector Processing Breakthrough technology © 2011 Ingres Corporation
On Chip Computing Processing in Chip Cache CPU cache access is more efficient than RAM cache access Time / Cycles to Process Data Processed DISK RAM CHIP 10GB2-3GB40-100MB Millions © 2011 Ingres Corporation
Vector Processing verses = 1 x 1 = 1 2 x 2 = 4 3 x 3 = 9 4 x 4 = 16 5 x 5 = 25 6 x 6 = 36 7 x 7 = 49 8 x 8 = n x n = n 2 SISD Single Instruction Single Data processed SIMD Single Instruction Multiple Data processed © 2011 Ingres Corporation
VectorWise Technology Automatic Compression Updateable Column Store Automatic Storage Indexes Minimize IO Innovations on industry proven techniques © 2011 Ingres Corporation
Updateable Column Store Only access relevant data Efficient incremental update enabled –Traditionally a weakness of column stores Cust_NumCust_surnameCust_first_nameCust_DOBCust_SexCust_Add_1Cust_Addr_2Cust_CityCust_State JonesSteven17-JAN-1971M333 StKilda RdMelbourneVic SmithLeonard04-APR-1964M147 Trafalgar RoadBirminghamEngland RogersCindy11-MAR-1980FBelmont Rail Service 421 Station St BelmontCA AndrewsJenny14-SEP-1977F117 West 42 nd StNew YorkNY CooperSheldon30-JUN-1980MIngres CorporationLevel 2, 426 Argello StRedwood CityCA KollwitzRolf22-DEC-1975MIBM Headquarters123 Mount View CrsAtlantic CityPN WongPenny13-NOV-1981FMing On Tower 1177 Moa Tzu Tung RdBeijingChina © 2011 Ingres Corporation
Automatic Compression Vectorized compression –Compressed on disk –Decompression for data processing in CPU cache –Compressed in RAM Column based compression with multiple algorithms –Automatically determined by VectorWise © 2011 Ingres Corporation
Compression Methods Run Length Encoding –Efficient if many duplicate adjacent tuple values are present –Such as in ordered columns with few unique values Patched Frame Of Reference –Encodes values as a small difference from a page-wide base value –PFOR is effective on any data distribution with some value distribution locality Delta encoding on top of PFOR –Integers are made smaller by considering the differences between subsequent values –Highly effective on ordered data PDICT dictionary encoding –Efficient in case the value distribution is dominated by a limited amount of very frequent values –Is currently the only one that applies to character data types © 2011 Ingres Corporation
Automatic Storage Indexes Stores min/max value per data block Automatically created Automatically maintained Enables efficient identification of candidate data blocks © 2011 Ingres Corporation
authpass IVW LOG User Interface (SQL, ABF, OpenROAD, JAVA, etc.) DAS Server (iigcd) Communications Server (iigcc) Archiver (iiacp) Recovery Server (iircp) Journals Ingres Transaction Log File VectorWise Instance Architecture Name Server (iigcn) User Interface (SQL, ABF, OpenROAD, etc.) DBMS Server (iidbms) LocksLog Buffers iix100 Server VectorWise Data Store IVW Memory IVW LOCK Databases © 2011 Ingres Corporation
Operating System Currently available on 64-bit Linux and Windows Runs on –RedHat –Fedora –CentOS –SuSE11 –Ubuntu –Works on other Linux flavours –Windows 2008 –Windows 7 © 2011 Ingres Corporation
Hardware Requirements Fast multi-core CPUs Memory –2 Gbytes for OS + IVW requirements + other apps –Minimum 8 Gbytes Disk –Lots © 2011 Ingres Corporation
And now live © 2011 Ingres Corporation