Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow Seminar November 20, 2002

Outline - Part II Motivation To test - what and why? Issues Results OCCI at CERN Summary & Conclusions

Motivation To understand possibilities and perils while dealing with C++ binding to RDBMS Relational solution to deal with bulk dataRelational solution to deal with bulk data -Still tables, tablespaces, extents, indices, views, triggers somewhere down there… Server side processing -Indices, complex queries, stored procedures Impedance mismatch -Is the OCCI natural way to go fast? -Are SQL99 types efficient enough to deal with objects in Oracle? Objects mapped onto tables - limitations, advantages  try to take the best of two worlds -Stay pure ( (psuedo)pure C++, Java, SQL, PLSQL, XML) ? -Or Mix (C++ Java SQL PLSQL XML) ?

What to test? Relational -OCCI Relational access pattern copied from DAO/ODBC/JDBC -We call some form of SQL, resultsets as output -No mapping of objects on the client, some objects may partake in the queries on the server though vs. Associative Objects returned from queries as resultsets -Iteration over results from queries or stored procedures -Logic identical to Relational… vs. Navigational mode -Similar to OODB, access to “roots” table or one root object, then by Refs In a real world we should consider mixing Associative with Navigational mode to get optimal performance!!! -(application dependent rethink 100x first!!!)

Navigational mode architectural understanding, aims To optimize client’s side: limit number of roundtrips by: Cache usage Prefetching Implicit pinning/unpinning of objects Conscious commit/flush policy To optimize server side performance Low level layout of objects -Tablespaces, Datafiles (manual, automatic, size of extents, striping, index separtion) - scope of Refs (local index addressing), REF constraints Associative access vs. navigational -preselection on the server with optimised SQL queries vs. navigation optimisation

Test case Navigational access using Ring of nodes structure: InnerNode objects with embedded varray of numbers in a cyclic linked list Easy to model worst case scenario Cache size influence Embedded dynamic array access Non native Oracle Number type performance, casting impact Prefetching affect Small server side cache Aim: To test raw IO performance within HEP like model in OCCI -Model simplified during initial test with 9.0 due to numerous bugs…

Cache impact Cache size: Impacts how many objects can stay in memory without roundtrip to server Issue: Embedded objects do not count! -Problem with memory usage if embedded objects tend to be big – application may occupy much more memory space then wanted -Calculate cache size for “root” objects or use REFs instead Cache size calculation Max cache size, optimal cache size (8MB) -Max size: How many object may stay resident until they are flushed to server -Optimal size: how much memory for objects should have stayed after hitting max limit

Lost memories… Since only cache objects are garbage collected deallocation of embedded objects is user responsibility! Lack of custom deallocation code may cause huge leaks!!! Always bear in mind that cache_size != memory occupied One of memory deallocation problems in version 9.0 Fixed in 9.2…

Results, approach Environment Server -Solaris 2.8, with Oracle 9.2.0.0 0 – Beta. -Sun 280R with 2x750MHz CPUs, 1GB RAM. -Storage: EMC CLARiiON FC4500 system, RAID5, with 2 storage processors, 512 MB of cache each. 14 disks. Used 4 disks for striped tablespace. -Raw OS writing is 50MB/s and 33MB/s for reading roughly. 1 to 9 Clients running on 1 to 3 sibling Suns.

Results, approach, numbers Tests methodology Test conducted for various settings for -Maximum cache size: 5%, 15%, 30%, 50%, 75%, over 100% of all resident objects’ size -Size of ring between 10 and 100K -Size of embedded number array between 10 and 5k Maximal size of objects per test case ~25 Millions repeated 5-10 times per test case on average All in all hundreds of billions of objects creations/traversals/deletions, some days of test runs Results gathered in the database, tens of queries, tens of multidimensional plots -http://knienart.home.cern.ch/knienart/OCCIResults/OCCITests9.2.mdb

Versions speed comparison, figure

Server’s vs. client’s side processing, fig.

Cache speed, fig.

Write speed, fig

Read speed, fig

Read to write ratio, fig.

Cache size consideration…

Worlds in collision… Very often one must know database configuration/limitations to avoid bottleneck while trying to optimise Example: Short vs. long objects If size of Embedd_object < max_small save in same block/extent Else save in separate overflow area Optimisation turns out to be decelerator (for write, accelerator for read… see Read to Write ratio fig) Hint: Always differentiate between REFs and embedded objects access pattern…

CERN Applied OCCI Compass migration Bulk inserts using OCCI relational mode Framework of dozens distributed OCCI clients managed by single manager, hot statistics, framework state in DB, using MPI for management/coordination -OCCI pros: Exception handling, easily embedded SQL, server side processing, easy persistency layer separation for experiments 5 billion rows… Conditions DB Transparent OCCI access without changes to users’ API Linux RAC

Some Linux RAC results… Interesting study of how distributed application performance is dependent on an underlying hardware: No network separation for RAC messages and data flow Single shared drive pool for multiple RAC nodes File block level coherency thanks to Sistina distributed FS 20-150 Clients running from 10 nodes using MPI synchronization for controlled DB stressing Stress put on ingesting, inserted up to 200*10 6 records with few KB sized BLOBs Results: Usually 5-8 times slower than single instance (stable though – no significant degradation when number of clients processes increased from i.e. 20 to 50)

Performance related work OCCI could be used as a convenient front end for NTUPLE analysis NTUPLES analysis using SQL Multidimensional queries optimisation -Using bitmap indices, function indicies, materialized views, etc comparison with Root -Automatic creation of schema for Oracle (evolution of software written for Alpha++) -Automatic import of Root NTuples into Oracle -Root speed tests -Database tests, refer to: -http://knienart.home.cern.ch/R2O/http://knienart.home.cern.ch/R2O/

Summary OCCI is quite effective if general rules are obeyed: Use cache carefully Be conscious what happens with memory Understand server side clues Preselect on the server if you can Server side speed can be achieved if we do not kill performance with roundtrips Physical DB design is taken into account during app analysis/design Bound to Oracle only, no source code available whatsoever… Is C++ the future to keep objects anyway? B&V say Java… Benchmarks Were good for learning of Oracle internals And for optimisation of performance And for finding jeopardy while designing OCCI application Learnt: There is almost always way to optimise by tuning server without changes in the OCCI code Raw performance worse but comparable to OODB, huge speedups possible using old nifty RDB techniques (indices etc) Speed can change dramatically even with a minor version change! Oracle types are not C++ types – OCCI conversion is pretty expensive - 10i?

Tools: mpatrol strace Inescapable for the first releases of OCCI…

Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Similar presentations

Presentation on theme: "Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow."— Presentation transcript:

Similar presentations

About project

Feedback

Войти

Auth with social network:

Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow.

Similar presentations

Presentation on theme: "Evaluation of the C++ binding to the Oracle Database System Performance Issues and Benchmarks Dirk Geppert and Krzysztof Nienartowicz, IT/DB CERN IT Fellow."— Presentation transcript:

Similar presentations

About project

Feedback