Presentation is loading. Please wait.

Presentation is loading. Please wait.

Maximizing Performance With Informix TimeSeries

Similar presentations


Presentation on theme: "Maximizing Performance With Informix TimeSeries"— Presentation transcript:

1 Maximizing Performance With Informix TimeSeries
Jeffrey McMahon Session E03 IBM Mon/Apr 23/1:05 4/17/20184/15/12 Session Z99

2 Agenda Create basic time series data in your database.
Use container pooling to spread container usage for optimal i/o. Best practices when loading time series data. Best practices when purging time series data. Monitoring loads and purges to ensure maximum throughput. 4/17/20184/15/12 Session Z99

3 TimeSeries Table What does a TimeSeries table look like? Series
Meter_id Series 1 [( :00, value 1, value 2, …, value N), ( :15, value 1, value 2, …, value N), …] 2 3 4 Table grows 4/17/20184/15/12 Session Z99

4 Building a table with TimeSeries
Calendartable Create a calendar of 15 minute intervals: insert into calendartable (c_name, c_calendar) values ( ‘interval_15min_gmt‘, --calendar name ‘startdate( :00: ), pattern({1 on, 14 off }, minute)’ ); 4/17/20184/15/12 4/17/20184/15/12 Session Z99 4

5 Building a table with TimeSeries
Create Rowtype Create a row type to hold your interval data: create row type vee_interval_type ( reading_dt datetime year to fraction (5), <- must be fraction(5)! reading_flag smallint, reading_value decimal(14,3), indicator smallint, code char(40) ); 4/17/20184/15/12 4/17/20184/15/12 Session Z99 5

6 Building a table with TimeSeries
Create Table Create a table with a TimeSeries column: create table vee_interval_table ( meter_id bigint, reading_type char(10), measuring_unit char(10), vee_interval_ts TimeSeries(vee_interval_type), primary key (meter_id) ); 4/17/20184/15/12 4/17/20184/15/12 Session Z99 6

7 Building a table with TimeSeries
TSContainerCreate Create a “container” to hold the TimeSeries interval data: execute procedure tscontainercreate ( 'container1', container name 'rootdbs', dbspace (rootdbs usually isn’t used!) 'vee_interval_type', -- TimeSeries rowtype 10000, first, next extent (KB) ); 4/17/20184/15/12 4/17/20184/15/12 Session Z99 7

8 Building a table with TimeSeries
TSCreate Add a row to the table: insert into vee_interval_table values ( 1, meter id (primary key) 'dmd', reading type 'kwh', unit of measurement (kwh) TSCreate( 'interval_15min_gmt', cal_name ' :00: ', --date for the earliest element possible 0, threshold (0 = all data in container) 0, zero 0, nelems (in-row space to preallocate) 'container1' ) --container name to hold this timeseries data ); 4/17/20184/15/12 4/17/20184/15/12 Session Z99 8

9 Building a table with TimeSeries
TSContainerSetPool Create a pool of containers called meter_pool. This is used to automatically assigned a container to a TimeSeries (individual containers must already exist): execute procedure tscontainersetpool( 'container1', 'meter_pool' ); execute procedure tscontainersetpool( 'container2', 'meter_pool' ); execute procedure tscontainersetpool( 'container3', 'meter_pool' ); 4/17/20184/15/12 4/17/20184/15/12 Session Z99 9

10 Building a table with TimeSeries
TSContainerSetPool Create a pool of containers called meter_pool execute procedure tscontainersetpool( 'container1', 'meter_pool' ); execute procedure tscontainersetpool( 'container2', 'meter_pool' ); execute procedure tscontainersetpool( 'container3', 'meter_pool' ); Pools can be used to uniformly distribute TimeSeries across multiple containers 4/17/20184/15/12 Session Z99

11 TSContainerPoolRoundRobin
Building a table with TimeSeries TSContainerPoolRoundRobin Automatically assign containers to TimeSeries in Round-Robin order using a pool: insert into vee_interval_table values ( 2, 'dmd', 'kwh', TSCreate( 'interval_15min_gmt', ' :00: ', 0, 0, 0, tscontainerpoolroundrobin( 'vee_interval_table', 'vee_interval_ts', 'vee_interval_type', 0, 'meter_pool') ) ); This works well only if each time series grows at about the same rate Stock market data would not work well with pools Better to assign time series to containers using custom logic for these cases 4/17/20184/15/12 4/17/20184/15/12 Session Z99 11

12 Spreading the TimeSeries
Building a table with TimeSeries Spreading the TimeSeries SELECT meter_id, vee_interval_ts From vee_interval_table meter_id vee_interval_ts origin( :00: ), calendar(interval_15min_gmt), container(container1), threshold(0), regular, [] meter_id vee_interval_ts origin( :00: ), calendar(interval_15min_gmt), container(container2), meter_id vee_interval_ts origin( :00: ), calendar(interval_15min_gmt), container(container3), 4/17/20184/15/12 4/17/20184/15/12 Session Z99 12

13 vee_interval_table Table
Result vee_interval_table Table Each Container should be placed on a separate disk meter_id vee_interval_ts (int) timeseries(mtr_data) Container1 1 2 3 4 Container2 5 6 7 Container3 8 4/17/20184/15/12 Session Z99

14 What a Container Looks Like
Each time series has a unique ID generated internally For regular TimeSeries: This ID plus the offset is used to search the btree. For irregular TimeSeries: This ID plus the timestamp is used to search the btree BTREE Each data page holds sorted data for exactly one time series MTR1 Jan 1 MTR1 Mar 3 MTR4 Jan 1 MTR7 Jan 1 MTR10 Jan 1 MTR13 Jan 1 Data Pages: Data for MTR1 on Jan 1, 2, 3, 4, 5, … Data for MTR1 on Mar 3, 4, 5, 6, 7, …. Data for MTR4 on Jan 1, 2, 3, 4, 5, … Data for MTR7 on Jan 1, 2, 3, 4, 5, … Data for MTR10 on Jan 1, 2, 3, 4, 5, … Data for MTR13 on Jan 1, 2, 3, 4, 5, … 4/17/20184/15/12 Session Z99

15 Loading TimeSeries The key to good load performance is I/O parallelism
Use multiple containers for your TimeSeries data Run multiple loaders in parallel Don’t allow two loaders to load the same container Need to balance number of loaders with number of cpus and (virtual) disks 4/17/20184/15/12 Session Z99

16 Loading TimeSeries Other Considerations:
You will most likely have to preprocess the input data for best performance: Pre-sort the data by ascending time and group it by primary key Create “N” input files for each loader process to load Insure all the data for a particular container is assigned to the same loader process Create TimeSeries with threshold(0) to insure all time series data is loaded into containers and not into the home row Note: Inserting data into a TimeSeries virtual table is slow! 4/17/20184/15/12 Session Z99

17 Loading TimeSeries Ensure one loader per container.
If base table is fragmented, you can create containers per table fragment Possibly store the container id in the home row and use it in the loads and queries “where container_id == X” Otherwise, you can use GetContainerName() UDR to determine which TimeSeries are assigned a container 4/17/20184/15/12 Session Z99

18 Loading TimeSeries TimeSeries Loader API is the fastest way to load TimeSeries data (These names are about to change!) Init Put builds a 32k input buffer Flush flushes current buffer Close Shutdown Requires you pre-sort and pre-group data into files as mentioned previously 4/17/20184/15/12 Session Z99

19 The End Result vee_interval_table Loader Loader Loader Loader
Data sorted by time, grouped by primary key vee_interval_table Loader Container1 Unsorted, ungrouped data Loader Container2 Loader Container3 Loader Container4 4/17/20184/15/12 Session Z99

20 How it Works TSBinLoad_Flush() Btree for Container1 32k Buffer
… | (id A, Time X+7, values) | (Id A, Time X+8, values) | (Id B, Time X, values) | … 32K buffer holds consecutive records grouped by TimeSeries ID and sorted by ascending time or offset (Time X, Values) (Time X+1, Values) (Time X+2, Values) (Time X+2, Values) (Time X+3, Values) (Time X+4, Values) (Time X+5, Values) (Time X+6, Values) Data Page for Id A, Container1 Data Page for Id B, Container1 4/17/20184/15/12 Session Z99

21 IBM AMT-Sybex Benchmark
263 4/17/20184/15/12 4/17/20184/15/12 Session Z99 21

22 IBM AMT-Sybex Benchmark
263 4/17/20184/15/12 4/17/20184/15/12 Session Z99 22

23 Purging TimeSeries DelClip()
Deletes elements but leaves page allocated DelTrim() Deletes elements and reclaims space only if deleting data at end of TimeSeries DelRange() Deletes elements and reclaims space at any location in the TimeSeries 4/17/20184/15/12 4/17/20184/15/12 Session Z99 23

24 Purge Details Purge is very similar to load Future work
Key to success is parallelism Run multiple purge operations Never run two purges on the same container at the same time Future work Attach/detach container partitions 4/17/20184/15/12 Session Z99

25 Container Usage Calculate speed of container usage during loads and purges TSContainerUsage() TSContainerTotalUsed() TSContainerTotalPages() execute function tscontainerusage('container1'); pages slots total 4/17/20184/15/12 4/17/20184/15/12 Session Z99 25

26 UDR Cache Try to achieve one udr per list PC_HASHSIZE / PC_POOLSIZE
onstat -g cac prc list# id ref_cnt dropped? heap_ptr udr name eb4f838 d ed02038 c467c38 dd47038 d52a838 eb50838 4/17/20184/15/12 4/17/20184/15/12 Session Z99 26

27 Preload Shared Libraries
PRELOAD_DLL_FILE onconfig parameter PRELOAD_DLL_FILE $INFORMIXDIR/extend/TimeSeries.5.00.FC3/TimeSeries.bld PRELOAD_DLL_FILE $INFORMIXDIR/extend/TimeSeries.5.00.FC3/tsbloader.bld 4/17/20184/15/12 4/17/20184/15/12 Session Z99 27

28 Questions?!? 4/17/20184/15/12 4/17/20184/15/12 Session Z99 28

29 Maximizing Performance with Informix TimeSeries
Jeffrey McMahon 4/17/20184/15/12 Session Z99


Download ppt "Maximizing Performance With Informix TimeSeries"

Similar presentations


Ads by Google