Download presentation
Presentation is loading. Please wait.
1
SciDB Array Storage Mijung Kim 2/15/13
2
ArrayStore [Soroush et al. 2011]
ArrayStore: A Storage Manager for Complex Parallel Array Processing, Emad Soroush, Magdalena Balazinska, and Daniel Wang. SIGMOD'11, June 12-16, 2011, Athens, Greece
3
Array slicing and dicing using a large chunk
4
Array slicing and dicing using small chunks
5
Array slicing and dicing using two-level chunks
6
Join on Misaligned Chunks
7
Join on Re-partitioned Chunks
8
Join on Misaligned Chunks
9
kNN
10
kNN with Overlap Chunk on Two-Level Storage
Reduce I/O overhead!!
11
System catalog DB (Postgres)
SciDB Storage System catalog DB (Postgres) Header Files Transaction Log Files Data Files (Segments) Storage Header Version Segment usage # Chunks Etc. Array Meta Data ID, name, dimensions, attributes, etc. …
12
SciDB Pipelined Array Processing
A chunk is materialized into memory C11 A chunk at a time streamed into and out of operation Operation C11 C12 C23 …
13
Load Chunk Load In-memory chunk LRU-based In-memory cache Chunk …
Swap chunk Temp file Query Lookup DB chunk Add chunk to cache Chunk Map Array ID Chunk Storage address … Chunk Header DiskPos SegmentNo Offset … Read chunk Segments
14
Store Array SourceChunkIterator C11 C12 C23 … Source Array Copy chunk
Write chunk Segments C11 C12 C23 … RLEChunkIterator Empty Array (e.g., RLEBitmap)
15
Create Array CREATE ARRAY array_name <attributes> [dimensions]
Dimension [name=start:end,chunk_size,chunk_overlap] E.g., CREATE ARRAY m4x4 <val1:double,val2:int32> [x=0:3,4,0,y=0:3,4,0];
16
CP ALS
17
CP ALS A C A B B C
18
CP ALS A C A B B C
19
Distributed CP ALS Worker 1 A C A B B C Worker 3 Worker 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.