Data Clustering Research in CMS Koen Holtman CERN/CMS Eindhoven University of technology CHEP ’2000 Feb 7-11, 2000
Introduction 3-year Ph.D. project on ‘prototyping of CMS storage management’ –Focus on disk/tape based physics analysis, objects >=1 KB –Focus on scalability and (re)clustering Clustering: placement of object data on physical storage media (disk, tape) Reclustering: rearranging the clustering –Clustering and reclustering are not specific to Objectivity or Object databases
I/O Risks for LHC 1 Risk of insufficient scalability –I/O scalability issues have been studied (240 clients, 172 MB/s) –Structure data into chunks (runs), use chunk-level subjobs –‘private’ DBs –Need a read-ahead optimization: ‘bursty sequential reading’
I/O risks for LHC 2 Risk of insufficient I/O performance –MB/s needs for interactive physics analysis??? –MB/s in 2005: GB/s sequential I/O on CERN CMS disk farm –Random I/O can be factor slower! –Well-understood Clustering is important –Subdetector clustering
HEP problem Main HEP problem: increasing selectivity over time, degrades performance to that of random reading Well under- stood by now Solution: recluster ‘by hand’ (DSTs) –By hand is good enough?? Issues: consistency, # of users, space, effort, on-demand reconstruction Research on automatic reclustering
Disk reclustering Developed on the fly + batch reclustering –Dynamically recluster data based on observing new access patterns –Implemented as ‘object store’ class Keeps I/O efficiency on disk good enough, automatically Supports on-demand reconstruction Scaling…..
Tape (re)clustering 1 Clustering on tape: HENP GC Cache filtering and chunk reclustering in a multiuser analysis system with disk and tape
Tape (re)clustering 2 Cache filtering yields factor 1-50 performance gain depending on workload parameters Compensates to some extent for low clustering efficiency on tape Chunk reclustering does not seem attractive, only performance gains for very small disk farm sizes So risks remain large Extension path... With cache filtering No cache filtering
Conclusions Existing practice: –Chunks/runs, subjobs, sequential access, subdetector based clustering In this project: –Validated existing practice, detailed investigation of disk performance, scalablility, read-ahead, disk reclustering, disk+tape system with cache filtering Remaining risks: –Don’t know how much I/O needed, clustering efficiency on tape, WAN issues –To investigate systems with large caching effects: access patterns needed Design for large parameter space through simulation
Access patterns Never know enough about access patterns –Known: object sizes, increasing selectivity, full reconstruction –We don’t know much about: user-level physics analysis In systems with large caching effects, these parameters have a large effect on performance Performance of a tape+disk based analysis system for various workload parameters –Strategy: design over large parameter space (simulation) –Strategy: investigate parameters and their importance