Presentation is loading. Please wait.

Presentation is loading. Please wait.

Challenges and Solutions Will Schroeder, co-Founder, President VAC Big Data Consortium Meeting July 31, 2012.

Similar presentations


Presentation on theme: "Challenges and Solutions Will Schroeder, co-Founder, President VAC Big Data Consortium Meeting July 31, 2012."— Presentation transcript:

1 Challenges and Solutions Will Schroeder, co-Founder, President VAC Big Data Consortium Meeting July 31, 2012

2 Thanks

3 Big Data Architecture Platform Collaboration

4 Kitware, Inc. Open Source Scientific Computing Software Software Services

5 Kitware CMake CDash ParaView

6

7 Other Kitware Big Data Projects HPC -Simulation BioMedical Point Clouds Text & Documents Web: >8 billion indexed pages Kitware / VTK / Titan Electron Scanning Microscopy Connectome Resolution towards 100,000 2 x 10,000 Whole Slide Imaging / Digital Pathology Resolution at 100,000 2 x hundreds LIDAR Acquisition rates: > 200,000 pts/sec Kitware VTK / PCL / VES 3deling.com nimh.nih.gov Turbulent Flow /kitware ParaView 160,000 Computing Cores Argonne Intrepid

8 Columbus Large Image Format (CLIF) 2007 & 2006 315 204 8k x 8k tiled image (64 MP) Six cameras with 4k x 2.6k images 8-bit grayscale raw format Frame rate ~ 1.6Hz 15-30cm GSD Duration ~ 2.8 hrs (16117 frames) in 2007; ~1 hr in 2006 Metadata Camera configuration

9 SCALABLE ARCHITECTURES Data-Centric Computing Client-Server Co-Processing Mobile to Supercomputer Big Data Architecture Platform Collaboration

10

11 The Traditional Visualization Workflow is Breaking Down Image from Rob Ross, Argonne National Laboratory Solver Disk Storage Disk Storage Visualization Full Mesh

12 Small Example Simulation 40 million finite elements simulation File size: 3.2GB per time step 1000 time steps 100 time steps written to disk Visualization ParaView Quad-core Mac Pro with 12 GB memory IO: 240 secs Contour: 25 secs Slice: 7 secs

13 Issues IO vs. analysis time Reduced time accuracy in post-processing Data movement ORNL Jaguar 2.33 petaflops, 224,526 compute cores

14 Data-Centric Computing

15 ParaViewWeb

16 Co-Processing

17 Mobile to Supercomputer ParaView Kiwi / VES

18 PLATFORM Toolkits & Modularization Integration Software Licenses Big Data Architecture Platform Collaboration

19 Toolkits & Modularization

20 Integration Module 1 Module 2Module 3Module 2 (Python) Integration Glue

21 Software Licenses Early Reciprocal Licenses –Requires release of software combined with OS software –Generally discourages commercial collaboration –E.g., GPL Now Permissive Licenses –Few strings attached –Suitable for commercial collaboration –E.g., BSD, Apache, MIT

22 COLLABORATION Multi-view, Multi-control Test-Driven Development / Software processes Big Data Architecture Platform Collaboration

23 Multi-View, Multi-Control Collaboration ParaViewWeb

24 Software Repository Build, Test & Package Community Review Developers & Users

25

26 Scalable Architectures Agile, open platforms Robust, test-driven collaboration Summary Big Data Architecture Platform Collaboration

27

28 Scientists Publisher Journals Evolution Papers Peer-Review

29 If it’s not reproducible, it’s not Science Nullius in Verba “take nobody's word for it” Royal Society 1640

30 Nature (March 2012) –Glenn Begley, former head of cancer research at pharma giant Amgen –Lee M. Ellis, cancer researcher at the University of Texas Failure of Reproducibility Found that more than 90% of papers published in science journals describing "landmark" breakthroughs in preclinical cancer research, are not reproducible, and are thus just plain wrong.


Download ppt "Challenges and Solutions Will Schroeder, co-Founder, President VAC Big Data Consortium Meeting July 31, 2012."

Similar presentations


Ads by Google