Download presentation
Presentation is loading. Please wait.
1
ISDA + OpenStack Rob Kooper
2
BrownDog
3
NSF ACI Data Program Geoffrey Fox $5,000,000 2014-2019 Ken Koedinger
Middleware and High Performance Analytics Libraries for Scalable Data Science Ken Koedinger $4,830,819 Building a Scalable Infrastructure for Data-Driven Discovery and Innovation in Education Kenton McHenry $10,519,716 Kenton McHenry $10,519,716 Alex Szalay $7,603,723 Long Term Access to Large Scientific Data Sets: The SkyServer and Beyond Michael Levine $4,902,601 The Data Exacell Xiaohui Carol Song $3,409,029 Integrating Geospatial Capabilities into HUBzero Reagan Moore $8,300,992 Steven Ruggles $7,993,266 Margaret Hedstrom $8,000,000 Margaret Hedstrom $8,000,000 Bill Michener $21,194,548 Golam Choudhury $10,085,120
4
“Big Data” At least two big components: Large quantities of data
Large varieties of data “Long-Tail” Number of grants Dollars
5
The Problem Addressed by Brown Dog
Large collections of un-curated and/or unstructured digital data (“long-tail” data) Many file formats No metadata No useful filenames No useful directory structure No textual contents
6
What Is Needed Means of deciphering the bytes that make up digital data so that one can retrieve its contents Data Structures (e.g. images, 3D points, sound waves, strings, fields, matrices, etc…) Means of indexing data contents so that large collections of data can be searched and desired data found An ability to compare data
7
What Is Typically Needed To Do This
The file format specifications describing how contents are represented within the file’s bytes, the software used to create and view the data, software to convert to a format that is accessible, and the execution environment (platform, operating system, libraries, other software, etc…). The existence of metadata describing the data (possibly as simple as useful file/directory names), in order to search/index data.
8
Clowder
9
Manage Raw Files and Derived Metadata
Image taken from camera
10
File Uploaded to DTS
11
Extracted Metadata in Web Interface
12
Extracted Metadata from Service API
13
RabbitMQ RabbitMQ vhost (clowder) Image Extractor clowder.ncsa Clowder
*.file.image.# ncsa.image.preview Image Extractor *.file.image.# *.file.image.# dts.ncsa DTS Faces Face Extractor *.file.image.# *.file.composed.zip Shape File imlczo.ncsa IMLCZO GeoExtractor GeoTiff File *.file.image.tiff Clowder instances Exchanges Bindings Queues Extractors
14
OpenStack
15
Projects 5 projects + 1 generalized project ISDA project for our group
Allows members to start/stop test instances General instances BrownDog Compute nodes + data nodes Other projects 1 compute node + 1 data node split into smaller pieces
16
Servers No more ISDA vm servers Use openstack to host server
Volumes store server information Use puppet to manage servers Easy to create instances Command line access to create server
17
Servers @ ISDA Currently using 134 monitored machines
12 physical machines 102 VM on XEN + ESXI 6 VM on openstack (will only increase)
18
Elasticity and BrownDog
RabbitMQ used for messages Every operation is a message A message queue for each operation Elasticity code monitors RabbitMQ Based on number of message start new instances If load below number of messages stop instances Elasticity code can start VM images Multiple instances of code in VM image Docker images
19
Throw Away Instances Many of the same instances
Used for running same software many times Clowder Extractors Clowder Tool instances Use CORE-OS with docker Pass in cloud-init to initialize instance At boot time download docker container and start All instances can be turned off and restarted
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.