I590 Data Science Curriculum August

Slides:



Advertisements
Similar presentations
Big Data Open Source Software and Projects ABDS in Summary XIV: Level 14B I590 Data Science Curriculum August Geoffrey Fox
Advertisements

Big Data Open Source Software and Projects ABDS in Summary I I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XIX: Layer 14B Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XVI: Layer 13 Part 1 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary II: Layers 3 to 4 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XXII: Layer 15B Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XVII: Layer 13 Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary IV: Layer 5 Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XIII: Level 14A I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary VI: Layer 6 Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary VII: Level 10 I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XXI: Layer 15B Part 1 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary VI: Level 9 I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XII: Level 13 I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary III: Layer 5-Part 1 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XIV: Layer 11C Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary IX: Level 11C I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects Unit 0 Part B: Class Introduction Data Science Curriculum March Geoffrey Fox
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
Distributed Systems Fall 2014 Zubair Amjad. Outline Motivation What is Sqoop? How Sqoop works? Sqoop Architecture Import Export Sqoop Connectors Sqoop.
BIG DATA APPLICATIONS & ANALYTICS LOOKING AT INDIVIDUAL HPCABDS SOFTWARE LAYERS 1/26/2015 Cloud Computing Software 1 Geoffrey Fox January BigDat.
Big Data Open Source Software and Projects ABDS in Summary I: Layers 1 to 2 Data Science Curriculum March Geoffrey Fox
FutureGrid Connection to Comet Testbed and On Ramp as a Service Geoffrey Fox Indiana University Infra structure.
Big Data Open Source Software and Projects ABDS in Summary XVIII: Layer 14A Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary IV: Level 7 I590 Data Science Curriculum August Geoffrey Fox
Recipes for Success with Big Data using FutureGrid Cloudmesh SDSC Exhibit Booth New Orleans Convention Center November Geoffrey Fox, Gregor von.
Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XII: Level 13 I590 Data Science Curriculum August Geoffrey Fox
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Panel Discussion Software Defined Ecosystems June BigSystem Software-Defined Ecosystems at HPDC Vancouver Canada Geoffrey Fox.
Big Data Open Source Software and Projects ABDS in Summary II: Layer 5 I590 Data Science Curriculum August Geoffrey Fox
Big Data Workshop Summary Virtual School for Computational Science and Engineering July Geoffrey Fox
1 Panel on Merge or Split: Mutual Influence between Big Data and HPC Techniques IEEE International Workshop on High-Performance Big Data Computing In conjunction.
OMOP CDM on Hadoop Reference Architecture
Hyungro Lee, Geoffrey C. Fox
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Big Data A Quick Review on Analytical Tools
Status and Challenges: January 2017
Platform as a Service.
NSF start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Indiana University.
In-Memory Performance
Distinguishing Parallel and Distributed Computing Performance
Some Remarks for Cloud Forward Internet2 Workshop
NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.
Department of Intelligent Systems Engineering
Digital Science Center I
I590 Data Science Curriculum August
Applications SPIDAL MIDAS ABDS
Data Science Curriculum March
Tutorial Overview February 2017
Data Science for Life Sciences Research & the Public Good
Cloud DIKW based on HPC-ABDS to integrate streaming and batch Big Data
Clouds from FutureGrid’s Perspective
Big Data Young Lee BUS 550.
Database Software.
Services, Security, and Privacy in Cloud Computing
Big Data Open Source Software and Projects ABDS in Summary I
Department of Intelligent Systems Engineering
$1M a year for 5 years; 7 institutions Active:
ODBC and JDBC.
Panel on Research Challenges in Big Data
Cloud versus Cloud: How Will Cloud Computing Shape Our World?
Big Data, Simulations and HPC Convergence
Microsoft Azure Services Platform
Convergence of Big Data and Extreme Computing
Twister2 for BDEC2 Poznan, Poland Geoffrey Fox, May 15,
Presentation transcript:

I590 Data Science Curriculum August 15 2014 Big Data Open Source Software and Projects ABDS in Summary IX: Level 11C  I590 Data Science Curriculum August 15 2014 Geoffrey Fox gcf@indiana.edu http://www.infomall.org School of Informatics and Computing Digital Science Center Indiana University Bloomington

HPC-ABDS Layers Message Protocols Distributed Coordination: Here are 17 functionalities. Technologies are presented in this order 4 Cross cutting at top 13 in order of layered diagram starting at bottom Message Protocols Distributed Coordination: Security & Privacy: Monitoring: IaaS Management from HPC to hypervisors: DevOps: Interoperability: File systems: Cluster Resource Management: Data Transport: SQL / NoSQL / File management: In-memory databases&caches / Object-relational mapping / Extraction Tools Inter process communication Collectives, point-to-point, publish-subscribe Basic Programming model and runtime, SPMD, Streaming, MapReduce, MPI: High level Programming: Application and Analytics: Workflow-Orchestration: SQL Technologies

Apache Derby http://db.apache.org/derby/ Apache Derby is a relational database management system written in Java and based on the SQL and JDBC standards Derby offers a small footprint (~2.6 megabytes), an embedded JDBC driver, and is easy to deploy and use. Derby originated in 1996 as a startup out of Oakland, CA called Cloudscape Inc. Cloudscape was acquired by Informix and then later by IBM. IBM donated the code to Apache in 2004, creating the Derby incubator project. Derby is a subproject of Apache DB. Derby has been included as part of the Java API since the Java 7 release, rebranded as “JavaDB”. Typically used as an embedded database. Performance not competitive as a standalone system.

Public Cloud SQL as a Service Provides traditional databases as a service on clouds Azure SQL Service http://msdn.microsoft.com/en-us/library/azure/dn133151.aspx Google Cloud SQL https://developers.google.com/cloud-sql/ Amazon Relational Database Service (Amazon RDS) http://aws.amazon.com/rds/ with MySQL, PostgreSQL, Oracle and SQL Server