SALSASALSASALSASALSA Digital Science Center June 25, 2010, IIT Geoffrey Fox Judy Qiu School.

Slides:



Advertisements
Similar presentations
FutureGrid and US Cyberinfrastructure Collaboration with EU Symposium on transatlantic EU-U.S. cooperation in the field of large scale research infrastructures.
Advertisements

FutureGrid Overview NSF PI Science of Cloud Workshop Washington DC March Geoffrey Fox
Cosmic Issues and Analysis of External Comments on FutureGrid TG11 Salt Lake City July Geoffrey Fox
Future Grid Introduction March MAGIC Meeting Gregor von Laszewski Community Grids Laboratory, Digital Science.
SALSASALSASALSASALSA Using MapReduce Technologies in Bioinformatics and Medical Informatics Computing for Systems and Computational Biology Workshop SC09.
Student Visits August Geoffrey Fox
Clouds Cyberinfrastructure and Collaboration CTS2010 Chicago IL May Geoffrey Fox
Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.
Overview Presented at OGF31 Salt Lake City, July 2011 Geoffrey Fox, Gregor von Laszewski, Renato Figueiredo Contact:
Dimension Reduction and Visualization of Large High-Dimensional Data via Interpolation Seung-Hee Bae, Jong Youl Choi, Judy Qiu, and Geoffrey Fox School.
FutureGrid Summary TG’10 Pittsburgh BOF on New Compute Systems in the TeraGrid Pipeline August Geoffrey Fox
Panel Session The Challenges at the Interface of Life Sciences and Cyberinfrastructure and how should we tackle them? Chris Johnson, Geoffrey Fox, Shantenu.
1 Challenges Facing Modeling and Simulation in HPC Environments Panel remarks ECMS Multiconference HPCS 2008 Nicosia Cyprus June Geoffrey Fox Community.
SC2010 Gregor von Laszewski (*) (*) Assistant Director of Cloud Computing, CGL, Pervasive Technology Institute.
FutureGrid Summary FutureGrid User Advisory Board TG’10 Pittsburgh August Geoffrey Fox
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
SCSI: Platforms & Foundations: Cyberinfrastructure Socially Coupled Systems & Informatics: Science, Computing & Decision Making in a Complex Interdependent.
FutureGrid Overview David Hancock HPC Manger Indiana University.
FutureGrid: an experimental, high-performance grid testbed Craig Stewart Executive Director, Pervasive Technology Institute Indiana University
FutureGrid: an experimental, high-performance grid testbed Craig Stewart Executive Director, Pervasive Technology Institute Indiana University
Clouds and FutureGrid MSI-CIEC All Hands Meeting SDSC January Geoffrey Fox
FutureGrid Overview CTS Conference 2011 Philadelphia May Geoffrey Fox
Cloud Data mining and FutureGrid SC10 New Orleans LA AIST Booth November Geoffrey Fox
SALSASALSASALSASALSA AOGS, Singapore, August 11-14, 2009 Geoffrey Fox 1,2 and Marlon Pierce 1
Science in Clouds SALSA Team salsaweb/salsa Community Grids Laboratory, Digital Science Center Pervasive Technology Institute Indiana University.
Overview of Cyberinfrastructure Northeastern Illinois University Cyberinfrastructure Day August Geoffrey Fox
FutureGrid SOIC Lightning Talk February Geoffrey Fox
Science of Cloud Computing Panel Cloud2011 Washington DC July Geoffrey Fox
FutureGrid and US Cyberinfrastructure Collaboration with EU Symposium on transatlantic EU-U.S. cooperation in the field of large scale research infrastructures.
Experimenting with FutureGrid CloudCom 2010 Conference Indianapolis December Geoffrey Fox
Science Clouds and FutureGrid’s Perspective June Science Clouds Workshop HPDC 2012 Delft Geoffrey Fox
FutureGrid Overview Geoffrey Fox
FutureGrid: an experimental, high-performance grid testbed Craig Stewart Executive Director, Pervasive Technology Institute Indiana University
FutureGrid TeraGrid Science Advisory Board San Diego CA July Geoffrey Fox
FutureGrid Design and Implementation of a National Grid Test-Bed David Hancock – HPC Manager - Indiana University Hardware & Network.
Future Grid FutureGrid Overview Dr. Speaker. Future Grid Future GridFutureGridFutureGrid The goal of FutureGrid is to support the research on the future.
FutureGrid Overview Geoffrey Fox
FutureGrid: an experimental, high-performance grid testbed Craig Stewart Executive Director, Pervasive Technology Institute Indiana University
What’s Hot in Clouds? Analyze (superficially) the ~140 Papers/Short papers/Workshops/Posters/Demos in CloudCom Each paper may fall in more than one category.
Future Grid FutureGrid Overview Geoffrey Fox SC09 November
FutureGrid Overview Geoffrey Fox
FutureGrid SC10 New Orleans LA IU Booth November Geoffrey Fox
Parallel Applications And Tools For Cloud Computing Environments Azure MapReduce Large-scale PageRank with Twister Twister BLAST Thilina Gunarathne, Stephen.
FutureGrid Overview Geoffrey Fox
Future Grid Future Grid All Hands Meeting Introduction Indianapolis October Geoffrey Fox
FutureGrid SOIC Lightning Talk February Geoffrey Fox
FutureGrid Cyberinfrastructure for Computational Research.
Building Effective CyberGIS: FutureGrid Marlon Pierce, Geoffrey Fox Indiana University.
RAIN: A system to Dynamically Generate & Provision Images on Bare Metal by Application Users Presented by Gregor von Laszewski Authors: Javier Diaz, Gregor.
SALSASALSASALSASALSA FutureGrid Venus-C June Geoffrey Fox
Research in Grids and Clouds and FutureGrid Melbourne University September Geoffrey Fox
FutureGrid TeraGrid Science Advisory Board San Diego CA July Geoffrey Fox
FutureGrid Overview Geoffrey Fox
Bio Gregor von Laszewski is conducting state-of-the-art work in Cloud computing and GreenIT at Indiana University as part of the Future Grid project. During.
Tutorial Presented at TG2011 Geoffrey Fox, Gregor von Laszewski, Renato Figueiredo, Kate Keahey, Andrew Younge Contact:
FutureGrid BOF Overview TG 11 Salt Lake City July Geoffrey Fox
FutureGrid NSF September Geoffrey Fox
Computing Research Testbeds as a Service: Supporting large scale Experiments and Testing SC12 Birds of a Feather November.
Future Grid Future Grid Overview. Future Grid Future GridFutureGridFutureGrid The goal of FutureGrid is to support the research that will invent the future.
SALSASALSASALSASALSA Digital Science Center February 12, 2010, Bloomington Geoffrey Fox Judy Qiu
Bioinformatics on Cloud Cyberinfrastructure Bio-IT April Geoffrey Fox
Lizhe Wang, Gregor von Laszewski, Jai Dayal, Thomas R. Furlani
Private Public FG Network NID: Network Impairment Device
Our Objectives Explore the applicability of Microsoft technologies to real world scientific domains with a focus on data intensive applications Expect.
FutureGrid: a Grid Testbed
Biology MDS and Clustering Results
Tutorial Overview February 2017
FutureGrid and Applications
Gregor von Laszewski Indiana University
PolarGrid and FutureGrid
Presentation transcript:

SALSASALSASALSASALSA Digital Science Center June 25, 2010, IIT Geoffrey Fox Judy Qiu School of Informatics and Computing and Community Grids Laboratory, Digital Science Center Pervasive Technology Institute Indiana University

SALSASALSA PTI Activities in Digital Science Center Community Grids Laboratory led by Fox – Gregor von Lazewski: FutureGrid architect, GreenIT – Marlon Pierce: Grids, Services, Portals including Earthquake Science, Chemistry and Polar Science applications – Judy Qiu: Multicore and Data Intensive Computing (Cyberinfrastructure) including Biology and Cheminformatics applications Open Software Laboratory led by Andrew Lumsdaine – Software like MPI, Scientific Computing Environments – Parallel Graph Algorithms Complex Networks and Systems led by Alex Vespignani – Very successful H1N1 spread simulations run on Big Red – Can be extended to other epidemics and to “critical infrastructure” simulations such as transportation

SALSASALSA FutureGrid Concepts Support development of new applications and new middleware using Cloud, Grid and Parallel computing (Nimbus, Eucalyptus, Hadoop, Globus, Unicore, MPI, OpenMP. Linux, Windows …) looking at functionality, interoperability, performance Put the “science” back in the computer science of grid computing by enabling replicable experiments Open source software built around Moab/xCAT to support dynamic provisioning from Cloud to HPC environment, Linux to Windows ….. with monitoring, benchmarks and support of important existing middleware June 2010 Initial users; September 2010 All hardware (except IU shared memory system) accepted and significant use starts; October 2011 FutureGrid allocatable via TeraGrid process

SALSASALSA FutureGrid: a Grid/Cloud Testbed IU Cray operational, IU IBM (iDataPlex) completed stability test May 6 UCSD IBM operational, UF IBM stability test completed June 12 Network, NID and PU HTC system operational UC IBM stability test completed June 7; TACC Dell awaiting completion of installation NID : Network Impairment Device Private Public FG Network

SALSASALSA FutureGrid Partners Indiana University (Architecture, core software, Support) – Collaboration between research and infrastructure groups Purdue University (HTC Hardware) San Diego Supercomputer Center at University of California San Diego (INCA, Monitoring) University of Chicago/Argonne National Labs (Nimbus) University of Florida (ViNE, Education and Outreach) University of Southern California Information Sciences (Pegasus to manage experiments) University of Tennessee Knoxville (Benchmarking) University of Texas at Austin/Texas Advanced Computing Center (Portal) University of Virginia (OGF, Advisory Board and allocation) Center for Information Services and GWT-TUD from Technische Universtität Dresden. (VAMPIR) Red institutions have FutureGrid hardware 5

SALSASALSA 6

SALSASALSA Biology MDS and Clustering Results Alu Families This visualizes results of Alu repeats from Chimpanzee and Human Genomes. Young families (green, yellow) are seen as tight clusters. This is projection of MDS dimension reduction to 3D of repeats – each with about 400 base pairs Metagenomics This visualizes results of dimension reduction to 3D of gene sequences from an environmental sample. The many different genes are classified by clustering algorithm and visualized by MDS dimension reduction

SALSASALSA High Performance Data Visualization Developed parallel MDS and GTM algorithm to visualize large and high-dimensional data Processed 0.1 million PubChem data having 166 dimensions Parallel interpolation can process up to 60M PubChem points MDS for 100k PubChem data 100k PubChem data having 166 dimensions are visualized in 3D space. Colors represent 2 clusters separated by their structural proximity. GTM for 930k genes and diseases Genes (green color) and diseases (others) are plotted in 3D space, aiming at finding cause-and-effect relationships. GTM with interpolation for 2M PubChem data 2M PubChem data is plotted in 3D with GTM interpolation approach. Red points are 100k sampled data and blue points are 4M interpolated points. PubChem project,

SALSASALSA Hadoop/Dryad Comparison Inhomogeneous Data I Inhomogeneity of data does not have a significant effect when the sequence lengths are randomly distributed Dryad with Windows HPCS compared to Hadoop with Linux RHEL on Idataplex (32 nodes)

SALSASALSA AzureMapReduce

SALSASALSA Early Results with AzureMapreduce Compare Hadoop ms Hadoop VM ms DryadLINQ ms Windows MPI ms