1 Challenges and New Trends in Data Intensive Science Panel at Data-aware Distributed Computing (DADC) Workshop HPDC Boston June 24 2008 Geoffrey Fox Community.

Slides:

Advertisements

Similar presentations

Supporting Research on Campus - Using Cyberinfrastructure (CI) Public research use of ICT has rapidly increased in the past decade, requiring high performance.

Advertisements

O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY Center for Computational Sciences Cray X1 and Black Widow at ORNL Center for Computational.

The Time Horizons of the R&D Activities of the Business Groups and of Corporate Technology are Different A seamless transition from R&D in Corporate Technology.

U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003

Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.

Funding Opportunities at NSF Jane Silverthorne International Arabidopsis Consortium Workshop January 15, 2011.

1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,

IMPECS Workshop on Social Computing Niloy Ganguly & Krishna Gummadi.

1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,

Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.

Riding the Wave: a Perspective for Today and the Future APA Conference, November 2011 Monica Marinucci EMEA Director for Research, Oracle.

NSF and Environmental Cyberinfrastructure Margaret Leinen Environmental Cyberinfrastructure Workshop, NCAR 2002.

14 July 2000TWIST George Brett NLANR Distributed Applications Support Team (NCSA/UIUC)

Commodity Architectures and Army Research Challenges Workshop on Edge Computing Using New Commodity Architectures (EDGE) 24 May 2006 J. Michael Coyle Program.

1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.

Knowledge Environments for Science and Engineering: Current Technical Developments James French, Information and Intelligent Systems Division, Computer.

Knowledge Environments for Science and Engineering: Overview of Past, Present and Future Michael Pazzani, Information and Intelligent Systems Division,

Computing MS Degrees Masters Degrees in Computing at GMU Jeff Offutt Professor of Software Engineering Chair, Graduate Studies Committee Coordinator, MS-SWE.

Cyberinfrastructure Supporting Social Science Cyberinfrastructure Workshop October Chicago Geoffrey Fox

1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,

Welcome to HTCondor Week #14 (year #29 for our project)

3DAPAS/ECMLS panel Dynamic Distributed Data Intensive Analysis Environments for Life Sciences: June San Jose Geoffrey Fox, Shantenu Jha, Dan Katz,

HPC and e-Infrastructure Development in China’s High- tech R&D Program Danfeng Zhu Sino-German Joint Software Institute (JSI), Beihang University Dec.

1 Challenges Facing Modeling and Simulation in HPC Environments Panel remarks ECMS Multiconference HPCS 2008 Nicosia Cyprus June Geoffrey Fox Community.

Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox

Transforming Data-Driven Publications and Decision Support Joan L. Aron, Ph.D. Consultant Federal Big Data Working Group COM.BigData 2014.

X-Informatics Introduction: What is Big Data, Data Analytics and X-Informatics? January Geoffrey Fox

CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.

PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.

Science Clouds and FutureGrid’s Perspective June Science Clouds Workshop HPDC 2012 Delft Geoffrey Fox

OpenQuake Infomall ACES Meeting Maui May Geoffrey Fox

Applying Grid Computing Research to Commercial IR Applications Presented by Carl Sylvia, SBIR Project Manager Deep Web Technologies, LLC GGF-14 – June.

What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.

Computer Laboratory Practicing at the Faculty of Natural Science and Mathematics Vesna Veličković Marko Milošević Workshop on Lab Practicing in Computer.

Service - Oriented Middleware for Distributed Data Mining on the Grid ，劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.

Pascucci-1 Valerio Pascucci Director, CEDMAV Professor, SCI Institute & School of Computing Laboratory Fellow, PNNL Massive Data Management, Analysis,

Hassan A. Karimi Geoinformatics Laboratory School of Information Sciences University of Pittsburgh 3/27/20121.

On the Need for a Data Decadal Survey Stanley C. Ahalt, Ph.D. Director, RENCI Professor, UNC-Chapel Hill Dept. of Computer Science Director, Biomedical.

08/05/06 Slide # -1 CCI Workshop Snowmass, CO CCI Roadmap Discussion Jim Bottum and Patrick Dreher Building the Campus Cyberinfrastructure Roadmap Campus.

The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.

Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.

SALSASALSASALSASALSA Cloud Panel Session CloudCom 2009 Beijing Jiaotong University Beijing December Geoffrey Fox

November Geoffrey Fox Community Grids Lab Indiana University Net-Centric Sensor Grids.

Computational Science & Engineering meeting national needs Steven F. Ashby SIAG-CSE Chair March 24, 2003.

Community Grids Lab at Pervasive Technology Labs Geoffrey Fox

Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.

Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.

Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.

SALSASALSASALSASALSA Digital Science Center February 12, 2010, Bloomington Geoffrey Fox Judy Qiu

1 Why is Digital Curation Important for Workforce and Economic Development? Alan Blatecky Office of Cyberinfrastructure Symposium on Digital Curation in.

Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox

HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox

Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.

1 Cloud Systems Panel at HPDC Boston June Geoffrey Fox Community Grids Laboratory, School of informatics Indiana University

Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.

Challenge: LIKES 1.CISE Pathways to Revitalized Undergraduate Education (CPATH) “to transform undergraduate computing education on a national scale, to.

1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.

Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.

Hall D Computing Facilities Ian Bird 16 March 2001.

Geoffrey Fox Panel Talk: February

Panel: Beyond Exascale Computing

Status and Challenges: January 2017

NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.

Biology MDS and Clustering Results

Hilton Hotel Honolulu Tapa Ballroom 2 June 26, 2017 Geoffrey Fox

Department of Intelligent Systems Engineering

PolarGrid and FutureGrid

Panel on Research Challenges in Big Data

Cloud Futures Panel -- Future cloud-related research

Convergence of Big Data and Extreme Computing

Presentation transcript:

1 Challenges and New Trends in Data Intensive Science Panel at Data-aware Distributed Computing (DADC) Workshop HPDC Boston June Geoffrey Fox Community Grids Laboratory, School of informatics Indiana University

HPDC DADC Panel Questions I What do you think is the most challenging research problem in data-intensive computing today? What will it be five years from now? – Same problem as for any distributed computing arena Should we call it clouds What is a sustainable architecture and associated software What are special requirements of scientific data processing How do they relate across sensors, “classic” astronomy/particle physics and “scholarly studies” How do these relate to commercial systems that perhaps look most like “scholarly studies” Is there any chance today’s 100 idiosyncratic data and metadata systems will survive 5 year projection depends a lot on evolution of compute, data and network systems/hardware The demand for Data-intensive computing has also attracted the attention of the federal funding agencies. NSF, DOE and other agencies are introducing new programs focusing on data-intensive computing such as DataNet, INTEROP, CLuE, SCiDAC Data Centers etc. Can we say Data Intensive Computing will be the next big/hot thing? How long do you think this trend will continue? – Note 30 years ago I was solving LHC problem on a few hundred 1600 BPI tapes (around 10 gigabyte of data ). I used identical analysis software (as did those from 15 years before) but volume of data a factor of million lower – HPCC Initiative shifted emphasis to simulation in 1990 – Datanet: Note raw data and software thrown away (without asking me; both “saved” on data repository chipstore) 12 years ago but I do have results of analysis – So problem is not so new; scale is new – I want to stress that agencies should keep a good balance between research and deployment; I am not so sure community has done very well on deployment

HPDC DADC Panel Questions II What How do you think the new Petaflop scale systems and multi- core machines will affect data-intensive science? – Multi-core will be very important as the high performance distributed data analysis system for plethora of distributed data – Not clear what computational complexity of important data mining algorithms will be – Unclear if Petaflops very relevant; many smaller machines better and not allowed to use Petaflops in this fashion Could you comment on the vision of your institution on the "next generation data-intensive science & discovery"? – Very important in usual sciences – Biology, Chemistry, Physics – Indiana University has no engineering school so social science or “scholarly knowledge” especially interesting – Education and Training; clearly identified as a serious problem – Propose (as chair of informatics department) certificates/minors in X- Informatics (a wrinkle on Computational Science) X = Geography, Library, Health, Business, Media, Sports Already have full set of offerings in Bio, Chem, Security, Music, Social, HCI ….