Https://portal.futuregrid.org Cyberinfrastructure Supporting Social Science Cyberinfrastructure Workshop October 16 2012 Chicago Geoffrey Fox

Slides:



Advertisements
Similar presentations
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Advertisements

1 Challenges and New Trends in Data Intensive Science Panel at Data-aware Distributed Computing (DADC) Workshop HPDC Boston June Geoffrey Fox Community.
The Great Academia/Industry Grid Debate November Geoffrey Fox Community Grids Laboratory Indiana University
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
International Conference on Cloud and Green Computing (CGC2011, SCA2011, DASC2011, PICom2011, EmbeddedCom2011) University.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.
Computational Physics Kepler Dr. Guy Tel-Zur. This presentations follows “The Getting Started with Kepler” guide. A tutorial style manual for scientists.
Addition to Networking.  There is no unique and standard definition out there  Cloud Computing is a general term used to describe a new class of network.
Current Developments in Differential Privacy Salil Vadhan Center for Research on Computation & Society School of Engineering & Applied Sciences Harvard.
Virtual Clusters Supporting MapReduce in the Cloud Jonathan Klinginsmith School of Informatics and Computing.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
3DAPAS/ECMLS panel Dynamic Distributed Data Intensive Analysis Environments for Life Sciences: June San Jose Geoffrey Fox, Shantenu Jha, Dan Katz,
1 Challenges Facing Modeling and Simulation in HPC Environments Panel remarks ECMS Multiconference HPCS 2008 Nicosia Cyprus June Geoffrey Fox Community.
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
DuraCloud A service provided by Sandy Payette and Michele Kimpton.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
Proposed Project on Microsoft Academic Search Cyberinfrastructure Workshop October Chicago Geoffrey Fox
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Data! Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
Cloud computing.
Science Clouds and FutureGrid’s Perspective June Science Clouds Workshop HPDC 2012 Delft Geoffrey Fox
U.S. Department of the Interior U.S. Geological Survey Next Generation Data Integration Challenges National Workshop on Large Landscape Conservation Sean.
OpenQuake Infomall ACES Meeting Maui May Geoffrey Fox
Transformation of Research and Education in the 21 st Century Edward Seidel Director, Office of Cyberinfrastructure National Science Foundation
API, Interoperability, etc.  Geoffrey Fox  Kathy Benninger  Zongming Fei  Cas De’Angelo  Orran Krieger*
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Scientific Computing Environments ( Distributed Computing in an Exascale era) August Geoffrey Fox
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
ICETE 2012 Joint Conference on e-Business and Telecommunications Hotel Meliá Roma Aurelia Antica, Rome, Italy July
FutureGrid Connection to Comet Testbed and On Ramp as a Service Geoffrey Fox Indiana University Infra structure.
Some remarks on Use of Clouds to Support Long Tail of Science July XSEDE 2012 Chicago ILL July 2012 Geoffrey Fox.
NanoHUB.org and HUBzero™ Platform for Reproducible Computational Experiments Michael McLennan Director and Chief Architect, Hub Technology Group and George.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
ISERVOGrid Architecture Working Group Brisbane Australia June Geoffrey Fox Community Grids Lab Indiana University
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
SALSASALSASALSASALSA Cloud Panel Session CloudCom 2009 Beijing Jiaotong University Beijing December Geoffrey Fox
Virtual Appliances CTS Conference 2011 Philadelphia May Geoffrey Fox
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Remarks on MOOC’s Open Grid Forum BOF July 24 OGF38B at XSEDE13 San Diego Geoffrey Fox Informatics, Computing.
Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Training Data Scientists DELSA Workshop DW4 May Washington DC Geoffrey Fox Informatics, Computing.
Remarks on MOOC’s SC13 Birds of a Feather November Geoffrey Fox Informatics, Computing and Physics.
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Panel Discussion Software Defined Ecosystems June BigSystem Software-Defined Ecosystems at HPDC Vancouver Canada Geoffrey Fox.
Connecting Users, Data & Data Repositories Simon J. Goring ORCID: John W. Williams doi: /m9.figshare Distinguished Lecture.
Big Data Open Source Software and Projects ABDS in Summary II: Layer 5 I590 Data Science Curriculum August Geoffrey Fox
Indiana University Faculty Geoffrey Fox, David Crandall, Judy Qiu, Gregor von Laszewski Data Science at Digital Science Center.
1 Seattle University Master’s of Science in Business Analytics Key skills, learning outcomes, and a sample of jobs to apply for, or aim to qualify for,
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Transformative Earth Sciences through Data: Neotoma, EarthCube & Flyover Country Simon Goring Assistant Scientist University of Wisconsin - Madison S i.
1 Panel on Merge or Split: Mutual Influence between Big Data and HPC Techniques IEEE International Workshop on High-Performance Big Data Computing In conjunction.
Geoffrey Fox Panel Talk: February
Panel: Beyond Exascale Computing
CSCE 587: Big Data Analytics
Tutorial Overview February 2017
Cyberinfrastructure and PolarGrid
Services, Security, and Privacy in Cloud Computing
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Department of Intelligent Systems Engineering
Digital Science Center
Panel on Research Challenges in Big Data
Cloud versus Cloud: How Will Cloud Computing Shape Our World?
Convergence of Big Data and Extreme Computing
Presentation transcript:

Cyberinfrastructure Supporting Social Science Cyberinfrastructure Workshop October Chicago Geoffrey Fox Informatics, Computing and Physics Indiana University Bloomington

Goal of Day Come up with a few (3-5) projects that advance Social Sciences Cyberinfrastructure Choose so that together they cover spectrum of characteristics 2 Characteristics ABC….Z Project 1XXX Project 2XXX ….. Project NXX

Data Type What is large? #Collections v. Collection Size v. #Users “Big (Social) Science” v Long Tail # rows v # columns v time dependence Structured (defined) v unstructured (inferred/discovered) metadata granularity of metadata Data modality: Streaming, video, image, text, “binary” – vector space or not (genomics, network) distributed v centralized data (production/storage/processing) Complex objects v. tables Observed v. simulation or modeling 3

Data Nature (“ilities”) Open data Sharable Data Publication model / Data citation models? – DOI or Handler Reproducibility Sustainability Standards Management Integration Dramatic change in next 10 years Data availability as in Public Windy Grid 4

Mining/Analyzing data Access: role of Community comments, crowd sourcing, Processing: “Simple” statistics, Linkage software, data visualization, GIS, analytics (SVM, LDA, Clustering...); (new) management tools Data Mining (discovering the unexpected) v. Data Analysis (discovering with excellence the ~expected) Modeling for data components and regression More data v more/better algorithms (in simulation, algorithm advances ~ as important as machine advances) Programming model: Excel, SQL, R, SPSS, Other Scripting, MapReduce, "Fortran/C++/Java", Libraries, workflow, portal/gateway Open software & sustainability of it 5

Security & Privacy Support sharing The law Risk of identification, harm from disclosure Differential Privacy and nifty obfuscation ideas IRB Federated Identity Enclave 6

The Infrastructure Repository/Archive v. Active (compute + storage) data Bring Computing to data Commercial Clouds v. XSEDE v. University Local v. cloud v. department/university Distributed (Federated) clouds as collections distributed DropBox, Google docs, Skype etc. v customized Generality of DuraCloud, Dataverse DataUp etc. Tool repository/library Cloudbursting (public-private hybrid cloud) Connectivity to cloud (can be addressed by I2?) Backup v Main Home 7

Other Characteristics Satisfying NSF Data Management requirements Breadth of applicability of solutions # Organizations collaborating on project Interdisciplinary collaborations Data (science) Curricula Relation to issues in other fields Support and Governance Industry ahead of Academia 8