Data Intensive Computing Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory graph500 and data-intensive.

Slides:



Advertisements
Similar presentations
Sam Skalicky Biru Cui.  Discovery  Architecture  Evaluation  Conclusion.
Advertisements

Microsoft Proprietary High Productivity Computing Large-scale Knowledge Discovery: Co-evolving Algorithms and Mechanisms Steve Reinhardt Principal Architect.
CloudSocial Mobility Big data Social connections, mobility, cloud delivery and pervasive information are converging in a powerful way. This convergence.
A. Bucchiarone, Juan P. Galeotti / GT-VMT’08 Dynamic Software Architectures Verification using DynAlloy Antonio Bucchiarone IMT Graduate School of Lucca,
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
Proposal for a Standard Representation of the Results of GC-MS Analysis: A Module for ArMet Helen Fuell 1, Manfred Beckmann 2, John Draper 2, Oliver Fiehn.
Experiments in Computer Science. " The fundamental principle of science, the definition almost, is this: the sole test of the validity of any idea is.
Consolidating Information Participants Participants –H. Burkom- Barbara Spratkes-Wilkins –J. Coberly - Stella Tsai –K. Cox Questions Questions –How do.
Chapter 1 General Problem Solving Concepts
XMT BOF SC09 XMT Status And Roadmap Shoaib Mufti Director Knowledge Management.
Energy Issues in Data Analytics Domenico Talia Carmela Comito Università della Calabria & CNR-ICAR Italy
Cyber-Infrastructure for Agro-Threats Steve Goddard Computer Science & Engineering University of Nebraska-Lincoln.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
NSF/TCPP Curriculum Planning workshop Behrooz Shirazi Washington State University February 2010.
© What do bioinformaticians do?
Graph Algorithms for Irregular, Unstructured Data John Feo Center for Adaptive Supercomputing Software Pacific Northwest National Laboratory July, 2010.
Definition of Computational Science Computational Science for NRM D. Wang Computational science is a rapidly growing multidisciplinary field that uses.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Big Data: A definition Big data is the realization of greater intelligence by storing, processing, and analyzing data that was previously ignored due to.
This project has been funded with support from the European Commission. Mathematical literacy and basic competences in science and technology Module Information.
computer
Climate Change Impacts in the United States Third National Climate Assessment [Name] [Date] Energy, Water, and Land.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
1 Feburary 8, 2010 DataSpace 1. HP Labs Research Interests HP Labs have organized its corporate research around 8 major themes that include Information.
Space-Time Mesoscale Analysis System A sequential 3DVAR approach Yuanfu Xie, Steve Koch John McGinley and Steve Albers Global Systems Division Earth System.
Chapter 10 Cognition, Language, Creativity. Concepts Allow us to think abstractly Concept formation: classify information into meaningful categories (belonging.
Hassan A. Karimi Geoinformatics Laboratory School of Information Sciences University of Pittsburgh 3/27/20121.
Numerical Libraries Project Microsoft Incubation Group Mary Beth Hribar Microsoft Corporation CSCAPES Workshop June 10, 2008 Copyright Microsoft Corporation,
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
IDEAL 2.0 Resources for Educators Buckeye Elementary School District # Julie Baker – District Technology TOSA
1 PANEL DISCUSSION ON CFD AND STRUCTURES APPLICATION REQUIREMENTS  Panelists :  Alex Akkerman, Ford  E. Thomas Moyer, Naval Surface Warfare Center 
Data Mining with Big data
©2012 LIESMARS Wuhan University Building Integrated Cyberinfrastructure for GIScience through Geospatial Service Web Jianya Gong, Tong Zhang, Huayi Wu.
Project-Haystack: A community-driven solution to make device data easy to use Making data self-describing so applications just work Wednesday July 22,
1 of 10 Introduction to Visual Sample Plan & Applications DQO Training Course Day 3 Module 21 Presenter: Sebastian Tindall (60 minutes)
Visual Communications
AF5.3 L1-2 Processing and analysing data to support the evaluation process and draw conclusions Say what happened in an investigation.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Chapter 0 Introduction. © 2005 Pearson Addison-Wesley. All rights reserved 0-2 Chapter 0: Introduction 0.1 The Role of Algorithms 0.2 The Origins of Computing.
Photosynthesis Learning Objectives To know what happens during photosynthesis To devise experiments for photosynthesis To identify the limiting factors.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Center for Component Technology for Terascale Simulation Software (CCTTSS) 110 April 2002CCA Forum, Townsend, TN This work has been sponsored by the Mathematics,
HPC 2.1 – Functions Learning Targets: -Determine whether a relation represents a function. -Find the value of a function. -Find the domain of a function.
Fantasy Football. Objective Students will use statistical analyses and quantitative evaluations to get the edge in fantasy football. By looking at data,
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024Low Power Wide Area Network.
GCSE Computer Science Content Overview
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
SNS COLLEGE OF TECHNOLOGY
Limiting Factors.
Data and Analytics Diagram Template
An Introduction Length: 23:52
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Text Analytics Market trends research and projections for : Global.
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024 Text Analytics Market share.
Gas Laws Name Period Due Date.
Advanced Microsoft Excel By Sumit Seth. Advanced Microsoft Excel Being the most widely used spreadsheet developed by Microsoft for Windows, macOS, Android,
HPSA18: Logistics 7:00 am – 8:00 am Breakfast
GEOP 4355 SC Analytics Outline What is Analytics?
Recording data.
Суури мэдлэг Basic Knowledge
Artificial Intelligence
The Sentient Web: IoT + graphs + AI/ML
Preliminary Evaluation – 3 minutes
Panel on Research Challenges in Big Data
2.7 Piecewise Functions Algebra 2.
Қазіргі заманғы ақпараттық технологиялар
Introduced by Global Trade
Line Graphs.
Application Panel MANYCORE Computing 2007
Presentation transcript:

Data Intensive Computing Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory graph500 and data-intensive computing – Richard Murphy, Sandia National Laboratories Large-scale knowledge discovery – Steve Reinhardt, Microsoft Data intensive computing at SNL – Andrew Wilson, Sandia National Laboratories IBM's InfoSphere Streams – Roger Rea, IBM Graph500 and Data Intensive HPC – Richard Murphy, Sandia National Laboratory Data Analytics – Phillip Morris, Platform Computing Data Intensive Computing – Richard Altmaier, Intel

1. Please provide a definition for "Data Intensive Computing". Please explain the difference between "finding a needle in a haystack" and "knowledge discovery".

2. "Data Intensive Computing" generally involves analyses of non-numeric data and the number of combinatorial possibilities grows rapidly. The objective of the analysis is to find in the data, meaningful relationships. How do we (a) test for convergence when not evaluating all possible combinations and (b) test for statistical significance--when the data is non-numeric?

3. "Data Intensive Computing" often involves the use of incomplete data. How does this affect the analysis process?

4. If you could design an ideal computing architecture for Data Intensive Computing, what would it look like?