Member of the Helmholtz Association 03/10/2015 | RDA Fifth Plenary Meeting | San Diego, USA | Paradise Point Resort Markus Götz Jülich Supercomputing Center.

Slides:



Advertisements
Similar presentations
Forschungszentrum Jülich in der Helmholtz-Gesellschaft February 2007 – OGF19 A Collaborative Online Visualization and Steering (COVS) Framework for e-Science.
Advertisements

A New Algorithm of Fuzzy Clustering for Data with Uncertainties: Fuzzy c-Means for Data with Tolerance Defined as Hyper-rectangles ENDO Yasunori MIYAMOTO.
Scaling Content Based Image Retrieval Systems Christine Lo, Sushant Shankar, Arun Vijayvergiya CS 267.
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics.
Neural Networks Chapter Feed-Forward Neural Networks.
1/20 Document Segmentation for Image Compression 27/10/2005 Emma Jonasson Supervisor: Dr. Peter Tischer.
Introduction to Data Mining Engineering Group in ACL.
Evaluating Performance for Data Mining Techniques
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Multiclass object recognition
Approximating the Algebraic Solution of Systems of Interval Linear Equations with Use of Neural Networks Nguyen Hoang Viet Michal Kleiber Institute of.
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
SUPPORTING A MODELING CONTINUUM IN SCALATION John A. Miller Michael E. Cotterell Stephen J. Buckley University of Georgia IBM Thomas J. Watson Research.
CS525: Big Data Analytics Machine Learning on Hadoop Fall 2013 Elke A. Rundensteiner 1.
“Study on Parallel SVM Based on MapReduce” Kuei-Ti Lu 03/12/2015.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
An Example of Course Project Face Identification.
ASPRS Annual Conference 2005, Baltimore, March Utilizing Multi-Resolution Image data vs. Pansharpened Image data for Change Detection V. Vijayaraj,
Apache Mahout. Mahout Introduction Machine Learning Clustering K-means Canopy Clustering Fuzzy K-Means Conclusion.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
From Machine Learning to Deep Learning. Topics that I will Cover (subject to some minor adjustment) Week 2: Introduction to Deep Learning Week 3: Logistic.
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
Overview of Machine Learning RPI Robotics Lab Spring 2011 Kane Hadley.
CERN openlab V Technical Strategy Fons Rademakers CERN openlab CTO.
Machine Learning Documentation Initiative Workshop on the Modernisation of Statistical Production Topic iii) Innovation in technology and methods driving.
Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.
Data Visualization Michel Bruley Teradata Aster EMEA Marketing Director April 2013 Michel Bruley Teradata Aster EMEA Marketing Director.
Data Science and Big Data Analytics Chap 4: Advanced Analytical Theory and Methods: Clustering Charles Tappert Seidenberg School of CSIS, Pace University.
Challenges with XML Challenges with Semi-Structured collections Ludovic Denoyer University of Paris 6 Bridging the gap between research communities.
M. Wang, T. Xiao, J. Li, J. Zhang, C. Hong, & Z. Zhang (2014)
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.
Apache Mahout Qiaodi Zhuang Xijing Zhang.
Machine Learning Lecture for Methodological Foundations of Biomedical Informatics Fall 2015 (BMSC-GA 4449) Sisi Ma NYU Langone Medical Center CHIBI.
A TUTORIAL ON SUPPORT VECTOR MACHINES FOR PATTERN RECOGNITION ASLI TAŞÇI Christopher J.C. Burges, Data Mining and Knowledge Discovery 2, , 1998.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Where is the Rio Santa Basin? Rio Santa Basin Project Background Expanding high altitude glacial lakes pose risks to downstream communities Various organizations.
Machine Learning in Compiler Optimization By Namita Dave.
Presented by Jack Dongarra University of Tennessee and Oak Ridge National Laboratory KOJAK and SCALASCA.
HPC University Requirements Analysis Team Training Analysis Summary Meeting at PSC September Mary Ann Leung, Ph.D.
Data Mining in Germany IIM Conference, Oct. 24, 2012 Gottfried Schwarz, DLR > Lecture > Author Document > Datewww.DLR.de Chart 1.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.
Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
BIG DATA Initiative SMART SubstationBig Data Solution.
Brief Intro to Machine Learning CS539
Course Outline (6 Weeks) for Professor K.H Wong
Deep Learning for Big Data
Big data classification using neural network
RDA Plenary 5 Big Data (Analytics) IG Session
Deep Learning: What is it good for? R. Burgmann
The Relationship between Deep Learning and Brain Function
Machine Learning overview Chapter 18, 21
A Personal Tour of Machine Learning and Its Applications
iGrid Aron Kondoro – University of Dar-es-Salaam - Tanzania
Implementing Boosting and Convolutional Neural Networks For Particle Identification (PID) Khalid Teli .
RDA Big Data Infrastructure WG
Deep Learning Fundamentals online Training at GoLogica
Machine Learning & Data Science
FUNDAMENTALS OF MACHINE LEARNING AND DEEP LEARNING
Fuzzy Support Vector Machines
Sarah Dahab Supervised by Stéphane maag Started on March 2016
RGB-D Image for Scene Recognition by Jiaqi Guo
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
Tuning CNN: Tips & Tricks
CHAPTER 7: Information Visualization
View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.
What's New in eCognition 9
Presentation transcript:

Member of the Helmholtz Association 03/10/2015 | RDA Fifth Plenary Meeting | San Diego, USA | Paradise Point Resort Markus Götz Jülich Supercomputing Center (JSC) // University of Iceland Morris Riedel Jülich Supercomputing Center (JSC) // University of Iceland Big Data Interest Group Smart Data Analytics

Member of the Helmholtz Association Outline Introduction  Research Group, Research Area Smart Data Analytics Use Cases and Techniques  Classification, Land Cover Type, piSVM  Clustering, „Drunken Flies“, HPDBSCAN  Deep-Learning, Cortex Layers, pylearn CNN Conclusion  Results and RDA Context 03/10/20152Markus Götz | Smart Data Analytics | Forschungszentrum Jülich

Member of the Helmholtz Association 03/10/20153 Research Group  Jülich Supercomputing Center (HPC/HTC)  High Productivity Data Processing Group Research Area  Smart Data Analytics Methods  Evaluation and Development of Scalable Tools  Processing Platform Requirements  Application in Scientific Use Case Introduction Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Parallel Data Analytics Data Mining Methods Machine Learning Algorithms Scientific Community Application Data Analzsis Tools Generic Data Methods Smart Data Analytics

Member of the Helmholtz Association 03/10/20154 Land Cover Type Problem  Collaboration with University of Iceland  Determine Land Cover Type in Satellite Images  Different Types - Road, Building, Vegetation, … Classification  Supervised Learning Technique  Known Set of Groups or Classes  Determine Membership of New Items Classification // Land Cover Type Markus Götz | Smart Data Analytics | Forschungszentrum Jülich

Member of the Helmholtz Association 03/10/20155 Approach  Support Vector Machines (SVM)  Existing Solution: piSVM (MPI)  In-house Optimization of Parallel Code Classification // Land Cover Type Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Area Standard deviation Inertia

Member of the Helmholtz Association 03/10/20156 „Drunken Flies“  Collaboration with University of Cologne  Investigate Influence of Genetics on Alcohol Consumption  Literally Make Flies Drunk Clustering  Unsupervised Learning Technique  Subdivide Database into Similar Groups  Similarity Metrics Clustering // „Drunken Flies“ Markus Götz | Smart Data Analytics | Forschungszentrum Jülich

Member of the Helmholtz Association 03/10/20157 Clustering // „Drunken Flies“ Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Approach  Image Processing Pipeline  HPDBSCAN  In-house Development (MPI+OpenMP)

Member of the Helmholtz Association 03/10/20158 Cortex Layer Problem  Institute for Neuro-Medicine (INM) at FZJ  Segment the Seven Layers of the Cortex  Images of Actual Brain Slices  Each Gigabytes (60k square resolution) Deep Learning  Supervised Learning Technique (Classification)  More Advanced Mathemical Models  Various Flavors of Neural Networks Deep Learning // Cortex Layers Markus Götz | Smart Data Analytics | Forschungszentrum Jülich

Member of the Helmholtz Association 03/10/20159 Approach  Convolutional Neural Networks  Existing Serial Toolkit  Pylearn 2/Theano  Scaling Issues Deep Learning // Cortex Layers Markus Götz | Smart Data Analytics | Forschungszentrum Jülich

Member of the Helmholtz Association 03/10/ Conclusion Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Results  Big Data Challenge is Real!  Gap between Analytics Requirements and Actual Implementations Interest for RDA  Code is GitHub and Bitbucket  Data is Open and Freely B2SHARE  Choice of Dataformats  Question of Future Processing Platforms

Member of the Helmholtz Association 03/10/ Thanks you for the attention… Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Fifth Plenary Meeting 08 – 12 March 2015 San Diego, USA | Paradise Point Resort Contact: Big Data IG > Wiki > 5th Plenary