Member of the Helmholtz Association 03/10/2015 | RDA Fifth Plenary Meeting | San Diego, USA | Paradise Point Resort Markus Götz Jülich Supercomputing Center (JSC) // University of Iceland Morris Riedel Jülich Supercomputing Center (JSC) // University of Iceland Big Data Interest Group Smart Data Analytics
Member of the Helmholtz Association Outline Introduction Research Group, Research Area Smart Data Analytics Use Cases and Techniques Classification, Land Cover Type, piSVM Clustering, „Drunken Flies“, HPDBSCAN Deep-Learning, Cortex Layers, pylearn CNN Conclusion Results and RDA Context 03/10/20152Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Member of the Helmholtz Association 03/10/20153 Research Group Jülich Supercomputing Center (HPC/HTC) High Productivity Data Processing Group Research Area Smart Data Analytics Methods Evaluation and Development of Scalable Tools Processing Platform Requirements Application in Scientific Use Case Introduction Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Parallel Data Analytics Data Mining Methods Machine Learning Algorithms Scientific Community Application Data Analzsis Tools Generic Data Methods Smart Data Analytics
Member of the Helmholtz Association 03/10/20154 Land Cover Type Problem Collaboration with University of Iceland Determine Land Cover Type in Satellite Images Different Types - Road, Building, Vegetation, … Classification Supervised Learning Technique Known Set of Groups or Classes Determine Membership of New Items Classification // Land Cover Type Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Member of the Helmholtz Association 03/10/20155 Approach Support Vector Machines (SVM) Existing Solution: piSVM (MPI) In-house Optimization of Parallel Code Classification // Land Cover Type Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Area Standard deviation Inertia
Member of the Helmholtz Association 03/10/20156 „Drunken Flies“ Collaboration with University of Cologne Investigate Influence of Genetics on Alcohol Consumption Literally Make Flies Drunk Clustering Unsupervised Learning Technique Subdivide Database into Similar Groups Similarity Metrics Clustering // „Drunken Flies“ Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Member of the Helmholtz Association 03/10/20157 Clustering // „Drunken Flies“ Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Approach Image Processing Pipeline HPDBSCAN In-house Development (MPI+OpenMP)
Member of the Helmholtz Association 03/10/20158 Cortex Layer Problem Institute for Neuro-Medicine (INM) at FZJ Segment the Seven Layers of the Cortex Images of Actual Brain Slices Each Gigabytes (60k square resolution) Deep Learning Supervised Learning Technique (Classification) More Advanced Mathemical Models Various Flavors of Neural Networks Deep Learning // Cortex Layers Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Member of the Helmholtz Association 03/10/20159 Approach Convolutional Neural Networks Existing Serial Toolkit Pylearn 2/Theano Scaling Issues Deep Learning // Cortex Layers Markus Götz | Smart Data Analytics | Forschungszentrum Jülich
Member of the Helmholtz Association 03/10/ Conclusion Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Results Big Data Challenge is Real! Gap between Analytics Requirements and Actual Implementations Interest for RDA Code is GitHub and Bitbucket Data is Open and Freely B2SHARE Choice of Dataformats Question of Future Processing Platforms
Member of the Helmholtz Association 03/10/ Thanks you for the attention… Markus Götz | Smart Data Analytics | Forschungszentrum Jülich Fifth Plenary Meeting 08 – 12 March 2015 San Diego, USA | Paradise Point Resort Contact: Big Data IG > Wiki > 5th Plenary