Scalable Fast Rank-1 Dictionary Learning for fMRI Big Data Analysis

Slides:

Advertisements

Similar presentations

FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.

Advertisements

LIBRA: Lightweight Data Skew Mitigation in MapReduce

Distributed Approximate Spectral Clustering for Large- Scale Datasets FEI GAO, WAEL ABD-ALMAGEED, MOHAMED HEFEEDA PRESENTED BY : BITA KAZEMI ZAHRANI 1.

Matei Zaharia Large-Scale Matrix Operations Using a Data Flow Engine.

Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.

Automatic Resource Scaling for Web Applications in the Cloud Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science.

Automatic Identification of ROIs (Regions of interest) in fMRI data.

Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.

Mesos A Platform for Fine-Grained Resource Sharing in Data Centers Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy.

Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.

Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models Rebecca A. Hutchinson (1) Tom M. Mitchell.

An Integrated Pose and Correspondence Approach to Image Matching Anand Rangarajan Image Processing and Analysis Group Departments of Electrical Engineering.

FLANN Fast Library for Approximate Nearest Neighbors

HELSINKI UNIVERSITY OF TECHNOLOGY LABORATORY OF COMPUTER AND INFORMATION SCIENCE NEURAL NETWORKS RESEACH CENTRE Variability of Independent Components.

U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.

A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.

Department of Computer Science Engineering SRM University

Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.

Research Directions for Big Data Graph Analytics John A. Miller, Lakshmish Ramaswamy, Krys J. Kochut and Arash Fard Department of Computer Science University.

A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.

OPTIMIZATION OF FUNCTIONAL BRAIN ROIS VIA MAXIMIZATION OF CONSISTENCY OF STRUCTURAL CONNECTIVITY PROFILES Dajiang Zhu Computer Science Department The University.

Independent Component Analysis (ICA) A parallel approach.

A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.

Mining High Utility Itemset in Big Data

1 Finding Constant From Change: Revisiting Network Performance Aware Optimizations on IaaS Clouds Yifan Gong, Bingsheng He, Dan Li Nanyang Technological.

Network modelling using resting-state fMRI: effects of age and APOE Lars T. Westlye University of Oslo CAS kickoff meeting 23/

 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.

Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.

Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.

Fig.1. Flowchart Functional network identification via task-based fMRI To identify the working memory network, each participant performed a modified version.

Joint Sparse Representation of Brain Activity Patterns in Multi-Task fMRI Data 2015/03/21.

Building a Distributed Full-Text Index for the Web by Sergey Melnik, Sriram Raghavan, Beverly Yang and Hector Garcia-Molina from Stanford University Presented.

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

DISTIN: Distributed Inference and Optimization in WSNs A Message-Passing Perspective SCOM Team

Massive Semantic Web data compression with MapReduce Jacopo Urbani, Jason Maassen, Henri Bal Vrije Universiteit, Amsterdam HPDC ( High Performance Distributed.

NA-MIC National Alliance for Medical Image Computing Core 1b – Engineering Computational Platform Jim Miller GE Research.

Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Apache Spark Osman AIDEL.

Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:

Enhancements to Galaxy for delivering on NIH Commons

Geoffrey Fox Panel Talk: February

Big Data Analytics and HPC Platforms

PROTECT | OPTIMIZE | TRANSFORM

Sushant Ahuja, Cassio Cristovao, Sameep Mohta

Database management system Data analytics system:

Hydra: Leveraging Functional Slicing for Efficient Distributed SDN Controllers Yiyang Chang, Ashkan Rezaei, Balajee Vamanan, Jahangir Hasan, Sanjay Rao.

Brad Sutton Assoc Prof, Bioengineering Department

Pagerank and Betweenness centrality on Big Taxi Trajectory Graph

基于多核加速计算平台的深度神经网络分割与重训练技术

Pathology Spatial Analysis February 2017

Computing and Compressive Sensing in Wireless Sensor Networks

StratusLab Final Periodic Review

StratusLab Final Periodic Review

Spark Presentation.

So far we have covered … Basic visualization algorithms

Hadoop Clusters Tess Fulkerson.

Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.

SpatialHadoop: A MapReduce Framework for Spatial Data

Kijung Shin1 Mohammad Hammoud1

Detection of Local Cortical Asymmetry via Discriminant Power Analysis

Introduction to Spark.

Author: Ahmed Eldawy, Mohamed F. Mokbel, Christopher Jonathan

湖南大学-信息科学与工程学院-计算机与科学系

CMPT 733, SPRING 2016 Jiannan Wang

CS110: Discussion about Spark

AWS Cloud Computing Masaki.

M/EEG Statistical Analysis & Source Localization

CMPT 733, SPRING 2017 Jiannan Wang

NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &

Overview: Chapter 2 Localization and Tracking

2019/9/14 The Deep Learning Vision for Heterogeneous Network Traffic Control Proposal, Challenges, and Future Perspective Author: Nei Kato, Zubair Md.

Presentation transcript:

Scalable Fast Rank-1 Dictionary Learning for fMRI Big Data Analysis Xiang Li1, Milad Makkie1, Binbin Lin2, Mojtaba Sedigh Fazli1, Ian Davidson3, Jieping Ye2, Tianming Liu1, Shannon Quinn1 1Department of Computer Science and Bio-Imaging Research Center, University of Georgia, Athens, GA 2Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 3Department of Computer Science, University of California, Davis, CA http://cs.uga.edu/~xiang

Functional Network Discovery by Matrix Decomposition Introduction Algorithm Parallelization Deployment Result and Performance Functional Network Discovery by Matrix Decomposition Data-driven approaches for discovering underlying functional organization patterns; “Functional Networks” as components from matrix decomposition/factorization analysis: collection of voxels; Temporal and spatial patterns are defined in the matrix decomposition results (i.e. components and loading coefficient); Widely-applied in fMRI studies: ICA, PCA, dictionary learning;

Dictionary Learning Method Introduction Algorithm Parallelization Deployment Result and Performance Dictionary Learning Method Temporal variation patterns stored in D matrix; Spatial distribution patterns stored in α (mixing) matrix; m number of dictionaries (functional networks) learned; “Sparse Representation of Whole-brain FMRI Signals for Identification of Functional Networks”, Medical Image Analysis, 2014

Functional Network Discovery by Matrix Decomposition Introduction Algorithm Parallelization Deployment Result and Performance Functional Network Discovery by Matrix Decomposition Both task-evoked (e.g. M1, M3, M5) and resting-state networks (e.g. RSN2, RSN 3) could be identified from the dictionary learning results; “Sparse Representation of Whole-brain FMRI Signals for Identification of Functional Networks”, Medical Image Analysis, 2014

Functional Network Discovery by Matrix Decomposition Introduction Algorithm Parallelization Deployment Result and Performance Functional Network Discovery by Matrix Decomposition “Holistic Atlases of Functional Networks and Interactions Reveal Reciprocal Organizational Architecture of Cortical Function”, IEEE Transactions on Biomedical Engineering, 2015

Result and Performance Introduction Algorithm Parallelization Deployment Result and Performance Working with Big Data fMRI big data posed grand challenges on the analysis methods: data size quickly out-grows the memory capacity and computational power; Utilizing population-level data for learning the holistic brain functional networks space rather than the dominant features: overcome the bias and false-positives in traditional hypothesis-based studies; Requires integrated informatics system and the fast and scalable algorithms for high-throughput neuroimaging researches; “Group-PCA for very large fMRI datasets”, Smith et. al, Neuroimaging, 2015

Rank-1 Dictionary Learning Overview Introduction Algorithm Parallelization Deployment Result and Performance Rank-1 Dictionary Learning Overview Loading coefficient matrix (spatial) V … v2 v3 vK Dictionary matrix (temporal) U R0 = S u1 v1 × deflate = R1 = R0-u1v1T u2 RK = RK-1-uKvK’ uK u3 “Fast and Scalable Rank-1 Dictionary Learning for Inferring Brain Networks from fMRI Data”, IEEE Transactions on Medical Imaging, in submission

Rank-1 Dictionary Learning Overview Introduction Algorithm Parallelization Deployment Result and Performance Rank-1 Dictionary Learning Overview General formulation of matrix factorization problem: 1 2 ||𝑆−𝐷𝛼||+𝜓 𝛼 Dictionary learning with l-0 constraint (u and v are vectors): 𝐿 𝑢, 𝑣 = 𝑆−𝑢 𝑣 𝑇 𝐹 , s.t. 𝑢 =1, 𝑣 0 ≤𝑟. Alternating Least Square (ALS) updating: 𝑣= argmin 𝑣 𝑆−𝑢 𝑣 𝑇 𝐹 = 𝑢 𝑇 𝑆 𝑇 , 𝑠.𝑡. 𝑣 0 ≤𝑟, 𝑢= argmin 𝑢 𝑆−𝑢 𝑣 𝑇 𝐹 = 𝑆𝑣 𝑆𝑣 , Converging at step 𝑗 if: 𝑢 𝑗+1 − 𝑢 𝑗 <𝜀, 𝜀=0.01. Deflation: 𝑅 𝑛 = 𝑅 𝑛−1 −𝑢 𝑣 𝑇 , 𝑅 0 =𝑆,1<𝑛≤𝐾,

Distributed r1DL based on Spark Subroutines for Parallelization Introduction Algorithm Parallelization Deployment Result and Performance Distributed r1DL based on Spark Controller node (worker) Slave node (worker) map(function) reduce(lambda function) Subroutines for Parallelization Vector-matrix multiplication; Matrix-vector multiplication; Matrix deflation;

Distributed r1DL based on Spark Introduction Algorithm Parallelization Deployment Result and Performance Distributed r1DL based on Spark

Current parallelization implementations Introduction Algorithm Parallelization Deployment Result and Performance Current parallelization implementations Import S Imported on-demand as small portions, maintained at HDFS … R0=S / S=Rk … Each node reads its corresponding portion of S v=uS Each node receives u, only uses portion of u for calculating v, then summed up v at controller node … v=topR(v) … Divide&conquer for partitioning v, distributed at each node u=Sv Each node receives v, generates portion of u, then collected at controller node … Rk=Rk-1-uv … Broadcasting v and u, then summed up the total residual at controller node

Deployment of r1DL and D-r1DL Introduction Algorithm Parallelization Deployment Result and Performance Deployment of r1DL and D-r1DL r1DL is implemented in C++, MATLAB and Python. Currently it can be run locally or through our HELPNI web service: http://bd.hafni.cs.uga.edu/HELPNI/ D-r1DL is implemented in Spark (Python). It has been experimentally deployed in: Our in-house server (will be linked to HELPNI soon); Georgia Advanced Computing Resource Center (GACRC) server: 48 cores, 128 GB memory http://gacrc.uga.edu/; Amazon Elastic Compute Cloud (AWS-EC2) service;

Results by r1DL on Human Connectome Project data Introduction Algorithm Parallelization Deployment Result and Performance Results by r1DL on Human Connectome Project data “Fast and Scalable Rank-1 Dictionary Learning for Inferring Brain Networks from fMRI Data”, IEEE Transactions on Medical Imaging, in submission

Results by r1DL on Human Connectome Project data Introduction Algorithm Parallelization Deployment Result and Performance Results by r1DL on Human Connectome Project data “Fast and Scalable Rank-1 Dictionary Learning for Inferring Brain Networks from fMRI Data”, IEEE Transactions on Medical Imaging, in submission

Atom Cardinality and Decreasing Variation in Residual Introduction Algorithm Parallelization Deployment Result and Performance Atom Cardinality and Decreasing Variation in Residual “Fast and Scalable Rank-1 Dictionary Learning for Inferring Brain Networks from fMRI Data”, IEEE Transactions on Medical Imaging, in submission

Performance comparison Introduction Algorithm Parallelization Result and Performance Deployment Performance comparison Atom Cardinality and Decreasing Variation in Residual “Fast and Scalable Rank-1 Dictionary Learning for Inferring Brain Networks from fMRI Data”, IEEE Transactions on Medical Imaging, in submission

Performance statistics by D-r1DL Introduction Algorithm Parallelization Deployment Result and Performance Performance statistics by D-r1DL

Performance statistics by D-r1DL Introduction Algorithm Parallelization Deployment Result and Performance Performance statistics by D-r1DL Memory cost of D-r1DL by using the resilient distributed dataset (RDD) supported by Spark could be constant, regardless of the size of the input; Nodes working on data partitions rather than the whole dataset;

Large-scale fMRI (>20 GB) analysis Introduction Algorithm Parallelization Deployment Result and Performance Large-scale fMRI (>20 GB) analysis Group-wise tfMRI data aggregated from 68 subjects: 176×15,228,260 input matrix;

Large-scale fMRI (>20 GB) analysis with sampling Introduction Algorithm Parallelization Deployment Result and Performance Large-scale fMRI (>20 GB) analysis with sampling

Conclusion and Discussion Introduction Algorithm Parallelization Deployment Result and Performance Conclusion and Discussion Fast and scalable solution for big neuroimaging data analysis; Not limited to fMRI data: a general framework for all modalities of biomedical imaging data; Getting more comprehensive understanding of the brain functional dynamics: precisely mapping individual functional networks to the population-wise state space; Try our web service! http://bd.hafni.cs.uga.edu/HELPNI/