Alan Jovic1, Kresimir Jozic2, Davor Kukolja1,

Slides:



Advertisements
Similar presentations
ENERGY AND POWER CHARACTERIZATION OF PARALLEL PROGRAMS RUNNING ON THE INTEL XEON PHI JOAL WOOD, ZILIANG ZONG, QIJUN GU, RONG GE {JW1772, ZILIANG,
Advertisements

Alan Jović, Lea Suć, Nikola Bogunović
PRESENTED BY: ILYA NELKENBAUM KEREN ARMON SUPERVISOR: MR. YOSSI KANIZO 09/03/2011 Cuckoo the Kicking Bird 1.
Classification of Electrocardiogram (ECG) Waveforms for the Detection of Cardiac Problems By Enda Moloney.
Workflow API and workflow services A case study of biodiversity analysis using Windows Workflow Foundation Boris Milašinović Faculty of Electrical Engineering.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Paradigms & Benchmarks Ryan McCune CSE Final Presentation 11/3/11 Notre Dame Computer Science 1.
Extraction of nonlinear features from biomedical time-series using HRVFrame framework Analysis of cardiac rhythm records using HRVFrame framework and Weka.
Random Forest-Based Classification of Heart Rate Variability Signals by Using Combinations of Linear and Nonlinear Features Alan Jovic, Nikola Bogunovic.
Fast Support Vector Machine Training and Classification on Graphics Processors Bryan Catanzaro Narayanan Sundaram Kurt Keutzer Parallel Computing Laboratory,
Chapter 4 – Threads (Pgs 153 – 174). Threads  A "Basic Unit of CPU Utilization"  A technique that assists in performing parallel computation by setting.
Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,
HRVFrame: Java-Based Framework for Feature Extraction from Cardiac Rhythm Alan Jovic and Nikola Bogunovic Faculty of Electrical Engineering and Computing,
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Programming Multi-Core Processors based Embedded Systems A Hands-On Experience on Cavium Octeon based Platforms Lab Exercises: Lab 1 (Performance measurement)
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Accelerating K-Means Clustering with Parallel Implementations and GPU Computing Janki Bhimani Miriam Leeser Ningfang Mi
A WEB PLATFORM FOR ANALYSIS OF MULTIVARIATE HETEROGENEOUS BIOMEDICAL TIME - SERIES - A PRELIMINARY REPORT Alan Jovic, Davor Kukolja, Kresimir Jozic, Marko.
2014 Heterogeneous many cores for medical control: Performance, Scalability, and Accuracy Madhurima Pore, Arizona State University October 10,2014 #GHC14.
Alan Jovic 1, Davor Kukolja 1, Kresimir Jozic 2, Mario Cifrek 1 to: 1 University of Zagreb, Faculty of Electrical Engineering.
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Biomedical time series preprocessing and expert-system based feature extraction in MULTISAB platform Alan Jovic1, Davor Kukolja1, Kresimir Friganovic1,
Sub-fields of computer science. Sub-fields of computer science.
Chapter 4 – Thread Concepts
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Introduction to threads
NFV Compute Acceleration APIs and Evaluation
Big Data is a Big Deal!.
OPERATING SYSTEMS CS 3502 Fall 2017
Chapter 4: Threads.
Processes and threads.
CS427 Multicore Architecture and Parallel Computing
MATLAB Distributed, and Other Toolboxes
Operating Systems (CS 340 D)
An Introduction to the IVC Software Framework
Processes and Threads Processes and their scheduling
It’s All About Me From Big Data Models to Personalized Experience
Efficient Image Classification on Vertically Decomposed Data
Chapter 4 – Thread Concepts
Async or Parallel? No they aren’t the same thing!
Chapter 4: Multithreaded Programming
Lecture 21 Concurrency Introduction
Introduction CSE 1310 – Introduction to Computers and Programming
Improving java performance using Dynamic Method Migration on FPGAs
Operating Systems (CS 340 D)
Chapter 4: Threads.
Operating Systems (CS 340 D)
Anne Pratoomtong ECE734, Spring2002
MULTISAB project: a web platform based on specialized frameworks for heterogeneous biomedical time series analysis - an architectural overview Authors:
Chapter 4: Threads.
Efficient Image Classification on Vertically Decomposed Data
Business Process Management Software
Chapter 4: Threads.
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
Objective of This Course
Optimizing MapReduce for GPUs with Effective Shared Memory Usage
What is Concurrent Programming?
MULTISAB: A web platform for analysis of multivariate
Multithreaded Programming
Concurrency: Mutual Exclusion and Process Synchronization
What is Concurrent Programming?
Operating Systems (CS 340 D)
Decision tree ensembles in biomedical time-series classifaction
Multithreading Tutorial
Chapter 4: Threads & Concurrency
The Challenge of Cross - Language Interoperability
Lecture Topics: 11/1 Hand back midterms
Igor Stančin, Alan Jović to: {igor.stancin,
Presentation transcript:

Parallelization in biomedical time series analysis web platform: the MULTISAB project experience Alan Jovic1, Kresimir Jozic2, Davor Kukolja1, Kresimir Friganovic1, Mario Cifrek 1 E-mail to: alan.jovic@fer.hr 1 University of Zagreb, Faculty of Electrical Engineering and Computing, Zagreb, Croatia 2 INA - industrija nafte, d.d., Zagreb, Croatia

CONTENT Motivation & goal MULTISAB platform structure Parallelization candidate locations Parallelization implementation Validation od cardiac rhythm records Conclusion

Motivation & goal Complexity of biomedical time series (BTS) processing The need for efficient web-based biomedical software is continuosly growing in the healthcare community Goal: development of a web platform for automatic classification of human body disorders based on the analysis of biomedical signals Achieving efficient calculations through parallelization is an important aspect of the platform!

MULTISAB platform structure Three (sub)projects: Frontend browser based UI Backend requests & session handling database communication Processing BTS analysis frameworks Frameworks Record input handling Preprocessing Signal visualization General time series features Specific (domain) time series features Feature extraction Expert system recommendations Data mining Reporting

MULTISAB platform structure Extensive use of many contemporary technologies

MULTISAB platform structure

Parallelization candidate locations The computationally most intensive parts of the process Preprocessing Filtering, power density estimates, time-frequency methods Feature extraction Focus of this work Data mining Classification algorithms (ANN, SVM, random forest)

Parallelization candidate locations Feature extraction Attempt at parallelization of BTS features calculation directly on GPU using Aparapi API (Java OpenCL library) FAILED Main reasons for failure: the nature of the algorithms (conditional next-step execution, non-trivial mathematical operations) overhead of run-time generation of OpenCL kernel due to code dissasembly Conclusion: Java multi-threading should be used instead

Parallelization implementation Assumptions for starting feature extraction: One or more patient records uploaded and preprocessed earlier Records may contain heterogeneous signals, e.g. 10 EEG trails, 2 ECG trails Number of feature extraction iterations is equal to the number of different signal types (i.e. ECG, EEG...) in the records.

Parallelization implementation Assumptions for starting feature extraction: All records contain the same types and numbers of signal trails. Each iteration is performed on one signal type in all records All signals in all records are of equal duration, if not, extraction is performed only until the end of the shortest signal trail Note: user may select only a portion of uploaded records and signals for analysis

Parallelization implementation Feature extraction parameters need to be set: The list of features that need to be extracted for each iteration The list of feature parameters with values for each feature The starting time in the record from which the feature extraction process starts (the same for all iterations). The analyzed segment width (the same for all iterations). The final time in the record until which the analysis is performed (the same for all iterations).

Parallelization implementation Parallelization procedure if (no. segments in signal > 1), process segments in parallel until all are analyzed else if (no. signals of the same type > 1), process signals in parallel until all the signals are analyzed* else if (no. records > 1), process records in parallel until the records are analyzed. else // only one record with one signal type and one segment the parallelization is not performed and the record is analyzed in the main thread. * Note: a special case for the implemented bivariate features is that all signal pairs (e.g. for calculation of the mutual information feature) are analyzed in parallel.

Parallelization implementation Extraction of the list of features in a single segment, signal, or record always proceeds sequentially (for easier synchronization due to varying algorithms’ complexities) Limit the degree of parallelization: no. logical cores – 1 Synchronization: all threads need to finish with the current feature extraction task before moving on to the next parallelization task (main thread waits).

Parallelization validation Cardiac rhythm records from well-known MIT-BIH Arrhythmia Database available at PhysioNet web portal A total of 27 time domain, frequency domain and nonlinear heart rate variability features (standardly used as well as experimentally used features) The experiments were run on Intel Core i7-4790 CPU @3.6 GHz with 16 GB RAM and 8 logical cores, of which 7 were used for parallelization.

Parallelization validation No. of analyzed record segments, with their length Parallel / sequential execution Number of included records 1 12 48 85 segments, each 20 s parallel 11630 35714 113579 sequential 10314 39315 130466 18 segments, each 90 s 14513 4479 150623 1788 96229 337213 7 segments, each 240 s 20017 89043 251912 26616 197320 800242 3 segments, each 560 s 3157 196623 7858204 53116 4268117 17672115 2 segments, each 840 s 4667 355649 14781244 76629 618155 26395223 Experiments run 5 times, mean and standard deviation are reported in milliseconds

Conclusions Maximum effect of multithreading for a general BTS analysis may be achieved for: long segments the number of segments given as multiple of the number of used logical cores Except for the analysis of a few records with very short segments, parallelization is beneficial

Conclusions Future work: Increase degree of parallelization Implement parallelization for specific preprocessing and data mining methods Explore efficient parallelization of multivariate features

Thank you! Questions? This work has been fully supported by the Croatian Science Foundation under the project number UIP-2014-09-6889: A software system for parallel analysis of multiple heterogeneous time series with application in biomedicine (MULTISAB)