Symbolization and pattern formation strategies for the assessment of cardiovascular control Alberto Porta Department of Biomedical Sciences for Health,

Slides:



Advertisements
Similar presentations
Applied Algorithmics - week7
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Sampling and Pulse Code Modulation
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
Fast Algorithms For Hierarchical Range Histogram Constructions
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Noninvasive monitoring of the cardiovascular regulation based on heart rate variability analysis: do non linear tools provide additional information? Alberto.
Sponsor: Dr. Lockhart Team Members: Khaled Adjerid, Peter Fino, Mohammad Habibi, Ahmad Rezaei Fall Risk Assessment: Postural Stability and Non-linear Measures.
Mining for High Complexity Regions Using Entropy and Box Counting Dimension Quad-Trees Rosanne Vetro, Wei Ding, Dan A. Simovici Computer Science Department.
Signal Processing Institute Swiss Federal Institute of Technology, Lausanne 1 Feature selection for audio-visual speech recognition Mihai Gurban.
Principal Component Analysis
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Common Problems in Writing Statistical Plan of Clinical Trial Protocol Liying XU CCTER CUHK.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.
Chapter 11 Multiple Regression.
REGRESSION AND CORRELATION
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
On Error Preserving Encryption Algorithms for Wireless Video Transmission Ali Saman Tosun and Wu-Chi Feng The Ohio State University Department of Computer.
lecture 2, linear imaging systems Linear Imaging Systems Example: The Pinhole camera Outline  General goals, definitions  Linear Imaging Systems.
Quantum One: Lecture 8. Continuously Indexed Basis Sets.
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing
Computer Vision – Compression(2) Hanyang University Jong-Il Park.
M. Velucchi, A. Viviani, A. Zeli New York University and European University of Rome Università di Firenze ISTAT Roma, November 21, 2011 DETERMINANTS OF.
Disentangling cardiovascular control mechanisms via multivariate modeling techniques: the “spontaneous” baroreflex Alberto Porta Department of Biomedical.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Cs: compressed sensing
Theory of Probability Statistics for Business and Economics.
Prof. of Clinical Chemistry, Mansoura University.
Detecting non linear dynamics in cardiovascular control: the surrogate data approach Alberto Porta Department of Biomedical Sciences for Health Galeazzi.
Time irreversibility analysis: a tool to detect non linear dynamics in short-term heart period variability Alberto Porta Department of Biomedical Sciences.
Digital Image Processing Image Compression
Image Compression – Fundamentals and Lossless Compression Techniques
Image Compression Fasih ur Rehman. Goal of Compression Reduce the amount of data required to represent a given quantity of information Reduce relative.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
Flow of mechanically incompressible, but thermally expansible viscous fluids A. Mikelic, A. Fasano, A. Farina Montecatini, Sept
Alberto Porta Department of Biomedical Sciences for Health Galeazzi Orthopedic Institute University of Milan Milan, Italy Understanding the effect of nonstationarities.
Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer.
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
On Coding for Real-Time Streaming under Packet Erasures Derek Leong *#, Asma Qureshi *, and Tracey Ho * * California Institute of Technology, Pasadena,
Alberto Porta Department of Biomedical Sciences for Health Galeazzi Orthopedic Institute University of Milan Milan, Italy Evaluating complexity of short-term.
1 An Efficient Classification Approach Based on Grid Code Transformation and Mask-Matching Method Presenter: Yo-Ping Huang.
Case Selection and Resampling Lucila Ohno-Machado HST951.
Analysis of Experimental Data; Introduction
Data Mining and Decision Support
Inductance Screening and Inductance Matrix Sparsification 1.
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
1 Chapter 6 SAMPLE SIZE ISSUES Ref: Lachin, Controlled Clinical Trials 2:93-113, 1981.
Entropy vs. Average Code-length Important application of Shannon’s entropy measure is in finding efficient (~ short average length) code words The measure.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
بسم الله الرحمن الرحيم Digital Signal Processing Lecture 2 Analog to Digital Conversion University of Khartoum Department of Electrical and Electronic.
Stats Methods at IC Lecture 3: Regression.
Keep the Adversary Guessing: Agent Security by Policy Randomization
Data Transformation: Normalization
The Johns Hopkins University
Digital Communications Chapter 13. Source Coding
Statistical Data Analysis
Context-based Data Compression
Roberto Battiti, Mauro Brunato
Discrete Event Simulation - 4
Common Problems in Writing Statistical Plan of Clinical Trial Protocol
Inductance Screening and Inductance Matrix Sparsification
Statistical Data Analysis
Data Transformations targeted at minimizing experimental variance
Introduction Software maintenance:
Presentation transcript:

Symbolization and pattern formation strategies for the assessment of cardiovascular control Alberto Porta Department of Biomedical Sciences for Health, Galeazzi Orthopedic Institute, University of Milan, Milan, Italy

Introduction Cardiovascular variables exhibit, when observed on a beat-to-beat basis, rhythmical changes, referred to as spontaneous variability Spontaneous variability reflects the action of cardiovascular control mechanisms operating to guarantee the functioning of our organism in every condition In healthy condition modifications of the state of the cardiovascular control result in significant changes of spontaneous variability Pathological conditions dramatically alter spontaneous variability

Spontaneous beat-to-beat variability

The state of the cardiovascular control influences spontaneous variability Supine Passive standing

Cardiovascular diseases influence spontaneous variability A.L. Goldberger et al, Proc Natl Acad Sci, 99: , 2002 Healthy subject Atrial fibrillation patient Congestive heart failure patient Cheyne-Stokes breathing patient

Introduction Successful separation between different experimental conditions within the same population and between different populations within the same experimental condition depends on the significance of the features extracted from cardiovascular variability signals Symbolization and pattern formation strategies, indissolubly linked to the concept of partition (and, more generally, to coarse graining), provide features and indexes helpful to distinguish experimental conditions and/or groups

Aim To review symbolization and pattern formation strategies exploited to distinguish experimental conditions and/or groups in cardiovascular control studies

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

Symbolization Symbolization is a procedure transforming samples of a series into symbols belonging to a finite alphabet Code: all the samples belonging to the same bin are transformed into the same symbol The most utilized binning procedures are: 1) uniform quantization 2) non uniform quantization 3) sample-centered uniform quantization 4) sample-centered non uniform quantization A. Porta et al, Biol Cybern, 78:71-78, 1998 A. Porta et al, IEEE TBME, 54:94-106, 2007 N. Wessel et al, Phys Rev E, 61: , 2000 D. Cysarz et al, AJP, 292:R368-R372, 2007 A. Porta et al, IEEE TBME, 54:94-106, 2007 S.M. Pincus, Chaos, 5: , 1995 J.S. Richman et al, AJP, 278:H2039-H2049, 2000

Uniform quantization The full dynamical range (i.e. max-min) is divided into q bins of equal width ε max(RR)-min(RR) ε = q q = 6 Bins do not overlap and cover the full dynamical range one sample is transformed into one symbol

Non uniform quantization q = 6 The full dynamical range (i.e. max-min) is divided into q bins of different width ε one sample is transformed into one symbol Bins do not overlap and cover the full dynamical range

Sample-centered uniform quantization Bins are constructed around each sample and have the same width ε Bins overlap several symbols can be associated to the same sample ε ε

Sample-centered non uniform quantization Bins are constructed around each sample and have different width ε Bins overlap several symbols can be associated to the same sample ε ε

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

s L C (i-1) = (s(i-1), …., s(i-L)) Pattern formation via delay embedding reconstruction s L A (i+1) = (s(i+1), …., s(i+L)) causal pattern of s(i) anti-causal pattern of s(i) ii+1i-1i+2i-2 … … i-Li+L original x(i) s(i)s(i+1)s(i-1)s(i+2)s(i-2) … … s(i-L)s(i+L) series symbolic series symbolization procedure s(i) is the forward image of s L C (i-1) s(i) is the backward image of s L A (i+1) present past future forwardbackward x L C (i-1) x L A (i+1)

Causal patterns are linked to predictability s(i) might be predicted using s L C (i-1) Conditional distribution of s(i) given s L C (i-1) Conditional distribution of s(i) given s L C (i-1) Conditional distribution of s(i) given s L C (i-1) s(i) can be fully predicted given L previous symbols s(i) can be largely predicted given L previous symbols s(i) cannot be predicted given L previous symbols 111

Causal patterns are linked to conditional entropy The information carried by s(i) might be reduced given s L C (i-1) Conditional distribution of s(i) given s L C (i-1) Conditional distribution of s(i) given s L C (i-1) Conditional distribution of s(i) given s L C (i-1) No information is carried by s(i) given L previous symbols The information carried by s(i) cannot be reduced given L previous symbols A certain amount of information is carried by s(i) given L previous symbols 111

Anti-causal patterns are linked to reversibility Since reversing time makes anti-causal patterns into causal ones (and vice versa), anti-causal patterns are helpful to test reversibility (i.e. the preservation of the statistical properties after time reversal) Conditional distribution of s(i) given s L C (i-1) Conditional distribution of s(i) given s L A (i+1) Since the conditional distributions are different, statistical properties are not maintained after time reversal 11

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

Uniform partition L=3, q=6 x(i-1) x(i-2) x(i-3) - Cells are hyper-cubes of side ε - The number of cells are - Cells have the same size - Cells cover the entire embedding space - Cells do not intersect - The number of patterns in each cell might be variable qLqL ε - uniform quantization - delay embedding procedure uniform partition

Non uniform partition L=3, q=6 x(i-1) x(i-2) x(i-3) - Cells are hyper-boxes - The number of cells are - Cells have different size - Cells cover the entire embedding space - Cells do not intersect - The number of patterns in each cell might be made constant qLqL - non uniform quantization - delay embedding procedure non uniform partition

- Cells are hyper-spheres of radius ε - The number of cells are N-L+1 - Cells are built around patterns - Cells have equal size - Cells cover the entire embedding space - Cells intersect - The number of patterns in each cell might be variable Pattern-centered uniform coarse graining L=3 x(i-1) x(i-2) x(i-3) ε - sample-centered uniform quantization - delay embedding procedure pattern-centered uniform coarse graining

Pattern-centered non uniform coarse graining - Cells are hyper-spheres of radius ε - The number of cells are N-L+1 - Cells are built around patterns - Cells have different size - Cells cover the entire embedding space - Cells intersect - The number of patterns in each cell might be made constant L=3 x(i-1) x(i-2) x(i-3) ε - sample-centered non uniform quantization - delay embedding procedure pattern-centered non uniform coarse graining

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

Techniques for the optimization of the pattern length 1) Fixing the pattern length, L, according to the series length, N, and the number of symbols, q 2) Using a strategy penalizing unreliable in-sample prediction or conditional entropy 3) Exploiting in-sample predictability loss or conditional entropy rise while increasing L using pattern-centered non uniform coarse graining 4) Taking advantage from out-of-sample prediction

Fixing the pattern length N >> q L Several patterns are present in each cell of partition Example A. Porta et al, IEEE Trans Biomed Eng, 48: , 2001 L = pattern length q = number of symbols N = series length In short-term cardiovascular variability analysis q=6 N=300 L = 3

Length-three pattern distribution A. Porta et al, IEEE Trans Biomed Eng, 48: , 2001 RestT90R10R15R20

Using a strategy penalizing unreliable in-sample prediction or conditional entropy Conditional distribution of s(i) given s L C (i-1) Perfect in-sample prediction or false certainty ? If the number of s L C (i-1) is largeReliability of prediction is high If the number of s L C (i-1) is smallReliability of prediction is low If the number of s L C (i-1) is 1 Insufficient knowledge, likely to produce a false certainty 1

Using a strategy penalizing unreliable in-sample prediction or conditional entropy Conditional distribution of s(i) given s L C (i-1) Likely false certainty True uncertainty If the number of s L C (i-1) is 1 Conditional distribution of s(i) given s L C (i-1) Optimal L minimizes mean square prediction error or conditional entropy 1 1

Normalized corrected conditional entropy A. Porta et al, Biol Cybern, 78:71-78, 1998 L min =3

Sample-centered non uniform coarse graining implies in-sample predictability loss or conditional entropy rise at large L Conditional distribution of s(i) given s L=2 C (i-1) Conditional distribution of s(i) given s L=3 C (i-1) Conditional distribution of s(i) given s L=4 C (i-1) Conditional distribution of s(i) given s L=5 C (i-1) At low L, increasing L might be helpful to unfold dynamics, thus improving predictability and reducing conditional entropy At high L, increasing L leads to the increase of cell size, thus decreasing predictability and increasing conditional entropy L Optimal L minimizes mean square prediction error or conditional entropy

Normalized k-nearest-neighbor conditional entropy Porta A et al, Physiol Meas, 34:17-33, 2013 L min =4

Out-of-sample prediction - Construction of the pattern library - Assessment of the conditional distributions - Definition of the predictor First halfSecond half - test for the predictor as a function of L Optimal L minimizes mean square out-of-sample prediction error

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

Techniques to reduce the redundancy of patterns L=3, q=6 x(i-1) x(i-2) x(i-3) Adjacent and non adjacent cells are merged according to some criteria

Redundancy reduction technique based on the frequency content of the patterns S={0,1} and s L C ={s L C (i), with i=1,…,N-L+1} Most stable patterns {(0,0,…,0,0), (1,1,…,1,1)} … {…, …} Most variable patterns {(0,1,…,0,1), (1,0,…,1,0)} Frequency of the dominant rhythm

A. Porta et al, IEEE Trans Biomed Eng, 48: , V 1V 2LV 2UV Example of redundancy reduction technique based on the frequency content of the patterns S={0,1,2,3,4,5} and s L C ={s L C (i), with i=1,…,N-L+1} with L=3 Pattern number 6 3 =216 4 pattern classes

Symbolic dynamics and autonomic nervous system A. Porta et al, Am J Physiol, 293:H702-H708, 2007

Redundancy reduction technique based of the Shannon entropy (SE) of the patterns S={0,1} and s L C ={s L C (i), with i=1,…,N-L+1} Least informative patterns {(0,0,…,0,0), (1,1,…,1,1)} … {…, …} Most informative patterns {(0,1,...,0,1), (0,…,0,1,…,1), …} … 0 or 1 SE=0 1 0 SE=log2 1 1

Example of redundancy reduction technique based on the entropy of the patterns S={0,1} and s L C ={s L C (i), with i=1,…,N-L+1} with L=8 17 pattern classes Pattern number 2 8 =256 D. Cysarz et al, Comp Biol Med, 42: , 2012

Binary patterns of length L=8 with ApEn=0.0 D. Cysarz et al, Comp Biol Med, 42: , 2012 Symbolic dynamics and autonomic nervous system

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

Time domain indexes derived from symbolization and pattern formation strategies - Sample frequency of a specific pattern - Sample frequency of a specific pattern class - Forward mean square prediction error between the current value and its best prediction based on causal patterns - Backward mean square prediction error between the current value and its best prediction based on anti-causal patterns A. Porta et al, IEEE Trans Biomed Eng, 48, , 2001 D. Cysarz et al, Comp Biol Med, 42, , 2012 N. Wessel et al, Phys Rev E, 61, , 2000 A. Porta et al, IEEE Trans Biomed Eng, 47, , 2000 A. Porta et al, IEEE Trans Biomed Eng, 54, , 2007 A. Porta et al, Phil Trans R Soc A, 367: , 2009 N. Wessel et al, Med Biol Eng Comput, 44, , 2006

Information domain indexes derived from symbolization and pattern formation strategies - Shannon entropy of the pattern distribution - Entropy rate of the series corrected approximate entropy (CApEn) sample entropy (SampEn) corrected conditional entropy (CCE) - Mutual information S.M. Pincus, Chaos, 5: , 1995 J.S. Richman and J.R. Moorman, Am J Physiol, 278:H2039-H2049, 2000 A. Porta et al, Biol Cybern, 78:71-78, 1998 A. Porta et al, J Appl Physiol, 103: , 2007 N. Wessel et al, Phys Rev E, 61: , 2000 A. Porta et al, IEEE Trans Biomed Eng, 48: , 2001 D. Hoyer et al, IEEE Trans Biomed Eng, 52: , 2005

Outline 1) Symbolization strategies 2) Pattern construction via delay embedding procedure 3) Symbolization strategy and pattern construction technique impose a coarse graining of the multidimensional space 4) Techniques to optimize the pattern length 5) Approaches to reduce redundancy of patterns 6) Indexes derived from symbolization and pattern formation strategies 7) Application in clinics

Application on symbolic dynamics in clinics

R. Maestri et al, J Cardiovasc Electrophysiol 18: , 2007 Best multivariate clinical model for the prediction of the total cardiac death in heart failure population

Linear and non linear indexes derived from heart rate variability R. Maestri et al, J Cardiovasc Electrophysiol 18: , Linear indexes in the time and frequency domains

R. Maestri et al, J Cardiovasc Electrophysiol 18: , 2007 Additive predictive value to the best multivariate clinical model

Conclusions Symbolization procedures and pattern formation techniques impose a coarse graining over the multidimensional space Techniques to optimize the pattern length are necessary to make consistent indexes derived from symbolization and pattern formation strategies Techniques to reduce redundancy of patterns are necessary to focus a limited amount of features Symbolization and pattern formation strategies provided indexes improving the best multivariate model based on traditional clinical parameters for the prediction of total cardiac death in heart failure population