Presentation is loading. Please wait.

Presentation is loading. Please wait.

) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC.

Similar presentations


Presentation on theme: ") Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC."— Presentation transcript:

1 ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC HEALTH RECORDS AND CLINICAL TRIALS SYSTEMS ADVANCING PATIENTS SAFETY IN CLINICAL RESEARCH 12 th International Conference on Bioinformatics and Bioengineering, Larnaka Athos Antoniades

2 FP7, ICT-2011 – 5.3 Page 2  Why Share Data?  What are the current legal and ethical limitations?  How have scientists shared medical data so far?  Key Problems  Perturbation  Cell Suppression

3 FP7, ICT-2011 – 5.3 Page 3 Why share data:  Replication Testing  Statistical Power  Multiple Testing Problem Legal and Ethical Issues  Anonymization vs Pseudoanonimization  Limitations derived from consent form signed by subjects  Other, regional, study, or subject specific issues.

4 FP7, ICT-2011 – 5.3 Page 4 aaaAAA CaseU 00 U 01 U 02 ControlU 10 U 11 U 12

5 FP7, ICT-2011 – 5.3 Page 5 A paper that analyzes data from a specific study reports: Marital Status Age MarriedWidowedSingle 0-160150 18-2410550 25-34407 35~601520

6 FP7, ICT-2011 – 5.3 Page 6 A paper that analyzes data from a specific study reports: Marital Status Age MarriedWidowedSingle 0-160150 18-2410550 25-34407 35~601520

7 FP7, ICT-2011 – 5.3 Page 7 A paper that analyzes data from a specific study reports: Marital Status Age MarriedWidowedSingle 0-160150 18-2410550 25-34407 35~601520

8 FP7, ICT-2011 – 5.3 Page 8 Paper 1 that analyzes data from a specific study reports: Marital Status Age MarriedWidowedSingle 0-16NA 50 18-2410750 25-34407 35~601520 Marital Status Age MarriedWidowedSingle 0-16NA 50 18-2510850 26-3545740 36~551420 Paper 2 that analyzes data from the same study reports:

9 FP7, ICT-2011 – 5.3 Page 9 Original Data Marital Status Age MarriedWidowedSingle 0-160 1 50 18-2410750 25-34407 35~601520 Marital Status Age MarriedWidowedSingle 0-16NA 51 18-249849 25-3440741 35~611421 Perturbation (+-1) and Cell Suppression (<5)

10 FP7, ICT-2011 – 5.3 Page 10 Most common parameters testedMost common parameters tested Perturbation:[0], [-1,1], [-3,3], [-5,5], [-10,10] Cell Supression: <0, <=1, <=3,<=5,<=10 Standard main effect test using Chi SquareStandard main effect test using Chi Square Pearson’s Correlation Coefficient used to evaluate deviation of each parameter combination to original results.Pearson’s Correlation Coefficient used to evaluate deviation of each parameter combination to original results. A-priory defined threshold for Pearson’s correlation coefficient <=0.95.A-priory defined threshold for Pearson’s correlation coefficient <=0.95.

11 FP7, ICT-2011 – 5.3 Page 11

12 FP7, ICT-2011 – 5.3 Page 12 Objectives:  Design and develop the data mining techniques and the scalable infrastructure for the identification of phenotypic and genetic associations related to adverse events.  Develop new and implement existing state of the art analytical approaches for genetic data.  Define and implement the knowledge extraction and filtering mechanisms and the knowledge base  Integrate the knowledge base into a lightweight decision support system (Adverse events early detection mechanism)

13 FP7, ICT-2011 – 5.3 Page 13

14 FP7, ICT-2011 – 5.3 Page 14 Provides the tools for identifying and removing erroneous data or data that do not conform to the quality standards that a user might define. Tools:  Hardy-Weinberg Equilibrium Test  Allele Frequency Test  Missing Data Test

15 FP7, ICT-2011 – 5.3 Page 15 Provides the tools for removing redundant or irrelevant features from a dataset. Tools: Rough Set Feature Selection Information Gain Feature Selection Chi Squared Feature Selection

16 FP7, ICT-2011 – 5.3 Page 16

17 FP7, ICT-2011 – 5.3 Page 17 Provides the tools for performing single hypothesis testing on a dataset and test for associations. Tools: Pearson’s Chi Square Test Fisher’s Exact Test Odds Ratio Binomial Logistic Regression Linkage Disequilibrium Genetic Region Based Association Testing

18 FP7, ICT-2011 – 5.3 Page 18 Provides the tools for performing data mining analyses on a dataset and extract association rules. Tools: Association Rules (apriori) Decision Trees with Percentage Split (C4.5) Decision Trees with Cross Validation (C4.5) Random Forest with Percentage Split Random Forest with Cross Validation

19 FP7, ICT-2011 – 5.3 Page 19

20 FP7, ICT-2011 – 5.3 Page 20

21 FP7, ICT-2011 – 5.3 Page 21   Knowledge Extraction Mechanism This mechanism is responsible for storing statistically significant associations and important association rules in the Linked2Safety knowledge database Has two steps:   Logging system   Storing important knowledge   Filtering mechanism This mechanism allows users to insert or delete associations and association rules

22 FP7, ICT-2011 – 5.3 Page 22   Uses the knowledge in the L2S knowledge base   Runs in the background to identify new associations and association rules   Reruns analyses when updated datasets are available   Creates alerts for patients profiles associated with adverse events

23 FP7, ICT-2011 – 5.3 Page 23

24 FP7, ICT-2011 – 5.3 Page 24

25 FP7, ICT-2011 – 5.3 Page 25 Overlapping non genetic data of at least 2 data providers: Variables AgeWeight gain GenderHeadaches BMIGastrointestinal symptoms Smoking EverOphthalmological problems DyslipidemiaType of ophthalmological condition DiabetesHigh blood pressure Diabetes type IHeart conditions exist Diabetes type IIType of heart condition AnemiaHypertension Depressive personality disorderMyocardial infarction Major depressive disorderStroke Schizotypal personality disorderCoronary heart disease

26 FP7, ICT-2011 – 5.3 Page 26 We were able to identify for a given dataset the maximum noise that can be added to the data without significantly affecting the outcomes. Results presented are only relevant to MASTOS, all other datasets need to repeat the analytical approach described to determine the maximum noise that can be added to the results. Further investigation is necessary to identify the minimum parameter settings to satisfy legal and ethical requirements.

27 FP7, ICT-2011 – 5.3 Page 27 Athos Antoniades University of Cyprus email: athos@cs.ucy.ac.cy


Download ppt ") Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC."

Similar presentations


Ads by Google