Presentation is loading. Please wait.

Presentation is loading. Please wait.

2015/2016. 1.NON-SMALL CELL LUNG CANCER (NSCLC) 1.1 Adenocarcinomas are often found in an outer area of the lung. 1.2 Squamous cell carcinomas are usually.

Similar presentations


Presentation on theme: "2015/2016. 1.NON-SMALL CELL LUNG CANCER (NSCLC) 1.1 Adenocarcinomas are often found in an outer area of the lung. 1.2 Squamous cell carcinomas are usually."— Presentation transcript:

1 2015/2016

2 1.NON-SMALL CELL LUNG CANCER (NSCLC) 1.1 Adenocarcinomas are often found in an outer area of the lung. 1.2 Squamous cell carcinomas are usually found in the center of the lung next to an air tube (bronchus). (1.3) Large cell carcinomas can occur in any part of the lung. They tend to grow and spread faster than the other two types.) 2. SMALL CELL LUNG CANCER(SCLC) 2.1 Small cell carcinoma (oat cell cancer). 2.2 Combined small cell carcinoma. 3. CARCINOIDS (COIDS) form a distinct histologic tumor subtype LUNG CANCER Coid

3 EGFR mutations The tyrosine kinase (TK) domain is a region on the EGFR gene which is prone to mutation in patients with NSLC. The TK domain has 7 exons (exons 18-24), of which exons 8-21 carry somatic mutations in patients with NSCLC.

4 The data about microdeletions on the EGFR genes was collected from an online database, and the NCBI database nucleotide sequence NG_007726 was used to build the training data sets. The transformed statistical table is given in Table, and was used for generation of mutated exons’ sequences. For this purpose was developed an algorithm “ approxsimative predictor” in MATLAB, which can generate the required number of mutated exons based on a sample of healthy exons, statistical data on the type of mutation, span of nucleotides that are affected, and the number of patients with this type of mutation shown in Table. Transformed statistical table in MATLAB software package

5 Approach based on exact identifier Due to the poor results (training, validation, and test errors) of approximative predictor, we developed the model based on generators of “predictive” combinations of all microdeletion mutations on exons 18, 19 and 20 (shown in Figure).  The aim of development of this EGFR mutations identifier was to achieve exact identification of mutated exons 18,19 and 20 in the EGFR gene and thus create the basis for the discovery of new treatments for Non-Small Cell Lung Cancer (NSCLC).  We have developed an integrated software suit using two levels of ensembling: 1. Ensembling of EGFR gene to exons and 2. ensembling of global exon combinations to partial exon combinations.

6 Ensembling of combinations of the exon 19 to groups of 10 nucleotides  microdeletions in mutated exons.  exon nucleotides can be in two states: deletion or nucleotide is normal.  2 n mutated exons where n is length of exon. PREMISES 1 Instead of generating a 2 lot of combinations for each exon (in real time would take a long time) we generate a 3 combinations of the partial parts of exons, and identified mutations in these parts we integrate in the exon that gives us complete information about mutations. 2 3 EGFR GENE MUTATIONS IDENTIFIER - PREMISES This model consists of three modules:

7 1. The first module includes preprocessing data (extraction, encoding, and normalization); 2.The second module includes functions for training of radial basis (radbas) neural network ensemble using “predictive” mutations training set. 3. The third module is intended for exploitation in two modes: EGFR GENE MUTATIONS IDENTIFIER - STRUCTURE

8 DIAGNOSIS PREDICTIVE SYSTEM EGFR GENE MUTATIONS IDENTIFIER On-line mode utilizing sample patients’ data with microdeletion mutations extracted on-line from EGFR mutation database or off- line mode( simulation mode with “predictive” microdeletion mutations from own database). The steps of operation are: masking, exon combinations ensembling, radbas mutations identification and conversion from binary to decimal format, reensembling of partial parts of exons, and counting.

9 Radial Basis Network Test of radbas identifier with 10 input vectors with 10 binary elements (0= mutation, 1= no mutation), and outputs with decimal identification of positions/number of mutations Identification of mutations;  odd output from radbas network indicates the beginning of a mutation sequence,  and even output indicates the number of mutations from that position


Download ppt "2015/2016. 1.NON-SMALL CELL LUNG CANCER (NSCLC) 1.1 Adenocarcinomas are often found in an outer area of the lung. 1.2 Squamous cell carcinomas are usually."

Similar presentations


Ads by Google