Presentation is loading. Please wait.

Presentation is loading. Please wait.

BIOL595 Final Project Alexandr Pak, Krittikan Chanpaisaeng, Xin Wen April 29, 2016 Literature Analysis.

Similar presentations


Presentation on theme: "BIOL595 Final Project Alexandr Pak, Krittikan Chanpaisaeng, Xin Wen April 29, 2016 Literature Analysis."— Presentation transcript:

1 BIOL595 Final Project Alexandr Pak, Krittikan Chanpaisaeng, Xin Wen April 29, 2016 Literature Analysis

2 Project Goal To create a program that allows users to Acquire information about the publications from NCBI Search for active scholars Visualize the trend of the publications containing the keywords Find related topics that are most studied with the search term 2

3 Steps in Pipeline Fetch article info containing the search term from NCBI Store data in database for ease in search and data retrieval Ebot(eFetch), XML XML::LibXML, DBI, MySQL Provide useful statistics in form of bar charts: # articles published per year, # articles written by each author Provide useful statistics in form of bar charts: # articles published per year, # articles written by each author Find the top related topics/keywords Visualize the trends of topic popularity Find the top related topics/keywords Visualize the trends of topic popularity GD::Graph Lingua::EN::Tag, ThemeRiver 3

4 XML File Fetch 4 Search term: BRCA 2935 articles, 13765 authors, 37222 keywords

5 Data Import 5 Connection and Database Creation Table Creation Parsing and Data Insertion Table Creation

6 Database Structure 6 AUTHORS ARTICLES Author_Article_Relationship KEYWORDS

7 MySQL: Get Number of Publications by Each Author 7

8 Associated Terms Acquisition From Abstracts Lingua::EN::Tag 8 $VAR1 = 'cancer'; $VAR2 = 'breast'; $VAR3 = '%'; $VAR4 = 'brca'; $VAR5 = 'patients'; $VAR6 = 'mutation'; $VAR7 = 'mutations'; $VAR8 = 'women'; $VAR9 = 'brca1'; $VAR10 = 'risk'; $VAR11 = 'ovarian'; $VAR1 = 'cancer'; $VAR2 = 'breast'; $VAR3 = '%'; $VAR4 = 'breast cancer'; $VAR5 = 'brca'; $VAR6 = 'patients'; $VAR7 = 'mutation'; $VAR8 = 'mutations'; $VAR9 = 'women'; $VAR10 = 'brca1'; $VAR11 = 'risk’; longest noun phrase = 1 longest noun phrase = 3

9 9 Associated Terms Acquisition From Keywords Table

10 10 Data Visualization: 2D Area Graph (Excel)

11 Data Visualization: ThemeRiver 11

12 12

13 XML File Fetch 13

14 Data Import 14 Search term: BRCA 2935 articles, 13765 authors, 37222 keywords

15 Data Import 15 Search term: BRCA 2935 articles, 13765 authors, 37222 keywords

16 ThemeRiver 16 Xu, Panpan, et al. (2013)


Download ppt "BIOL595 Final Project Alexandr Pak, Krittikan Chanpaisaeng, Xin Wen April 29, 2016 Literature Analysis."

Similar presentations


Ads by Google