Wrapping up our last topic: You and your (DNA) parasites Events like these, happening over and over again, have led to… Edward Marcotte/Univ. of Texas/BCH391L/Spring.

Slides:



Advertisements
Similar presentations
PubMed and other relevant NLM resources. Hands-on course, June 26, 2003 Astrid Müller University of Oslo Library Library of Medicine and Health Sciences.
Advertisements

PubMed Overview From the main HINARI webpage, we can access PubMed by clicking on Search HINARI journal articles through PubMed (Medline). Note: If you.
PubMed/How to Search, Display, Download & (module 4.1)
PubMed Overview From the HINARI Content page, we can access PubMed by clicking on Search inside HINARI full-text using PubMed. Note: If you do not properly.
MY NCBI To register, add filters and use the MY NCBI options, you should directly access PubMed using the following address:
PubMed/History; Accessing Full-Text Articles (module 4.4)
Managing References : Mendeley
MY NCBI (module 4.5). MODULE 4.5 PubMed/How to Use MY NCBI Instructions - This part of the: course is a PowerPoint demonstration intended to introduce.
The results for this search are displayed in the Summary format with a total of 3808 citations.
PubMed/How to Search, Display, Download & (module 4.1)
NIH Public Access Compliance Cleveland Health Sciences Library Case Western Reserve University Kathleen C. Blazar.
Introduction to the Internet
PubMed for Trainers, Winter 2013 U.S. National Library of Medicine (NLM) and NLM Training Center Full Text.
Zoology 305 Library Databases/Indexes Lab Goals for session: 1) Meet your librarian Kevin Messner 2) Understand.
How to manage my references? Stop Searching, Start Discovering.
PubMed and its search options Jan Emmerich, Sonja Jacobi, Kerstin Müller (5th Semester Library Management)
NCBI/WHO PubMed/Hinari Course NCBI Literature Databases: PubMed Background.
Introduction to PubMed® (pubmed.gov)
Creating NCBI The late Senator Claude Pepper recognized the importance of computerized information processing methods for the conduct of biomedical research.
1.
References: Online Sources APA format Created by Andrea Dottolo, Ph.D., Department of Psychology, University of Massachusetts, Lowell 1.
Encyclopedia of Genetics, Genomics, Proteomics and Bioinformatics USC School of Medicine Library.
NATIONAL LIBRARY OF MEDICINE The PubMed ID and Entrez, PubMed and PubMed Central Edwin Sequeira National Center for Biotechnology Information June 21,
Perl Programming: Developing Key Tools for Bioinformatics An Informative Look Behind the Importance of Programming Skills and Brief Tutorial on Getting.
Project presentation using TWiki Lim Yun Ping National University of Singapore.
Master’s course Bioinformatics Data Analysis and Tools Lecture 6: Internet Basics Centre for Integrative Bioinformatics.
Accessing journals by via PubMed Note the link to find articles through HINARI/PubMed. Using this option will be covered in later in the Short Course.
PubMed/How to Search, Display, Download & (module 4.1)
PubMed/History; Accessing Full-Text Articles (module 4.4)
1 Spidering the Web in Python CSC 161: The Art of Programming Prof. Henry Kautz 11/23/2009.
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
Bioinformatics.
Edward Marcotte/Univ. of Texas/BIO337/Spring 2014 Science news of the day: “The game, called EteRNA, allows players to remotely carry out real experiments.
A data retrieval workflow using NCBI E-Utils + Python Part II: Jinja2 / Flask John Pinney Tech talk Tue 19 th Nov.
Internet Browsing the world. Browse Internet Course contents Overview: Browsing the world Lesson 1: Internet Explorer Lesson 2: Save a link for future.
PubMed Overview From the HINARI Content page, we can access PubMed by clicking on Search inside HINARI full-text using PubMed. Note: If you do not properly.
PubMed/How to Search, Display, Download & (module 4.1)
Part 1 – PubMed Interface, Display options, Saving, Printing, and ing results. Instructions This part of the course is a PowerPoint demonstration.
Doug Raiford Lesson 3.  More and more sequence data is being generated every day  Useless if not made available to other researchers.
Dan J. Grauman National Cancer Institute National Institutes of Health Department of Health and Human Services Bethesda, Maryland, USA Interactive Cancer.
Variables and ConstantstMyn1 Variables and Constants PHP stands for: ”PHP: Hypertext Preprocessor”, and it is a server-side programming language. Special.
Accessing journals by via PubMed Note the link to find articles through HINARI/PubMed. Using this option will be covered in later in the Short Course.
Journal Searching Nancy B. Clark, M.Ed. Director of Medical Informatics Education FSU College of Medicine 1 All recourses are available online in Medical.
SPRINGER ONLINE
Class material and homework for February 9 today’s in-class topic: selected examples of contemporary biotechnology –polymerase chain reaction (PCR) –DNA.
PubMed Review MLA 2005 San Antonio, Texas. 15 Million Milestone million citations in PubMed million citations in PubMed 13+ million citations.
Partner Publishers’ Websites From the Partner publisher services dropdown menu, click on the Elsevier Science - Science Direct website. Note that this.
PubMed/How to Search, Display, Download & (module 4.1)
PubMed/How to Search, Display, Download & (module 4.1)
Chapter 16 Web Pages And CGI Scripts Department of Biomedical Informatics University of Pittsburgh School of Medicine
1 2/16/05CS120 The Information Era Chapter 4 Basic Web Page Construction TOPICS: Intro to HTML and Basic Web Page Design.
PubMed Basics Barbara A. Wood, MLIS Calder Library University of Miami Miller School of Medicine.
MEDLINE®/PubMed® PubMed for Trainers, Fall 2015 U.S. National Library of Medicine (NLM) and NLM Training Center An introduction.
PubMed/How to Search, Display, Download & (module 4.1)
Wrapping up our last topic: You and your (DNA) parasites Events like these, happening over and over again, have led to…
The Web Web Design. 3.2 The Web Focus on Reading Main Ideas A URL is an address that identifies a specific Web page. Web browsers have varying capabilities.
You and your (DNA) parasites
Livia Vasas PhD Budapest, September 2011.
PubMed Database Interface (Basic Course Module 4 Part A)
PubMed Database Interface (Basic Course Module 4 Part B)
Bibliography and reference manager programs, Endnote 2018 Attila Skulteti
Lívia Vasas, PhD 2018 The Nation Library of Medicine and its databases Mozilla Firefox or Google Chrome Lívia Vasas, PhD.
Lesson 3 Bioinformatics Laboratory
PubMed Database Interface (Basic Course: Module 4 Part A)
PubMed Database Interface Part A (Basic Course Module 4)
You and your (DNA) parasites
PubMed Database Interface (Basic Course: Module 4)
PubMed/How to Search, Display, Download & (module 4.1)
PubMed/How to Search, Display, Download & (module 4.1)
PubMed/How to Search, Display, Download & (module 4.1)
Presentation transcript:

Wrapping up our last topic: You and your (DNA) parasites Events like these, happening over and over again, have led to… Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

(apologies—missing the citation, now lost) Bottom line: Roughly half of your (and my) genome is the fossil wreckage of genomic parasites. We know this (in part) from sequence alignments. ~45% Wrapping up our last topic: You and your (DNA) parasites Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

So far, we’ve talked about DNA, RNA and protein sequences How to compare sequences to decide if they are related Having databases full of sequences and comparing them rapidly (BLAST) In fact, many such databases exist, so today we’ll start with a brief tour of some of the biological data on the web. Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Just some of the resources available for bioinformatics Think of these as the raw data for new discoveries Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Just some of the resources available for bioinformatics Think of these as the raw data for new discoveries >75K protein-protein interactions >1,300 biochemical processes and reactions, described in detail OMIM = the most important resource for human genetic disease Medline has >22 million research articles, many with complete text online GEO has ~900K experiments, each measuring 1000’s of mRNA or protein abundances Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Live demo OMIM, Reactome, Human Protein Atlas Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

It’s nice to know that all of this exists, but ideally, you’d like to be able to so something constructive with the data. That means getting the data inside your own programs. All of these databases let you download data in big batches, but this isn’t always the case, so…. We saw one way to do this in AppSoma. Here’s another. Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Let’s empower your Python scripts to grab data from the web. We’ll use Python library/module = an optional, specialized set of Python methods This particular Python module is called urllib2. urllib2 is: A collection of programs/tools to let you to surf the web from inside your programs. Much more powerful than the simple tasks we’ll do with it. More details: Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

The basic idea: We first set up a “request” by opening a connection to the URL. We then save the response in a variable and print it. If it can’t connect to the site, it’ll print out a helpful error message instead of the page. You can more or less use the commands in a cookbook fashion…. Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

import urllib2# include the urllib2 module url = " try:# this 'try' statement tells Python that we might expect an error. request = urllib2.urlopen(url)# setup a request page = request.read()# save the response print page# show the result to the user except urllib2.HTTPError:# handle a page not found error print "Could not find page." For example:  Run this… Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

>>> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" " Home | The University of Texas at Austin <!— …and so on…  We just captured the UT web page and printed it out (minus the images)… Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

That was a static web page. Let’s try one that requires some sort of action, for example by entering a document id or an id code for a sequence. Many web pages pass this information along in the web URL itself… Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Here’s a complete Python program to retrieve a single entry from Medline: import urllib2 pmid = # Insert the pmid where the {} are in the following URL: url = " try:# there might be an error! request = urllib2.urlopen(url) page = request.read() print page except urllib2.HTTPError:# handle page not found error print "Could not connect to Medline!" Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

If you run that program, you should get back… the Medline entry for the human genome sequence paper >>> PMID OWN - NLM STAT- MEDLINE DA DCOM LR IS (Print) IS (Linking) VI IP DP Feb 15 TI - Initial sequencing and analysis of the human genome. PG AB - The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence. FAU - Lander, E S AU - Lander ES AD - Whitehead Institute for Biomedical Research, Center for Genome Research, Cambridge, Massachusetts 02142, USA. [and so on] Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

>>> PMID OWN - NLM STAT- MEDLINE DA DCOM LR IS (Print) IS (Linking) VI IP DP Feb 15 TI - Initial sequencing and analysis of the human genome. PG AB - The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence. FAU - Lander, E S AU - Lander ES AD - Whitehead Institute for Biomedical Research, Center for Genome Research, Cambridge, Massachusetts 02142, USA. [and so on] If you run that program, you should get back… We just printed it. We could have saved it or extracted data from it. For example… Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Here’s our Python program again to retrieve a single entry from Medline. How would we modify this to count the authors? import urllib2 pmid = # Insert the pmid where the {} are in the following URL: url = " try:# there might be an error! request = urllib2.urlopen(url) page = request.read() print page except urllib2.HTTPError:# handle page not found error print "Could not connect to Medline!" Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Here’s our Python program again to retrieve a single entry from Medline. How would we modify this to count the authors? import urllib2 pmid = # Insert the pmid where the {} are in the following URL: url = " try:# there might be an error! request = urllib2.urlopen(url) page = request.read() print page.count("AU - ") except urllib2.HTTPError:# handle page not found error print "Could not connect to Medline!" Medline begins author lines with "AU - ", so…  Run this, & get … >>> 255 So, there were 255 authors on one (of the two) human genome papers Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015

Queries to Medline or any other NCBI database, including GenBank, are described at: You can often figure out the form of the URL just by looking something up in a database, then noting the address of the web page with the data. This very simple approach could easily be the basis for: a home-made web browser a program to consult biological databases in real time a program to map the internet, etc. Of course, with this kind of power available to you, the imagination reels... Edward Marcotte/Univ. of Texas/BCH391L/Spring 2015