# load data originaldata = load_data_from_csv(rawdatafile) #filter out a range filtered = range_filter({:min=> 20,:max =>50},originaldata) # sum normalize.

Slides:



Advertisements
Similar presentations
" GTPS" is acronym of Gene Trek in Procaryote Space. Various complete genomes of eubacteria and archaea have been registered in the International Nucleot.
Advertisements

In The Name of Allah The Most Beneficent The Most Merciful
Advanced Higher Unit 3 Nuclear Magnetic Resonance Spectroscopy.
1 CHAPTER 13 Molecular Structure by Nuclear Magnetic Resonance (NMR)
Nuclear Magnetic Resonance (NMR) Spectroscopy
Proposal for a Standard Representation of the Results of GC-MS Analysis: A Module for ArMet Helen Fuell 1, Manfred Beckmann 2, John Draper 2, Oliver Fiehn.
Structure Determination by NMR CHY 431 Biological Chemistry Karl D. Bishop, Ph.D. Lecture 1 - Introduction to NMR Lecture 2 - 2D NMR, resonance assignments.
Instrumental Chemistry Chapter 11 Atomic Mass Spectrometry.
Forensic Toxicology - the study of the chemical and physical properties of toxic substances and their physiological effect on living organisms.
Chapter 19 NMR Spectroscopy. Introduction... Nuclear Magnetic Resonance Spectrometry is based on the measurement of absorption of electromagnetic radiation.
Nuclear Energy Effects and Uses of Radiation
 PART Requirements for Spectroscopic Techniques for Polymers 1. High resolution 2. High sensitivity (>1%) 3. High selectivity between molecular.
Physical and Chemical Tests 10-1 Purification: Chromatography Distillation Recrystallization Comparison to known compounds: Melting point Boiling point.
By, Blessy Babu. What is Gas Chromatography?  Gas spectroscopy is a technique used to separate volatile components in a mixture.  It is particularly.
Chemical Analysis. Analytical Techniques When chemical evidence is collected at a crime scene, it must be run through an instrument. These instruments.
The nucleus. Rutherford's nuclear atom (1902 ‑ 1920) Ernest Rutherford was interested in the distribution of electrons in atoms. Two of his students,
Year 12 Chemistry Unit 3 – AOS 1 Chemical Analysis.
Framework Universal & Infinite Software Solution.
Nuclear Magnetic Resonance Spectroscopy Workshop
1 H NMR Spectroscopy A short introduction Larry Scheffler.
Nuclear Magnetic Resonance Spectroscopy Dr. Sheppard Chemistry 2412L.
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy Leroy Wade.
Nuclear Magnetic Resonance Spectroscopy. 2 Introduction NMR is the most powerful tool available for organic structure determination. It is used to study.
Chapter 13 Nuclear Magnetic Resonance Spectroscopy Jo Blackburn Richland College, Dallas, TX Dallas County Community College District  2006,  Prentice.
Week 11 © Pearson Education Ltd 2009 This document may have been altered from the original State that NMR spectroscopy involves interaction of materials.
Nuclear Magnetic Resonance Spectroscopy (NMR) Dr AKM Shafiqul Islam School of Bioprocess Engineering.
NMR in Medicine and Biology MRI- Magnetic Resonance Imaging (water) In-vivo spectroscopy (metabolites) Solid-state NMR (large structures) Solution NMR.
NMR in Medicine and Biology
Learning Objectives Use high resolution n.m.r spectrum of simple molecules (carbon, hydrogen & oxygen) to predict The different types of proton present.
MARISSA: MApReduce Implementation for Streaming Science Applications 作者 : Fadika, Z. ; Hartog, J. ; Govindaraju, M. ; Ramakrishnan, L. ; Gunter, D. ; Canon,
Infrared Spectroscopy
W HAT IS NUCLEAR MAGNETIC RESONANCE ? State that NMR spectroscopy involves interaction of materials with low-energy radio frequency radiation. Describe.
Forensic Toxicology. Toxicology A science that deals with poisons and their effect and with the problems involved Forensic Toxicology The use of toxicology.
1 Nuclear Magnetic Resonance Nuclear Magnetic Resonance (NMR) Applying Atomic Structure Knowledge to Chemical Analysis.
Nuclear Magnetic Resonance Spectroscopy. Learning Objectives Use high resolution n.m.r spectrum of simple molecules (carbon, hydrogen & oxygen) to predict.
Metabolomics and analytical chemistry: GC & GCMS Simone Bossi Analytical Chemistry Lab – Plant Physiology – Plant Biology.
 Our mission Deploying and unifying the NMR e-Infrastructure in System Biology is to make bio-NMR available to the scientific community in.
Forensic Toxicology -t-the study of the chemical and physical properties of toxic substances and their physiological effect on living organisms by Ondřej.
Silberschatz, Galvin and Gagne  Operating System Concepts UNIT II Operating System Services.
Chemistry XXI Unit 1 How do we distinguish substances? M1. Searching for Differences Identifying differences that allow us to separate components. M2.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
Chapter 13 Nuclear Magnetic Resonance Spectroscopy
Nuclear Magnetic Resonance
Metabolomics MS and Data Analysis PCB 5530 Tom Niehaus Fall 2015.
NUCLEAR MAGNETIC RESONANCE SPECTROSCPY A guide for A level students KNOCKHARDY PUBLISHING.
A Study in Hadoop Streaming with Matlab for NMR data processing Kalpa Gunaratna1, Paul Anderson2, Ajith Ranabahu1 and Amit Sheth1 1Ohio Center of Excellence.
MOLECULAR STRUCTURE ANALYSIS NMR Spectroscopy VCE Chemistry Unit 3: Chemical Pathways Area of Study 2 – Organic Chemistry.
Instrumental Analysis
Computational Challenges in Metabolomics (Part 1)
E87 - Vocabulary Risk – The chance that something unfavorable, such as injury or death, will occur because of a particular action or event.
Software and Software Engineering By bscshelp.com software engineering 1.
Biomolecular Nuclear Magnetic Resonance Spectroscopy
Tools and Services Workshop
Joslynn Lee – Data Science Educator
An Efficient Bit Vector Approach to Semantics-based
International Neurourology Journal 2014;18:
Introduction to Biophysics
Mass Spectrometry Vs. Immunoassay
Nuclear Magnetic Resonance NMR Spectroscopy Nuclear Magnetic Resonance NMR Spectroscopy Shovan Sarker Biochemistry & Moleculer Biology SUST.
Human Health and Disease
Nuclear Magnetic Resonance
Chapters 11 and 12: IR & NMR Spectroscopy, Identification of Unknowns
Lecture 22 Introduction to Mass Spectrometry Lecture Problem 7 Due
Nuclear Magnetic Resonance Spectroscopy
Microbiome: Metabolomics
Nuclear Magnetic Resonance (NMR)
Chapters 11 and 12: IR & NMR Spectroscopy, Identification of Unknowns
Schematic diagram of NMR set-up
CHY 431 Biological Chemistry
Microbiome: Metabolomics
Presentation transcript:

# load data originaldata = load_data_from_csv(rawdatafile) #filter out a range filtered = range_filter({:min=> 20,:max =>50},originaldata) # sum normalize normalized = sum_normalize(filtered) # write the file store (normalized_datafile,normalized) 1. Introduction The term metabolomics is defined as a comprehensive analysis in which metabolites of a biological system are identified and quantified. Any technique that can quantify metabolites can be used for metabolomics, but there are two primary techniques seen in the literature: nuclear magnetic resonance (NMR) and mass spectrometry with a prior online separation step such as high performance liquid chromatogra- phy (HPLC) or gas chromatography (GC). While neither technique is strictly superior, each technique has its own advantages and disadvantages. Existing applica- tions include the identification of biomarkers associated with responses to toxin and pathophysiologic changes, sample classification based on the type of toxic exposure, large scale human studies, clinical diagnosis, and the study of genetic disorders. Metabolomics Toxicology is the branch of pharmacology that deals with poisons and their effects on plant, animal and human life. Metabolic profiling especially of urine or blood plasma samples can be used to detect the physiological changes caused by toxic insult of a chemical. Toxicology Nuclear magnetic resonance (NMR) spectroscopy is an experimental technique that exploits the properties of an atom’s nucleus. It can be used to obtain information about the concentration and structure of molecules. NMR studies magne- tic nuclei by applying a static magnetic field followed by applying a second oscillating magnetic field. Specifically, only nuclei with an odd number of protons or neutrons can be measured using NMR; however, the two most common atoms studied are 1 H And 13 C. NMR NMR Spectrometer 2. What is the problem ? Metabolomics generates large data sets Hard to move around for processing May not work well for the case of Service Oriented Workflow Processing is computationally intensive Cannot be processed in a single machine Some institutions have regulations on data E.g. Air Force Research Lab have strict regulations on data usage Need a better way to take code near the data, Not data near the code ! 4. Example Workflow A = LOAD '$filename' USING PigStorage(',‘) AS (colnum:int, value:double); B = GROUP A BY colnum; C = FOREACH B GENERATE group, SUM(A.value); D = COGROUP A by colnum inner, C by $0 inner; F = FOREACH D GENERATE group, FLATTEN (A),FLATTEN (C); G = FOREACH F GENERATE $0,($2/$4)*100; STORE G INTO '$filename_processed‘ USING PigStorage (','); Filter Data Sum normalize Load Data Store Data 3. Using a DSL for mini Workflows Use a Domain Specific Language (DSL) to create a mini Workflow The DSL can be transformed into a Map-Reduce (Hadoop) program. Flexible enough to be compiled into other types of executable programs (E.g. Matlab program) Reduces the number of hops to create the Workflow. No need to do iterative programming with developers. The biologist can create the program by him/herself. 5. Usage Patterns The DSL can be used in different ways 1.Can be used as a standalone workflow creator 2.Can be used to implement individual services for a service oriented workflow 6. Discussion Convenience Custom DSL is comprehensive to a biologist rather than a generic solution like PIGLatin. Tools Sophisticated tooling is required to enable the biologist to conveniently build the workflows. Provenance and Metadata DSL makes it easier to integrate provenance and other metadata. Efficiency The efficiency may be reduced but, it can be easily compensated by a cheaper improvement of computing power. 7. What is the bottom line for the Biologist ? Faster processing and result generation The backend services can run much faster than any single computer and hence provides the results faster. Ability to handle very large datasets Clouds are capable of handling much larger datasets than any single computer. Works well inside regulated environments Data can easily be processed in house without violating institutional regulations. Reduced hops makes for quick turnaround Most tasks can be programmed by the biologist without the need for skilled developers. Getting Code Near the Data: A Study of Generating Customized Data Intensive Scientific Workflows with Domain Specific Language Ashwin Manjunatha 1, Ajith Ranabahu 1, Paul Anderson 2 and Amit Sheth 1 1 Ohio Center of Excellence in Knowledge Enabled Computing (Kno.e.sis) 2 Air Force Research Laboratory, Biosciences & Protection Division 1 {ashwin, ajith, / 2