University of Minnesota

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

AGENT-BASED APPROACH FOR ELECTRICITY DISTRIBUTION SYSTEMS University of Jyväskylä University of Vaasa Acknowledgements: Industrial Ontologies Group.
United Nations Economic Commission for Europe Statistical Division NTTS 2015 – Satellite Workshop on Big Data March 9, 2015 The Big Data Project – The.
Présentation EPFL-Public | Ecole Polytechnique Fédérale de Lausanne EPFL.
REDCap Overview Institute for Clinical and Translational Science Heath Davis Fred McClurg Brian Finley.
GNORASI vision and achievements, Future perspectives Panagiotis Symeonidis Environmental Physicist, M.Sc., Ph.D. Technical Director DRAXIS Environmental.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Use of Remote Sensing Data to Improve FAO Statistics Overview Global Food Security Support Analysis 30m (GFSAD30) July 2015 Madison, WI Fabio.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
® The importance of international standards for data exchange Denise McKenzie Executive Director, Communications & Outreach Open Geospatial Consortium.
IT and IM: Promises and Pitfalls Greta Lowe August 15, 2011.
REDCap Overview Institute for Clinical and Translational Science Heath Davis Fred McClurg Brian Finley.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
| nectar.org.au NECTAR TRAINING Module 2 Virtual Laboratories and eResearch Tools.
Project Management May 30th, Team Members Name Project Role Gint of Communications Sai
Project number: ENVRI and the Grid Wouter Los 20/02/20161.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
Supporting the “Solving Business Problems with Environmental Data” Competition 24 th October 2013 Vlad Stoiljkovic.
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
Collaboration & Data Collection TechnologyOctober 2014Butler Institute for Families Visionary Evaluation for a Sustainable, Equitable Future AEA Roundtable.
UN GLOBAL PULSE: Product Harnessing innovation to protect the vulnerable Sara Farmer Chief Platform Architect, UN Global Pulse Executive Office of the.
Business Intelligence MSCS 6931 Compare Tableau and Power BI Haochen(Bamboo) Sun Sep 30, 2015.
Introduction to Data Management Arllet M. Portugal Integrated Breeding Platform Breeding Management System Intensive Workshop on Data Management Jan. 26,
What we mean by Big Data and Advanced Analytics
Recruiting graduates for innovative companies for over 40 years!
Model-based design inspection based on traceability information models and design slicing Shiva Nejati April 15, 2015.
Session: Towards systematically curating and integrating
SAP Trade Repository Reporting by Virtusa
Systems Analysis and Design in a Changing World, Fifth Edition
G.E.M.S : Enabling Agricultural Innovation
2nd GEO Data Providers workshop (20-21 April 2017, Florence, Italy)
Product Overview.
HydroNET and Real Time Radar Rainfall Data
OPAG-PWS Report of the Chair
X3D Technology Approach for Developing 3D Web-GIS System
Solutions to Clinical Data Visualization and Analysis
Metrics for Marketing Data Collection and Analysis Tools.
Reporting and Analysis With Microsoft Office
RDA WG on-farm data sharing IGAD / Barcelona
Testbed for Medical Cyber-Physical Systems
Group Decision Support Systems
Breeding Information Management System
VIVO: Faculty Research Information System and Discovery
COMPANY PROFILE: COMPU CAMPO
Development of the Amphibian Anatomical Ontology
Insurance Fraud Analytics in the Cloud with Saama and Microsoft Azure
RA-II/Doc Implementation of the WMO DRR Roadmap in RA II including major activities on DRR Services Alasdair Hainsworth, Chief Disaster Risk Reduction.
Eight Years of Success Geospatially-oriented technology solutions
Innovate. Improve. Grow. WEAVER: HEXAPOD ROBOT WITH 5DOF LIMBS FOR NAVIGATING ON UNSTRUCTURED TERRAIN.
The Importance of “Genomes to Fields”
Project Oxygen… Shashwat Shriparv
Staying afloat in the sensor data deluge
Cyberinfrastructure for the Life Sciences
Analytics Plus Product Overview 1.
Visually Execute Your Strategy
WIS Strategy – WIS 2.0 Submitted by: Matteo Dell’Acqua(CBS) (Doc 5b)
Interoperability and standards for statistical data exchange
Introduction to the WMO/OGC Hydrology Domain Working Group
Semantic Annotation service
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Status and Plan of Regional WIGOS Center (West Asia) in
University of Wisconsin, Madison
Analytics Plus Product Overview.
project management software
MIM Database API Demonstration
Jonathan Griffin, Managing Director, IFIS Publishing &
OpenDP: A Pitch for a Community Effort
Driving Successful Projects
Presentation transcript:

University of Minnesota &  Progress Toward Real-Time Data Submission and Analysis for our Breeding Community Kevin Silverstein, PhD, Operations Manager GEMS led by: Philip Pardey, Jim Wilgenbusch and Kevin Silverstein College of Food Agricultural and Natural Resource Science, CFANS Minnesota Supercomputing Institute, MSI University of Minnesota Phenome Meeting February 6, 2019

What is G.E.M.S? A novel data sharing and big data analytical platform that enables public-private research collaborations for innovation in food and agricultural production, and other domain areas G E M S A novel data sharing and analysis platform which enables public-private research collaborations for innovation in agricultural production and other domain areas across Genomic, Environment, Management and Socio-economic subject matter disciplines Platform also explicitly accommodates data that spans time and space, thus linking high throughput experimental data with remote sensed satellite generated data Genomics Environment Management Socio-Economics Time Space 2

The G.E.M.S Team (more than 20 brains strong!) Bi-Weekly build meetings Weekly technical meetings Numerous ad hoc consultations in the Cargill Branary & MSI

Realizing the Big Data Revolution Get the data to the tool or get the tool to the data Data Transfer Reconcile file formats, units, vocabularies, languages, and ontologies Data Interoperability Access to complex software and ability to replicate analyses Data Analysis Facilitate complex partnerships and respecting data ownership and privacy Data Sharing The GEMS team took a different design tack We set about designing a platform that could overcome Obstacles related to dealing with BIG DATA so that we can realize the benefits of the BIG DATA revolution. 4

Accomplishments for 2018 Field trial data cleaning Environmental data cleaning Import of KDSmart phenotyping data Prototyping data collection and analysis web dashboard

Field trial data cleaning Supported G2F in cleaning multi-state maize field trial data (2016-DOI, 2017-ARK) Developed new python modules for cleaning field trial data Built automated methods to detect errors, missing data and outliers in hybrid phenotypic data Generated Error Detection report, Summary Statistics report and Pedigree Summary report to detect outliers and provide data summary

Environmental data cleaning Developed new python modules for cleaning environmental data Performed cleaning of G2F environmental data (2016,2017) Built Application Programming Interface (API) to automatically pull weather data from nearest weather station Developed tools to perform conversion of units from imperial to metric system Developed algorithms to flag errant observations based on guidelines from World Meteorological Organization (WMO) Developed tool to detect local outliers

Environmental data cleaning Developed new python modules for cleaning environmental data Performed cleaning of G2F environmental data (2016,2017) Built Application Programming Interface (API) to automatically pull weather data from nearest weather station Developed tools to perform conversion of units from imperial to metric system Developed algorithms to flag errant observations based on guidelines from World Meteorological Organization (WMO) Developed tool to detect local outliers

Environmental data cleaning Developed new python modules for cleaning environmental data Performed cleaning of G2F environmental data (2016,2017) Built Application Programming Interface (API) to automatically pull weather data from nearest weather station Developed tools to perform conversion of units from imperial to metric system Developed algorithms to flag errant observations based on guidelines from World Meteorological Organization (WMO) Developed tool to detect local outliers

Environmental data cleaning Developed new python modules for cleaning environmental data Performed cleaning of G2F environmental data (2016,2017) Built Application Programming Interface (API) to automatically pull weather data from nearest weather station Developed tools to perform conversion of units from imperial to metric system Developed algorithms to flag errant observations based on guidelines from World Meteorological Organization (WMO) Developed tool to detect local outliers

Import of KDSmart phenotyping data KDSmart Software  KDXplore Software KDSmart is a software from the Canberra, Australia Company DArT Allows recording of phenotypic observations using handheld devices (android phones or tablets) Building a system to automate the import of G2F KDSmart phenotyping data into the GEMS platform (in-progress).

Data collection and analysis web dashboard Grain Yield across Field Locations State: Wisconsin Field Location: WIH2 City: Arlington Histogram of Grain Yield Field Location: WIH2 Designing a prototype of a web Dashboard for G2F on the G.E.M.S platform Provide high level view of multi-state maize trial field data Prototype will include basic query capabilities and visualization of the data

Proposed activities (2019) Data Cleaning / Systematizing Clean 2018 G2F data plus systematize and clean 2014 & 2015 data (Q1-Q2) Easy-access weather data API for use in R, Python (Q3-Q4) Real-time data uploads/cleaning for collaborators (Q3-Q4) APIs to external programs Support BrAPI and additional G.E.M.S. API components to share with MaizeGDB, CyVerse, GOBii, EIB, KDSmart (Q3-Q4) Customized Web Dashboard for G2F Prototype will include basic query capabilities and visualization of the data (Q1-Q4)

Acknowledgments G.E.M.S: Christina Poudyal Philip Pardey UW Madison: Naser AlKhalifah Natalia DeLeon Iowa Corn: David Ertl G2F Consortium Members

Thanks G.E.M.S: https://agroinformatics.org G2F: https://www.genomes2fields.org