Joaquin Vanschoren Hendrik Blockeel Experiment Databases for Machine Learning MLOSS Workshop - NIPS 2008.

Slides:



Advertisements
Similar presentations
SOMA2 – Drug Design Environment. Drug design environment – SOMA2 The SOMA2 project Tekes (National Technology Agency of Finland) DRUG2000 program.
Advertisements

Language Technologies Reality and Promise in AKT Yorick Wilks and Fabio Ciravegna Department of Computer Science, University of Sheffield.
Margherita Sini, FAO 1/ Multilingual Issues in Ontology Creation: Examples from NeOn and Concept Server Projects Margherita Sini.
INTRODUCTORY MICROSOFT ACCESS Lesson 1 – Access Basics
2015/6/1Course Introduction1 Welcome! MSCIT 521: Knowledge Discovery and Data Mining Qiang Yang Hong Kong University of Science and Technology
Time-Space Trade-Offs for 2D Range Minimum Queries
Large-Scale Machine Learning Program For Energy Prediction CEI Smart Grid Wei Yin.
A World of Thanks.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.
Welcome!. Goals XLDB Goals 1.Identify trends, commonalities and major roadblocks related to building extremely large databases 2.Bridge the gap between.
Lab 7: Attribute SQL- Query the database
School of Computing and Mathematics, University of Huddersfield Knowledge Engineering: Issues for the Planning Community Lee McCluskey Department of Computing.
Automated Changes of Problem Representation Eugene Fink LTI Retreat 2007.
Attribute databases. GIS Definition Diagram Output Query Results.
Information systems and databases Database information systems Read the textbook: Chapter 2: Information systems and databases FOR MORE INFO...
1 Juha Mykkänen (ed.) University of Kuopio, HIS R&D Unit Health Kuopio seminar Brussels, 5 November 2004 Health Information Systems and Information Technology.
Copyright © 2012 by Gan Wang, Lori Saleski and BAE Systems. Published and used by INCOSE with permission. The Architecture and Design of a Corporate Engineering.
Experiment Databases: Towards better experimental research in machine learning and data mining Hendrik Blockeel Katholieke Universiteit Leuven.
Conditor Towards a national reference repository for French scientific production Valérie Bonvallot (CNRS-Inist) – Thierry Dautcourt (Inria)
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Pasewark & Pasewark Microsoft Office 2003: Introductory 1 INTRODUCTORY MICROSOFT ACCESS Lesson 1 – Access Basics.
NUITS: A Novel User Interface for Efficient Keyword Search over Databases The integration of DB and IR provides users with a wide range of high quality.
By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.
AstroGrid 的安装,配置,组件介绍 LAMOST 王丹. AstroGrid Releases SNAPSHOT: latest bleeding edge code. Integration Tests results are available, but the code has not.
Future role of DMR in Cyber Infrastructure D. Ceperley NCSA, University of Illinois Urbana-Champaign N.B. All views expressed are my own.
Predicting Product Adoption in Large-Scale Social Networks Offensive: Hao Chen.
Computer Science 101 Database Concepts. Database Collection of related data Models real world “universe” Reflects changes Specific purposes and audience.
Juha Mykkänen University of Kuopio, HIS R&D Unit Health Kuopio seminar Brussels, 5 November 2004 SerAPI project: Service-oriented architecture and Web.
Nan Yang Chinese Terminologist Microsoft Language Excellence Shanghai, August 2008.
ON THE SELECTION OF TAGS FOR TAG CLOUDS (WSDM11) Advisor: Dr. Koh. Jia-Ling Speaker: Chiang, Guang-ting Date:2011/06/20 1.
BUSINESS ANALYTICS AND DATA VISUALIZATION
Database and Data File Management Oct 6/7/8, 2010 Fall 2010 | / Recitation 2.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
What have we learned?. What is a database? An organized collection of related data.
CGW 04, Stripped replication for the grid environment as a web service1 Stripped replication for the Grid environment as a web service Marek Ciglan, Ondrej.
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
Deriving Ramsey Numbers Interim Project Evaluations Team 7 Albuquerque Academy.
1 Limitations of BLAST Can only search for a single query (e.g. find all genes similar to TTGGACAGGATCGA) What about more complex queries? “Find all genes.
International Federation of Accountants Translation Activities in IFAC – an Update Kelly Ånerud, Senior Technical ManagerKelly Ånerud, Senior Technical.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
© 2005 IBM Corporation Discovering the Value of SOA with WebSphere Process Integration SOA on your terms and our expertise Building a Services Oriented.
7. Data Import Export Lingma Acheson Department of Computer and Information Science IUPUI CSCI N207 Data Analysis Using Spreadsheets 1.
WHY DO WE NEED TO LEARN LANGUAGES?.
Geographic Information Systems Using ESRI ArcGIS 9.3 Complex selection by attributes.
NVS New Zealand National Vegetation Survey. What is NVS? NVS (National Vegetation Survey) – New Zealand’s largest archive facility for plot-based vegetation.
Database Overview What is a database? What types of databases are there? How are databases more powerful than spreadsheets?
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Decision trees for hierarchical multi-label classification.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
INTRODUCTION TO GENERATING SERVICES
Model Discovery through Metalearning
The Distribution Dilemma
Research Methods Dr. X.
We will start soon. Feel free to ask (chat window) anything you want before we start.
Relational Databases.
Easy eBooks to Support Blended Learning
Introduction C.Eng 714 Spring 2010.
Use Case: The GEO-Wetlands Community Portal
Azure Machine Learning & ML Studio
Lab Automation Software
THE ENTERPRISE ANALYTICAL JOURNEY
Tools of Software Development
Data Warehousing and Data Mining
Introduction to Geoinformatics L-10. Managing GIS
Project Title Team Members EE/CSCI 451: Project Presentation
TRAINING OFAC BIS DUAL-USE YOUR EXPORT SANCTIONS GOAL AGENT RISK
Access Review.
Detecting Fraudulent Returns with Basic Machine Learning
Welcome! Knowledge Discovery and Data Mining
Presentation transcript:

Joaquin Vanschoren Hendrik Blockeel Experiment Databases for Machine Learning MLOSS Workshop - NIPS 2008

Machine learning experiments Summarized in papers Individual experiments and experiment details lost LearnIntegrate Share experiments Common description language All details to ensure repeatability ExpML: a first attempt Store experiments All information organized Ask any question by writing query (SQL) Reuse information Learn from the past Save time/resources

Why share experiments? In machine learning research: Good science Reproducibility Visibility Algorithms pop up in searches Organization ‘Map’ of known approaches Also negative results Save time & energy: Reuse experiments (e.g. benchmarking) New possibilities: Larger, more generalizable studies

ExpML: An Example Definitions: Repeatability + theoretical info Complex experiment setups: Experiment results: evaluations and predictions Algorithms + (meta)parameters Datasets + preprocessors

Accessing the database

Integration in tools Experimentation tools skip known experiments DM/ML toolbenches share experiments DM assistance tools advise e.g. workflows Parallellization ML experiment grids Inductive databases query model properties Molecule databases organize modelling efforts

Thanks Gracias Xie Danke Dank U Merci Efharisto Dhanyavaad Grazie Spasiba Obrigado Tesekkurler Thanks y'all Köszönöm Arigato Hvala Toda