Evaluating state of the art in AI

Slides:



Advertisements
Similar presentations
ONYX RIP Version Technical Training General. Overview General Messaging and What’s New in X10 High Level Print and Cut & Profiling Overviews In Depth.
Advertisements

SUCCESSFUL SERVICE DESIGN Turning innovation into practice READ MORE... USER GUIDE Read the “User Guide” to find out more about this site START Ready to.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
5/5/2005Toni Räikkönen Internet based data collection from enterprises using XML questionnaires and XCola engine CoRD Meeting May 11th 2005.
Raffaele Di Fazio Connecting to the Clouds Cloud Brokers and OCCI.
ATLAS Outreach & Education News & Collaboration. News Reporting ATLAS progress and results to the world ATLAS Week - 11 Oct 2011S. Goldfarb - ATLAS Outreach.
Computing on the Cloud Jason Detchevery March 4 th 2009.
Codeigniter is an open source web application. It occupies a very small amount of space in the memory and is most useful for developers who aim to develop.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
/16 Final Project Report By Facializer Team Final Project Report Eagle, Leo, Bessie, Five, Evan Dan, Kyle, Ben, Caleb.
1 RIC 2009 Symbolic Nuclear Analysis Package - SNAP version 1.0: Features and Applications Chester Gingrich RES/DSA/CDB 3/12/09.
Best 3 Software Development Languages. Hibernate Training Hibernate is a high-performance object-relational mapping tool and query service. Hibernate.
Advanced Higher Computing Science
Workload Management Workpackage
Top 8 Best Programming Languages To Learn
Big Data is a Big Deal!.
By: Raza Usmani SaaS, PaaS & TaaS By: Raza Usmani
What’s new in FUSION? Bob McGaughey
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Jason Bury Dylan Drake Rush Corey Watt
SCEC Drupal Website Development Overview and Status
Testing Alfresco extensions (no, it’s not about jUnit)
Docker Birthday #3.
Overview – SOE PatchTT November 2015.
Kevin C. Chang University of Illinois, Urbana-Champaign
Line of Business Solutions in SharePoint Online
Overview – SOE PatchTT December 2013.
Unified Modeling Language
A Network Science Approach to Fake News Detection on Social Media
What is all the fuss over Containers?
Design and Implementation
Fun with Reporting Services Tools
Midway Milestone Presentation CS Fall 2017
SENIOR MANAGER - SOFTWARE TESTING PRACTICE
Fast Action Links extension A love letter to CiviCRM
Operating Systems and Systems Programming
Top Reasons to Choose Angular. Angular is well known for developing robust and adaptable Single Page Applications (SPA). The Application structure is.
B534 distributed computing
Analyzing EZproxy logs with ezPAARSE
Big Data - in Performance Engineering
Section 14.1 Section 14.2 Identify the technical needs of a Web server
Automated Testing and Integration with CI Tool
Data science and machine learning at scale, powered by Jupyter
Introduction to Apache
Module 01 ETICS Overview ETICS Online Tutorials
HPML Conference, Lyon, Sept 2018
What's New in eCognition 9
Closing Remarks.
Fast Conflict Detection for
Technical Capabilities
Collaborative Collections
SharePoint 2019 Overview and Use SPFx Extensions
ARCHITECTURE OVERVIEW
MADE IN USA KEY QUOTE TOOL.
From Development to Production: Optimizing for Continuous Delivery
Title of Project Joseph Hallahan Computer Systems Lab
CS130 Spring 2018 Hi Everyone, hope you are enjoying ShopTalk so far
Back end Development CS Programming Languages for Web Applications
From Development to Production: Optimizing for Continuous Delivery
Software Development Life Cycle Models
CAD DESK PRIMAVERA PRESENTATION.
Node.js Test Automation using Oracle Developer Cloud- Simplified
PyWBEM Python WBEM Client: Overview #2
What's New in eCognition 9
Risk Map Project By Qinghua Long Research Software Engineer
Back end Development CS Programming Languages for Web Applications
eCV Replacement Susan Hallmark
Neal Kurande, WinaGodwin Anyanwu Jr., Adam Chau
THE ASSISTIVE SYSTEM SHIFALI KUMAR BISHWO GURUNG JAMES CHOU
Presentation transcript:

Evaluating state of the art in AI Hi, I’m Deshraj from CloudCV. CloudCV is an open source organization that aims to simplify the process of AI research. I’ll be talking about our latest project, EvalAI. Evaluating state of the art in AI

and 30+ open source contributors Team This is our team. We are a group of graduate students, engineers and researchers supported by more than 30 open source contributors who are working with us on the project. and 30+ open source contributors

Benchmarking progress in AI is hard The problem: Benchmarking progress in AI is hard. Comparing a new technique with existing approaches is a critical component of research. In a field as fast-moving as AI, such reproducibility is especially important. However, it is becoming increasingly difficult to reproduce numbers from published papers, and to reliably compare new algorithms with existing approaches. Some common problems are: inconsistencies arising from differences in algorithm implementation, from using different splits of the same dataset, or not using the same evaluation metrics. Differences in algorithm implementation Using different splits of the same dataset Not using consistent evaluation metrics

To benchmark progress, hundreds of AI challenges have been proposed: In computer vision, we have challenges like imagenet and the MS COCO Challenges, in data science we have had the Netflix Challenge and the KDD cup, and so on. However, a centralized platform to both host and participate in such challenges is still lacking. change the size of squad dataset move VQA image above say imagenet instead of ILSVRC ?

What is EvalAI? https://evalai.cloudcv.org EvalAI is an open-source web platform that aims to evaluate the state of the art in AI. Its goal is to help AI researchers, and students to host, collaborate, and participate in AI challenges organized around the globe. https://evalai.cloudcv.org

Wait, isn’t this just like Kaggle? EvalAI Challenge Platform Generic Challenges Protocols and Phases Open Source Portable Hosting A question we’re often asked is: Doesn’t Kaggle already do this? The central difference is this: On Kaggle, an AI researcher needs to choose from Kaggle’s predefined set of evaluation schemes: often that is just not enough. On the other hand with EvalAI, /any/ kind of challenge can be hosted. 

EvalAI Host Participant A challenge on EvalAI is designed around two entities: challenge hosts, and challenge participants. Now, I will be talking about some of the crucial features in EvalAI for both hosts and participants. EvalAI Host Participant

Link: https://github.com/Cloud-CV/EvalAI Host Open Source We have open sourced EvalAI so that the challenge hosts can deploy it on their own servers and run their challenges. Link: https://github.com/Cloud-CV/EvalAI

Host Open Source Custom Evaluation Protocols and Phases We have designed a versatile backend framework that can support user-defined evaluation metrics and various evaluation phases for a single challenge. Custom Evaluation Protocols and Phases

Host Hosting is simple Open Source Custom Evaluation Hosting a challenge is streamlined. One can create the challenge on EvalAI using the intuitive UI or using zip configuration file. Custom Evaluation Protocols and Phases Hosting Challenge is simple Hosting is simple

MapReduce Job Execution: Logical View Participant Fast Evaluation MapReduce Job Execution: Logical View For participants, EvalAI provides a swift and robust backend based on map-reduce framework that speed up evaluation on the fly. This makes it much faster for researchers to reproduce results from technical papers, and to perform reliable and accurate analyses. Output Map Shuffle Reduce

Hosted on Codalab Hosted on Codalab As its first challenge, EvalAI hosted this year’s VQA 2.0 challenge. Some background: Last year, the VQA 1.0 challenge was hosted on Codalab, and on average evaluation would take 410 seconds. Hosted on Codalab Hosted on Codalab

Hosted on Codalab Hosted on Codalab This year, the dataset for the VQA 2.0 challenge is twice as large. Despite this, we’ve found that our parallelized backend only takes a third of the time, which is a 6x speedup! Hosted on Codalab Hosted on Codalab

6x speedup Hosted on Codalab Hosted on EvalAI This year, the dataset for the VQA 2.0 challenge is twice as large. Despite this, we’ve found that our parallelized backend only takes a third of the time, which is a 6x speedup! Hosted on Codalab Hosted on EvalAI

Participant Fast Evaluation Centralized leaderboards On the front-end, we display centralized leaderboards that update in real time to help participants and organizers track progress. Centralized leaderboards VQA 2.0 Public Leaderboard

Centralized leaderboards Participant Fast Evaluation Finally, we have designed a clean and intuitive user experience. To top it all, EvalAI is completely open-source and so new features can easily be added by the community. Centralized leaderboards Realtime updates

Progress 30+ Open Source Contributors Overall, we are building a centralized platform for all the AI challenges whether they are hosted on EvalAI or fork of EvalAI or as any other web application. 30+ Open Source Contributors 3 times with Google Summer of Code 1100+ Github Issues & PR’s

Upcoming features Challenge Creation on EvalAI OR We are working on some really cool features as of now and I would like give an overview of some of those features. The first one is creating challenge on EvalAI. One can create challenge either using zip configuration file or using the intuitive UI of EvalAI. OR Using intuitive UI (releasing soon) Using zip configuration file

Upcoming features Challenge X Central leaderboard for challenges hosted as third party applications We are also working on adding support for challenges hosted on third party web applications where they can be linked with EvalAI using a simple web hook service that will publish the results on EvalAI’s leaderboard on the fly. This way, the challenge hosts can restrict the access to the test annotations to themselves and still their challenge can be featured on EvalAI along with all the results. Challenge X HTTP POST (for each submission) Third party server EvalAI

Submission Successful Upcoming features EvalAI Python APIs Another important feature is EvalAI Python APIs. Using this, the participants can submit the results to EvalAI through python console. >> import evalai >> evalai.submit(challenge=“vqa”, data={“foo”: “bar”}) Submission Successful Participant Terminal

Summarize Centralized platform for all AI Challenges And many more To summarize, our ultimate goal is to build a centralized platform to host, participate and collaborate in AI challenges organized around the globe and we hope to help in benchmarking progress in AI. And many more

Interested in hosting a Challenge? If you are interested in hosting a challenge on EvalAI, please reach out to us at team@cloudcv.org Reach out to us at team@cloudcv.org

Thank you! evalai.cloudcv.org deshraj@cloudcv.org Thats all from my side. Thanks for listening. :) Thank you! evalai.cloudcv.org deshraj@cloudcv.org