Feature Engineering Studio October 7, 2013. Welcome to Bring Me Another Rock.

Slides:



Advertisements
Similar presentations
My Favorite Story.
Advertisements

Feature Engineering Studio January 21, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
TIPS TO INTERVIEW & WRITE LIKE A PRO Adapted from RAY MURRAY Assistant Professor Oklahoma State University Adapted from RAY MURRAY Assistant Professor.
Knowledge Engineering Week 3 Video 5. Knowledge Engineering  Where your model is created by a smart human being, rather than an exhaustive computer.
Calendar Browser is a groupware used for booking all kinds of resources within an organization. Calendar Browser is installed on a file server and in a.
Test Preparation Strategies
Calendar Browser is a groupware used for booking all kinds of resources within an organization. The software is totally integrated in Outlook. Calendar.
Feature Engineering Studio February 23, Let’s start by discussing the HW.
Mixed-level English classrooms What my paper is about: Basically my paper is about confirming with my research that the use of technology in the classroom.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Feature Engineering Week 3 Video 3. Feature Engineering.
CLARIN tools for workflows Overview. Objective of this document  Determine which are the responsibilities of the different components of CLARIN workflows.
Feature Engineering Studio March 30, Iterative Feature Refinement.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Study Guide for Final Exam What Smart Students Know.
We’re going to a New Version? Bill Marlow School of Business, IT & Management.
Presented by Greg Durkin For DLTV 27 August, 2015.
Blackboard Connect 5 Training Session Presented by: Rebecca Castañeda, Director for Federal Programs August 13, 2015.
Feature Engineering Studio September 23, Welcome to Mucking Around Day.
Registration and Log-in Flow Benjamin Melançon agaric.com & dgd7.org.
How to Succeed in Math Class. Many people remember their college math classes as: Lecture …
A1 Double Mathematics Vivien Moore. The course The Doubles course consists of six modules in the first year : C1, C2, C3, C4, M1and FP1. We will be finishing.
A2 Mathematics with Mechanics Vivien Moore. The course The A2 course consists of three modules: C3, C4 and M2. We will be finishing C3 by the end of October.
Feature Engineering Studio September 9, Welcome to Problem Proposal Day Rules for Presenters Rules for the Rest of the Class.
Feature Engineering Studio September 23, Let’s start by discussing the HW.
Server Upgrades CPTE 433 John Beckett. Tool #1: The Worksheet A blank/lined piece of paper. Go through it thrice. Work through each step yourself Work.
Feature Engineering Studio October 14, Iterative Feature Refinement.
Microsoft ® Office Excel 2003 Training Using XML in Excel SynAppSys Educational Services presents:
Homework Problem 2.47 Step 0: Think! This is a kinematics problem because it has a rock falling (constant acceleration) and sound waves traveling through.
Is when you say something but it means something else.
Feature Engineering Studio March 1, Let’s start by discussing the HW.
Feature Engineering Studio September 30, Quick Note Please me for appointments rather than just showing up at my office – I’m always glad.
What was your easiest assignment in Algebra & why? My easiest assignment for algebra was a work sheet called Graphing Linear Equations because I came.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 27, 2013.
CSCE 1030 Computer Science 1 First Day. Course Dr. Ryan Garlick Office: Research Park F201 B –Inside the Computer Science department.
Lit Circle Unit The How-to’s and the Whyfore’s. What is a Lit Circle A lit circle is a small group of people dedicated to one book and the complete mastery.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Feature Engineering Studio October 7, Welcome to Bring Me a Rock Day 2.
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
Feature Engineering Studio October 21, Feature Adaptation Presentations How many presenters do we have today? Five minutes per presentation N minutes.
Feature Engineering Studio September 9, Welcome to Feature Engineering Studio Design studio-style course teaching how to distill and engineer features.
Primenumbers.co.uk This presentation will help you get the most out of this service.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Feature Engineering Studio February 2, Welcome to Problem Proposal Day Rules for Presenters Rules for the Rest of the Class.
Feature Engineering Studio April 13, Friend Features Who managed to find a friend with relevant background expertise?
© 2015 albert-learning.com How to talk to your boss How to talk to your boss!!
First Grade Sight Words
Safer Internet Day. What do you use the Internet for? watching TV shows watching online videos playing gamestalking to friends homeworkfinding out things.
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Mineral Classification Virtual Lab Name Date Class.
Dante Webbe.  Is a computer career for me?  To answer this question. I will create 3 algorithms to help determine whether or not a computer related.
Online Job Applications Workshop Coordinators Sharon Feeney – Andrea Reynolds –
Monday, November 12, 2013 Please get out your folder and be ready to read by the time I’ve completed attendance. Focus on the vocabulary words. Write down.
TechKnowlogy Conference August 2, 2011 Using GoogleDocs for Collaboration.
Technology acts a direct substitute, with no functional improvement
Simplifying Algebraic Expressions
Good Morning Everyone!! Our Warm Up today is finishing the exam we began on Monday. You will have exactly 30 mins in class today before we need to move.
Core Methods in Educational Data Mining
Core Methods in Educational Data Mining
Big Data, Education, and Society
Feature Engineering Studio
Feature Engineering Studio
YOUR APP NAME MY APP IS GR8 MORE TEXT FEATURE 1 FEATURE 2 FEATURE 3
How to study for an exam Spend at least 75% of your time here.
Core Methods in Educational Data Mining
OSU Professional Development
Mrs. Sirkin’s Steps to Success
Mr. Nesbitt Physical Science Room 302
ACADEMIC MISCONDUCT Don’t do it!.
Presentation transcript:

Feature Engineering Studio October 7, 2013

Welcome to Bring Me Another Rock

In birthdate order Each person should tell us about their favorite feature they created for Bring Me Another Rock Tell us what it was How you created it Your just-so story And was your just-so story correct

Next Tell us about anything cool you did in Excel or another program to create a feature

Too Hard? Were there any features that anyone kind of wanted to create, but it was too difficult? (or too much work?)

Better? Who here got better features (in terms of goodness metric) for Bring Me Another Rock, than Bring Me a Rock?

Other Interesting Observations?

GoogleRefine (now OpenRefine)

Mostly just an Excel clone, abandoned in favor of the fully-online Google Towels Sheets But some nice additional functionality

GoogleRefine (now OpenRefine) Functionality to make it easy to regroup and transform data – Find similar names – Connect names – Bin numerical data – Mathematical transforms showing resultant graphs – Text transforms and column creation

GoogleRefine (now OpenRefine) Functionality for finding anomalies/outliers

GoogleRefine (now OpenRefine) Functionality for automatically repeating the same process on a new data set *Really* nice for cases where you complete a complex process and want to repeat it – Replicates a really good logbook, which most data analysts don’t keep – Now seen in other tools like iPython Notebook – Still not in Excel, but Excel has been stagnant for years

GoogleRefine (now OpenRefine) Functionality for connecting your data set to web services to get additional relevant info

GoogleRefine (now OpenRefine) Can load in and export common but hard-to- work-with data types – JSON and XML

GoogleRefine (now OpenRefine) Some videos you should watch later AWM Ba0 k

Questions? Comments?

Assignment for next Monday

Iterative Feature Refinement Select three of the features you have created in previous assignments These features should be “among the best” of the features you have previously created For each of these three features, create at least five “close variants” of these features – “time for last 3 actions” and “time for last 4 actions” are close variants – “time for last 3 actions” and “total time between help requests and next action” are two separate features Using the Excel Equation Solver is an OK substitute for creating five “close variants” If you don’t use the excel equation solver – As you create the close variants for each feature, don’t just make them all at once – Make a variant – Test whether it’s better than the previous variant (by goodness metric) If it is, keep going in the same direction If it isn’t, try doing the opposite or something else

Iterative Feature Refinement Write a report that discusses your process – I took feature N – I changed it from N to N* – The goodness changed from G to G* – Then I did…

Iterative Feature Refinement You don’t need to prepare a presentation But be ready to discuss your features in class

Also for Next Monday Please read Rodrigo, M.M.T., Baker, R.S.J.d., McLaren, B., Jayme, A., Dy, T. (2012) Development of a Workbench to Address the Educational Data Mining Bottleneck. Proceedings of the 5th International Conference on Educational Data Mining,

Next Classes 3/30 Feature Reuse – IFR assignment due 4/1 Lab Session: Building Predictive Models – Come to this if you want to learn more about the theory behind building predictive models; how to do it effectively and appropriately (beyond just the how) – You don’t need to come to this if you’ve taken Core Methods or Big Data and Education

Thank you!