Presentation is loading. Please wait.

Presentation is loading. Please wait.

Feedback – Lab 2 9 Sept 2014. Your learning experience in this course.

Similar presentations


Presentation on theme: "Feedback – Lab 2 9 Sept 2014. Your learning experience in this course."— Presentation transcript:

1 Feedback – Lab 2 9 Sept 2014

2 Your learning experience in this course

3 Lab Sessions: Text Comprehension & Task Interpretation (Always: point out inaccuracies) Use Case 1 – I do not understand the text: go back to the video lecture, probably you have not built the background context required for completing the task.

4 …. Continued Use case 2: – Oh my god, what am I supposed to do here? read the text several times and identify the key points in the text strucure: - Description - The purpose - Tasks - pre-processing: feature transformation - identify the best features by applying your knowledge about empirical error - Interpret the results based on your knowledge about empirical error and your common sense knowledge or historical research.

5 My expectations on your learning experience Students should be able to interpret the text and the tasks (diversified interpretations are allowed and welcome) Students should be able to show critical mind by working out a plausible interpretation(s) and motivate their choice (s).

6 About instructions and time… I am not sure that instructions were unclear. The core task is the representation bin0 and bin1 in order to apply the formulae. This was the cognitive effort of this lab. You could work in groups and groups could exchange info between them… and for several hours…. And you made it!

7 Pre-processing: feature transformation Categorical features  Binary features – Each feature shoud assume a value 0 or a value 1 following the instructions under the heading ”Preprocessing” (search & replace; if formulae; whatever…)

8 The task was about empirical error (Lect 6, min 7:44) Empirical error: how well the chosen hypothesis classifies the training data. How do you assess a hypothesis? – Systematic counting of correct guesses and wrong guesses made by the hypothesis wrt the correct labels – This means that you must compare the predictions of the hypothesis with the actual labels

9 Lab Task Our hypotheses were the different features. We have to assess each feature wtr to classiffication (survived vs died)

10 1) For each feature, calculate the empirical error LEARN TO PREDICT THE FIRST COLUMN – (a) For each of the features calculate (and write down) the training error if you used only that feature to classify the data. To do this you will need to do the following for each feature: – Split the data based on that feature. Call bin0 all examples that have 0 for that features and bin1 all examples that have 1 for that feature. – Calculate the majority count for the label in each bin, i.e. for bin0, majority(bin0) = max(count(bin0 = survive); count(bin0 = notsurvive))

11 Accuracy/Error A possible representation…. WATCH OUT! AGE FEATURE IS TRICKY HERE!

12 Other representations (etc. etc.)

13 Which feature would be best to use? EMBARKED… if we trust this sample and our calculations… (error rate on this feature is the lowest) Basically this means that many of those who started their trip from Southampton did not survived. However, the difference betw the features was very small!

14 Many interesting interpretations! None believed that Embarked was a good feature for real ”this could depend on the small dataset” ”embarked feature gave the lowest error […] Intutivetly the first class feature should have the strongest relationship with the chance of surviving” ”If we calculate accuracy with more features […], we get more interesting results” ”The Embarked would be the best to use because it has the lowest error rate. In reality it is very unlikely that the city has any correlation with their chance of survival, unless they recieved some special training before boarding or shared a rough upbringing in the city” Etc.

15 Missing values Good that you noticed that there were missing values, ie cells without any value! – Some of you have removed them – Some of you have coverted to >25 In practice, missing values require ”more investigation” Missing values are not considered to be ”noise” in the sense that was explained during the video lecture.

16 Technical troubles If you experience problems with a computer: configuration problems, weird behaviour, etc. just change computer and report the touble (Per?)

17 Next… Those who have miscalculated the empirical error should recalculated in the correct way as presented. Those who want, can have some additional training with an optional task that is on the website. It contains the solution. You do not need to submit anything. It is just for you! All those who have submitted the report have completed this lab task. Well done!


Download ppt "Feedback – Lab 2 9 Sept 2014. Your learning experience in this course."

Similar presentations


Ads by Google