Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automated scoring of student trees

Similar presentations


Presentation on theme: "Automated scoring of student trees"— Presentation transcript:

1 Automated scoring of student trees
Two models of algorithmic judgement.

2 Automated scoring of student trees
Two models of algorithmic judgement. (how to tell if a tree is any good, without using your brain)

3 Last winter, some two hundred freshman biology students
were asked to perform a simple task of sorting organisms to indicate how they are related to each other.

4 Last winter, some two hundred freshman biology students
were asked to perform a simple task of sorting organisms to indicate how they are related to each other. The goal was to find out, first, how well they understood evolutionary relationships prior to taking Biology 101.

5 Last winter, some two hundred freshman biology students
were asked to perform a simple task of sorting organisms to indicate how they are related to each other. The goal was to find out, first, how well they understood evolutionary relationships prior to taking Biology 101. The more interesting goal was to understand what sorts of mistakes they made, and to try to understand what they believe about phylogenetic relationships when they enter college.

6 Last winter, some two hundred freshman biology students
were asked to perform a simple task of sorting organisms to indicate how they are related to each other. The goal was to find out, first, how well they understood evolutionary relationships prior to taking Biology 101. The more interesting goal was to understand what sorts of mistakes they made, and to try to understand what they believe about phylogenetic relationships when they enter college. Here are some of the trees they produced.

7

8 Some of the trees are pretty good...

9 Some of the trees are pretty good...

10 Some of the trees are pretty good...

11 ... and some are not so good.

12 ... and some are not so good.

13 ... and some are not so good.

14 We’d like to analyze these trees to determine which ones show some
understanding of the relevant classifications, and which do not.

15 We’d like to analyze these trees to determine which ones show some
understanding of the relevant classifications, and which do not. Ideally, we’d like to do this without using expensive human brains to do the sorting.

16 We’d like to analyze these trees to determine which ones show some
understanding of the relevant classifications, and which do not. Ideally, we’d like to do this without using expensive human brains to do the sorting. The following slides will show two ways to determine whether trees have organisms grouped into vertebrates and invertebrates, one based on graph relationships and the other based on the organisms’ locations on the screen.

17 Automated analysis of student trees
based on graph relations: the shortest-path method.

18 How can we tell if organisms are well-grouped?

19 How can we tell if organisms are well-grouped?
In a tree structure, a group is a set of nodes descended from a common ancestor.

20 How can we tell if organisms are well-grouped?
In a tree structure, a group is a set of nodes descended from a common ancestor. This method makes use of this fact to determine group relationships based on nodes’ distances from one another on the graph.

21 How can we tell if organisms are well-grouped?
In a tree structure, a group is a set of nodes descended from a common ancestor. This method makes use of this fact to determine group relationships based on nodes’ distances from one another on the graph. We begin by examining the distances between members of the same group – vertebrates – in a well-formed tree.

22 Distance from rat to human: 2
Distance from rat to bird: 4 Average distance between vertebrates: 6

23 Distance from rat to human: 2
Distance from rat to bird: 4 Average distance between vertebrates: 6 Correct groupings.

24 Distance from rat to human: 2
Distance from rat to bird: 4 Average distance between vertebrates: 6 Distance from rat to human: 2 Distance from rat to bird: 4 Average distance between vertebrates: 3 Correct groupings.

25 Next, we look at the distances between members of the vertebrates and
the invertebrates on the tree.

26 Distance from rat to snail: 6
Distance from rat to bird: 4 Average distance between vertebrates: 6 Correct groupings.

27 Distance from rat to snail: 6
Distance from rat to beetle: 6 Average distance between vertebrates: 6 Correct groupings.

28 Distance from rat to snail: 6
Distance from rat to beetle: 6 Average distance from vertebrates to invertebrates: 6 Correct groupings.

29 Average distance from vertebrates to invertebrates: 6
Average distance between invertebrates: 3 Correct groupings.

30 Next, we look at the same distances for an incorrectly-structured tree.

31 Distance between rat and human: 6
One of these trees has the right idea.... And the other hasn’t Incorrect groupings.

32 Distance between rat and human: 6
Distance between rat and bird: 4 One of these trees has the right idea.... And the other hasn’t Incorrect groupings. Incorrect groupings.

33 Distance between rat and human: 6
Distance between rat and bird: 4 Average distance between vertebrates: 5.3 One of these trees has the right idea.... And the other hasn’t Incorrect groupings.

34 Distance from rat to beetle: 2
One of these trees has the right idea.... And the other hasn’t Incorrect groupings. Incorrect groupings.

35 Distance from rat to beetle: 2
Distance from rat to snail: 6 One of these trees has the right idea.... And the other hasn’t Incorrect groupings. Incorrect groupings.

36 Distance from rat to beetle: 2
Distance from rat to snail: 6 Average distance from vertebrate to invertebrate: 4.5 One of these trees has the right idea.... And the other hasn’t Incorrect groupings.

37 In a well-formed tree, the members of a group will be
clustered under one branch of the tree.

38 In a well-formed tree, the members of a group will be
clustered under one branch of the tree. This means that the average distance between members of a group will be smaller than the average distance between any member of the group and the non-group members.

39 In a well-formed tree, the members of a group will be
clustered under one branch of the tree. This means that the average distance between members of a group will be smaller than the average distance between any member of the group and the non-group members. If a tree is not well-formed, the in-group distance and the out-of-group distances will be similar.

40 In a well-formed tree, the members of a group will be
clustered under one branch of the tree. This means that the average distance between members of a group will be smaller than the average distance between any member of the group and the non-group members. If a tree is not well-formed, the in-group distance and the out-of-group distances will be similar. We can derive a grouping score by dividing the out-of-group distances by the in-group distances.

41 In a well-formed tree, the members of a group will be
clustered under one branch of the tree. This means that the average distance between members of a group will be smaller than the average distance between any member of the group and the non-group members. If a tree is not well-formed, the in-group distance and the out-of-group distances will be similar. We can derive a grouping score by dividing the out-of-group distances by the in-group distances. A score of one indicates random placement, higher scores indicate greater clustering of vertebrates or invertebrates.

42 In a well-formed tree, the members of a group will be
clustered under one branch of the tree. This means that the average distance between members of a group will be smaller than the average distance between any member of the group and the non-group members. If a tree is not well-formed, the in-group distance and the out-of-group distances will be similar. We can derive a grouping score by dividing the out-of-group distances by the in-group distances. A score of one indicates random placement, higher scores indicate greater clustering of vertebrates or invertebrates.

43 Incorrect groupings: out-of-group distance: 5.3 in-group distance: 4.5 Grouping score = 1.1 Correct groupings: out-of-group distance: 6 in-group distance: 3 Grouping score = 2

44 This works well, if the student is kind enough to connect
all of the nodes together.

45 This works well, if the student is kind enough to connect
all of the nodes together. But what do we do if the student's tree is not connected?

46 Using convex hulls to determine groupings purely from spatial relationships.

47 This is the “correct grouping” tree, with all of its connections removed.

48 Clearly, we can no longer count connections to establish groups.

49 However, the grouping information can be recovered by examining
the convex hulls enclosing the organisms.

50 A convex hull is the smallest convex curve enclosing a set of points.

51 A convex hull is the smallest convex curve enclosing a set of points.
If we draw the convex hulls surrounding two sets of points, for example vertebrates and invertebrates.

52

53 A convex hull is the smallest convex curve enclosing a set of points.
If we draw the convex hulls surrounding two sets of points, for example vertebrates and invertebrates, we know that the points are separate groups if their hulls do not collide.

54 Invertebrates Invertebrates Vertebrates

55 A convex hull is the smallest convex curve enclosing a set of points.
If we draw the convex hulls surrounding two sets of points, for example vertebrates and invertebrates, we know that the points are separate groups if their hulls do not collide. This gives us a simple test for whether groups are ideally separated or not.

56

57 A convex hull is the smallest convex curve enclosing a set of points.
If we draw the convex hulls surrounding two sets of points, for example vertebrates and invertebrates, we know that the points are separate groups if their hulls do not collide. This gives us a simple test for whether groups are ideally separated or not. But what if the groups are only a little bit mixed?

58

59 A convex hull is the smallest convex curve enclosing a set of points.
If we draw the convex hulls surrounding two sets of points, for example vertebrates and invertebrates, we know that the points are separate groups if their hulls do not collide. This gives us a simple test for whether groups are ideally separated or not. But what if the groups are only a little bit mixed? If we can figure out the minimum number of nodes we have to remove to eliminate the collision

60 A convex hull is the smallest convex curve enclosing a set of points.
If we draw the convex hulls surrounding two sets of points, for example vertebrates and invertebrates, we know that the points are separate groups if their hulls do not collide. This gives us a simple test for whether groups are ideally separated or not. But what if the groups are only a little bit mixed? If we can figure out the minimum number of nodes we have to remove to eliminate the collision We have an indication of how badly the two groups are mixed.

61 A convex hull is the smallest convex curve enclosing a set of points.
If we draw the convex hulls surrounding two sets of points, for example vertebrates and invertebrates, we know that the points are separate groups if their hulls do not collide. This gives us a simple test for whether groups are ideally separated or not. But what if the groups are only a little bit mixed? If we can figure out the minimum number of nodes we have to remove to eliminate the collision We have an indication of how badly the two groups are mixed. Here’s how our collision elimination algorithm does that:

62 First, the convex hulls are identified
First, the convex hulls are identified. The area of overlap is shaded in red.

63 Next, the organisms are identified for elimination.

64 The algorithm seeks the node closest to the centroid of the collision area, and removes it.

65 First, the convex hulls are identified
First, the convex hulls are identified. The area of overlap is shaded in red.

66 The process is repeated until there is no area of collision.

67 The misplaced nodes are highlighted, identifying the problem areas for
the student. Further analysis can identify patterns of mistakes, showing areas needing attention from teachers of biology.

68


Download ppt "Automated scoring of student trees"

Similar presentations


Ads by Google