Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Cognitive Computer Vision Kingsley Sage khs20@sussex.ac.uk and Hilary Buxton hilaryb@sussex.ac.uk Prepared under ECVision Specific Action 8-3 http://www.ecvision.org

Lecture 9 Recognising visual behaviour – Bottom-up and top-down vision – HIVIS (case study) – Dynamic Bayesian Networks – Dynamic Decision Networks – Bayes Automated Taxi (case study) In this lecture we shall see how some of the techniques we have seen thus far can be used to build real Cognitive Vision Systems

Visual behaviour In computational terms, visual behaviour can be defined as a functional description of the spatial and/or temporal dynamics of a visual object or set of objects in an environment Functional description may be characterised by, for example: – “simple” tabular form – set of visual prototypes (facial models …) – statistical models (HMMs, VLMMs …) Recognising visual behaviour means finding the fit between the model and observation data

Visual behaviour © Johnson & Hogg Green circles indicate match with a “normal” trajectory. Red circles indicate “unusual” behaviour

Visual behaviour toy example due to Frey & Jojic involving changing structure over time 0.75 0.25 0.75 0.250.75 0.25 0.75 + PacMan moves forward at each time step with probability 0.8

Bottom-up / top-down vision A generalised view Scene Interpretation …… CONTROL POLICY (WITH STATE MEMORY) FEATURE COMBINATION d1d1 d2d2 dNdN Image Data Driven Task Based Control

Case study: traffic surveillance HIVIS (Buxton and Howarth) The task is to identify traffic behaviours (such as overtaking) Imagery taken from a roundabout in Germany Buxton and Howarth took two different approaches – HIVIS MONITOR (bottom-up) – HIVIS WATCHER (with top-down control)

HIVIS MONITOR on-road-surface on-entry-road stationary significant orientation change in-right-turn-region on-roundabout t1t1 t2t2 t3t3 t4t4 t5t5 t6t6 t7t7 t8t8 Space is divided into regions (see right) so we can specify relationships on objects

HIVIS MONITOR These spatial relationship primitives were then used to build a set of activity primitives using a temporal logic e.g.

HIVIS MONITOR We can use these logics to build up knowledge of individual object properties and object pairs (like overtaking) This is fine, but there are drawbacks: – We examine all objects without considering whether they are relevant to the behaviours we want to detect (e.g. vehicles passing in the background) – We can only recognise the behaviours after they have taken place. This is OK for an off-line analysis system (post-event analysis) but not useful for a “live” system where we want to be able to detect behaviours as they emerge (for control and prediction)

HIVIS WATCHER In HIVIS WATCHER, a set of (computationally low-cost) Visual Routines are used to generate pre-attentive objects A high level Control Policy is used to select objects that warrant attention (attentive selection) - In this case, mutual proximity of objects. This control policy determines a “watch” parameter (how useful are these objects to watch) Deictic (“pointing”) markers are assigned to selected objects. Behaviour primitives (“episodes”) are formed by applying BBN rules on objects referenced by the markers

HIVIS WATCHER Deictic (“pointing”) reference Pre-attentive selection assigns markers to objects that warrant attention For each object, we determine information about the relative position, speed and heading of the object BBNs are then used to combine data into likely episodes

HIVIS WATCHER Combining spatial relationships into episodes time pairs watch agents s o deictic state overtake follow queue unknown episode

Dynamic Bayesian Networks Recall from previous lectures that BBNs have Conditional Probability tables for each node This static approach can be extended to reflect the fact that external factors can influence the BBN (structure and CPT values) in a manner that may not be convenient to model by adding additional nodes to the network, resulting in a DBN So, using HIVIS as an example, knowledge of the likely episode at time t influences our belief in the likely episode at time t+1 (like a Markov assumption). Such an assumption can help us, for example, to maintain temporal continuity

Dynamic Decision Networks DDNs are similar in concept, but include the notion of utility U of an action resulting from a decision D Action p(Rain)=0.3p(Sunny=0.7) Walk to work-10050 Drive to work3010

Graphical representation of DBNs and DDNs Reproduced from Forbes/Huang/Kanazawa/Russell 1995 Decisions D are made by some agent (cf. to-down control) and inform the state evolution process. When “looking ahead” we use Utility U to evaluate the cost associated of being in a state

Bayes Automated Taxi (BAT) Forbes, Huang, Kanazawa and Russell 1995 Forbes et al was concerned with autonomous vehicles for driving on a normal (unadapted) highway using vision Developed a driving simulator for the BATmobile to test DDN based decision making module Use a set of high-level decision tree structures to decide which actions to pursue … Reproduced from Forbes/Huang/Kanazawa/Russell 1995

Bayes Automated Taxi (BAT) Reproduced from Forbes/Huang/Kanazawa/Russell 1995 This a part of the probabilistic network for one vehicle Smaller nodes with thicker outlines are sensor observations

Bayes Automated Taxi (BAT) Reproduced from Forbes/Huang/Kanazawa/Russell 1995

Further reading “Conceptual descriptions from monitoring and watching image sequences”, R. J. Howarth and H. Buxton, Image and Vision Computing 18, 2000 “The BATmobile: Towards a Bayesian Automated Taxi”, J. Forbes, T. Huang, K, Kanazawa, S. Russell, 1995

Summary Visual behaviour recognition involves modelling the spatial and temporal structures of scales in terms of objects The top-down or task-driven approach has many advantages in computational terms Dynamic Bayesian Nets and Dynamic Decision Nets are useful formalisms for real-world applications

Next time … Task based visual control

Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Similar presentations

Presentation on theme: "Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Similar presentations

Presentation on theme: "Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3"— Presentation transcript:

Similar presentations

About project

Feedback