Human Performance Metrics for ATM Validation Brian Hilburn NLR Amsterdam, The Netherlands
Overview Why consider Human Performance? How / When is HumPerf considered in validation? Difficulties in studying HumPerf Lessons Learnt Toward a comprehensive perspective… ( example data )
Traffic Growth in Europe Actual Traffic Traffic Forecast (H) Traffic Forecast (M) Traffic Forecast (L) Movements (millions)
Accident Factors
l Unexpected human (ab)use of equipment etc. l New types of errors and failures l Costs of real world data are high l New technologies often include new & hidden risks l Operator error vs Designer error l Transition(s) and change(s) are demanding Implementation (and failure) is very expensive! Why consider HUMAN metrics?
l Titanic l Three Mile Island l Space shuttle l Bhopal l Cali B-757 l Paris A-320 l FAA/IBM ATC Famous Human Factors disasters
When human performance isnt considered...
…...!!!!!!
What is being done to cope? Near and medium term solutions l RVSM l BRNAV l FRAP l Civil Military airspace integration l Link 2000 l Enhanced surveillance l ATC tools
ATM: The Building Blocks Displays (eg CDTI) Tools (eg CORA) Procedures (eg FF-MAS Transition) Operational concepts (eg Free Flight)
Monitoring in Free Flight: Ops Con drives the ATCos task!
NLR Free flight validation studies l Human factors design & measurements l Ops Con + displays + procedures + algorithms l Retrofit automation & displays –TOPAZ: no safety impairment…. –no pilot workload increase with.. –3 times present en-route traffic –delay, fuel & emission savings l ATC controller impact(s) –collaborative workload reduction l Info at NLR website
The aviation system test bed Data links Two way Radio Experiment Scenario Manager scenario 'events' scenario 'events' System data Human data Human data System data
Evaluating ATCo Interaction with New Tools Human Factors trials ATCos + Pilots Real time sim Subjective data Objective data also
Objective Measures Heart Rate Respiration Scan pattern Pupil diameter Blink rate Scan randomness Integrated with subjective instruments... HEART Analysis Toolkit
Correlates of Pupil Diameter Emotion Age Relaxation / Alertness Habituation Binocular summation Incentive (for easy problems) Testosterone level Political attitude Sexual interest Information processing load Light reflex Dark reflex Lid closure reflex Volitional control Accommodation Stress Impulsiveness Taste Alcohol level
Pupil Diameter by Traffic Load
RIVER IBE 326 AMC Time line Hand-off Datalink Traffic Pre-acceptance Arrival management tool Communication tool Automation: assistance or burden? Conflict detection & resolution tools
Low Traffic Visual scan trace, 120 sec.
Visual scan trace, 120 sec High Traffic
Positive effect of automation on heart rate variability
Positive effect of automation on pupil size
Better detection of unconfirmed ATC data up-links
No (!) positive effect on subjective workload
Objective vs Subjective Measures Catch 22 of introducing automation: Ill use it if I trust it. But I cannot trust it until I use it!
Automation & Traffic Awareness
Converging data: The VINTHEC approach l Team Situation Awareness EXPERIMENTAL correlate behavioural markers w physio ANALYTICAL Game Theory Predictive Model of Teamwork VS
Free Routing: Implications and challenges Implications: Airspace definition Automation tools Training ATCo working methods Ops procedures Challenges: Operational Technical Political Human Factors FRAP
Sim 1: Monitoring for FR Conflicts l ATS Routes l Direct Routing Airways plus direct routes l Free Routes Structure across sectors
Response time (secs) Sim 1: Conf Detection Response Time
Studying humans in ATM validation Decision making biases-- ATC = skilled, routine, stereotyped Reluctance-- Organisational / personal (job threat) Operational rigidity -- unrealistic scenarios Transfer problems-- Skills hinder interacting w system Idiosyncratic performance-- System is strategy tolerant Inability to verbalise skilled performance-- Automaticity
Moving from CONSTRUCT to CRITERION: Evidence from CTAS Automation Trials Time-of-flight estimation error, by traffic load and automation level.
Controller Resolution Assistant (CORA) EUROCONTROL Bretigny (F) POC: Mary Flynn Computer-based tools (e.g. MTCD, TP, etc.) Near-term operational Two phases CORA 1: identify conflicts, controller solves CORA 2: system provides advisories
CORA: The Challenges Technical challenges… Ops challenges… HF challenges Situation Awareness Increased monitoring demands Cognitive overload mis-calibrated trust Degraded manual skills New selection / training requirements Loss of job satisfaction
CORA: Experiment Controller preference for resolution order Context specificity Time benefits (Response Time) of CORA
Construct Operationalised Definition Result SA|ATA-ETA|Auto x Traf WorkloadPupDiam TX - PupDiam base Datalink display reduces WL Dec Making/Response biasIntent benefits Strategies VigilanceRT to AlertsFF = CF AttitudeSurvey responsesFF OK, but need intent info Synthesis of results
Validation strategy l Full Mission Simulation –Address human behaviour in the working context l Converging data sources (modelling, sim (FT,RT), etc) l Comprehensive data (objective and subjective) l Operationalise terms (SA, WL) l Assessment of strategies –unexpected behaviours, or covert Dec Making strategies
Human Performance Metrics: Potential Difficulties l Participant reactivity l Cannot probe infrequent events l Better links sometimes needed to operational issues l Limits of some (eg physiological) measures –intrusiveness –non-monotonicitytask dependence wrt –reliability, sensitivity –time-on-task, motor artefacts l Partial picture –motivational, social, organisational aspects
Using HumPerf Metrics l Choose correct population l Battery of measures for converging evidence l Adequate training / familiarisation l Recognise that behaviour is NOT inner process l More use of cog elicitation techniques l Operator (ie pilot / ATCo) preferences –Weak experimentally, but strong organisationally?
Validation metrics: Comprehensive and complementary l Subj measures easy, cheap, face valid l Subj measures can tap acceptance (wrt new tech) l Objective and subjective can dissociate l Do they tap different aspects (eg of workload)? –Eg training needs identified l Both are necessary, neither sufficient
Operationalise HF validation criteria l HF world (SA, Workload) vs l Ops world (Nav accuracy, efficiency) l Limits dialogue between HF and Ops world l Moving from construct (SA) to criterion (traffic prediction accuracy)
Summing Up: Lessons Learnt l Perfect USER versus perfect TEST SUBJECT (experts?) l Objective vs Subjective Measures –both necessary, neither sufficient l Operationalise terms: pragmatic, bridge worlds l Part task testing in design; Full mission validation l Knowledge elicitation: STRATEGIES
Summing Up (2)... Why consider Human Performance? » New ATM tools etc needed to handle demand » Humans are essential link in system How / When is HumPerf considered in validation? » Often too little too late… Lessons Learnt » Role of objective versus subjective measures » Choosing the correct test population » Realising the potential limitations of experts Toward a comprehensive perspective… » Bridging the experimental and operational worlds
Thank You... for further information: Brian Hilburn NLR Amsterdam tel: