Download presentation
Presentation is loading. Please wait.
Published byMarvin Rogers Modified over 6 years ago
1
Institutionalizing a Culture of Statistical Thinking in DoD Testing
Dr. Catherine Warner Science Advisor September 2017
2
Outline Overview of DoD Testing Improving Operational Testing
Statistical Analysis Methods for Improving Mission Characterization Continuing the Path Forward Bayesian Methods for Maximizing Information Defensible Surveys – Capturing Human Interactions Improving Modeling and Simulation Looking to the Future
3
Goal of Operational Test: Evaluate Operational Effectiveness and Suitability
Operational Environment Representative Users “Real” Threats Conducting Missions
4
Tend to be requirements driven
DoD Test Paradigm Contractor Testing Developmental Testing Operational Testing Test Timeline Tend to be requirements driven
5
Requirements documents are often missing important mission considerations
6
OT characterizes mission capability
Contractor Testing Developmental Testing Operational Testing Test Timeline
7
By the early 1980s, Congress’ concerns were growing
8
Congress established DOT&E separate from the Services’ operational testing agencies
Department of Defense Office of the Secretary of Defense Director, Operational Test and Evaluation (DOT&E)
9
Congress established DOT&E separate from the Services’ operational testing agencies
Department of Defense Office of the Secretary of Defense Army Navy & Marines Air Force Director, Operational Test and Evaluation (DOT&E) Service Operational Testing Agencies
10
Congress established DOT&E separate from the Services’ operational testing agencies
Department of Defense Office of the Secretary of Defense Army Navy & Marines Air Force Director, Operational Test and Evaluation (DOT&E) Service Operational Testing Agencies
11
Operational testing provides critical information to warfighters about new systems… Before warfighters’ lives and missions depend on them Photo notes: U.S. Air Force Master Sgt. Lance Cheung photographs himself and a three-ship formation of F-15E Strike Eagle aircraft from Royal Air Force Lakenheath, England, on Aug. 3, The Strike Eagles are attached to the 492nd Fighter Squadron and are practicing basic surface attack techniques. Cheung is a photojournalist for the Air Force News Agency in San Antonio, Texas.
12
Operational testing provide critical information to warfighters about new systems… Before warfighters’ lives and missions depend on them Time to correct problems Time to restrict missions Photo notes: U.S. Air Force Master Sgt. Lance Cheung photographs himself and a three-ship formation of F-15E Strike Eagle aircraft from Royal Air Force Lakenheath, England, on Aug. 3, The Strike Eagles are attached to the 492nd Fighter Squadron and are practicing basic surface attack techniques. Cheung is a photojournalist for the Air Force News Agency in San Antonio, Texas.
13
Improving Operational Testing
14
Why did we need to improve test methods?
Figure from DOT&E EA-18G BLRIP Figure from DOT&E EA-18G BLRIP Percent Success Percent Success
15
DOT&E Sets Policy and Guidance for Conducting Operational Testing
The goal of the experiment. This should reflect evaluation of end-to-end mission effectiveness in an operationally realistic environment. Quantitative mission-oriented response variables for effectiveness and suitability. (These could be Key Performance Parameters but most likely there will be others.) Factors that affect those measures of effectiveness and suitability. Systematically, in a rigorous and structured way, develop a test plan that provides good breadth of coverage of those factors across the applicable levels of the factors, taking into account known information in order to concentrate on the factors of most interest. A method for strategically varying factors across both developmental and operational testing with respect to responses of interest. Statistical measures of merit (power and confidence) on the relevant response variables for which it makes sense. These statistical measures are important to understanding "how much testing is enough?" and can be evaluated by decision makers on a quantitative basis so they can trade off test resources for desired confidence in results.
16
Laying the foundations for statistical methods in T&E
Research Consortium Offsite Meeting Charter Statistical Engineering with NASA AO Training, OTA Training
17
Puzzled??
18
Sharing lessons learned advanced our mutual understanding
19
Without a destination, any path will do
Institutionalizing Statistical Thinking in Test and Evaluation 2017 2016 National Research Council Study Design of Experiments endorsed as a sound methodology for OT&E OTA MOA on DOE DOT&E Initiatives Guidance on DOE in TEMPs DOT&E Policy Issued OTA Test Design Processes Updated DOT&E Science Advisor Established “Test Science Roadmap” effort DOT&E/ TRMC funded Science of Test Research Consortium DOT&E TEMP Guide Published DASD (DT&E) STAT Implementation Plan STAT COE DOT&E Roadmap Report Two Additional DOT&E Guidance memos on Application of DOE to OT&E Survey Best Practices Memo Cyber- security Procedures Additional Survey and cyber work Modeling and simulation validation guidance Cyber priorities Updated TEMP Guidance M&S Guidance 2015 2009 2010 2011 2012 2013 1998 2014
20
Lessons Learned from Implementing DOE
Strong leadership Communicate, communicate, communicate Find partners Compromise Be open to new ideas Create quick successes and highlight them Support the workforce
21
Statistical Analysis Methods for Improving Mission Characterization
22
Statistical analyses maximize information
23
Statistical models capture important interactions – Apache FOT&E
L16 has a bigger effect on low density battlefields Statistical Result L16 targeting data, battlefield density were statistically significant; light was not. Two factor interaction between BF density and L16 targeting data was significant 80% confidence intervals shown
24
Continuing the Path Forward
25
Bayesian methods provide flexibility in combining information – Stryker Family of Vehicles Reliability Shows
26
Sometimes mission outcome is subjective
Survey regarding improved situational awareness Strongly agree Agree Slightly agree Slightly disagree Disagree Strongly disagree
27
Guidance highlighted key concepts for improving surveys
Surveys are appropriate for quantitatively measuring operator and maintainer thoughts and opinions Have an administration plan for surveys and only use surveys when appropriate Use the right survey Empirically vetted surveys should be used to measure known constructs (e.g., workload, usability, trust) Custom surveys should be used appropriately Follow best practices for writing questions Always pre-test Avoid asking questions without a clear analysis plan Use interviews and focus groups for problem identification and general context Do not develop lengthy exhaustive surveys about every problem that could occur
28
Live-Virtual Constructive simulations can help us learn more… but it needs better validation
Operational Space Range of threat types Threat lethality
29
Live-Virtual Constructive simulations can help us learn more… but it needs better validation
Operational Space Range of threat types Modeling and simulation Threat lethality
30
Live-Virtual Constructive simulations can help us learn more… but it needs better validation
Operational Space Range of threat types Threat lethality Modeling and simulation live testing
31
We continue to increase the statistical defensibility of DoD Test and Evaluation
2017 2016 National Research Council Study Design of Experiments endorsed as a sound methodology for OT&E OTA MOA on DOE DOT&E Initiatives Guidance on DOE in TEMPs DOT&E Policy Issued OTA Test Design Processes Updated DOT&E Science Advisor Established “Test Science Roadmap” effort DOT&E/ TRMC funded Science of Test Research Consortium DOT&E TEMP Guide Published DASD (DT&E) STAT Implementation Plan STAT COE DOT&E Roadmap Report Two Additional DOT&E Guidance memos on Application of DOE to OT&E Survey Best Practices Memo Cyber- security Procedures Additional Survey and cyber work Modeling and simulation validation guidance Cyber priorities Updated TEMP Guidance M&S Guidance 2015 2009 2010 2011 2012 2013 1998 2014
32
Future Test Challenges
33
We need to think carefully about the challenges we face in the future
Space Cyber Autonomy Big data Workforce
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.