Presentation is loading. Please wait.

Presentation is loading. Please wait.

Institutionalizing a Culture of Statistical Thinking in DoD Testing

Similar presentations


Presentation on theme: "Institutionalizing a Culture of Statistical Thinking in DoD Testing"— Presentation transcript:

1 Institutionalizing a Culture of Statistical Thinking in DoD Testing
Dr. Catherine Warner Science Advisor September 2017

2 Outline Overview of DoD Testing Improving Operational Testing
Statistical Analysis Methods for Improving Mission Characterization Continuing the Path Forward Bayesian Methods for Maximizing Information Defensible Surveys – Capturing Human Interactions Improving Modeling and Simulation Looking to the Future

3 Goal of Operational Test: Evaluate Operational Effectiveness and Suitability
Operational Environment Representative Users “Real” Threats Conducting Missions

4 Tend to be requirements driven
DoD Test Paradigm Contractor Testing Developmental Testing Operational Testing Test Timeline Tend to be requirements driven

5 Requirements documents are often missing important mission considerations

6 OT characterizes mission capability
Contractor Testing Developmental Testing Operational Testing Test Timeline

7 By the early 1980s, Congress’ concerns were growing

8 Congress established DOT&E separate from the Services’ operational testing agencies
Department of Defense Office of the Secretary of Defense Director, Operational Test and Evaluation (DOT&E)

9 Congress established DOT&E separate from the Services’ operational testing agencies
Department of Defense Office of the Secretary of Defense Army Navy & Marines Air Force Director, Operational Test and Evaluation (DOT&E) Service Operational Testing Agencies

10 Congress established DOT&E separate from the Services’ operational testing agencies
Department of Defense Office of the Secretary of Defense Army Navy & Marines Air Force Director, Operational Test and Evaluation (DOT&E) Service Operational Testing Agencies

11 Operational testing provides critical information to warfighters about new systems… Before warfighters’ lives and missions depend on them Photo notes:  U.S. Air Force Master Sgt. Lance Cheung photographs himself and a three-ship formation of F-15E Strike Eagle aircraft from Royal Air Force Lakenheath, England, on Aug. 3, The Strike Eagles are attached to the 492nd Fighter Squadron and are practicing basic surface attack techniques. Cheung is a photojournalist for the Air Force News Agency in San Antonio, Texas.

12 Operational testing provide critical information to warfighters about new systems… Before warfighters’ lives and missions depend on them Time to correct problems Time to restrict missions Photo notes:  U.S. Air Force Master Sgt. Lance Cheung photographs himself and a three-ship formation of F-15E Strike Eagle aircraft from Royal Air Force Lakenheath, England, on Aug. 3, The Strike Eagles are attached to the 492nd Fighter Squadron and are practicing basic surface attack techniques. Cheung is a photojournalist for the Air Force News Agency in San Antonio, Texas.

13 Improving Operational Testing

14 Why did we need to improve test methods?
Figure from DOT&E EA-18G BLRIP Figure from DOT&E EA-18G BLRIP Percent Success Percent Success

15 DOT&E Sets Policy and Guidance for Conducting Operational Testing
The goal of the experiment. This should reflect evaluation of end-to-end mission effectiveness in an operationally realistic environment. Quantitative mission-oriented response variables for effectiveness and suitability. (These could be Key Performance Parameters but most likely there will be others.) Factors that affect those measures of effectiveness and suitability. Systematically, in a rigorous and structured way, develop a test plan that provides good breadth of coverage of those factors across the applicable levels of the factors, taking into account known information in order to concentrate on the factors of most interest. A method for strategically varying factors across both developmental and operational testing with respect to responses of interest. Statistical measures of merit (power and confidence) on the relevant response variables for which it makes sense. These statistical measures are important to understanding "how much testing is enough?" and can be evaluated by decision makers on a quantitative basis so they can trade off test resources for desired confidence in results.

16 Laying the foundations for statistical methods in T&E
Research Consortium Offsite Meeting Charter Statistical Engineering with NASA AO Training, OTA Training

17 Puzzled??

18 Sharing lessons learned advanced our mutual understanding

19 Without a destination, any path will do
Institutionalizing Statistical Thinking in Test and Evaluation 2017 2016 National Research Council Study Design of Experiments endorsed as a sound methodology for OT&E OTA MOA on DOE DOT&E Initiatives Guidance on DOE in TEMPs DOT&E Policy Issued OTA Test Design Processes Updated DOT&E Science Advisor Established “Test Science Roadmap” effort DOT&E/ TRMC funded Science of Test Research Consortium DOT&E TEMP Guide Published DASD (DT&E) STAT Implementation Plan STAT COE DOT&E Roadmap Report Two Additional DOT&E Guidance memos on Application of DOE to OT&E Survey Best Practices Memo Cyber- security Procedures Additional Survey and cyber work Modeling and simulation validation guidance Cyber priorities Updated TEMP Guidance M&S Guidance 2015 2009 2010 2011 2012 2013 1998 2014

20 Lessons Learned from Implementing DOE
Strong leadership Communicate, communicate, communicate Find partners Compromise Be open to new ideas Create quick successes and highlight them Support the workforce

21 Statistical Analysis Methods for Improving Mission Characterization

22 Statistical analyses maximize information

23 Statistical models capture important interactions – Apache FOT&E
L16 has a bigger effect on low density battlefields Statistical Result L16 targeting data, battlefield density were statistically significant; light was not. Two factor interaction between BF density and L16 targeting data was significant 80% confidence intervals shown

24 Continuing the Path Forward

25 Bayesian methods provide flexibility in combining information – Stryker Family of Vehicles Reliability Shows

26 Sometimes mission outcome is subjective
Survey regarding improved situational awareness Strongly agree Agree Slightly agree Slightly disagree Disagree Strongly disagree

27 Guidance highlighted key concepts for improving surveys
Surveys are appropriate for quantitatively measuring operator and maintainer thoughts and opinions Have an administration plan for surveys and only use surveys when appropriate Use the right survey Empirically vetted surveys should be used to measure known constructs (e.g., workload, usability, trust) Custom surveys should be used appropriately Follow best practices for writing questions Always pre-test Avoid asking questions without a clear analysis plan Use interviews and focus groups for problem identification and general context Do not develop lengthy exhaustive surveys about every problem that could occur

28 Live-Virtual Constructive simulations can help us learn more… but it needs better validation
Operational Space Range of threat types Threat lethality

29 Live-Virtual Constructive simulations can help us learn more… but it needs better validation
Operational Space Range of threat types Modeling and simulation Threat lethality

30 Live-Virtual Constructive simulations can help us learn more… but it needs better validation
Operational Space Range of threat types Threat lethality Modeling and simulation live testing

31 We continue to increase the statistical defensibility of DoD Test and Evaluation
2017 2016 National Research Council Study Design of Experiments endorsed as a sound methodology for OT&E OTA MOA on DOE DOT&E Initiatives Guidance on DOE in TEMPs DOT&E Policy Issued OTA Test Design Processes Updated DOT&E Science Advisor Established “Test Science Roadmap” effort DOT&E/ TRMC funded Science of Test Research Consortium DOT&E TEMP Guide Published DASD (DT&E) STAT Implementation Plan STAT COE DOT&E Roadmap Report Two Additional DOT&E Guidance memos on Application of DOE to OT&E Survey Best Practices Memo Cyber- security Procedures Additional Survey and cyber work Modeling and simulation validation guidance Cyber priorities Updated TEMP Guidance M&S Guidance 2015 2009 2010 2011 2012 2013 1998 2014

32 Future Test Challenges

33 We need to think carefully about the challenges we face in the future
Space Cyber Autonomy Big data Workforce


Download ppt "Institutionalizing a Culture of Statistical Thinking in DoD Testing"

Similar presentations


Ads by Google