Ten years with Evidence-Based Software Engineering What is it

Ten years with Evidence-Based Software Engineering What is it
Ten years with Evidence-Based Software Engineering What is it? Has it had any impact? What’s next? Ten years with Evidence-Based Software Engineering. What is it? Has it had any impact? What’s next? Thursday, Nov 20th Abstract: An evidence-based software engineer is one who is able to: 1) Formulate a question, related to a decision or judgment, so that it can be answered by the use of evidence, 2) Collect, critically evaluate and summarise relevant evidence from research, practice and local studies, 3) Apply the evidence, integrated with knowledge about the local context, to guide decisions and judgments. The keynote reflects on what it in practise means to be evidence-based in software engineering contexts, where the number of different contexts is high and the research-based evidence sparse, and why there is a need for more evidence-based practises. We summarise our experience from ten years of Evidence-Based Software Engineering in the context of university courses, training of software engineers and systematic literature reviews of software engineering research. While there are challenges in training people in evidence-based practise, our experience suggest that it is feasible and that the training can make an important difference in terms of quality of software engineering judgment and decisions. Based on our experience we suggest changes in how evidence-based software engineering should be presented and taught, and how we should ease the transfer of research results into evidence-based practises. You see it when you believe it. Rain dance. Magne Jørgensen

Initial papers on evidence-based software engineering Barbara Kitchenham Tore Dybå Magne Jørgensen

Evidence-based software engineering (EBSE)
The main steps of EBSE are as follows: Convert a relevant problem or need for information into an answerable question. Search the literature and practice-based experience for the best available evidence to answer the question. (+ create own local evidence, if needed) Critically appraise the evidence for its validity, impact, and applicability. Integrate the appraised evidence with practical experience and the client's values and circumstances to make decisions about practice. Evaluate performance in comparison with previous performance and seek ways to improve it.

Most (93%) of our communication is non-verbal
In many situations with myths and over-simplifications there will be no reference to evidence. In this case, however, there seems to be an agreement about that the main source of evidence, at least for the 93% claim, is a paper from 1967 written by Mehrabian and Wiener [4]. The design of this study was as follows: The participants hear a person saying the positive words “honey”, “thanks” and “dear”, the neutral words “maybe”, “really”, “oh” and the negative words “don’t”, “brute” and “terrible”. All these words are spoken in a positive, neutral and negative voice and the participants are asked to assess “… the feelings of a speaker towards the person whom she is addressing”. The authors found, not surprisingly, that if a person said, for example, “thanks” in a negative voice to another person, most observers would not emphasise the positive content of the word, but instead the negative voice, when assessing to what extent the speaker liked the other person. This study, together with a follow-up study with facial expression added, also published in 1967, are the sources of the claim that 93% of communication is non-verbal. We doubt, however, that anybody who had actually read and understood the design of the original studies would believe that we could use this study as evidence in support of the general claim that 93% (or most of) of communication is non-verbal.

Software Chaos? Huge improvement 1994-1998?
Gjennomsnittlig kostnadsoverskridelse i IT-prosjekter er på 189% (Standish Group, studie fra 1994 fortsatt mest referert “software crisis” dokumentasjon) Kvalitetsstudier viser overskridelser på typisk 30. De ba om å få problemprosjekter og oppsummerer som om det var representativt. Mistake 189% of the estimate and 189% cost overrun! Hvorfor trodde/tror vi på dette tallet? Hvorfor aksepterer vi UNNTAKET som typisk? Husker unntak best? Passer til en følelse at estimering er vanskelig? Gjelder sikkert de andre (som er dårligere enn meg)? Passer til et formål? Selge konsulenttjenester. Selv om jeg er dårlig til å estimere, så er jeg bedre enn gjennomsnittet. (page 13 of their 1994 CHAOS report): “We then called and mailed a number of confidential surveys to a random sample of top IT executives, asking them to share failure stories.”

Difficult (but possible) to debunk myths

But new ones are arriving all the time … 14% Waterfall and 42% of Agile projects are successful (source: The Standish Group, The Chaos Manifesto 2012) Example of representativeness fallacy: Seminar in Oslo (last week). The providers reported that on average 12% of the projects failed (cancelled or did not deliver much benefits), but when (immediately after) faced with the common finding that about 10-15% of project failes, 45% said that they thought this number would be higher, and 0% though it would be lower. Media focus on failure gives a quite biased picture! Successful = “On cost, on schedule and with specified functionality” Can you spot a serious error in this comparison?

“I see it when I believe it” vs “I believe it when I see it”
Design: Data sets with randomly set performance data comparing “traditional” and “agile” methods. Survey of each developer’s belief in agile methods Question: How much do you, based on the data set, agree in: “Use of agile methods has caused a better performance when looking at the combination of productivity and user satisfaction.” Result: Previous belief in agile determined what they saw in the random data

WHAT are the skills needed to practice EVIdence-based software engineering?

Formulate questions meaningful for your context/challenge/problem
The question “Is Agile better than Traditional methods?” is NOT answerable. What is agile? What is traditional? What is better? What is the context?

Question what statements and claims means

Know how to evaluate argumentation
Data Claim Backing Warrant Qualifier Reservation

Know how to use google scholar (search for research-based evidence)

Know how to collect and evaluate practice-based experience
Methods similar to evaluation of research-based evidence and claims Be aware of “organizational over-learning”

Know how to create local evidence
Experimentation is often possible and creates the best evidence Pilot studies Trial-sourcing Controlled experiments in own context

Any impact from EBSE? Successes: Trained several hundreds of software engineers (both students and professionals) Courses in evidence-based software engineering (or closely related courses) created at several universities Conferences and research projects with this as core topic Upcoming journal (within information systems) with this as topic Failures: No systematic evaluation of how well software engineers are able to implement and how often they actually use the EBSE-steps No real success story to tell (Company X used this and it led to ……) Mention Scienta (EBSE-based company, with now more than 20 employees)

What’s next Still to be answered: Is it realistic to achieve an evidence-based software engineering profession, or is this only for those with research competence? My opinion: Moderate optimist based on my experience with teaching this to students and professionals. Challenges: Not much research (evidence-based medicine is easier) High number of different contexts makes use of evidence complex Systematic literature reviews are typically too much “we need more research” + student experiments and not very useful for the software industry Opportunities: More emphasis on the teaching how to collect practice-based evidence (and less on systematic reviews, google scholar-searches) More emphasis on the generation of local evidence (teaching of how to do controlled trials in own context)

The paper clip was invented by a Norwegian
Illustrating mechanisms that makes us believe in things that aren’t so: We believe it because we want it to be true (and are told that it is true).

Google trends (fashions): Worldwide web searches Category: Computers & Electronics, Software
Software engineering Agile Kanban Rational Unified Process Agile is soon “traditional”. What will be the next fashion? Illustrating that we are very much fashion-based

The ease of creating myths:
Are risk-willing or risk-averse developers better? Group A: Group B: Initially Average 3.3 Initially Average 5.4 Debriefing Average 2: 3.5 Debriefing Average 2: 5.0 2 weeks later Average 3: 3.5 2 weeks later Average 3: 4.9 Study design: Research evidence + Self-generated argument. Question: Based on your experience, do you think that risk-willing programmers are better than risk-averse programmers? 1 (totally agree) – 5 (No difference) - 10 (totally disagree) Neutral group: Average 5.0

Ten years with Evidence-Based Software Engineering What is it

Similar presentations

Presentation on theme: "Ten years with Evidence-Based Software Engineering What is it"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ten years with Evidence-Based Software Engineering What is it

Similar presentations

Presentation on theme: "Ten years with Evidence-Based Software Engineering What is it"— Presentation transcript:

Similar presentations

About project

Feedback