Download presentation
Presentation is loading. Please wait.
Published bySheena Jennings Modified over 8 years ago
1
Why plagiarism detection software might not catch cheats Dr Edgar A. Whitley LSE
2
Background A three year HEFCE funded project on student diversity and academic writing (LSE and Lancaster) –http://www.lums.lancs.ac.uk/Departments/o wt/Research/sdaw/http://www.lums.lancs.ac.uk/Departments/o wt/Research/sdaw/ Lessons learned about international students apply equally to many home students
3
Research assumptions Some plagiarism is deliberate attempt to deceive –Copy someone else’s essay –Buy or ‘commission’ essays Much “plagiarism” might be the result of students learning –To become members of a new academic community –To do lengthy academic writing –To do academic writing in an additional language
4
‘Plagiarism detection’ software
5
Turnitin Used in over 80 countries and by 5000 institutions (12 million students and educators) worldwide. 40 million student papers in their database growing by 50,000 papers per day Turnitin crawler has downloaded over 12 billion Internet pages and updates itself at a rate of 60 million pages per day.
6
Summary reports
7
Proper referencing
8
No original work
9
May catch students learning to become part of the academy May have come from a ‘teaching only’ background (e.g. India) May have limited experience of using journals and refereed conference papers (e.g. China) May have limited experience of writing long ‘essays’ (e.g. Greece)
10
Implications for practice Need to rethink recruitment and selection policies Need to provide advice about the why and how of referencing –At the time of need, not administrative convenience
11
More generally May have limited skills for paraphrasing and critical engagement with the literature (argumentation) May be unaware of regulations and penalties regarding plagiarism
12
Continued Need to provide opportunity to learn (i.e. make mistakes) and get feedback Need to provide clear guidance on what is expected from student work
13
What does this indicate?
14
What might not be being caught?
15
‘Copy’ detection software Dependent on coverage of database of texts Dependent on algorithm used to match texts
16
Database coverage Inevitably limited to a subset of available materials –Must be in electronic form –Must be in ‘readable’ electronic form –Must have access to materials –Must have uptodate materials
17
Actual coverage Some indications –Total of 15308 fragments were submitted to Turnitin –48.4% of fragments were ‘found’ (i.e. similarity index > 25%)
18
Based on our study there is a 50% chance of being undetected if using random texts taken from the internet
19
Matching algorithm Based on a system specific criteria for what counts as a match, e.g. number of characters If sufficient variation within the matching block then no match detected
20
Turnitin’s algorithm Based on matching consecutive characters 7 consecutive words + 4 new words will probably never be detected Minor changes at the right place can mean the difference between detection and non- detection
21
Implications Ability to paraphrase affects likelihood of match being found Not all misuse of sources will be picked up –Absence of match does not mean no inappropriate use of sources
22
Why plagiarism detection software might not catch cheats Some of what is caught is not cheating but learning to become part of an academic community Some cheating might not be picked up by algorithm and database
23
Who, in your institution, should we inform about our project work?
24
For more information Resources website –http://www.sdaw.infohttp://www.sdaw.info Email E.A.Whitley@lse.ac.ukE.A.Whitley@lse.ac.uk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.