Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using copy-detection and text comparison algorithms for cross- referencing multiple editions of literary works A. Zaslavsky, Alejandro Bia, K. Monostori,

Similar presentations


Presentation on theme: "Using copy-detection and text comparison algorithms for cross- referencing multiple editions of literary works A. Zaslavsky, Alejandro Bia, K. Monostori,"— Presentation transcript:

1 Using copy-detection and text comparison algorithms for cross- referencing multiple editions of literary works A. Zaslavsky, Alejandro Bia, K. Monostori, School of Computer Science & Software Engineering Australia Monash University, Australia, A.Zaslavsky@monash.edu.auA.Zaslavsky@monash.edu.au Spain & Miguel de Cervantes DL, University of Alicante, Alicante, Spain, abia@dlsi.ua.esabia@dlsi.ua.es European Conference on Digital Libraries, Darmstadt, 2001

2 Overview Copy-detection, plagiarism and comparative literary analysis Text processing in DLs and humanities research Tools and approaches MatchDetectReveal architecture Cervantes's Quijote DL & MDR Conclusion

3 Introduction Problems Intellectual property Plagiarism Search results Copy-prevention Special hardware Active documents Copy-detection Plagiarism.org SCAM Koala sif

4 Copy-detection Digital watermarking Codewords Line-shift coding Word-shift coding Feature coding String comparison 30 32

5 Copy-Detection Architecture Registration Module Comparison Module Parsing Module

6 MatchDetectReveal(MDR) Internet MDR users MDR customizer 4matching engine 4format converter 4search engine 4visualiser local repository matching rule DB indexes Similarity & overlap rule interpreter IEEE DL ACM DL Local cluster Global resources Base Document Set Generator      

7 Example screen dump

8 Conclusion Comparative analysis of editions Cleaning up OCR output Performance Text ordering not necessary Fine granularity of overlap detection

9 Future Work Similar blocks of text XML output Rules for overlap & similarity Visualisation of results


Download ppt "Using copy-detection and text comparison algorithms for cross- referencing multiple editions of literary works A. Zaslavsky, Alejandro Bia, K. Monostori,"

Similar presentations


Ads by Google