Authorship Attribution By Allison Pollard. What is Authorship Attribution? The way of determining who wrote a text when it is unclear who wrote it. It.

Slides:



Advertisements
Similar presentations
E-asTTle Writing All you ever wanted to know……. “Launched in November 2007, the Revised New Zealand Curriculum sets the direction for teaching and learning.
Advertisements

Computer Security Lab Concordia Institute for Information Systems Engineering Concordia University Montreal, Canada A Novel Approach of Mining Write-Prints.
Authorship Attribution CS533 – Information Retrieval Systems Metin KOÇ Metin TEKKALMAZ Yiğithan DEDEOĞLU 7 April 2006.
Automatic Authorship Identification Diana Michalek, Ross T. Sowell, Paul Kantor, Alex Genkin, David Madigan, Fred Roberts, and David D. Lewis.
Introduction to the theory of grammar
 Juxtapp: A Scalable System for Detecting Code Reuse Among Android Applications  Steve Hanna, Ling Huang, Edward Wu1, Saung Li, Charles Chen, and Dawn.
In the universe of knowledge with linguistic intelligence and semantic logic.
Stylometry System CSIS Stylometry System – Use Cases and Feasibility Study Gregory Shalhoub, Robin Simon, Jayendra Tailor, Ramesh Iyer, Dr. Sandra Westcott.
Stylistics and stylometry. 2 What is “style”? Term not much loved by linguists –Too vague –Has connotations in neighbouring fields (“style” = good style,
Digital Stylometry The State of the Art of Authorship Analysis in 2006 That is to say, I took this attitude--to wit, I only BELIEVED Bacon wrote Shakespeare,
Writing a literary analysis essay English II Honors.
A Brief Introduction to Stylistics By:Dr.K.T.KHADER
Revision Part I Stylistic 551 Lecture 31. Stylistics: Objectives Understand the importance and function of Style and language in literary works. analyze.
Writing a Research Paper. Step 1: Define your topic.
1 Marlowe or Shakespeare? Determining the Authorship of a Mysterious Play Chapter 9, Exercise 4 Bill Camarinos Andy Gibbons.
The Literary Essay: A Step-by-Step Guide. You are being asked to read in a special way. To analyze something means to break it down into smaller parts.
Ch. 16 Document Examination CSI And Document Examination CSI And Document Examination.
STYLOMETRY IN IR SYSTEMS Leyla BİLGE Büşra ÇELİKKAYA Kardelen HATUN.
Communicative Language Teaching Vocabulary
HIGH-LEVEL TEXT ANALYSIS AND TECHNIQUES Angela Zoss Data Visualization Coordinator 226 Perkins Library Duke University Libraries,
Statistical analysis of Skype conversations: recognizing individuals by their chatting style Candidato : Cristina Segalin Relatore: Dr. Marco Cristani.
Chapter 10 Handwriting Analysis, Forgery, and Counterfeiting By the end of this chapter you will be able to: describe 12 types of handwriting characteristics.
INFORMATION NETWORKS DIVISION COMPUTER FORENSICS UNCLASSIFIED 1 DFRWS2002 Language and Gender Author Cohort Analysis of .
Using Machine Learning Techniques in Stylometry Ramyaa, Congzhou He, Dr. Khaled Rasheed.
Tutorial on Writing 1 for ME4001, Introduction to Engineering Lawrence Cleary Shannon Consortium Regional Writing Centre, ULRegional Writing Centre.
Presented by Teererai Marange. According to Caliskan-Islam et al.(2015), authorship attribution using the Code Stylometry feature set is possible when.
 There must be a coherent set of links between techniques and principles.  The actions are the techniques and the thoughts are the principles.
Levels of Linguistic Analysis
THE LITERARY ANALYSIS Moving Beyond the Formulaic 1.
Literary Commentary and Analysis. Important Terms: Criticism, Commentary, Analysis  First, “criticism” and “commentary” mean the exact same thing, so.
Written Assignment NOTES AND TIPS FOR STUDENTS.  MarksLevel descriptor 0The work does not reach a standard described by the descriptors below. 1–2The.
AP English Literature and Composition National Exam
Communicative Language Teaching
Quantitative Formalism: The “Genre” Potential of Political Rhetoric Michael Santoro, Queens College English Department.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Introduction to Critical Appreciation. INTRODUCTION Writing about Literature is a course that is designed to introduce students to ways and means of analysing.
Chapter 16 Notes Part 1. The Unabomber  In 1978, a package was sent to a professor at Northwestern University.  It exploded when it was opened by a.
GCSE English Language 8700 GCSE English Literature 8702 A two year course focused on the development of skills in reading, writing and speaking and listening.
 Used to be applicable to literary corpus/ academia only  Source code similarity/plagiarism detection is very important  “Moss” is the most widely.
Composing Music with Grammars. grammar the whole system and structure of a language or of languages in general, usually taken as consisting of syntax.
Inquiry II Cultural & Historical Interrogation.
Introduction to Language and Society August 25. Areas in Linguistics Phonetics (sound) Phonology (sound in mind) Syntax (sentence structure) Morphology.
INTRODUCTION TO APPLIED LINGUISTICS
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Antar Abdellah.  Writing is a process NOT a product  You need to go through the experience of writing to produce real valuable pieces  Copying or quoting.
King Faisal University جامعة الملك فيصل Deanship of E-Learning and Distance Education عمادة التعلم الإلكتروني والتعليم عن بعد [ ] 1 جامعة الملك فيصل عمادة.
The Need for the Study of Stylistics. –(1) Style is an integral part of meaning. Without the sense of style we cannot arrive at a better understanding.
Text Linguistics. Definition of linguistics Linguistics can be defined as the scientific or systematic study of language. It is a science in the sense.
Honors American Literature
Distinguishing authorship
Writing for Publication
Style in general and style
Chapter 7 Verbal Intercultural Communication
Exam Practice Paper 1 AO1: Apply appropriate methods of language analysis, using associated terminology and coherent written expression. AO2: Demonstrate.
Presented By: Marine Milad, Ph.D.
GCSE 2015 English Literature.
CORPUS LINGUISTICS Corpus linguistics is the study of language as expressed in samples (corpora) or "real world" text. An approach to derive at a set of.
Reading and Frequency Lists
MYP Descriptors – Essay Types & Rubrics
Peer Editing Rhetorical Analysis
Towards defining translation
AS Literature Lesson One Textual Analysis.
Writing a literary analysis essay
Stylometry and Authorship
Presented by : Amna H.Ali MA Student
TEMPLATE ELEMENTS.
Common Exam for English 9
What is Discourse Analysis
Artificial Intelligence 2004 Speech & Natural Language Processing
Presentation transcript:

Authorship Attribution By Allison Pollard

What is Authorship Attribution? The way of determining who wrote a text when it is unclear who wrote it. It is useful when two or more people claim to have written something or when no one is willing (or able) to stay that (s)he wrote the piece

The Basis A text makes use of all linguistic domains: semantics, syntax, lexicography, phonology (orthography) and morphology. Each of these domains is rule governed, yet, within these rules and among the components, the grammar offers the writer choices. The text as an end product is an outcome of the particular choices taken by its author. This is why each specific text carries the fingerprints of its creator.

The Assumptions: there is a specific single author there are choices to be made the author is consistent in his/her preferred choices these choices are present and could be detected in all end products of that creator

Computerized Analysis Developed in the 1980s Based on stylometry—the statistical analysis of literary style [quantifying some of the features of an author’s style]

Method 1: Word- or Sentence- Length The origin of stylometry First developed in 1887, later extended in 1938 NOT reliable methods

Method 2: Function Words Relies on word usage and context-free (“function”) words Analyze frequency, position, or immediate context of words Criticized method, cannot reliably distinguish between certain literature types

Method 3: Vocabulary Distributions Measuring the “richness” or “diversity” of an author’s vocabulary Analyzes the frequency profile of word- usage to glimpse the author’s extent of vocabulary

Method 4: Content Analysis Tabulates the frequency of types of words in a text Aims to reach the denotative or connotative meaning of the text

Method 5: Neural Networks Recognize the underlying organization of data (which is vitally important for any pattern recognition problem, which Stylometry is)

Past Uses — Scholarly Did Shakespeare write his own plays? Who wrote the Federalist papers?

Recent Uses — Literary Determine who wrote the anonymously published novel Primary Colors [Joe Klein] Target suspects for the authorship of the Unabomber’s Manifesto [Ted Kaczynski]

Future Uses — Beyond Identifying and blocking spam Detecting lies, flag potential inconsistencies Locate authors of malicious code

References Ephratt, Michal. Authorship attribution - the case of lexical innovations. Gerritsen, Corey M. Authorship Attribution Using Lexical Attraction. Holmes, David I. Stylometry: Its Origins, Development and Aspirations. Pfleeger, Charles P. and Shari Lawrence Pfleeger. Security in Computing. Pg 342.