Evaluation of a Stylometry System on Various Length Portions of Books

Slides:



Advertisements
Similar presentations
Parametric measures to estimate and predict performance of identification techniques Amos Y. Johnson & Aaron Bobick STATISTICAL METHODS FOR COMPUTATIONAL.
Advertisements

JStylo: An Authorship-Attribution Platform and its Applications
Face Recognition and Biometric Systems Eigenfaces (2)
Voiceprint System Development Design, implement, test unique voiceprint biometric system Research Day Presentation, May 3 rd 2013 Rahul Raj (Team Lead),
Authorship Verification Authorship Identification Authorship Attribution Stylometry.
Authorship Attribution CS533 – Information Retrieval Systems Metin KOÇ Metin TEKKALMAZ Yiğithan DEDEOĞLU 7 April 2006.
Lecture 22: Evaluation April 24, 2010.
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
 Juxtapp: A Scalable System for Detecting Code Reuse Among Android Applications  Steve Hanna, Ling Huang, Edward Wu1, Saung Li, Charles Chen, and Dawn.
Designing a Multi-Biometric System to Fuse Classification Output of Several Pace University Biometric Systems Leigh Anne Clevenger, Laura Davis, Paola.
Stylometry System CSIS Stylometry System – Use Cases and Feasibility Study Gregory Shalhoub, Robin Simon, Jayendra Tailor, Ramesh Iyer, Dr. Sandra Westcott.
Team Members: Ana Caicedo Escobar Sandeep Indukuri Deepthi Tulasi Kevin Chan Under Esteemed Guidance of: Prof. Charles C Tappert Robert Zack.
Robert S. Zack May 8, 2010 METHODS OF DERIVING BIOMETRIC ROC CURVES FROM THE k-NN CLASSIFIER.
T EAMS 2 & 4 R ESEARCH D AY P RESENTATION P RESENTERS T EAMS 2 & 4 T HE M ICHAEL L. G ARGANO 9 TH A NNUAL R ESEARCH D AY P RESENTATION P RESENTERS E DYTA.
Long Text Keystroke Biometrics Study Gary Bartolacci, Mary Curtin, Marc Katzenberg, Ngozi Nwana Sung-Hyuk Cha, Charles Tappert (Software Engineering Project.
Keystroke Biometric : ROC Experiments Team Abhishek Kanchan Priyanka Ranadive Sagar Desai Pooja Malhotra Ning Wang.
CS Team 5 Alex Wong Raheel Khan Rumeiz Hasseem Swati Bharati Biometric Authentication System.
Stylometry Project May 4, 2007 Pace’s Research Day.
Keystroke Biometric Studies Assignment 2 – Review of the Literature Case Study – Keystroke Biometric Describe problem investigated (intro + abstract) Developed.
Keystroke Biometric Studies Keystroke Biometric Identification and Authentication on Long-Text Input Book chapter in Behavioral Biometrics for Human Identification.
Stylometry System CSIS Stylometry Projects, mostly Fall 2009 Project Seidenberg School of Computer Science and Information Systems.
Robert S. Zack, Charles C. Tappert, and Sung-Hyuk Cha Pace University, New York Performance of a Long-Text-Input Keystroke Biometric Authentication System.
Biometric ROC Curves Methods of Deriving Biometric Receiver Operating Characteristic Curves from the Nearest Neighbor Classifier Robert Zack dissertation.
A User Verification System: Spring, 2002 Timely Problems, Novel Solutions Project Team Members: William Baker, Arthur Evans, Lisa Jordan, Saurabh Pethe.
05/06/2005CSIS © M. Gibbons On Evaluating Open Biometric Identification Systems Spring 2005 Michael Gibbons School of Computer Science & Information Systems.
Authors: Anastasis Kounoudes, Anixi Antonakoudi, Vasilis Kekatos
Handwriting Copybook Style Analysis Of Pseudo-Online Data Student and Faculty Research Day Mary L. Manfredi, Dr. Sung-Hyuk Cha, Dr. Charles Tappert, Dr.
DARPA-BAA Proposal 2012 Active Authentication Technical POC: Dr. Charles Tappert Principal Investigators: Drs. Tappert, Cha, Chen, Grossman.
STYLOMETRY IN IR SYSTEMS Leyla BİLGE Büşra ÇELİKKAYA Kardelen HATUN.
Processing of large document collections Part 3 (Evaluation of text classifiers, applications of text categorization) Helena Ahonen-Myka Spring 2005.
Statistical analysis of Skype conversations: recognizing individuals by their chatting style Candidato : Cristina Segalin Relatore: Dr. Marco Cristani.
Finding Book Reviews H. Calogeridis R. Caldwell UW Library Last Updated: March 2005.
Csci5233 Computer Security1 Bishop: Chapter 14 Representing Identity.
Introduction to Biometrics Charles Tappert Seidenberg School of CSIS, Pace University.
Keystroke Biometric System Client: Dr. Mary Villani Instructor: Dr. Charles Tappert Team 4 Members: Michael Wuench ; Mingfei Bi ; Evelin Urbaez ; Shaji.
Keystroke Biometrics Studies on a Variety of Short and Long Text and Numeric Input Ned Bakelman, DPS Candidate Charles C. Tappert, PhD, Advisor Seidenberg.
No. 1 Classification and clustering methods by probabilistic latent semantic indexing model A Short Course at Tamkang University Taipei, Taiwan, R.O.C.,
1 Handwriting Analysis, Forgery, and Counterfeiting By the end of these notes you will be able to: describe 12 types of handwriting characteristics that.
Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier.
INFORMATION NETWORKS DIVISION COMPUTER FORENSICS UNCLASSIFIED 1 DFRWS2002 Language and Gender Author Cohort Analysis of .
Keystroke Biometrics Studies on a Variety of Short and Long Text and Numeric Input Ned Bakelman, DPS Candidate Charles C. Tappert, PhD, Advisor Seidenberg.
Handwriting Analysis EHS BioMed/Forensics. Video links chnique/document-examination/
Biometric for Network Security. Finger Biometrics.
GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.
Authorship Verification as a One-Class Classification Problem Moshe Koppel Jonathan Schler.
Fast face localization and verification J.Matas, K.Johnson,J.Kittler Presented by: Dong Xie.
Extending linear models by transformation (section 3.4 in text) (lectures 3&4 on amlbook.com)
On the relevance of facial expressions for biometric recognition Marcos Faundez-Zanuy, Joan Fabregas Escola Universitària Politècnica de Mataró (Barcelona.
Proximity based one-class classification with Common N-Gram dissimilarity for authorship verification task Magdalena Jankowska, Vlado Kešelj and Evangelos.
Computer-User-Input Behavioral Biometrics Dr. Charles C
STANDARD ERROR OF SAMPLE
Keystroke Biometric Studies
Computer-User-Input Behavioral Biometrics The Biometrics we focus on at Pace University Dr. Charles C. Tappert Seidenberg School of CSIS, Pace University.
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Introduction Machine Learning 14/02/2017.
Measuring & Marking Out Produced by Neil Liggett
BLIND AUTHENTICATION: A SECURE CRYPTO-BIOMETRIC VERIFICATION PROTOCOL
EVOLUTION FROM EXCEL PIVOT TABLES TO
IMPAIRED-USER INPUT SCENARIOS FOR KEYSTROKE BIOMETRIC AUTHENTICATION
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Keystroke Biometric Studies with Short Numeric Input on Smartphones
REMOTE SENSING Multispectral Image Classification
Documenting, record keeping and reporting to stakeholders
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
OLA HIGH Criminal Justice / Forensic Science
Intro to Machine Learning
describe 12 types of handwriting characteristics
Module 2 OBJECTIVE 14: Compare various security mechanisms.
Stylistic Author Attribution Methods for Identifying Parody Trump Tweets Isaac Pena Department of Computer Science, Yale University, New Haven, CT LILY.
Keystroke Biometric Studies with Short Numeric Input on Smartphones
Presentation transcript:

Evaluation of a Stylometry System on Various Length Portions of Books Ida Schulstad, Mark Boga, Cranston Jordan, Kara Pally, Vinnie Monaco, Richard DeStefano, John Stewart, and Charles Tappert Authentication Identification Verification

Stylometry “Stylometry is the application of the study of linguistic style, usually to written language …” and “… is often used to attribute authorship to anonymous or disputed documents” – Wikipedia

Book Text Experiments In this study, stylometry was used to verify the identity of authors Data: 30 authors and 10 books from each author System: earlier developed stylometry system System enhanced with additional features Performance of the stylometry system was determined on these literary texts In particular, the degree of performance increase with increasing text lengths

Classification System: Cha’s Dichotomy Model Used in All of Our Biometric Authentication Systems The feature space is transformed into a feature-difference space by calculating vector distances between pairs of samples of the same person (intra-person distances) and between pairs of samples of different people (inter-person distances). Mulitdimension space to a 2 dimension space: same person vs. different people For example: you have 3 people with 3 samples each. Calculate differences between samples in terms of distance and plot it. Can then determine within class or between class (same or different person) using NN. 24 is Yoon??? Enter citation (a) Feature space (b) Feature-difference space Transformation from feature space (a) to feature distance space (b)

Receiver Operating Characteristic (ROC) Curves Book Text Experiments - #1 The 30 Author Main Experiment Training and testing files were split in to 5 books for each author. Strong training – the system was trained on the test subjects. EERs for word sizes of 2, 5, and 10 K: 34%, 30%, and 25% Receiver Operating Characteristic (ROC) Curves 250, 500, 1K, 2K, 5K, 10K words. The Equal Error Rate (EER) increases with the Text Length

Receiver Operating Characteristic (ROC) Curves Book Text Experiments - #2 Strong training on 15 of the authors. Trained on 5 books from each author, tested on remaining 5 Performance improved with fewer subjects EERs ~20% for 10K, 24% for 5K, and 30% for 2K word samples. Receiver Operating Characteristic (ROC) Curves 2K, 5K, 10K words

Equal Error Rate (EER) vs. Text Length in Literary Book Texts from 30 Authors EER decreases logarithmically as a function of text length