Download presentation
Presentation is loading. Please wait.
Published byHester Jenkins Modified over 9 years ago
1
Analysing Crime-Scene Reports Katerina Pastra and Horacio Saggion University of Sheffield Scene of Crime Information System
2
Outline > Project Overview > SOCIS Architecture > Corpus > Linguistic Analysis > Pointers
3
Project Overview > Domain: Scene of Crime Investigation (SOC) > Main Features : 1. Multimedia briefing Summarisation of text and images 2. Generation Of formal reports & of photo albums 3. Intelligent Search 2000 - 2003
4
Project Overview (2) > Other systems for Crime Investigation: Academic R&D Projects Governmental agencies’ Systems Commercial Systems BUT: SOCIS brings ‘intelligence’ to CI systems > The ‘Digital Evidence in Court’ issue: Authenticity has to be verified Recently accepted in court
5
A view of SOCIS + Image processing Text processing Integrated Knowledge Base
6
Text Processing - Text Corpus - Information Extraction system >> Named Entities Recognition >> Co-reference Resolution Need: Linguistic Analysis of the Language at the SOC Lexical Information Morphosyntactic Information Semantic Information
7
The Corpus 4 days spent with a SOCO: 12 scenes visited * 2 complete case files examined * official documentation collected Official documentation : SOC Reports = 77 Photo Indexes = 300 Witness Statements = 14 Reported SOC Information : Press Association = 792 Washington Post = 233 Crime Watch = 8 NEEDEDNEEDED Reports - Photo indexes Witness statements Photographs ! For the same case ! For major crime ! Of significant quantity
8
Examples
9
SOC Language Characteristics General Characteristics: ! Telegraphic ! Descriptive ! Accurate ! Objective Special text type : Reports
10
Lexical Information Characteristics: - Extensive use of abbreviations - Jargon Creation of Word - Lists (gazetteers): - Based on PITO’s CDM - Over 200 lists (domain + general) Words of interest are assigned a semantic category
11
Morphosyntactic Features ! Extensive Ellipsis ! Simple temporal dimensions ! Limited co-ordination ! Sub-ordination avoided ! POS : NPs, PPs ! Adjuncts of place - time, Qualifiers For identifying entities of interest automatically, we need to write specific rules using: ! The word lists + Context Information
12
Modelling (1)
13
Modelling (2)
14
Pointers > SOCIS Sheffield Web Page http://www.dcs.shef.ac.uk/nlp/socis > SOCIS Surrey Web Page http://www.computing.surrey.ac.uk/ai/socis > NLP Group http://www.dcs.shef.ac.uk/nlp
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.