Download presentation
Presentation is loading. Please wait.
Published byShon Hopkins Modified over 9 years ago
1
Christopher Johnson
2
What is Computer Mediated Communication (CMC)? ◦ Short Message Service ◦ Blogs (Twitter) ◦ E-mail ◦ Instant messages Observed language during such communication ◦ Lo (Microsoft Messenger) ◦ Happy bday hpe u hv a gd day x (SMS) ◦ Awe! Ur so welcome! Sorry I was so sleepy! Lol (Twitter)
3
Most people are in contact with some form of CMC ◦ Children ◦ Adults People can hide behind any persona they create for themselves For example Paedophiles ◦ Lure children by pretending to be other children
4
Man reading every message? No ◦ Would this suffice anyway? Autonomous processing of messages? Yes ◦ Well at least the most appropriate way.
5
We need an understanding of the messages ◦ SMS ◦ Blogs ◦ E-mails ◦ And others We know that abbreviations are used ◦ But how can we expand these abbreviations back to standard text? ◦ What about misspellings How do we get a large real world corpus to train and test on?
6
VARD NLP techniques ◦ N-grams (This project will use Bigrams and Trigrams) Phonetic algorithms ◦ Soundex ◦ Metaphone These tools are commonly used for spell checkers But how well do these apply to CMC?
7
Research into current techniques which could be applicable Create a large corpus of CMC text Improve techniques for very similar languages ◦ (English CMC and CMC) Create a system which can distinguish between CMC text and unabridged text ◦ Test the systems success rate. Convert CMC to unabridged text ◦ (Ambitious, therefore only if time)
8
The Real World - National Education Association Health Information Network ◦ http://bnetsavvy.org/wp/a-teen-talks-about-texting-and-what-parentseducators-need-to-know- about-it/ http://bnetsavvy.org/wp/a-teen-talks-about-texting-and-what-parentseducators-need-to-know- about-it/ About VARD 2 – Baron, Alistair ◦ http://www.comp.lancs.ac.uk/~barona/vard2/ http://www.comp.lancs.ac.uk/~barona/vard2/ Lawrence Philips' Metaphone Algorithm - Atkinson, Kevin ◦ http://aspell.net/metaphone/ http://aspell.net/metaphone/ The Soundex Indexing System – The National Archives ◦ http://www.archives.gov/publications/general-info-leaflets/55.html http://www.archives.gov/publications/general-info-leaflets/55.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.