Introduction NLP Applications
Tweets 140 Characters Also contains images, 😊emoji, #hashtag, @usernames and links Grammatically ambiguous Customer Service Requests through Social Media
Present Research Method developed for extracting keywords from Tweets. By obtaining essential keywords by imitating human question-answering logic.
In answering a question, humans focus on the Keywords What is ? your name your name
Highest token accuracy POS tagging by NLP4J - 97.64% [4] NLP - Current Tools Stanford CoreNLP [1] OpenNLP [2] NLP4J [3] Highest token accuracy POS tagging by NLP4J - 97.64% [4]
Tweets affect the token accuracy of POS taggers. Models for POS tagging TwitIE [5] TweetNLP [6] Twitter-POS tagger for Stanford CoreNLP [7] it is noisy, with linguistic errors and idiosyncratic style. Token Accuracy of Stanford CoreNLP is 97.32% [4] Twitter-POS Tagger for Stanford CoreNLP recorded accuracy of 90.5% [7]
Data Collection Keyword Extraction Implementation Methodology
Methodology : Data Collection Tweets of the months of February and March 2016 were used Dialog Axiata Twitter Profile Rejected - Domain specific nouns,verbs,interjections and aux verbs Keywords - essential for the meaning of the sentence Keyword Corpus (258 words) Rejected words Corpus (64 Words)
2. Keyword Extraction Methodology Parser 1 Stanford CoreNLP POS Tagging with Twitter Model Parser 2 Keyword Matching Parser 3 Rejected Words Matching
Stanford CoreNLP POS Tagging with Twitter Model Parser 1 Parser 2 Parser 3 divided into a Subject (Noun Phrase, NP) Predicate (Verb Phrase, VP) NP - Numbers (CD), Noun (NN - all forms), Adjectives (JJ - all forms) VP - Verbs (VB - all forms) NP & VP – essence of the meaning NP - Usernames, Emoji, Hashtags, Pronouns VP - Adverbs, Wh-adverbs, Auxiliary Verbs
Fig.1 POS Tagged Tweet (Tregex Notation) Tweet - @dialoglk Please unsubscribe cool club service .my number 0771111111 Nouns – Club(NN), service(NN), number(NN), 0771111111(CD) Verbs – please(VB) Other - @dialoglk(USR), unsubscribe(JJ), cool(JJ), my (PRP$) Fig.2 Results from Parser 1
Keyword Matching Parser 1 Parser 2 Parser 3 Tweet is matched against a Domain Specific Keywords Corpus The words not classified as NPs and VPs The NPs and VPs identified from Parser 1 Tweet
Tweet - @dialoglk Please unsubscribe cool club service Tweet - @dialoglk Please unsubscribe cool club service .my number 0771111111 Nouns – Club(NN), service(NN), number(NN), 0771111111(CD) Verbs – please(VB) Adjectives – unsubscribe(JJ), cool(JJ) Other - @dialoglk(USR), unsubscribe(JJ), cool(JJ), my (PRP$) Fig. 3 Result from Parser 2
Rejected Words Matching Parser 1 Parser 2 Parser 3 The noise from the resulting keywords from Parser 2 The keywords which have a Levenshtein Distance match of 0 with the corpus Tweet - @dialoglk Please unsubscribe cool club service .my number 0771111111 Nouns – Club(NN), service(NN), number(NN), 0771111111(CD) Verbs – please(VB) Adjectives - unsubscribe(JJ), cool(JJ) Other - @dialoglk(USR), my (PRP$) Fig. 3 Noise Removed by Parser 3
unsubscribe (JJ), cool (JJ), club (NN), Final Keywords List = unsubscribe (JJ), cool (JJ), club (NN), service (NN), number(NN), 0771111111 (CD) @dialoglk Please unsubscribe cool club service .my number 0771111111
3. Implementation Implemented using Java. Fig. 5 GUI of the Program
Evaluation Methodology Evaluated using the Turing Test.[8] “The machine to be linguistically indistinguishable from humans” [9]
Evaluation Methodology : Design 14 new Tweets Keyword sets were generated by Humans (6 categories from different fields) Non-modified System (Sys.A) Modified System (Sys.B) Human supervisors evaluated the responses Sys A - Explain
Calculation of the test results n : Total number of Tweets x : Machine and Human answers were identical y : Supervisor detected the answer generated by the Machine z : Supervisor could not detect the answer generated by the machine T : Total instances where the system was successful
Summary of Turing Test results for Sys.A TABLE II Summary of Turing Test results for Sys.B Test Case Criteria x y z T Academics 14 0.00% English Language Experts 2 12 85.71% Undergraduates 3 9 35.71% Graduates 8 42.86% Computer Science Graduates 4 7 71.43% General Public 1 92.86% Test Case Criteria x y z T Academics 3 11 78.57% English Language Experts 2 12 85.71% Undergraduates 5 7 50.00% Graduates 6 57.14% Computer Science Graduates 4 9 1 35.71% General Public Test Case Failed Test Case Passed
Summary and Conclusions TABLE III Summary of Turing Test Results The research modifies the Stanford CoreNLP with Twitter POS Tagger Model using a mix of parsers and corpora The modified system had keyword sets identical to humans The enhancements increase overall Turing Test result from 50% to 83.33% System Tested Test cases that passed Test cases that failed Success rate of the System System without Modifications 3 50.00% System with modifications 5 1 83.33%
Language supported is English Future Work Limitations The system could be evaluated with a larger population for nuanced results Language supported is English Future Work Use a complete domain specific corpus to increase accuracy Present approach could be applied to other NLP tools
References [1] C. D. Manning, J. Bauer, J. Finkel, S. J. Bethard, M. Surdeanu, and D. McClosky, “The Stanford CoreNLP Natural Language Processing Toolkit,” Proc. 52nd Annu. Meet. Assoc. Comput. Linguist. Syst. Demonstr., pp. 55–60, 2014. [2] “Welcome to Apache OpenNLP,” 2013. [Online]. Available: http://opennlp.apache.org/. [3] “emorynlp/nlp4j: NLP tools developed by Emory University,” 2016. [Online]. Available: https://github.com/emorynlp/nlp4j. [4] “POS Tagging (State of the art),” 2016. [Online]. Available: http://aclweb.org/aclwiki/index.php?title=POS_Tagging_(State_of_the_art). [Accessed: 22-Aug-2016] [5] K. Bontcheva, L. Derczynski, A. Funk, M. A. Greenwood, D. Maynard, and N. Aswani, “TwitIE: An Open- Source Information Extraction Pipeline for Microblog Text,” 2013.
References [6] O. Owoputi, B. O ’connor, C. Dyer, K. Gimpel, N. Schneider, and N. A. Smith, “Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters,” Proc. NAACL, 2013. [7] L. Derczynski, A. Ritter, S. Clark, and K. Bontcheva, “Twitter part-of-speech tagging for all: Overcoming sparse and noisy data,” Proc. Recent Adv. Nat. Lang. Process., no. September, pp. 198–206, 2013. [8] A. M. Turing, “Computing Machinery and Intelligence,” Mind, vol. 49, pp. 433–460, 1950. [9] K. Lacurts, “Criticisms of the Turing Test and Why You Should Ignore ( Most of ) Them,” Official Blog of MIT’s Course: Philosophy and Theoretical Computer Science, 2011. [Online]. Available: people.csail.mit.edu/katrina/papers/6893.pdf. [Accessed: 23-Jun-2016]. *Images obtained from online sources.