Download presentation
Presentation is loading. Please wait.
1
Stanford CoreNLP
2
Content Architecture Annotator Examples from QALD-5 POS Tagger NER
Parser Dependency Parser Coreference Resolution Examples from QALD-5
3
Architecture Annotator: adds some kind of analysis information to an Annotation object. Annotation: a type safe heterogeneous map, the data structure which hold the results of annotators.
4
Annotator 16 annotators 1 TokenizerAnnotator 9 TrueCaseAnnotator 2
CleanXmlAnnotator 10 ParserAnnotator 3 WordToSentenceAnnotator 11 DependencyParseAnnotator 4 POSTaggerAnnotator 12 DeterministicCorefAnnotator 5 MorphaAnnotator 13 RelationExtractorAnnotator 6 NERClassifierCombiner 14 NaturalLogicAnnotator 7 RegexNERAnnotator 15 QuoteAnnotator 8 SentimentAnnotator 16 EntityMentionsAnnotator NaturalLogicAnnotator: Marks quantifier scope and token polarity, according to natural logic semantics.
5
POS Tagger http://nlp.stanford.edu/software/tagger.shtml
Labels tokens with their part-of-speech (POS) tag Maximum entropy Support English, Arabic, Chinese, French, Spanish, and German Accuracy: 97.24% (Penn Treebank tagset, English Penn Treebank WSJ) 93.46% (LDC Chinese Treebank POS tag set, Chinese and Hong Kong texts, % on unknown words) Example
6
NER, regexner http://nlp.stanford.edu/software/CRF-NER.shtml
Recognizes named (PERSON, LOCATION, ORGANIZATION, MISC). CRF sequence taggers Support English, Dutch, Spanish, German F % (Prec 88.21%, Rec 87.68%. CoNLL 2003 English news testb) Model: english.muc.7class.distsim.crf.ser, english.conll.4class.distsim.crf.ser Example numerical (MONEY, NUMBER, DATE, TIME, DURATION, SET) entities rule-based Bachelor of (Arts|Laws|Science|Engineering) DEGREE
7
Parser http://nlp.stanford.edu/software/lex-parser.shtml
Syntactic analysis Support English, Chinese, Arabic, Spanish, German PCFG(2003), recursive neural network(2013) F1: 86.36% (PCFG), 90.4% (RNN) (Penn Treebank WSJ Engish) Model: englishPCFG.ser, englishRNN.ser Example
8
DependencyParse http://nlp.stanford.edu/software/nndep.shtml
analyzes the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads. two type of output: basic, collapsed Support English, Chinese Neural Network, transition-based English Penn Treebank and Chinese Penn Treebank(2014) Universal Dependencies representation unlabeled attachment scores (UAS) English 92.0, Chinese 83.9 labeled attachment scores (LAS) English 90.7, Chinese 82.4 Models english_UD.gz (default, English, Universal Dependencies) PTB_Stanford_params.txt.gz (English, Stanford Dependencies) PTB_CoNLL_params.txt.gz (English, CoNLL Dependencies) CTB_CoNLL_params.txt.gz (Chinese, CoNLL Dependencies)
9
DependencyParse
10
Coreference Resolution http://nlp.stanford.edu/software/dcoref.shtml
pronominal and nominal coreference resolution The music was so loud that it couldn't be enjoyed. The project leader is refusing to help. The jerk thinks only of himself. Support English, Chinese Rule-based, Sieve-based (……, DiscourseMatch, ExactStringMatch, ……, RelaxedHeadMatch, PronounMatch) Avg F1 59.5% (CoNLL-2011 Shared Task data set, 2013) Dictionary: Demonym (Asia Asian Asians) Male (johannsen johansen johanson johansson) Female (kate katelyn kater katerina) …… Example:
11
Examples from QALD-5(1) Wo has vice president under the president who authorized atomic weapons against Japan during World War II?
12
Examples from QALD-5(2) Of the people that died of radiation in Los Alamos, whose death was an accident?
13
Examples from QALD-5(3) Which actress starring in the TV series Friends owns the production company Coquette Productions?
14
Examples from QALD-5(4) Which city does the first person to climb all 14 eight-thousanders come from?
15
Examples from QALD-5(5) What is the largest city in the county in which Faulkner spent most of his life?
16
Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.