Download presentation
Presentation is loading. Please wait.
Published byMaurice Ward Modified over 8 years ago
1
FreeLing Uncovered Lluís Padró Centre de Recerca TALP Universitat Politècnica de Catalunya padro@lsi.upc.edu
2
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Introduction What is FreeLing ? A configurable and extensible linguistic analysis library, developer-oriented. What is FreeLing not? A user-oriented off-the-shelf linguistic analyzer. What do people use it for? analyze -f es.cfg < mydocs.txt
3
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Classes
4
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Linguistic Data Classes
5
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing sequence Sample program Initialization: Create required modules with desired configuration files. Read and process text: send each input line through processing chain
6
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: Tokenizer Tokenizer: Initialization: tokenizer.dat tokenizer.dat Input: TEXT (class: string) El gato come pescado. El Sr. García pasea al perro a las 17.30. Output: TOKENS (class: vector )
7
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: Splitter Splitter: Initialization: splitter.dat splitter.dat Input: TOKENS (class: vector ) Output: SENTENCES (class: list )
8
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: Maco Maco: Submodules: numbers, punctuation, dates, dictionary, multiwords, NER, quantities, probabilities. Initialization: punct.dat, maco.db, sufixos.dat, locucions.dat, np.dat, quantities.dat, probabilitats.dat punct.dat maco.db sufixos.dat locucions.dat np.dat quantities.dat probabilitats.dat Input: SENTENCES (class: list ) Output: SENTENCES (class: list )
9
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: Senses Senses: Initialization: senses16.db senses16.db Input: SENTENCES (class: list ) Output: SENTENCES (class: list )
10
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: Tagger Tagger: Initialization: tagger.dat / constr_grammar.dat tagger.dat constr_grammar.dat Input: SENTENCES (class: list ) Output: SENTENCES (class: list )
11
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: NEC NEC: Initialization: nec.rgf, nec.lex, nec.abm nec.rgf Input: SENTENCES (class: list ) Output: SENTENCES (class: list )
12
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: Chart parser Chart parser: Initialization: grammar-dep.dat grammar-dep.dat Input: SENTENCES (class: list ) Output: SENTENCES (class: list )
13
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Processing Sequence: Dependencies Dependency parser: Initialization: dependences.dat dependences.dat Input: SENTENCES (class: list ) Output: SENTENCES (class: list )
14
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Other modules Semantic DB: Browse EuroWordNet by lemma or by synset. Initialization: senses16.db, wn16.db senses16.db wn16.db
15
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Including new languages (1) Tokenizer & Splitter: Adapt config files. Maco: Index form dictionary Adapt suffixation rules Provide (if any) multiwords file Develop (if needed) date, number, and quantities modules
16
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Including new languages (2) Tagger (and probabilities module) Use a tagged corpus to train taggers and compute lexical probabilities. Scripts are provided with FreeLing Chart parsers and Dependency parsers Develop appropriate grammars (or adapt some of the existing ones to the new language)
17
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Some NLP applications using FreeLing (1) OpenTrad (PROFIT, www.opentrad.org ) www.opentrad.org Spanish & English analysis for es-ba and en-ba syntactic transfer machine translation. Adaptations Improve/develop chunking grammars and dependency parser rules Produce appropriate XML output
18
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Some NLP applications using FreeLing (2) ASOMO (Judo Socialware, www.asomo.net ) www.asomo.net ML-based NER development environment for opinion mining on highly unstructured documents (blogs, forums, etc.) Adaptations: Extend/adapt JAVA API Develop ad-hoc modules to use Omlet&Fries to train NER modules.
19
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Some NLP applications using FreeLing (3) VKM (Cromosoma, essukaia.lsi.upc.edu/~vkm ) essukaia.lsi.upc.edu/~vkm CIDEM project to evaluate the viability of using NLP techniques in interactive Videogames. Closed- domain dialogue and QA system. Adaptations: Use semantic dictionary with basic logical forms instead of WN synsets. FreeLing output is processed by a DCG.
20
Seminari TALP-GPLN 29/09/2016 Have you got a FreeLing ? Some NLP applications using FreeLing (4) T-Incluye (Fundación CTIC, www.tincluye.org ) www.tincluye.org Exclusive language detector Adaptations: Adapt the form dictionary lemma criteria for some words (e.g. Príncipe-princesa) Develop an ad-hoc grammar for noun phrases, to pre-filter correct/irrelevant/incorrect phrases. Improve JAVA API for Semantic DB access.
21
FreeLing Uncovered Lluís Padró Centre de Recerca TALP Universitat Politècnica de Catalunya padro@lsi.upc.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.