Download presentation
Presentation is loading. Please wait.
Published byCameron Barkes Modified over 9 years ago
1
FNERC (towards final version v.3) Edinburgh, March 2002
2
>lingway█ Edinburgh meeting Table Of Contents >FNERC V.2 (recall) >FNERC V.3 >Improvements >Application to 2nd Domain >Machine learning
3
>lingway█ Edinburgh meeting FNERC V2 Overview
4
>lingway█ Edinburgh meeting FNERC V2 : Evaluation ZoningRE + Context
5
>lingway█ Edinburgh meeting FNERC : name matching /normalization >Name Matching consists in matching co-referential NE, NUMEX and TIMEX inside a same product description >FNERC : use of value “attribute” that we add during FNERC module >Example : if in the same product description, we annotate twice a PROCESSOR (say Intel PIII and Intel Pentium III), they will have the same value Id, and then when filling the NE – PROCESSOR slot, the module will just add one to the slot >As for Normalisation, the same value Id will be used to fill the slot with the first synonym of the Ontology >Run with a XSLT style-sheet against the XHTML input file
6
>lingway█ Edinburgh meeting FNERC V3 : improvements >Conclusion from the V2 >Zoning (1 st and 2 nd domain) >Adding Contextual Rules (1 st domain)?
7
>lingway█ Edinburgh meeting FNERC : 2 nd Domain >Ontology matters >Location: Country, Region, City >Employer organization : non-profit, Gvrt body, public, private >Background knowledge : education, language, skill >Job categories >Contract >Job Title >Department >Lexicons
8
>lingway█ Edinburgh meeting FNERC : 2 nd Domain >NERC Adaptation : >No sentence tokenization needed (no Entity at the sentence scale) >LgXmlsegmenter for zoning (enabling to declare empty tags) >Rules : Lists, Regular Expression, Context
9
>lingway█ Edinburgh meeting FNERC : 2 nd Domain >Location: not a necessary feature >Country : lists + patterns (Pays : France / Dans toute l’Europe) >Region : lists + patterns (Région parsienne) >City : lists + patterns (Ville : Boulogne / lieu de travail : Arcueil (94) >Miscellaneous : –(92) : area indicative –Situation géographique : Poste basé à Toulouse, déplacements occasionnels à l'étranger. > Employer organization : leaderEcrivez nous à > Generic Patterns for MAJMIN : Dans le cadre de son développement, Cybion recherche … / Cybion, leader français de la veille et de l'intelligence / Ecrivez nous à : SOCIETEL, 13 rue des forêts / Illicom recrute ! > Specific Patterns : Organisation des nations Unies, Compagnie Française du pétrole > Other : –« grand groupe bancaire » –Nous recherchons pour une importante société des Réseaux et des Telecom basée en IDF
10
>lingway█ Edinburgh meeting FNERC : 2 nd Domain >Background knowledge : >education : lists and patterns (Formation: bac + 4/5 / Formation BTS/DUT / Formation: Economie/Gestion, Sciences, Documentation ) >language : lists and patterns (langues requises: / bilingue anglais-japonais) >skill : lists and patterns (connaissances techniques: / Maîtrise de Word, Excel et Internet nécessaire / Ingénieur réseaux confirmé (Novell, MC2, MCP) >Job categories : >mapping with Job Title ? >Contract : Lists >Job Title : Lists + Layout >Straightforward : “Titre : Administrateurs Systèmes & Réseaux” >Specific size and font layout >Redundancy of structure : B1_illicom_1.html >Department : Patterns
11
>lingway█ Edinburgh meeting NERC V3 : adaptation to a new domain >Adaptation : >machine learning techniques >human customization of rules
12
>lingway█ Edinburgh meeting Machine learning and NERC V3 >Goal : helping the writing of rules related to a new domain >Approach : >3 spaces (left, entity, right) >Positive and negative >Rule induction (iteration) >References: >Markus Junker, Michael Sintek, and Matthias Rinck: Learning for Text Categorization and Information Extraction with ILP >Dayne Freitag: Toward General-Purpose learning for Information Extraction
13
>lingway█ Edinburgh meeting Example 1
14
>lingway█ Edinburgh meeting Example 1 (representation)
15
>lingway█ Edinburgh meeting Types of rules (left) >Word in position 3 >Bi-gramme in position 2,3 >Trigramme >Word (position 1, 2 or 3) >Bi-gramme in position (1,2) >Idem properties >Comb. Word+properties
16
>lingway█ Edinburgh meeting Example 1st iteration Rule (left) = "formation" in position 3 Rest: Next rule (left) = "Niveau" etc.
17
>lingway█ Edinburgh meeting Result = input to the expert >A set of (evaluated) rules >A first (evaluated) system >A set of cases non covered by the rules
18
>lingway█ Edinburgh meeting FNERC V3 Schedule >First results: end of March >Final version and Evaluation: mid-April >Final report for D2.4: end April
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.