Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hiroshi NAKAGAWA Information Technology Center, University of Tokyo,Japan Postal:

Similar presentations


Presentation on theme: "Hiroshi NAKAGAWA Information Technology Center, University of Tokyo,Japan Postal:"— Presentation transcript:

1 Hiroshi NAKAGAWA Information Technology Center, University of Tokyo,Japan E-mail: nakagawa@r.dl.itc.u-tokyo.ac.jp nakagawa@r.dl.itc.u-tokyo.ac.jp Postal: 7-3-1 Hongo, Bunkyo, Tokyo, 113-0033, JAPAN The Situation in Japan - A Personal View-

2 Structure of Academic Societies in JAPAN  Basic data of NLP research activities in Japan  At least 700 NLP researchers ( 700 is the number of the members of the Association of Natural Language Processing(ANLP) Japan)  Several major universities and Lab’s in computer related companies (NTT, Fujitsu, NEC, Hitachi, Toshiba, ….) and more.  Several major governmental laboratories (Communication Research Lab., Electro-Technical Lab., National Institute of Informatics)

3 Academic Societies in Japan -1  ANLP:Association of NLP (700 members): the core of NLP research activities in Japan  SIG NLP of Information Processing Society Japan:IPSJ (more than 500 members mostly overlapped with ANLP members)  SIG NLC of Institute of Electronics and Communication Engineering:IECE (also overlapped with ANLP)

4 Academic Societies in Japan -2  Japanese Society of AI :JSAI: NLP is a not so big part  SIG FI of IPSJ, SIG DD of IPSJ : rather small SIG  These are mainly IR people.  The trend of the 90s was that NLP people were flowing into IR applications.

5 Academic Society in Japan -3  Linguistics related societies  Cognitive Linguistics Society (new)  Social Linguistics Society (2 years old)  Japan Cognitive Science Society: NL of this society:JCSS is one of its main part but rather linguistics oriented research group.

6  Active researchers commit two, three academic societies above mentioned.  Our worry is that very few linguists are interested in NLP.  The reason is that they feel some kind of gaps or barriers between them and corpus based statistic oriented NLP approaches of these days. Researchers really are:

7 The 80s: Big projects with MT-related issues, supported by MITI-related agencies –[The 5 th generation computer project] –Mu machine translation project –EDR concept dictionary project –CICC Translation project among Asian languages The 90s - Now: Smaller projects with diverse issues including NLP as their part, supported by diverse funding agencies Historical Perspectives

8 Exploratory Integration The 80s Industrial Application MT in Network Browsers MT for Abstracts Automatic Caption Systems Further Investigation Generic NLP software and Tuning Equivalence of different formalisms KA from comparable corpora Dialogue in less restricted domains Statistical learning theories Ontology Building Exploratory Integration NLP in IR IE, CLIR, Speech and Language Language in Multimedia From big project of the 80s towards sharply focused the 90s The 90s

9 JSPS project : NLP in IE and IR, Basic research of generic NLP technologies TIT-CRL-NLL project: Speech and Language, Summarization of spontaneous speeches, Speech corpus collection RWC: Corpus collection and annotation as one of the research activities ATR Speech Translation: less restricted subject domains, adding new languages like Chinese, Example-based Paradigm Projects currently going on

10 KA Matching fund project: Knowledge acquisition from comparable corpora Center for excellence on theoretical linguistics: Language teaching as one of its application fields, Corpus collection GDA: Text annotation in multimedia environments, now extending to multimedia, multi-modal presentations --------(in preparation)--------- Language and Action Usability: Language in the network era

11 [Bottom-Up Initiatives] IREX, NTCIR: TREC,MUC-type workshop GSK: LDC, ELRA-type institution [Development Projects] MT for National Patent Office: MT in an integrated Information System JST: MT for Abstracts, Example-based MT Automatic Caption System at NHK

12  IREX is a competition type workshop about Information retrieval and extraction, MUC style conference/ Evaluation, Training and test corpus collection  NTCIR-1(1999) is also a competition type workshop about IR, CLIR, IE(named entity recognition), and ATR. Up to 40 groups were participated.  NTCIR-2(2000) takes IR, CLIR, Automatic Summarization as its tasks  i.e. about 20 groups have been participated to CLIR task so far. TREC,MUC type bottom up activities in Japan

13 The 80s’ big projects have brought up core NLP researchers of the 90s. Funding agencies started to think “Big fund is inefficient.” They prefer small, clear target project. NLP researchers inclined to do realistic applications. (Good or bad?) This NLP research trend is timely fitting well today’s IT revolution. For instance, NLP for cell-phone like gears is the promising. Real effects of the 80s’ big projects

14 International Co-operation: The current state Co-operation based on individual projects JSPS project: DFKI, U-Penn, Stanford Univ., KAIST UMIST/Salford, etc ATR : C-Star Consortium Co-operation based on private companies Co-operation in broader communities, supported by Government agencies No since CICC, MT among Asian languages Human Genome, Physics, Space Science, Brain Science Language-based technology: Highly sensitive to the priority as a nation Too closely related with industrial interests

15 Needs for Co-operation Validation of research results –Language universal vs. Language specific –NE, IR, Learning methods, grammar formalisms Resource gathering –Simply many languages to treat Integration of resources –Ontology for multi-lingual applications Independent funding Sponsoring/Independent funding Standardization Co-ordinated/Joint funding

16 Exploratory Integration The 80s Industrial Application MT in Network Browsers MT for Abstracts Automatic Caption Systems Further Investigation Generic NLP software and Tuning Equivalence of different formalisms KA from comparable corpora Dialogue in less restricted domains Statistical learning theories Ontology Building Exploratory Integration NLP in IR IE, CLIR, Speech and Language Independent Funding Sponsoring/Independent Funding Standardization Coordinated Funding From the 80s big project towards the sharply focused 90s: Targets The 90s


Download ppt "Hiroshi NAKAGAWA Information Technology Center, University of Tokyo,Japan Postal:"

Similar presentations


Ads by Google