Information Transfer through Online Summarizing and Translation Technology Sanja Seljan*, Ksenija Klasnić**, Mara Stojanac*, Barbara Pešorda*, Nives Mikelić.

Slides:



Advertisements
Similar presentations
Global Learning Outcomes at Pensacola State College (GLOs)
Advertisements

Statistics for Improving the Efficiency of Public Administration Daniel Peña Universidad Carlos III Madrid, Spain NTTS 2009 Brussels.
Rationale for a multilingual corpus for machine translation evaluation Debbie Elliott Anthony Hartley Eric Atwell Corpus Linguistics 2003, Lancaster, England.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Japanese University Students’ Attitudes toward the Teacher’s English Use Koji Uenishi Hiroshima University.
Study on the outcomes of teaching and learning about ‘race’ and racism Kish Bhatti-Sinclair (Division of Social Work Studies) Claire Bailey (Division of.
I Evaluation of Free Online Machine Translations for Croatian-English and English-Croatian Language Pairs Sanja Seljan,
Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.
Uncertainty Corpus: Resource to Study User Affect in Complex Spoken Dialogue Systems Kate Forbes-Riley, Diane Litman, Scott Silliman, Amruta Purandare.
Azra Rafique Khalid Mahmood. Introduction “To learn each and everything in a limited time frame of degree course is not possible for students”. (Mahmood,
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
More on ANOVA. Overview ANOVA as Regression Comparison Methods.
1 Examining the role of Self-Regulated Learning on Introductory Programming Performance Susan Bergin, Ronan Reilly and Des Traynor Department of Computer.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Jumping Off Points Ideas of possible tasks Examples of possible tasks Categories of possible tasks.
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
Jeopardy! One-Way ANOVA Correlation & Regression Plots.
Factorial Designs More than one Independent Variable: Each IV is referred to as a Factor All Levels of Each IV represented in the Other IV.
MACHINE TRANSLATION TRANSLATION(5) LECTURE[1-1] Eman Baghlaf.
Interdisciplinary role of English in the field of medicine: integrating content and context Nataša Milosavljević, Zorica Antić University of Niš, Faculty.
Introduction to Information System Development.
1 Integrating Google Apps for Education to Business English Student Trainees’ On-the-Job Training English Reports Asst.Prof. Phunsuk Kannarik.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
SPSS Series 1: ANOVA and Factorial ANOVA
1 Introduction to Modeling Languages Striving for Engineering Precision in Information Systems Jim Carpenter Bureau of Labor Statistics, and President,
Common framework Guidelines for Pilot Actions Debrecen 2013 Municipality of Debrecen Department of Sociology University of Debrecen External expert.
Evaluation of the Statistical Machine Translation Service for Croatian-English Marija Brkić Department of Informatics, University of Rijeka
Presented By : Abirami Poonkundran.  This paper is a case study on the impact of ◦ Syntactic Dependencies, ◦ Logical Dependencies and ◦ Work Dependencies.
FishBase Summary Page about Salmo salar in the standard Language of FishBase (English) ENBI-WP-11: Multilingual Access to European Biodiversity Sites through.
CHATS IN THE CLASSROOM: EVALUATIONS FROM THE PERSPECTIVES OF STUDENTS AND TUTORS AT CHEMNITZ UNIVERSITY OF TECHNOLOGY, COMMUNICATION ON TECHNOLOGY AND.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Digital Information and Heritage INFuture Zagreb, Sentence Alignment as the Basis For Translation Memory Database Sanja Seljan Faculty of.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Comparing syntactic semantic patterns and passages in Interactive Cross Language Information Access (iCLEF at the University of Alicante) Borja Navarro,
Text Based Information Retrieval Text Based Information Retrieval H02C8A H02C8B Marie-Francine Moens Karl Gyllstrom Katholieke Universiteit Leuven.
Loughborough London School of Sport & Exercise Sciences Evaluating the Competencies of Sports Managers in Taiwan: A Delphi Approach Ling-Mei Ko Professor.
An Analysis of Successful Online Behaviors Across Disciplines Catherine Finnegan, University System of Georgia Libby V. Morris, University of Georgia Kangjoo.
Click to edit Master title style Evaluation of Electronic Translation Tools Through Quality Parameters Vlasta Kučiš University of Maribor, Department of.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
An Investigation of Cognitive Operations on L2 Listening Comprehension Performance Speaker: Dr. Hui-Fang Shang ( 尚惠芳博士 ) Outline: I.Introduction II.Literature.
Methods for Automatic Evaluation of Sentence Extract Summaries * G.Ravindra +, N.Balakrishnan +, K.R.Ramakrishnan * Supercomputer Education & Research.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Instructors’ General Perceptions on Students’ Self-Awareness Frances Feng-Mei Choi HUNGKUANG UNIVERSITY DEPARTMENT OF ENGLISH.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Impact of automated translation on mining knowledge from text data , Brno Luděk Svozil.
The Psychologist as Detective, 4e by Smith/Davis © 2007 Pearson Education Chapter Eleven: Designing, Conducting, Analyzing, and Interpreting Experiments.
Customer Satisfaction Index July 2008 RESULTS. Introduction This report presents the results for the Customer Satisfaction Index survey undertaken in.
Customer contractor Sociologic Research on Awareness of Industrial Property Protection Possibilities December, 2015.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Are the Standard Documentations really Quality Reports? European Conference on Quality in Official Statistics Helsinki, 3-6 May 2010 © STATISTIK AUSTRIA.
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
AAPPL Assessment Follow Up June What is AAPPL Measure? The ACTFL Assessment of Performance toward Proficiency in Languages (AAPPL) is a performance-
CHAPTER 15: THE NUTS AND BOLTS OF USING STATISTICS.
Motivation and Job Satisfaction among Hospital Nurses Working in Port-Harcourt, Rivers State, NIGERIA. by LILLY-WEST, R. BULOALA.
The Sellout: Readers Sentiment Analysis of 2016 Man Booker Prize Winner Paper ID : 748.
Sentiment analysis algorithms and applications: A survey
Language Technologies Institute Carnegie Mellon University
Using Translation Memory to Speed up Translation Process
Student Satisfaction Results
Dennis Zhao,1 Dragomir Radev PhD1 LILY Lab
Presentation transcript:

Information Transfer through Online Summarizing and Translation Technology Sanja Seljan*, Ksenija Klasnić**, Mara Stojanac*, Barbara Pešorda*, Nives Mikelić Preradović*, Faculty of Humanities and Social Sciences, University of Zagreb *Department of Information and Communication Sciences, **Department of Sociology

Outline I.Introduction II.Related work III.Online text summarization tools IV.Online translation tools V.Research Methodology VI.Results VII. Conclusion Information Transfer through Online Summarizing and Translation Technology

I. Introduction information and communication technology – important role in information transfer information access, cross language retrival and information transfer – one step further in global communication online summarization and machine translation evaluation of information transfer Information Transfer through Online Summarizing and Translation Technology

II. Related work Europe Media Monitor (EMM) – automatic public service MiTAP and MITRE summarization in medical domain MuST – multilingual information retrival, summarization and translation system cross-language document summarization information system for legal professionals Information Transfer through Online Summarizing and Translation Technology

III. Online text summarization tools „Text summarization represents a method of extracting relevant portions of the input document, presenting the main ideas of the original text...“ (Mikelic Preradović, Vlainic, 2013) various summarization systems – statistical, linguistical or combined approach basic types of summaries – indicative and informative summarization techniques – surface methods, entity level, discourse level methods summarized text should give the answers to questions: who, what, when, where, and how? Information Transfer through Online Summarizing and Translation Technology

IV. Online translation tools machine translation technology - education market, the international institutions … quick and easy translation from one natural language into another – first access to information on other languages (for information assimilation) – widely used – free translation tools the aim – to show the impact of online machine translation tools to information transfer knowledge of the tools that are of good quality, precision and accuracy → automatic / human evaluation Information Transfer through Online Summarizing and Translation Technology

V. Research Methodology three respondents (native Croatian speakers) corpus: texts from English, German and Russian language – five different categories for each language (politics, news, sport, film and gastronomy) the total of N=240 evaluations were analysed – in the first task 90 – in the second task 90 evaluations – in the third taks 60 evaluations Information Transfer through Online Summarizing and Translation Technology

V. Research Methodology the first assignment – evaluation of machine-translated sentences at the sentence level three language pairs (English-Croatian, German-Croatian and Russian-Croatian) two online translation tools (Google Translate and Yandex Translate) texts on English and German were firstly summarized and then machine translated – summarization by online tool Swesum: from 108 sentences to 47 sentences in English and from 103 sentences into 49 sentences for German average score ranging from 1 to 5 Information Transfer through Online Summarizing and Translation Technology

V. Research Methodology the second assingment – quality evaluation of the whole text (score ranging from 1 to 5) the third assignment – related to information transfer – evaluation of the overall quality of the summarized and translated text from English and German language – giving the answers to the questions who, what, when, where and how? Information Transfer through Online Summarizing and Translation Technology

VI. Results Information Transfer through Online Summarizing and Translation Technology Description - mean accuracy scores MT system 1 (Google Translate)MT system 2 (Yandex Translate)

1. Evaluation at the sentence level Information Transfer through Online Summarizing and Translation Technology Error bars (mean and 95% CI for means): accuracy by tool and language MT system 1 (Google Translate) MT system 2 (Yandex Translate) One-way between subjects ANOVA [F (5,84) =4.78, p=.001] with post hoc comparisons using the Tukey HSD  No statistically significant difference among tools compared by the same language pair (e.g. English-Croatian for both tools) when transmitting information.  Two statistically significant mean diferences were found.

1. Evaluation at the sentence level Information Transfer through Online Summarizing and Translation Technology Error bars (mean and 95% CI for means): accuracy by tool and language MT system 1 (Google Translate) MT system 2 (Yandex Translate) One-way between subjects ANOVA [F (5,84) =4.78, p=.001] with post hoc comparisons using the Tukey HSD  Google Translate from English to Croatian resulted in higher mean accuracy than Yandex Translate from German to Croatian (p<.001)

1. Evaluation at the sentence level Information Transfer through Online Summarizing and Translation Technology Error bars (mean and 95% CI for means): accuracy by tool and language MT system 1 (Google Translate) MT system 2 (Yandex Translate) One-way between subjects ANOVA [F (5,84) =4.78, p=.001] with post hoc comparisons using the Tukey HSD  Yandex Translate from English to Croatian resulted in higher mean accuracy than Yandex Translate from German to Croatian (p<.001).

2. Evaluation at the text level Information Transfer through Online Summarizing and Translation Technology Comparison of sentence by sentence mean scores and text evaluation mean scores MT system 1 (Google Translate) MT system 2 (Yandex Translate) Quality evaluation of sentence by sentence translation has statisticaly higher overall mean score than quality evaluation of translation of the text as a whole [t(89)=7.20, p<.001]

Information Transfer through Online Summarizing and Translation Technology Error bars (mean and 95% CI for means): accuracy by tool and language MT system 1 (Google Translate) MT system 2 (Yandex Translate) One-way between subjects ANOVA [F (5,84) =4.78, p=.001] with post hoc comparisons using the LSD test  One statistically significant difference among tools compared by the same language: for German language.  Additional three statistically significant mean diferences between languages. 2. Evaluation at the text level

Information Transfer through Online Summarizing and Translation Technology Error bars (mean and 95% CI for means): accuracy by tool and language MT system 1 (Google Translate) MT system 2 (Yandex Translate) One-way between subjects ANOVA [F (5,84) =4.78, p=.001] with post hoc comparisons using the LSD test  Google Translate from English to Croatian resulted in higher mean accuracy than Yandex Translate from German to Croatian (p=.030). 2. Evaluation at the text level

Information Transfer through Online Summarizing and Translation Technology Error bars (mean and 95% CI for means): accuracy by tool and language MT system 1 (Google Translate) MT system 2 (Yandex Translate) One-way between subjects ANOVA [F (5,84) =4.78, p=.001] with post hoc comparisons using the LSD test  Google Translate from English to Croatian resulted in higher mean accuracy than Yandex Translate from Russian to Croatian (p=.019). 2. Evaluation at the text level

Information Transfer through Online Summarizing and Translation Technology Error bars (mean and 95% CI for means): accuracy by tool and language MT system 1 (Google Translate) MT system 2 (Yandex Translate) One-way between subjects ANOVA [F (5,84) =4.78, p=.001] with post hoc comparisons using the LSD test  Google Translate from German to Croatian resulted in higher mean accuracy than Yandex Translate from Russian to Croatian (p=.019). 2. Evaluation at the text level

Information Transfer through Online Summarizing and Translation Technology 3. Information transfer evaluation MT system 1 (Google Translate) MT system 2 (Yandex Translate) Information transfer in summaries across all domains Codes: 0 = NO 1 = YES Overall average information scores by language: - German English 4.4 Overall average information scores by question: who? 0.95 what? 0.87 how? 0.83 where? 0.72 when? 0.60

Information Transfer through Online Summarizing and Translation Technology 3. Information transfer evaluation Additional analysis: Binary logistic regression analyses was used to test whether accuracy evaluations for English-Croatian and German-Croatian translations of both systems can predict the odds of giving the answers to five listed questions. This analysis was performed on sentence level because of higher accuracy scores. Accuracy has shown to be statistically significant predictor only for the odds of giving the answers to how? question. Analysis showed that for a one-unit increase in accuracy on sentence by sentence level the odds of giving the answer to the question how? for transmitted information increases 6.3 times (95% C.I.: 2.1 – 18.5) (p=.001).

Information Transfer through Online Summarizing and Translation Technology We presented the data on information transfer in five domains (politics, news, sport, film and gastronomy) for texts taken from online newspapers for 3 languages (English, German and Russian). In the research three types of assignments were made. Notion: preliminary study due to small number of test data analysed in this pilot research. Taken together, results suggest significant differences in information transfer when using different online tools. Although they work best for the English language, there are significant differences among other languages and online tools. The user information perception gave significantly higher scores in sentence by sentence evaluation, than on the whole text evaluation. We detected a significant connection between accuracy and the ability to answer the question how?. VII. Conclusion

Thank you! Information Transfer through Online Summarizing and Translation Technology Sanja Seljan*, Ksenija Klasnić**, Mara Stojanac*, Barbara Pešorda*, Nives Mikelic Preradovic*, Faculty of Humanities and Social Sciences, University of Zagreb *Department of Information and Communication Sciences, **Department of Sociology Contact: