Experimental research design and methodology in TPR PhD Course in Translation Process Research Copenhagen, July 2014.

Slides:



Advertisements
Similar presentations
KeTra.
Advertisements

Joke Daems PhD student Lieve Macken, Sonia Vandepitte, Robert Hartsuiker Comparing HT and PE using advanced research tools.
Mixed-methods data analysis Graduate Seminar in English Language Studies Suranaree, March 2011 Richard Watson Todd KMUTT
MT Evaluation: Human Measures and Assessment Methods : Machine Translation Alon Lavie February 23, 2011.
 Retrospective view of Empirical and Experimental Research in Translation  In search of an efficient method to observe students´processes: Standing over.
Experiments and statistics. QNT, Paul Cairns, University of York2 Classic “lab” study  Studying cause and effect – “novel navigation for faster task.
Variables Variable = something that can change in different conditions in a study VARIABLES HAVE TO VARY!!
Methods and Techniques of investigating user behavior Introduction - why M & T? Gerrit C. van der Veer aims theory methods planning presentation.
SOWK 6003 Social Work Research Week 4 Research process, variables, hypothesis, and research designs By Dr. Paul Wong.
Variables Variable = something that can change in different conditions in a study VARIABLES HAVE TO VARY!!
Psych 231: Research Methods in Psychology
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
From Controlled to Natural Settings
Variables cont. Psych 231: Research Methods in Psychology.
ICS 463, Intro to Human Computer Interaction Design: 8. Evaluation and Data Dan Suthers.
Research Methods.
Quantitative Research
Writing the Research Paper
Formulating objectives, general and specific
Business and Management Research
Chapter 3 Goals After completing this chapter, you should be able to: Describe key data collection methods Know key definitions:  Population vs. Sample.
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Research.
بسم الله الرحمن الرحيم * this presentation about :- “experimental design “ * Induced to :- Dr Aidah Abu Elsoud Alkaissi * Prepared by :- 1)-Hamsa karof.
Evaluation Test Justin K. Reeve EDTECH Dr. Ross Perkins.
INTRODUCTION TO STATISTICS MATH0102 Prepared by: Nurazrin Jupri.
Research Methods in Computer Science Lecture: Quantitative and Qualitative Data Analysis | Department of Science | Interactive Graphics System.
IB Psychology Internal Assessment Guide - Introduction
Notes for Candidates Writing a Practical Report (Unit 2543)
Reasoning in Psychology Using Statistics Psychology
Evaluating a Research Report
Exploratory Research Design Week 02
WELNS 670: Wellness Research Design Chapter 5: Planning Your Research Design.
Aug. 21, 2012 Chapter 1 Sections 1 & 2. What is statistics? Conducting studies to collect, organize, summarize, analyze and draw conclusions from data.
Probability & Statistics – Bell Ringer  Make a list of all the possible places where you encounter probability or statistics in your everyday life. 1.
ESL Teacher Networking Meeting Session - 2 Raynel Shepard, Ed.D.
Variables and their Operational Definitions
SURVEY RESEARCH.  Purposes and general principles Survey research as a general approach for collecting descriptive data Surveys as data collection methods.
An Introduction to Statistics and Research Design
Experiment Basics: Variables Psych 231: Research Methods in Psychology.
The product of evaluation is knowledge. This could be knowledge about a design, knowledge about the user or knowledge about the task.
AMSc Research Methods Research approach IV: Experimental [1] Jane Reid
AICE.MannVrijBullRevised
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
C82MST Statistical Methods 2 - Lecture 1 1 Overview of Course Lecturers Dr Peter Bibby Prof Eamonn Ferguson Course Part I - Anova and related methods (Semester.
Usability Evaluation, part 2. REVIEW: A Test Plan Checklist, 1 Goal of the test? Specific questions you want to answer? Who will be the experimenter?
Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.
Introduction To Statistics
1 Introduction to Statistics. 2 What is Statistics? The gathering, organization, analysis, and presentation of numerical information.
Today: Assignment 2 back on Friday
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Chapter 5 Informal Assessment.
Searching and Using Electronic Literature III. Research Design.
Loftus & Palmer Cognitive Psychology The Core Studies.
Data Analysis- What do I need to know? What are…. Levels of measurement Measures of central tendency (mean, median, mode) Measures of dispersion (range,
Modular 1. Introduction of the Course Structure and MyLabsPlus.
ScWk 298 Quantitative Review Session
PROCESSING DATA.
Reasoning in Psychology Using Statistics
Part Two.
Qualitative vs. Quantitative
Unit 6 Research Project in HSC Unit 6 Research Project in Health and Social Care Aim This unit aims to develop learners’ skills of independent enquiry.
Developing a Methodology
Research Methods.
Chapter 12: Surveys Introduction 12.1 The method 12.2 Random samples
The Nature of Probability and Statistics
PSYCHOLOGICAL RESEARCH
Scientific Method Steps
Introduction to Experimental Design
Presentation transcript:

Experimental research design and methodology in TPR PhD Course in Translation Process Research Copenhagen, July 2014

Outline  Research design – basic concepts  Experimental tools and methods to collect translation process data  Examples of experimental TPR studies  Some practical considerations about carrying out experiments 2

Outline  Research design – basic concepts  Experimental tools and methods to collect translation process data  Examples of experimental TPR studies  Some practical considerations about carrying out experiments 3

Design  Starting point: I-wonder-question  Rephrase as research question/hypothesis Consider  What type of question/hypothesis it is  Sample and population  Which variables are involved 4

Design  Starting point: I-wonder-question  Rephrase as research question/hypothesis Consider  What type of question/hypothesis it is  Sample and population  Which variables are involved 5

Design: Research question/hypothesis  Formulate your I-wonder-question clearly and unambiguously  Make it falsifiable  Consider whether your starting point is  Question: Is there a difference between students and professional translators in terms of ST reading?  Open hypothesis: There is a difference between students and professionals in terms of ST reading  Directional hypothesis: There is a difference between students and professional translators in terms of ST reading, such that professionals spend less time on the ST 6

Design  Starting point: I-wonder-question  Rephrase as research question/hypothesis Consider  What type of question/hypothesis it is  Sample and population  Which variables are involved 7

Design: Type of question/hypothesis  Differences  Repeated measures: measuring effect of some difference within one group, e.g.  same translators working under different conditions or over a period of time (longitudinal study)  Independent groups: group difference between different groups doing same task, e.g.  students vs. professionals  training vs. control group  Functional relations  Between response and some manipulated variable 8

Design  Starting point: I-wonder-question  Rephrase as research question/hypothesis Consider  What type of question/hypothesis it is  Sample and population  Which variables are involved 9

Design: Sampling and population  Inferential statistics assumes random sampling  In practice, balance between randomness and possibility  Consider population and sample  Which population does my question pertain to?  Is it realistic to sample from that population?  Could a realistic sample pertain to a different, but still relevant population? 10

Design  Starting point: I-wonder-question  Rephrase as research question/hypothesis Consider  What type of question/hypothesis it is  Sample and population  Which variables are involved 11

Design: Variables Two important distinctions  Independent/explanatory (EV), dependent (DV) and control (CV) variables  Categorical and numerical variables 12

Design: DVs  Dependent/response variable (DV): what you are measuring or counting, e.g.  Translation time  Overall  Individual fixations  Translation quality  Number of occurrences of e.g.  metaphors  specific syntactic constructions …… 13

Design: EVs  Independent/explanatory variables (EVs):  Variables which according to your hypothesis may have an effect on your DV  Also called predictors  Types  Item-related: e.g. task difficulty, translation direction, translation tool  Participant-related: e.g. sex/gender, L1, professional status, L2 experience 14

Design: CVs  Control variables (CVs): variables to control in order to be sure that EV is responsible for DV  Experimental control  Statistical control  Avoid confounds 15

Design: Categorical and numerical variables  Categorical  Unordered categories (nominal): e.g. sex/gender, word class  Ordered categories (ordinal): e.g. lower/middle/upper class  Numerical  Discrete  Integers, finite values  E.g. counts of word in a corpus  Continuous  Real numbers, infinitely many values on scale  E.g. reading time 16

Design: Categorical and numerical variables Translation experience may be construed as  Nominal scale: student/professional  Ordinal scale: beginning / advanced student / professional  Discrete numerical: number of years of experience (1, 2, 3, 4…)  Continuous numerical: amount of experience (time, output) 17

Design: Categorical and numerical variables Important ramifications for  The questions asked  The type of statistical test to be applied 18

Outline  Research design – basic concepts  Experimental tools and methods to collect translation process data  Examples of experimental TPR studies  Some practical considerations about carrying out experiments 19

Experimental TPR tools and methods  eye-tracking  keylogging  audio recording (in Translog)  (video recording)  (think-aloud protocols)  retrospective interviews and questionnaires 20

Eye-tracking  eye-mind assumption (Just and Carpenter 1980)  cogntive attention  cognitive load  areas of interest (AOI)  eye-tracking measures  fixation count  total gaze time  fixation duration  pupil dilation  eye movements (transitions, attention shifts) 21

Keylogging  transient versions of target text  revision/editing  navigation  pauses  production speed  final target texts 22

Audio recording (available in Translog)  oral translations  think-aloud  comments 23

Questionnaires/retrospective interviews  language background  professional background  perception of source text difficulty  perception of different tasks  translation challenges experienced  etc. 24

Assessing the product  translation quality assessments  examination of translation of individual words (e.g. metaphors, terminology, specific word classes, number of alternative translation solutions, etc.) 25

Outline  Research design – basic concepts  Experimental tools and methods to collect translation process data  Examples of experimental TPR studies  Some practical considerations about carrying out experiments 26

Example 1: The Process of Post-Editing: a Pilot Study

Example 1: goal  to find out (‘I wonder’) how translators, with no post-editing training, would perform when asked to post-edit MT- produced output in comparison with the performance of a group of translators who translated the same texts manually, without any dictionary or technical assistance. 28

29

30

Example 1: research questions  what are the differences in quality between manual translations and post-edited MT output?  do more corrections lead to higher quality in the post-edited texts?  what are the time differences between manual translations and post-editing?  what are the differences in allocation of cognitive resources between manual translation and post-editing 31

Example 1: design/set-up  experimental research design with manipulation of circumstances to measure the effect on participants’ behaviour  lab environment simulating natural conditions  translation rankings 32

Example 1: variables  Dependent/response variables  translation time  translation quality  allocation of cognitive resources (ST vs. TT)  Independent/explanatory variable  translation mode (manual translation vs. post-editing) 33

Example 1: variables  control variables  using the same participants with the same text for both tasks might have created an unintended repetition effect  using the same participants but different texts might have created an unintended effect of textual differences (e.g. one text more difficult than the other) 34

Example 1: experiments  Modes  manual translation and post-editing  Participants  8 translators and 7 post-editors  Texts  three English source texts (same for both groups) 35

Example 1: experiments  one group of participants translated three texts (from scratch) from English into Danish and  one group of participants post-edited machine-translated (Google Translate) Danish versions of the the same three source texts 36

Example 1: tools/methods  eye-tracking allocation of cognitive resources (total gaze time on ST vs. TT)  keylogging task time keystrokes (edit distance) final output  translation evaluations translation quality 37

Example 1: quality assessment  QA method and procedure  7 evaluators  presentation of source sentence together with four candidate translations  two sentences had been produced using manual translation and two had been produced using post-editing (randomised and blinded)  evaluators were instructed to rank candidate translations from best to worst quality (ties permitted).  inter-rater and intra-rater agreement  did evaluators agree with each other  were evaluators consistents in their rankings 38

Example 1: design weaknesses  sample size  participant qualifications (not all worked as professional translators)  quality assessments (assessment task too difficult, inter-rater and intra-rater agreement too low) 39

Example 2: Speaking your translation students’ first encounter with speech recognition technology

Example 2: goal  to measure the impact on the translation process and product of using an automatic speech recognition (ASR) system compared with typing a translation and producing a sight translation without ASR  to measure the effect of training/practice with ASR on task time and quality of translations produced with ASR 41

42

Example 2: research questions (quantitative)  What are the task times in the three translation modalities (written, sight, ASR)?  Is there any difference in translation quality in the three modalities?  Is there any difference in cognitive load?  What is the effect on time and quality of participants training the system and gaining more experience using it? 43

Example 2: research questions (qualitative)  What are the students’ own perception of working with an ASR system?  What kind of strategies are employed by students who experience positive effects on time and quality? 44

Example 2: design/set-up  experimental research design with manipulation of circumstances to measure the effect on participants’ behaviour  lab environment  analysis of process and product  longitudinal study  experimental group compared with control group  qualitative analyses 45

Example 2: variables  Dependent/response variables  translation time  translation quality  cognitive load (average fixation durations)  Independent/explanatory variables  translation mode (written, sight, ASR)  training period 46

Example 2: variables  Control variables  texts had to be as similar as possible to ensure that process/product differences across translation tasks were caused by the mode and not by the text  sequence of presentation was rotated to ensure that differences between the written and oral modalities were owing to the translation mode and not, for instance, to varying levels of difficulty 47

Example 2: experiments  participants  14 translation students divided into two groups of seven: an experimental (training) group and a control group  modes  written translation  sight translation  sight translation with speech recognition  text  text excerpts taken from the same longer text to ensure the highest possible level of similarity 48

Example 2: experiments  Longitudinal study:  phase 1 (baseline): all participants translated texts under three different conditions  interim period: half of the participants (experimental group) worked with the ASR program at home (partly under controlled conditions) and the other half did not (control group)  phase 2 (follow-up): all participants translated texts under three different conditions (similar to phase 1), and results from experimental group were compared with control group and related to phase 1 49

Example 2: tools/methods  eye-tracking cognitive load (fixation duration)  Translog timings oral and written output transient versions of oral and written translations  evaluations of translation output translation quality  retrospective interviews students’ perceptions of ASR vs. written/sight 50

Example 2: quality assessment  QA method and procedure  3 evaluators  each evaluator assessed all three texts for all 14 participants (blinded with respect to mode)  global scores were given on a scale from 1-5  comments were provided to back up scores  inter-rater agreement  did evaluators agree with each other 51

Example 2: retrospective interviews  All participants:  general impression of using ASR  benefits and drawbacks/problems  Experimental group (training group)  total training time  type of texts produced using ASR  general impression  problems encountered 52

Example 2: case studies  small samples weaken the validity of quantitative results  large individual differences between translators (individual translator profiles)  in-depth analyses are extremely time- consuming  identification of participants who confirm (reject) the hypothesis  detailed analysis of strategies, gaze and keystroke patterns, choice of words etc. 53

Example 2: design weaknesses  ecological validity  unfamiliar setting  too little control during training period 54

Outline  Research design – basic concepts  Experimental tools and methods to collect translation process data  Examples of experimental TPR studies  Some practical considerations about carrying out experiments 55

Setting up the experiment  make detailed description (’protocol’) of experiment, including every action that needs to be carried out  write instruction for participants (so that they all receive the same information)  run pilot(s) 56

Running the experiment  increase eye-tracking data quality by  calibrating subjects before each new session  optimising light conditions (no direct sunlight)  checking distance to screen  check settings in Translog  check audio quality if using audio recordings 57