Presentation is loading. Please wait.

Presentation is loading. Please wait.

Emad A. S. Abu-Ayyash The British University in Dubai

Similar presentations


Presentation on theme: "Emad A. S. Abu-Ayyash The British University in Dubai"— Presentation transcript:

1 Emad A. S. Abu-Ayyash The British University in Dubai
Errors and non-errors in English-Arabic machine translation of gender-bound constructs in technical texts Emad A. S. Abu-Ayyash The British University in Dubai

2 Funny translation Parking for uncles: Replace Abdullah: Men are parkings:

3 English and Arabic forms
In the three examples below, “who” should be rendered in three different ways in Arabic (Shunnaq & Saraireh, 1998): This is the man who found the answer. This is the lady who found the answer. These are the three men who found the answer.

4 Research purpose Investigating to what extent MT systems are able to render texts from English to Arabic when such texts include forms that have different rules, or grammatical representations, in the two languages.

5 Background and setting
MT systems: Systran’s Pure Neural MT (PNMT); Google Translate (GT); Microsoft Bing (MB) Texts: Four technical texts Forms: Gender-bound constructs Languages: SLT: English; TLT: Arabic When: June 5th, 2017

6 Sample Texts

7 Target grammatical forms
Three gender-bound constructs: subject-verb agreement Adjectival-noun agreement Pronoun-antecedent agreement

8 Subject-verb agreement
English: Number Arabic: Gender and number English Arabic Transliteration The boy sings. يُغنّي الولد /yughannii alwaladu/ The girl sings. تُغنّي البنت /tughannii albintu/

9 Subject-verb agreement
Considering the four texts, however, it is clear that gender-related, subject-verb agreement constructs were rendered mostly accurately in the three MT systems. Total number of constructs: 58 Total number of errors: 13 (PNMT 2; GT 4; MB 7) Total number of non-errors: 45

10 Subject-verb agreement
Example of errors: SLT-1 PNMT GT MB Speed describes… تصف السرعة /taSifu-ssur`atu/ يصف السرعة /yaSifu-ssur`atu/ سرعة يصف /sur`ah yaSif/ Velocity gives… السرعة يعطيك /assur`atu yu`Tiik/ السرعة تعطيك /assur`atu tu`Tiik/

11 Subject-verb agreement
The three MT systems were fully accurate in subject- verb agreement structures in which the main verb was a copula (i.e. is, are, be, etc.). In SLT-1, “A simple example would be…” was rendered correctly as far as gender is concerned by the three MT systems as مثالٌ بسيط هو... /mithaalun basiiTun huwa/, realising that in Arabic, مثال (example) is masculine, which was why all the three systems used هو /huwa/ , rather than هي /heya/, which is a feminine pronoun.

12 Adjectival-noun agreement
Adjectival-noun agreement is another area of disparity between English and Arabic. While the former is neutral (e.g. smart boy, smart girl) since the adjective form is not bound to the gender of the head noun, the latter is not. In Arabic, the form of the adjective changes based on whether the head noun the adjective modifies is masculine or feminine.

13 Adjectival-noun agreement
The total number of adjectival-noun constructs in the four SLTs was 42 (4 in SLT-1, 7 in SLT-2, 12 in SLT-3, 19 in SLT-4), while the total number of errors was only 4 (PNMT 0 errors, GT 3 errors, MB 1 error).

14 Adjectival-noun agreement
Examples of non-errors SLT-1: A simple example SLT-2: Local area network SLT-3: A new TV SLT-3: A great picture SLT-4: Large samples SLT-4: New scheme

15 Adjectival-noun agreement
The errors here occurred in two types of adjectival expressions, which are 1) a gerund modifying a head noun and 2) an adjective within a series of adjectives.

16 Adjectival-noun agreement
Relative Clauses Non-errors prevailed with the three MT systems rendering “which” accurately, realising that Arabic requires gender-based agreement between the relative pronoun and the noun it modifies. For example, in SLT-1, the constructs “the speed at which…” and “the direction in which…” were correctly translated to Arabic as السرعة التي /assur`ah allatii/ and الاتجاه الذي /alittijaah allathii/ resepectively by all the three MT systems, which reflects the realisation that the word for speed in Arabic is feminine and the word for direction is masculine, a realisation that was accurately mirrored in translating the relative pronoun which as الذي (for the masculine) and التي (for the feminine).

17 Pronoun-antecedent agreement
The pronoun systems in English and Arabic are similar when it comes to the subject pronouns he, she, I, and we as these have one-to-one equivalence in Arabic as the respective renditions of هو /huwa/, هي /hiya/, أنا /’anaa/ and نحن /naHnu/.

18 Pronoun-antecedent agreement
The remaining three subject pronouns, which are it, they, you can be rendered in different ways to Arabic based on gender and singularity, plurality or even duality. English subject pronoun Corresponding Arabic pronoun(s) Meaning of the Arabic pronoun It هو /huwa/ Third person singular masculine (things) هي /heya/ Third person singular feminine (things)

19 Pronoun-antecedent agreement
The remaining three subject pronouns, which are it, they and you can be rendered in different ways to Arabic based on gender and singularity, plurality or even duality. English subject pronoun Corresponding Arabic pronoun(s) Meaning of the Arabic pronoun They هم /hum/ Third person plural masculine هن /hunna/ Third person plural feminine هما /humaa/ Third person dual masculine and feminine

20 Pronoun-antecedent agreement
The remaining three subject pronouns, which are it, they and you can be rendered in different ways to Arabic based on gender and singularity, plurality or even duality. English subject pronoun Corresponding Arabic pronoun(s) Meaning of the Arabic pronoun You أنتَ /’anta/ Second person singular masculine أنتِ /’anti/ Second person singular feminine أنتما /’antuma/ Second person dual masculine and feminine أنتم /’antum/ Second person plural masculine أنتن /’antunna/ Second person plural feminine

21 Pronoun-antecedent agreement
The total number of pronoun-antecedent constructs in the four SLTs is eight (3 in SLT-1, 0 in SLT-2, 1 in SLT-3, 4 in SLT-4), and the total number of errors in the rendition of these constructs was nine (PNMT 2 errors, GT 4 errors, MB 3 errors)

22 Pronoun-antecedent agreement

23 Conclusion Subject-verb agreement: The three MT systems were mostly accurate in their TLT renditions when the main verb was a full verb, and fully accurate with copulas. Adjectival-noun agreement structures: The three systems produced natural Arabic TLTs, realising Arabic sensitivity to these structures. However, certain environments were evidently problematic, namely when a gerund modified a head noun and when an adjective was used in a series of adjectives modifying the same noun. Pronoun-antecedent agreement: The most noticed inconsistency was in this area probably because of the huge discrepancy between English and Arabic in the pronoun system itself.

24 Conclusion Statistically, PNMT’s performance was better than the other two. In 94 occurrences of the three constructs, PNMT made 4 errors, GT 11 and MB 11. While these results cannot be generalised due to the smallness of the sample, the present analysis can form the ground of a broader quantitative investigation that detects the errors and non-errors in translating gender- bound constructs across hundreds of texts utilising the findings of the present analysis.


Download ppt "Emad A. S. Abu-Ayyash The British University in Dubai"

Similar presentations


Ads by Google