Presentation is loading. Please wait.

Presentation is loading. Please wait.

TaLC 2000 - Graz The English Italian Translational Corpus: A resource for learning about translation Federico Zanettin Università di Bologna SSLMIT – Forlì.

Similar presentations


Presentation on theme: "TaLC 2000 - Graz The English Italian Translational Corpus: A resource for learning about translation Federico Zanettin Università di Bologna SSLMIT – Forlì."— Presentation transcript:

1 TaLC 2000 - Graz The English Italian Translational Corpus: A resource for learning about translation Federico Zanettin Università di Bologna SSLMIT – Forlì federico@sslmit.unibo.it

2 CEXI Project Corpus English X / Cross / Translational Italian Bi-lingual Bi-lingual Parallel Parallel Bi-directional Bi-directional Translation-driven Translation-driven

3 Types of comparison ItalianEnglish TTs STs

4 Aims CEXI as a resource for Learning about (language, culture, translation) Learning to (read, write, translate) Limitations Funds (corpus size: 4M words) Copyright

5 Design criteria Selection features Description features

6 Primary selection criteria Translations / translated texts Medium: books Time: contemporary Audience: adult Prose Country of publication

7 Secondary selection criteria Author Translator Publisher Price and availability

8 Descriptive features Size vs. variety Full texts vs. samples Domain: fiction vs. non fiction

9 Fiction vs. non-fiction (Italy)

10 Corpus description Translation vs. non-translation Fiction vs. non-fiction English vs. Italian

11 Translations E  I (Italy 76-95)I  E (USA 77-96) UDC categoryTexts% % Literature/Child. Lit.481740%50228% Art/Games/Sports7576%34319% Edu/Law/So. Sci.125111%18711% Applied Science183516%1388% History/Geo./Biog.9198%17110% Natural & Ex. Sci6436%1116% Philosophy/Psycol.8337%533% Generalities/Info. Sci1011%20% Religion/Theology4774%26715% Total11633100%1774100%

12 Translation components, non-fiction UDC % from I.T. English (USA)Italian (Italy) Rel/Theo21% 7% Art/G/S27%11% Edu/Law/SocSci15%19% AppSci11%28% His/Geo/Bio13% Nat & EXSci 9% Phi/Psy 4%12% Gen/InfoSci 0% 1% Total100% No. of texts (provisional ) EnglishItalian 83 114 68 4 55 44 25 00 40

13 Non-translations Italian (titles)English (titles) Translations (E  I) Book production Translations (I  E) Book production Fiction 40%27% (Italy)28% (USA) 31% (UK) 22% (USA), 24% (UK) Non-fiction 60%73% (Italy)72% (USA) 69% (UK) 78% (USA), 76% (UK)

14 Non-fictional, non-translational components in the corpus vs. total book production Non- translation Italian Component Production (Italy) Non- translation English component Production (USA) Production (UK) Rel/Theo21%8% 7% 12% Art/G/S27%16%11%9%27% Edu/Law/SocS ci 15%28%19%29%20% AppSci11%16%28%23%15% His/Geo/Bio13%15%13% Nat & ExSci 9%5% 9%8%4% Phi/Psy 4%8%12%5%3% Gen/InfoSci 0%4% 1%6% Total100%

15 No. of textsSupp. textsTot al EnglishItalianEnglishItalian Rel/Theo83 58 Art/G/S114 7 Edu/Law/SS682 8 AppSci4117 His/Geo/Bio55 5 Nat & ExSci44 4 Phi/Psy253 5 Gen/InfoSci00 0 Total 40        52 Corpus composition (non-fiction)

16 Core corpus Sub-categoryNo. of text samples No. of words per sample Fiction4012,500 Non-fiction5212,500 Total for one component 92 texts1,150,000 words Total for core corpus (4 components) 368 texts4,600,000 words

17 Expansions Full texts Corrections to corpus composition Satellite corpora


Download ppt "TaLC 2000 - Graz The English Italian Translational Corpus: A resource for learning about translation Federico Zanettin Università di Bologna SSLMIT – Forlì."

Similar presentations


Ads by Google