An investigation into Corpus-based learning about language inin the primary-school: CLLIP Corpus evidence of the features of childrens literature.

Slides:



Advertisements
Similar presentations
Corpora in grammatical studies
Advertisements

Preparing a Work Plan BA25 Business Communications Professor Melody Thomas.
Mainly about text.
1 K-2 Smarter Balanced Assessment Update English Language Arts February 2012.
Revision and Exam Skills
An investigation into Corpus-based learning about language in the primary-school: CLLIP The classroom-based fieldwork.
Using Corpus Tools in Discourse Analysis Discourse and Pragmatics Week 12.
Strategies in Designing Reading & Writing assignments in Chemistry.
Introduction: A discourse perspective on grammar
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
Some Linguistic Tools. Lexical Categories (Parts of Speech)
Some Linguistic Tools. Linguistic features are analysed at the sentence level often to explore: (i) Interpersonal meaning (ii) Ideational meaning (iii)
1/26 Corpus Linguistics. 2/26 Varieties of English Relevance of corpus linguistics to this course –Previously studies of stylistics were largely informal.
Corpus 06 Discourse Characteristics. Reasons why discourse studies are not corpus-based: 1. Many discourse features cannot be identified automatically.
Language, Mind, and Brain by Ewa Dabrowska Chapter 2: Language processing: speed and flexibility.
Corpus 05 Grammar. Unlike lexicography, grammar does not have a long tradition of empirical study. Prescriptive vs descriptive: traditionally, grammatical.
2-Deixis and distance.
Corpus Linguistics Lexicography. Questions for lexicography in corpus linguistics How common are different words? How common are the different senese.
Reading Unit 2 Skills Review
Grammar Nuha Alwadaani.
Chapter One – Thinking as a Writer
Memory Strategy – Using Mental Images
Journal Article Presentation Group 1: Anik Damaris Maria Rofik.
Jan. Journal Current Events Introduction to Public Relations Key words: – Public relations – Target audience – News release.
GRAMMAR APPROACH By: Katherine Marzán Concepción EDUC 413 Prof. Evelyn Lugo.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Representatıvness, balance and samplıng ın a corpus Lınguistıcs.
Urban Institute – Session 2 The mediocre teacher tells The good teacher explains The superior teacher demonstrates The great teacher inspires ~ William.
Review Jeopardy AP ENGLISH Semester I Click Once to Begin JEOPARDY! A game show template.
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES exploring frequencies in texts Bambang Kaswanti Purwo
Genres of Nonfiction Literary Essay Informational
UNIT 7 DEIXIS AND DEFINITENESS
W HAT I S A P ARAGRAPH ?. A paragraph is a group of sentences that relates one main idea. Usually, a paragraph is part of a longer piece of writing; however,
The Principles of Design
Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London.
Translation Studies 9. The use of corpora in TS Krisztina Károly, Spring, 2006 Sources: Olohan, 2004; Tirkkonen-Condit, 2005.
RESEARCH DESIGN & CORPUS COMPILATION. Corpus design is intrinsic and a fundamental part of the analysis. It is guided by the RQ and affects the results.
Parts of Speech Major source: Wikipedia. Adjectives An adjective is a word that modifies a noun or a pronoun, usually by describing it or making its meaning.
Communicative and Academic English for the EFL Professional.
Writing to Analyse, Review, Comment. ReaderSubjectWriter.
Levels of Linguistic Analysis
Part-of-Speech Tagging with Limited Training Corpora Robert Staubs Period 1.
Writing Tracker Developing Stamina and Fluency Missouri Middle School Association Conference “Success in Middle Grades” February 8 – 9, St. Louis,
SOAPSTONE & STRATEGIES Annotation Notes. SOAPS Speaker Occasion Audience Purpose Subject.
The Linguistics of CA Session 3. Overview Linguistics Macro and Micro Linguistics Contrastive analysis Goal Mean Framework Levels Categories Models.
Learning Inductively Chapter Three. Born to think – people simply cannot keep themselves from comparing and contrasting things, actions, feelings, and.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
Critical Discourse Analysis
Introduction to Linguistics
How Many Words Does It Take to Listen and Read in English?
E303 Part II The Context of Language Research
Selecting a Novel for an Independent Reading Project
Nonfiction.
TRANSLATION 5. Genre and translation 1 Lingua Inglese 2 LM.
Reading and Frequency Lists

Exploring the BNC Corpus
Analyzing a text using SOAPSTone
Key Linguistic DEVICES Concepts
A CORPUS-BASED STUDY OF COLLOCATIONS OF HIGH-FREQUENCY VERB —— MAKE
What Is a Paragraph?.
English Teaching Sequence
Core Concepts Lecture 1 Lexical Frequency.
Literary Elements Expository texts – a short nonfiction work about a particular subject. They give information, discuss ideas or explain a process. Fiction.
2/2 - Newspaper Step 1 – Silent Read
Levels of Linguistic Analysis
H070 Topic Title H470/1 Exploring language.
Grade 3.
Synthesis.
“I Can” Learning Targets
Presentation transcript:

An investigation into Corpus-based learning about language inin the primary-school: CLLIP Corpus evidence of the features of childrens literature

The CLLIP Project: Background CLLIP: Corpus-based Learning about Language In the Primary-school ESRC-funded project Exploring potential for using corpus evidence with primary school children (9-11 year olds) for learning about language (L1)

Linguistic analysis of CLLIP corpus CLLIP corpus is a collection of the texts in the British National Corpus that were written for a child audience The corpus contains imaginative fiction, factual prose and other texts Linguistic analysis was conducted on the imaginative fiction texts only

Project research question: 1 1. Does linguistic analysis of the corpus data confirm, extend or challenge the descriptions of English lexis and syntax which are identified as teaching targets in the National Curriculum and the National Literacy Strategy? 1a. Does any such analysis suggest a need for further research on the basis of a larger dedicated corpus of writing for children?

Corpora: CLLIP and comparison CLLIP corpus: imaginative fiction written for child audience, from the BNC 31 texts Comparison corpus (hereafter Comp): imaginative fiction written for an adult audience, from the BNC 315 texts Newspaper texts from the BNC 114 texts

Purpose of the linguistic analysis To determine the characteristic features of the language of imaginative fiction written for children To compare and contrast the language of these texts with the language of imaginative fiction written for adults, and also the language of newspapers

Questions What is distinctive about the discourse of the CLLIP corpus? What similarities and differences are there in the overall word frequencies and of POSgrams in the three corpora? Is there a difference in the uses of certain lexical items between the child and adult fiction corpora? A POSgram is a sequence of parts of speech, such as an article followed by an adjective followed by another adjective then a noun (eg a bright red car; the last chocolate biscuit). In this study, we look at 6-grams (sequences of six parts of speech)

Frequency of Parts of Speech For each part of speech you can see 3 columns. The first two columns (left and middle) are for the CLLIP and Comp corpora respectively. What is remarkable is the similarity between the two for most parts of speech. There are many more nouns proportionally in the Newspaper corpus, while there are more lexical verbs in the fiction corpora.

Frequency data CLLIP – 22.0%; Comp – 22.4%; News – 23.5% The top ten most frequent tokens for the CLLIP and Comp corpora are remarkably similar, particularly the top 4. Note the greater frequency of of in the News corpus, which is related to the higher number of nouns – in expressions such as the resignation of. The figures at the top show the percentage of the overall frequency that the top ten account for in each corpus

Frequency - adjectives CLLIP – 14.6%; Comp – 11.3%; News – 11.9% Once again, a remarkable similarity exists between the top 11 adjectives for the fiction corpora, while the Newspaper corpus contains many adjectives that refer to social attributes. The figures at the top indicate that the top 11 adjectives in the CLLIP corpus do a larger amount of work than those for the other two corpora

Frequency - nouns CLLIP – 8.3%; Comp – 7.8%; News – 6.7%

POSgram information This table shows the most frequent 6-POS grams for each corpus. For each corpus, the sequence preposition + article + noun + of + article + noun is most common, followed by preposition + article + noun + preposition [not of] + article + noun in the two fiction corpora

Prep+art+[ ]+of+art+noun 51% This slide shows the nouns that most frequently fill the third slot in the preposition + article + noun + of + article + noun sequence. This shows that the sequence most commonly indicates spatial or temporal relations in the fiction corpora while in the newspaper corpus it can also express causal relations. The top six nouns in the CLLIP corpus account for 51% of the 6 POS grams of this sequence.

Body parts: NECK Do nouns in the CLLIP corpus more typically refer to physical entities in the world than the equivalent noun in the Comp corpus? The two righthand columns show the percentage of uses of the word neck that are used to refer to part of a piece of clothing, or used in an idiomatic sense. The adult corpus contains only a marginally higher percentage of idiomatic uses.

Neck CLLIP: stick your neck out Little physical contact Intimacy with animals Neck as site of pain Comp: breathing down your neck Lots of physical contact Intimacy between humans Neck as site of desire, tenderness, place for ornamentation

Finger CLLIP Figurative – 13% Jab, prod, lay, run, put Accusing, admonishing Used for drawing, for indicating the need for silence and for pulling triggers Comp Figurative – 19% Put, raise, point, run, jab, wag Furtive, tentative, negligent Used for communicating, for feeling [contours & textures], for wearing rings

in time – CLLIP We looked at uses of in time in the CLLIP corpus. The dominant meaning is immediate, and characters are concerned to accomplish something before the expiry of an implied deadline, externally imposed. A childly perspective seems often to imply staying on the right side of trouble or sanction.

in time – Comp In time in the Comp corpus is used in several senses. i: in the fullness of time, time on a large scale, which the speaker can perceive from a distance ii: within an appropriate period of time iii: others, as in the last line, where in and time have more separate meanings than is usual in the phrase