Download presentation
Presentation is loading. Please wait.
Published byEthel Lawson Modified over 8 years ago
1
Typology: Language Sampling Anna Siewierska & Dik Bakker
2
Typology: Language Sampling2 Empirical Cycle L L L L L L L L L L L L L L L Definition Categories: C1 … C3 Hypotheses PROVISIONAL DATA
3
Typology: Language Sampling3 Empirical Cycle L L L L L L L L L L L L L L L Definition Categories: C1 … C3 Hypotheses PROVISIONAL DATA
4
Typology: Language Sampling4 Empirical Cycle L L L L L L L L L L L L L L L Definition Categories: C1 … C3 Hypotheses PROVISIONAL DATA TEST
5
Typology: Language Sampling5 Empirical Cycle L L L L L L L L L L L L L L L L Definition Categories: C1 … C3 Hypotheses PROVISIONAL DATA L L L L L L L L L TEST
6
Typology: Language Sampling6 Empirical Cycle L L L L L L L L L L L L L L L L Definition Categories: C1 … C3 Hypotheses PROVISIONAL DATA L L L L L L L L L TEST
7
Typology: Language Sampling7 Overview
8
Typology: Language Sampling8 Overview 1.Collecting language data 2.Why a sample? 3.Types of biases in samples 4.Two strategies 5.Samples in typological literature 6.The DV method
9
Typology: Language Sampling9 Data collecting Languages of the world: n 7000
10
Typology: Language Sampling10 Data collecting Languages of the world: n 7000 S A M P L E (50 – 500)
11
Typology: Language Sampling11 Data collecting Why not all languages in our database ? - Too many - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong - Impossible even in principle
12
Typology: Language Sampling12 All Languages: impossible Extant languages:7000
13
Typology: Language Sampling13 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991)
14
Typology: Language Sampling14 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) - Latin, Cl. Greek, Gothic, Hebrew, Hittite, …
15
Typology: Language Sampling15 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) - Latin, Cl. Greek, Gothic, Hebrew, Hittite, … - Cl. Turkic, Cl.Tibetan, Archaic Chinese, …
16
Typology: Language Sampling16 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) - Latin, Cl. Greek, Gothic, Hebrew, Hittite, … - Cl. Turkic, Cl.Tibetan, Archaic Chinese, … - Manx, Cornish, …
17
Typology: Language Sampling17 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) - Latin, Cl. Greek, Gothic, Hebrew, Hittite, … - Cl. Turkic, Cl.Tibetan, Archaic Chinese, … - Manx, Cornish, … Problem?
18
Typology: Language Sampling18 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) - Latin, Cl. Greek, Gothic, Hebrew, Hittite, … - Cl. Turkic, Cl.Tibetan, Archaic Chinese, … - Manx, Cornish, … No native speaker intuitions …
19
Typology: Language Sampling19 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) - Latin, Cl. Greek, Gothic, Hebrew, Hittite, … - Cl. Turkic, Cl.Tibetan, Archaic Chinese, … - Manx, Cornish, … - Illinois, Mohican, Massachusett, Carolina, …
20
Typology: Language Sampling20 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) - Latin, Cl. Greek, Gothic, Hebrew, Hittite, … - Cl. Turkic, Cl.Tibetan, Archaic Chinese, … - Manx, Cornish, … - Illinois, Mohican, Massachusett, Carolina, … - X1, X2, X3, …, Xn
21
Typology: Language Sampling21 All Languages: impossible Extant languages:7000 Extinct languages: 500 (Ruhlen 1991) X1, X2, X3, …, Xn????
22
Typology: Language Sampling22 All Languages: impossible Homo Sapiens200,000 BP Geat Leap Forward 40,000 BP Average n of lgs6000 Diachronic change1000 year X lgs: (40,000 / 1000) * 6000 = 240,000
23
Typology: Language Sampling23 All Languages: impossible Extant languages: 7000 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500
24
Typology: Language Sampling24 All Languages: impossible Extant languages: 7000 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 3.0%
25
Typology: Language Sampling25 All Languages: impossible Extant Documented: 1500 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 0.6%
26
Typology: Language Sampling26 All Languages: impossible Extant Documented: 1500 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 0.6% spoken anno 2000
27
Typology: Language Sampling27 All Languages: impossible Extant Documented: 1500 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 0.6% spoken anno 2000
28
Typology: Language Sampling28 All Languages: impossible Extant Documented: 1500 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 0.6% spoken anno 2000 Typology: Universals of Human Language
29
Typology: Language Sampling29 All Languages: impossible Extant Documented: 1500 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 0.6% “Human Language” spoken anno 2000
30
Typology: Language Sampling30 All Languages: impossible Extant Documented: 1500 Extinct Documented: <100 X1, X2, X3, …, Xn240,000 Human languages247,500 “Human Language” spoken anno 2000
31
Typology: Language Sampling31 All Languages: impossible Extant Documented: 1500 Extinct Documented: <100 X1, X2, X3, …, Xn240,000 Human languages247,500 “Human Language” spoken anno 2000 Uniformi- tarianism (Lass 1997)
32
Typology: Language Sampling32 All Languages: impossible Extant Documented: 1500 Extinct Documented: <100 X1, X2, X3, …, Xn240,000 Human languages247,500 “Human Language” spoken anno 2000 Uniformi- tarianism (Lass 1997)
33
Typology: Language Sampling33 All Languages: impossible Extant Documented: 1500 Extinct Documented: <100 X1, X2, X3, …, Xn240,000 Human languages247,500 “Human Language” spoken anno 2000 Uniformi- tarianism (Lass 1997)
34
Typology: Language Sampling34 All Languages: impossible Extant Documented: 1500 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 0.6% Typology: Variety among human languages spoken anno 2000
35
Typology: Language Sampling35 All Languages: impossible Extant Documented: 1500 Extinct languages: 500 X1, X2, X3, …, Xn240,000 Human languages247,500 0.6% Variety among human languages spoken anno 2000
36
Typology: Language Sampling36 Variety: rare types Variety:
37
Typology: Language Sampling37 Variety: rare types Variety: Clicks (only in one family – Khoisan: 30 lgs) Active nominal marking (Pomo, Laz) Opposite person hierarchy Acc-Erg (Tib.Burm.) Tripartite agreement on ditransitives Syntactic ergativity (Aus, Maya) Adverbial agreement with focal (Aus, Cauc) OSV main clause order (S.Am) N.B. combination of (rare) features (cf. Greenberg)
38
Typology: Language Sampling38 Variety: rare types Variety: Clicks (only in one family – Khoisan: 30 lgs) Active nominal marking (Pomo, Laz) Opposite person hierarchy Acc-Erg (Tib.Burm.) Tripartite agreement on ditransitives Syntactic ergativity (Aus, Maya) Adverbial agreement with focal (Aus, Cauc) OSV main clause order (S.Am) “Rara et Rarissima”
39
Typology: Language Sampling39 Data collecting Why not all languages in our database ? - Too many - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong - Impossible even in principle Problematic for variety Possibly not for universality
40
Typology: Language Sampling40 Data collecting Why not all languages in our database ? - Too many - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong
41
Typology: Language Sampling41 Data collecting Why not all languages in our database ? - Too many - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong
42
Typology: Language Sampling42 Too many languages Samples in the typological literature: Greenberg (1963) – Word order30 Hawkins (1983) – Word order225 Tomlin (1986) – Word order402 Nichols (1992) – Head/Dep marking174 Bybee (1994) – Tense/Aspect/Mood76 Siewierska & Bakker (1990-) – Pers.Agr.450 Dryer (1985-) – Word order1200 Typical PhD project (1 person, 3 years): 50 - 100
43
Typology: Language Sampling43 Data collecting Why not all languages in our database ? - Too many - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong
44
Typology: Language Sampling44 Data collecting Why not all languages in our database ? - Too many sample inevitable - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong
45
Typology: Language Sampling45 Data collecting Why not all languages in our database ? - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong
46
Typology: Language Sampling46 Data collecting Why not all languages in our database ? - Only <1000 well described (grammar) <2000 partial sketch - Not (always) necessary - Sometimes even wrong
47
Typology: Language Sampling47 Lack of material Bibliographic bias: - (very) old - scarce - theory specific (Tagmemics; GG) - restricted to phonology and morphology - biased selection of the world’s languages:
48
Typology: Language Sampling48 Lack of material Further types of bias:
49
Typology: Language Sampling49 Lack of material Further types of bias: - Genetic
50
Typology: Language Sampling50 Lack of material Further types of bias: - Genetic Indo-European, Ugric, Bantu ++ Australian, Amerindian, Papuan - -
51
Typology: Language Sampling51 Lack of material Further types of bias: - Genetic - Areal
52
Typology: Language Sampling52 Lack of material Further types of bias: - Genetic - Areal Sprachbund: Balkan Circum-Baltic C.America S.E.Asia …
53
Typology: Language Sampling53 Lack of material Further types of bias: - Genetic - Areal - Typological
54
Typology: Language Sampling54 Lack of material Further types of bias: - Genetic - Areal - Typological Parametric variables (Hawkins 1983):
55
Typology: Language Sampling55 Lack of material Further types of bias: - Genetic - Areal - Typological Parametric variables (Hawkins 1983): Adposition
56
Typology: Language Sampling56 Lack of material Further types of bias: - Genetic - Areal - Typological Parametric variables (Hawkins 1983): Prep
57
Typology: Language Sampling57 Lack of material Further types of bias: - Genetic - Areal - Typological Parametric variables (Hawkins 1983): Prep [ Dem Num Adj Gen Rel N ] NP
58
Typology: Language Sampling58 Lack of material Further types of bias: - Genetic - Areal - Typological Parametric variables (Hawkins 1983): PRepNounModHierarchy:
59
Typology: Language Sampling59 Lack of material Further types of bias: - Genetic - Areal - Typological Parametric variables (Hawkins 1983): PRepNounModHierarchy: Prep ((NDem OR NNum NA) AND (NA NGen) AND (NGen NRel))
60
Typology: Language Sampling60 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural
61
Typology: Language Sampling61 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural Linguistic relativity (Sapir; Whorf)
62
Typology: Language Sampling62 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural Linguistic relativity (Sapir; Whorf) Lucy (1992): count nouns vs classifiers ~ counting tasks
63
Typology: Language Sampling63 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size
64
Typology: Language Sampling64 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size Small high genetic drift (Kimura 1983)
65
Typology: Language Sampling65 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size Small high genetic drift (Kimura 1983) Also linguistic drift? (Dahl: hunter/gatherer)
66
Typology: Language Sampling66 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size Small high genetic drift (Kimura 1983) Also linguistic drift? (Dahl: hunter/gatherer) N.B. OSV/OVS only in < 3000 languages
67
Typology: Language Sampling67 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size - Language contact
68
Typology: Language Sampling68 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size - Language contact Borrowed phenomenon measured twice
69
Typology: Language Sampling69 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size - Language contact BUT: contact may also create new types
70
Typology: Language Sampling70 Lack of material Further types of bias: - Genetic - Areal - Typological - Cultural - Community size - Language contact BUT: contact may also create new types
71
Typology: Language Sampling71 Data collecting Why not all languages in our database ? - Only <1000 well described (grammar) <2000 partial sketch (= bibliographical bias) - Not (always) necessary - Sometimes even wrong
72
Typology: Language Sampling72 Data collecting Why not all languages in our database ? - Only <1000 well described (grammar) < 2000 partial sketch Cater for biases by stratifying for the relevant dimensions - Not (always) necessary - Sometimes even wrong
73
Typology: Language Sampling73 Data collecting Why not all languages in our database ? - Not (always) necessary - Sometimes even wrong
74
Typology: Language Sampling74 Small is beautiful A good sample may be better than a large sample: Sample type and size depends on goal of project: Establish the probability of a language type (e.g. prepositional vs postpositional) Probability sample Explore the existing variety on a certain dimension (e.g. case systems; combination of order patterns) Variety sample
75
Typology: Language Sampling75 Small is beautiful 1. Probability sample - Only independent cases Control for: - genetic relations - language contact But: relative stability of relevant variables - Reflexive passive (Romance vs Slavic)
76
Typology: Language Sampling76 Small is beautiful Samples in the typological literature: Greenberg (1963) – Word order30 Hawkins (1983) – Word order225 Tomlin (1986) – Word order402 Nichols (1992) – Head/Dep marking174 Bybee (1994) – Tense/Aspect/Mood76 Siewierska & Bakker (1990-) – Pers.Agr.450 Dryer (1985-) – Word order1200 probab
77
Typology: Language Sampling77 Large may be better 2. Variety sample - Maximum (all?) different cases Cater for: - variation in genetic/areal groups - typically cyclical - stop when no new cases found Research parameters typically unknown !
78
Typology: Language Sampling78 Probability vs Variety Probability sample: - relatively small (30 – 150) - may be too large (double cases) Variety sample: - relatively large (> 200) - can not be too large (just superfluous)
79
Typology: Language Sampling79 Sampling in the literature Introductions to Typology: Comrie (1981) 9-12(4) Croft (1990)18-26(9) Whaley (1997)36-43(8) Song (2001)17-38(22)
80
Typology: Language Sampling80 Probability sampling Bell (1978) - genetic, areal and typological bias - 478 genetic groups (> 3000 year depth) - per family: n of lgs proportional to n of groups - problems: sample < 478: selection small families ‘disappear’
81
Typology: Language Sampling81 Probability sampling Perkins (1980) - Bell stratified for culture (Murdock 1967) - 50 languages with optimal genetic and cultural distance - good for probability, too small for variety
82
Typology: Language Sampling82 Probability sampling Dryer (1989) - ~ Bell, but: - 322 established genera, 3500 – 4000 years deep - variable values established per genus not language (mainly stable, else the most frequent) - 5 macro-areas, counting genera per area:
83
Typology: Language Sampling83 Probability sampling AfricaEurasiaAus-NGN.AmerS.Amer SOV2226292618 SVO2119665 SOV > SVO
84
Typology: Language Sampling84 Probability sampling AfricaEurasiaAus-NGN.AmerS.Amer SOV2226292618 SVO2119665 Good for universal preferences on stable variables Unclear how to generalize to other types of sampling, with languages central
85
Typology: Language Sampling85 Variety sampling Characteristics: Create variety samples of any size Free choice of classification used (Gen/Ar/Typ) Stratification on other parameter (Gen: Ar/Typ) Generate new samples + evaluate existing samples Fully formalized and computer implemented
86
Typology: Language Sampling86 Variety sampling Central idea: - classifications express linguistic (dis)similarities between languages - established on the basis of expert knowledge - subject to cyclical improvement and refinement - best starting point for explorative research into variation among languages
87
Typology: Language Sampling87 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian
88
Typology: Language Sampling88 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Mimimum sample: 1 language per family
89
Typology: Language Sampling89 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian HBRARBQUEGUA GEOCHEKANTAM
90
Typology: Language Sampling90 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian HBRARBQUEGUA GEOCHEKANTAM
91
Typology: Language Sampling91 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Select language with the best description (for the purpose) HBRARBQUEGUA GEOCHEKANTAM
92
Typology: Language Sampling92 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Includes all ISOLATES: Basque, Burushaski, Ket, Nahali, … HBRARBQUEGUA GEOCHEKANTAM
93
Typology: Language Sampling93 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Mimimum sample: 1 language per family Ruhlen (1991): 27 Ethnologue (2005): 120 Basic Sample Murdock (1967): 50
94
Typology: Language Sampling94 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Extending the Basic Sample to preferred size: e.g. extend Ruhlen-based sample from 27 50 KEY: relative complexity of family tree DV=3DV=6 DV=2
95
Typology: Language Sampling95 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Adjusting DV values to full tree structure: Recursively down the trees Lower levels contribute relatively less to DV DV=3DV=6 DV=2
96
Typology: Language Sampling96 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Formula for weight per level: C k = C k-1 + ( N k - N k-1 ) * ( MAX – (k-1) ) / MAX ) DV=3DV=6 DV=2 See Rijkhoff & Bakker (1998)
97
Typology: Language Sampling97 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Formula for weight per level: C k = C k-1 + ( N k - N k-1 ) * ( MAX – (k-1) ) / MAX ) DV=55.5DV=178.4 DV=8.5
98
Typology: Language Sampling98 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian Formula for weight per level: C k = C k-1 + ( N k - N k-1 ) * ( MAX – (k-1) ) / MAX ) DV=55.5DV=178.4 DV=8.5 362
99
Typology: Language Sampling99 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian DV=55.5DV=178.4 DV=8.5 Computer program:
100
Typology: Language Sampling100 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian DV=55.5DV=178.4 DV=8.5 Computer program: Number of lgs per family given sample size
101
Typology: Language Sampling101 Variety sampling SampleSize Family (n=5273) 3050100250 Afro-Asiatic (258)12616 Amerind (854)271851 Austric (1186)251439 Caucasian (38)1113 Chukchi (5)1111 Indo-European (180)12411 RUHLEN (1991)
102
Typology: Language Sampling102 Variety sampling SampleSize Family (n=5273) 3050100250 Afro-Asiatic (258)12616 Amerind (854)271851 Austric (1186)251439 Caucasian (38)1113 Chukchi (5)1111 Indo-European (180)12411 5.9% 3.3% 6.1%
103
Typology: Language Sampling103 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian DV=55.5DV=178.4 DV=8.5 Computer program: Number of lgs per family given sample size
104
Typology: Language Sampling104 Variety sampling Afro-Asiatic Amerindian Caucasian Dravidian DV=55.5DV=178.4 DV=8.5 Computer program: Number of lgs per family given sample size Optimal distribution over subbranches (maximum distance maximum variety)
105
Typology: Language Sampling105 Variety sampling Amerind Main Branch (n=854) TotalDVin 250 sample (n = 51) Central6019.16 Ge-Pano Carib19329.39 Northern23245.514 Eq-Tucanoan26845.014 Chibchan-Paezan7116.95 Andean309.93
106
Typology: Language Sampling106 Variety sampling Amerind Main Branch (n=854) TotalDVin 250 sample (n = 51) Central6019.16 Ge-Pano Carib19329.39 Northern23245.514 Eq-Tucanoan26845.014 Chibchan-Paezan7116.95 Andean309.93
107
Typology: Language Sampling107 Variety sampling Amerind Main Branch (n=854) TotalDVin 250 sample (n = 51) Central6019.16 Ge-Pano Carib19329.39 Northern23245.514 Eq-Tucanoan26845.014 Chibchan-Paezan7116.95 Andean309.93
108
Typology: Language Sampling108 Variety sampling Amerind Main Branch (n=854) TotalDVin 250 sample (n = 51) Central6019.16 Ge-Pano Carib19329.39 Northern23245.514 Eq-Tucanoan26845.014 Chibchan-Paezan7116.95 Andean309.93
109
Typology: Language Sampling109 Variety sampling Amerind Main Branch (n=854) TotalDVin 250 sample (n = 51) Central6019.16 Ge-Pano Carib19329.39 Northern23245.514 Eq-Tucanoan26845.014 Chibchan-Paezan7116.95 Andean309.93
110
Typology: Language Sampling110 Variety sampling Amerind Main Branch (n=854) TotalDVin 250 sample (n = 51) Central60 19.1 6 = 10% Ge-Pano Carib19329.39 Northern23245.514 Eq-Tucanoan26845.014 Chibchan-Paezan7116.95 Andean309.93
111
Typology: Language Sampling111 Variety sampling Andean (3 / 30) Amerind (51 / 854)
112
Typology: Language Sampling112 Variety sampling Andean ( 3 / 30) NORTHSOUTHAYMAQUECHCAHUAURA Amerind (51 / 854)
113
Typology: Language Sampling113 Variety sampling NORTHSOUTHAYMAQUECHCAHUAURA Andean ( 3 / 30) Amerind (51 / 854)
114
Typology: Language Sampling114 Variety sampling NORTHSOUTHAYMAQUECHCAHUAURA Andean ( 3 / 30) Amerind (51 / 854)
115
Typology: Language Sampling115 Variety sampling: output Typical output:
116
Typology: Language Sampling116 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273)
117
Typology: Language Sampling117 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273)
118
Typology: Language Sampling118 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Afro-Asiatic (55.53/6/258) 6 Altaic (15.07/2/62) 2 Amerind (178.44/6/854) 18 Australian (67.58/30/262) 7 …
119
Typology: Language Sampling119 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Afro-Asiatic (55.53/6/258) 6 Altaic (15.07/2/62) 2 Amerind (178.44/6/854) 18 Australian (67.58/30/262) 7 … Na-Dene (9.44/2/41) 1 Niger-Kordofanian (90.38/2/1068) 9 …
120
Typology: Language Sampling120 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Afro-Asiatic (55.53/6/258) 6 Altaic (15.07/2/62) 2 Amerind (178.44/6/854) 18 Australian (67.58/30/262) 7 … Na-Dene (9.44/2/41) 1 Niger-Kordofanian (90.38/2/1068) 9 … Basque (1.00/0/0) 1 Etruscan (1.00/0/0) 1 Gilyak (1.00/0/0) 1
121
Typology: Language Sampling121 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Afro-Asiatic (55.53/6/258) 6 Altaic (15.07/2/62) 2 Amerind (178.44/6/ 854 ) 18 Australian (67.58/30/262) 7 … Na-Dene (9.44/2/41) 1 Niger-Kordofanian (90.38/2/ 1068 ) 9 … Basque (1.00/0/0) 1 Etruscan (1.00/0/0) 1 Gilyak (1.00/0/0) 1
122
Typology: Language Sampling122 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) … Niger-Kordofanian (90.38/2/1068) 9 …
123
Typology: Language Sampling123 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Niger-Kordofanian (2/1068) 9 Niger-Congo (2/1036) 8 Niger-Congo Proper (2/1007) 7 Central Niger-Congo (2/961) 6 South Central Niger-Congo (3/755) 3 Eastern (9/703) 1 Western (2/47) 1 Ijo-Defaka (2/5) 1 North Central Niger-Congo (4/206) 3 West Atlantic (3/46) 1 Mande (3/29) 1 Kordofanian (2/32) 1
124
Typology: Language Sampling124 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Niger-Kordofanian (2/1068) 9 Niger-Congo (2/1036) 8 Niger-Congo Proper (2/1007) 7 Central Niger-Congo (2/961) 6 South Central Niger-Congo (3/755) 3 Eastern (9/703) 1 Western (2/47) 1 Ijo-Defaka (2/5) 1 North Central Niger-Congo (4/206) 3 West Atlantic (3/46) 1 Mande (3/29) 1 Kordofanian (2/32) 1
125
Typology: Language Sampling125 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Niger-Kordofanian (2/1068) 9 Niger-Congo (2/1036) 8 Niger-Congo Proper (2/1007) 7 Central Niger-Congo (2/961) 6 South Central Niger-Congo (3/755) 3 Eastern (9/703) 1 Western (2/47) 1 Ijo-Defaka (2/5) 1 North Central Niger-Congo (4/206) 3 West Atlantic (3/46) 1 Mande (3/29) 1 Kordofanian (2/32) 1
126
Typology: Language Sampling126 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Niger-Kordofanian (2/1068) 9 Niger-Congo (2/1036) 8 Niger-Congo Proper (2/1007) 7 Central Niger-Congo (2/961) 6 South Central Niger-Congo (3/755) 3 Eastern (9/703) 1 Western (2/47) 1 Ijo-Defaka (2/5) 1 North Central Niger-Congo (4/206) 3 West Atlantic (3/46) 1 Mande (3/29) 1 Kordofanian (2/32) 1
127
Typology: Language Sampling127 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Niger-Kordofanian (2/1068) 9 Niger-Congo (2/1036) 8 Niger-Congo Proper (2/1007) 7 Central Niger-Congo (2/961) 6 South Central Niger-Congo (3/755) 3 Eastern (9/703) 1 Western (2/47) 1 Ijo-Defaka (2/5) 1 North Central Niger-Congo (4/206) 3 West Atlantic (3/46) 1 Mande (3/29) 1 Kordofanian (2/32) 1
128
Typology: Language Sampling128 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Niger-Kordofanian (2/1068) 9 Niger-Congo (2/1036) 8 Niger-Congo Proper (2/1007) 7 Central Niger-Congo (2/961) 6 South Central Niger-Congo (3/755) 3 Eastern (9/703) 1 Western (2/47) 1 Ijo-Defaka (2/5) 1 North Central Niger-Congo (4/206) 3 West Atlantic (3/46) 1 Mande (3/29) 1 Kordofanian (2/32) 1
129
Typology: Language Sampling129 Variety sampling: output Classification: Ruhlen91 Criterion 1: Diversity Value: dynamic/global/average Sample size: 100 ( 1.90 % of 5273) Niger-Kordofanian (2/1068) 9 Niger-Congo (2/1036) 8 Niger-Congo Proper (2/1007) 7 Central Niger-Congo (2/961) 6 South Central Niger-Congo (3/755) 3 Eastern (9/703) 1 Western (2/47) 1 Ijo-Defaka (2/5) 1 North Central Niger-Congo (4/206) 3 West Atlantic (3/46) 1 Mande (3/29) 1 Kordofanian (2/32) 1
130
Typology: Language Sampling130 Variety sampling Side effect of large (variety) sample: Hidden diachrony
131
Typology: Language Sampling131 Variety sampling Problems: - works only on tree-shaped classifications - time depth in genetic trees: unbalanced - not good for probability samples - Creoles? Extinct languages?
132
Typology: Language Sampling132 Round off
133
Typology: Language Sampling133 Round off Two Sample Strategies:
134
Typology: Language Sampling134 Round off Two Sample Strategies: 1. Probability sample - relatively small - control for Gen/Ar/Typ bias
135
Typology: Language Sampling135 Round off Two Sample Strategies: 1. Probability sample - relatively small - control for Gen/Ar/Typ bias 2. Variety sample - relatively large - may be stratified for bias parameters - may have diachronic dimension
136
Typology: Language Sampling136 Round off Sample Types: 1. Probability sample 2. Variety sample 3. Random sample: when bias is unimportant
137
Typology: Language Sampling137 Round off Sample Types: 1. Probability sample 2. Variety sample 3. Random sample: when bias is unimportant 4. Convenience sample: when bibliographical constraints kick in...
138
Typology: Language Sampling138 ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.