Presentation is loading. Please wait.

Presentation is loading. Please wait.

A dvances in Automated Language Classification ASJP Consortium Dik Bakker, Lancaster.

Similar presentations


Presentation on theme: "A dvances in Automated Language Classification ASJP Consortium Dik Bakker, Lancaster."— Presentation transcript:

1 A dvances in Automated Language Classification ASJP Consortium Dik Bakker, Lancaster

2 ASJP: Automatic Reconstruction2 Overview Project: ASJP (Automated Similarity Judgment Program)

3 ASJP: Automatic Reconstruction3 Overview Project: ASJP are: Sören Wichmann (BRD; Netherlands) Viveka Velupillai (BRD) André Müller (BRD) Robert Mailhammer (BRD) Hagen Jung (BRD) Eric Holman (US) Anthony Grant (UK) Dmitry Egorov (Russia) Pamela Brown (US) Cecil Brown (US) Dik Bakker (UK; Netherlands)

4 ASJP: Automatic Reconstruction4 Overview Project: ASJP (Automated Similarity Judgment Program)

5 ASJP: Automatic Reconstruction5 Overview Project: ASJP (Automated Similarity Judgment Program) Overall goal: Automatic reconstruction of language relationships

6 ASJP: Automatic Reconstruction6 Overview Project: ASJP (Automated Similarity Judgment Program) Overall goal: Automatic reconstruction of language relationships Basis: Distance matrix between individual languages on basis of linguistic features

7 ASJP: Automatic Reconstruction7 Overview Project: ASJP (Automated Similarity Judgment Program) Overall goal: Automatic reconstruction of language relationships Basis: Distance matrix between individual languages on basis of linguistic features Method: Lexicostatistics: mass comparison of lexical items

8 ASJP: Automatic Reconstruction8 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals (a.o):

9 ASJP: Automatic Reconstruction9 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications

10 ASJP: Automatic Reconstruction10 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications - Classify newly described and unclassified languages

11 ASJP: Automatic Reconstruction11 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications - Classify newly described and unclassified languages - Estimate time depths between languages / genera / families

12 ASJP: Automatic Reconstruction12 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications - Classify newly described and unclassified languages - Estimate time depths between languages / genera / families - Search for (ir)regularities in phylogenies

13 ASJP: Automatic Reconstruction13 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications - Classify newly described and unclassified languages - Estimate time depths between languages / genera / families - Search for (ir)regularities in phylogenies - Test hypotheses (e.g. Atkinson et al 2008; ‘elbow’ phenomenon)

14 ASJP: Automatic Reconstruction14 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications - Classify newly described and unclassified languages - Estimate time depths between languages / genera / families - Search for (ir)regularities in phylogenies - Test hypotheses (e.g. Atkinson et al 2008; ‘elbow’ phenomenon) - Experimentally find the best/optimal dating method

15 ASJP: Automatic Reconstruction15 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications - Classify newly described and unclassified languages - Estimate time depths between languages / genera / families - Search for (ir)regularities in phylogenies - Test hypotheses (e.g. Atkinson et al 2008; ‘elbow’ phenomenon) - Experimentally find the best/optimal dating method - Detect borrowings

16 ASJP: Automatic Reconstruction16 Overview MAIN GOAL: Reconstruction of Language Relationships Derived goals: - Critical assessment and refinement of existing classifications - Classify newly described and unclassified languages - Estimate time depths between languages / genera / families - Search for (ir)regularities in phylogenies - Test hypotheses (e.g. Atkinson et al 2008; ‘elbow’ phenomenon) - Experimentally find the best/optimal dating method - Detect borrowings

17 ASJP: Automatic Reconstruction17 Overview 1. The basic list of lexical items

18 ASJP: Automatic Reconstruction18 Overview 1. The basic list of lexical items 2. Comparing languages

19 ASJP: Automatic Reconstruction19 Overview 1. The basic list of lexical items 2. Comparing languages 3. Some results: genetic and areal proximity

20 ASJP: Automatic Reconstruction20 Overview 1. The basic list of lexical items 2. Comparing languages 3. Some results: genetic and areal proximity 4. On Inheritance vs Borrowing

21 ASJP: Automatic Reconstruction21 Overview 1. The basic list of lexical items 2. Comparing languages 3. Some results: genetic and areal proximity 4. On Inheritance vs Borrowing 5. Conclusions

22 ASJP: Automatic Reconstruction22 1. The basic list of lexical items

23 ASJP: Automatic Reconstruction23 Lexical items Word list: Swadesh 100 basic meanings

24 ASJP: Automatic Reconstruction24 Lexical items Word list: Swadesh 100 basic meanings - Word coined in most languages

25 ASJP: Automatic Reconstruction25 Lexical items Word list: Swadesh 100 basic meanings - Word coined in most languages - Collected in field work lexicon / grammar

26 ASJP: Automatic Reconstruction26 Lexical items Word list: Swadesh 100 basic meanings - Word coined in most languages - Collected in field work lexicon / grammar - Inherited rather than borrowed

27 ASJP: Automatic Reconstruction27 Lexical items Word list: Swadesh 100 basic meanings - Word coined in most languages - Collected in field work lexicon / grammar - Inherited rather than borrowed - Culturally independent

28 ASJP: Automatic Reconstruction28 Lexical items Word list: Swadesh 100 basic meanings - Word coined in most languages - Collected in field work lexicon / grammar - Inherited rather than borrowed - Culturally independent - Stable over time

29 ASJP: Automatic Reconstruction29 Lexical items Word list: Swadesh 100 basic meanings - Word coined in most languages - Collected in field work lexicon / grammar - Inherited rather than borrowed - Culturally independent - Stable over time - Few synonyms

30 ASJP: Automatic Reconstruction30 1. I21. dog41. nose61. die81. smoke 2. you22. louse42. mouth62. kill82. fire 3. we23. tree43. tooth63. swim83. ash 4. this24. seed44. tongue64. fly84. burn 5. that25. leaf45. claw65. walk85. path 6. who26. root46. foot66. come86. mountain 7. what27. bark47. knee67. lie87. red 8. not28. skin48. hand68. sit88. green 9. all29. flesh49. belly69. stand89. yellow 10. many30. blood50. neck70. give90. white 11. one31. bone51. breasts71. say91. black 12. two32. grease52. heart72. sun92. night 13. big33. egg53. liver73. moon93. hot 14. long34. horn54. drink74. star94. cold 15. small35. tail55. eat75. water95. full 16. woman36. feather56. bite76. rain96. new 17. man37. hair57. see77. stone97. good 18. person38. head58. hear78. sand98. round 19. fish39. ear59. know79. earth99. dry 20. bird40. eye60. sleep80. cloud100. name

31 ASJP: Automatic Reconstruction31 1. I21. dog41. nose61. die81. smoke 2. you22. louse42. mouth62. kill82. fire 3. we23. tree43. tooth63. swim83. ash 4. this24. seed44. tongue64. fly84. burn 5. that25. leaf45. claw65. walk85. path 6. who26. root46. foot66. come86. mountain 7. what27. bark47. knee67. lie87. red 8. not28. skin48. hand68. sit88. green 9. all29. flesh49. belly69. stand89. yellow 10. many30. blood50. neck70. give90. white 11. one31. bone51. breasts71. say91. black 12. two32. grease52. heart72. sun92. night 13. big33. egg53. liver73. moon93. hot 14. long34. horn54. drink74. star94. cold 15. small35. tail55. eat75. water95. full 16. woman36. feather56. bite76. rain96. new 17. man37. hair57. see77. stone97. good 18. person38. head58. hear78. sand98. round 19. fish39. ear59. know79. earth99. dry 20. bird40. eye60. sleep80. cloud100. name

32 ASJP: Automatic Reconstruction32 1. I21. dog41. nose61. die81. smoke 2. you22. louse42. mouth62. kill82. fire 3. we23. tree43. tooth63. swim83. ash 4. this24. seed44. tongue64. fly84. burn 5. that25. leaf45. claw65. walk85. path 6. who26. root46. foot66. come86. mountain 7. what27. bark47. knee67. lie87. red 8. not28. skin48. hand68. sit88. green 9. all29. flesh49. belly69. stand89. yellow 10. many30. blood50. neck70. give90. white 11. one31. bone51. breasts71. say91. black 12. two32. grease52. heart72. sun92. night 13. big33. egg53. liver73. moon93. hot 14. long34. horn54. drink74. star94. cold 15. small35. tail55. eat75. water95. full 16. woman36. feather56. bite76. rain96. new 17. man37. hair57. see77. stone97. good 18. person38. head58. hear78. sand98. round 19. fish39. ear59. know79. earth99. dry 20. bird40. eye60. sleep80. cloud100. name

33 ASJP: Automatic Reconstruction33 1. I21. dog41. nose61. die81. smoke 2. you22. louse42. mouth62. kill82. fire 3. we23. tree43. tooth63. swim83. ash 4. this24. seed44. tongue64. fly84. burn 5. that25. leaf45. claw65. walk85. path 6. who26. root46. foot66. come86. mountain 7. what27. bark47. knee67. lie87. red 8. not28. skin48. hand68. sit88. green 9. all29. flesh49. belly69. stand89. yellow 10. many30. blood50. neck70. give90. white 11. one31. bone51. breasts71. say91. black 12. two32. grease52. heart72. sun92. night 13. big33. egg53. liver73. moon93. hot 14. long34. horn54. drink74. star94. cold 15. small35. tail55. eat75. water95. full 16. woman36. feather56. bite76. rain96. new 17. man37. hair57. see77. stone97. good 18. person38. head58. hear78. sand98. round 19. fish39. ear59. know79. earth99. dry 20. bird40. eye60. sleep80. cloud100. name

34 ASJP: Automatic Reconstruction34 1. I21. dog41. nose61. die81. smoke 2. you22. louse42. mouth62. kill82. fire 3. we23. tree43. tooth63. swim83. ash 4. this24. seed44. tongue64. fly84. burn 5. that25. leaf45. claw65. walk85. path 6. who26. root46. foot66. come86. mountain 7. what27. bark47. knee67. lie87. red 8. not28. skin48. hand68. sit88. green 9. all29. flesh49. belly69. stand89. yellow 10. many30. blood50. neck70. give90. white 11. one31. bone51. breasts71. say91. black 12. two32. grease52. heart72. sun92. night 13. big33. egg53. liver73. moon93. hot 14. long34. horn54. drink74. star94. cold 15. small35. tail55. eat75. water95. full 16. woman36. feather56. bite76. rain96. new 17. man37. hair57. see77. stone97. good 18. person38. head58. hear78. sand98. round 19. fish39. ear59. know79. earth99. dry 20. bird40. eye60. sleep80. cloud100. name

35 ASJP: Automatic Reconstruction35 1. I21. dog41. nose61. die81. smoke 2. you22. louse42. mouth62. kill82. fire 3. we23. tree43. tooth63. swim83. ash 4. this24. seed44. tongue64. fly84. burn 5. that25. leaf45. claw65. walk85. path 6. who26. root46. foot66. come86. mountain 7. what27. bark47. knee67. lie87. red 8. not28. skin48. hand68. sit88. green 9. all29. flesh49. belly69. stand89. yellow 10. many30. blood50. neck70. give90. white 11. one31. bone51. breasts71. say91. black 12. two32. grease52. heart72. sun92. night 13. big33. egg53. liver73. moon93. hot 14. long34. horn54. drink74. star94. cold 15. small35. tail55. eat75. water95. full 16. woman36. feather56. bite76. rain96. new 17. man37. hair57. see77. stone97. good 18. person38. head58. hear78. sand98. round 19. fish39. ear59. know79. earth99. dry 20. bird40. eye60. sleep80. cloud100. name

36 ASJP: Automatic Reconstruction36 1. I21. dog41. nose61. die81. smoke 2. you22. louse42. mouth62. kill82. fire 3. we23. tree43. tooth63. swim83. ash 4. this24. seed44. tongue64. fly84. burn 5. that25. leaf45. claw65. walk85. path 6. who26. root46. foot66. come86. mountain 7. what27. bark47. knee67. lie87. red 8. not28. skin48. hand68. sit88. green 9. all29. flesh49. belly69. stand89. yellow 10. many30. blood50. neck70. give90. white 11. one31. bone51. breasts71. say91. black 12. two32. grease52. heart72. sun92. night 13. big33. egg53. liver73. moon93. hot 14. long34. horn54. drink74. star94. cold 15. small35. tail55. eat75. water95. full 16. woman36. feather56. bite76. rain96. new 17. man37. hair57. see77. stone97. good 18. person38. head58. hear78. sand98. round 19. fish39. ear59. know79. earth99. dry 20. bird40. eye60. sleep80. cloud100. name

37 ASJP: Automatic Reconstruction37 Lexical items: further reduction Early analyses have shown: - Optimal 40/100 item subset gives same results

38 ASJP: Automatic Reconstruction38 Lexical items: further reduction Early analyses have shown: - Optimal 40/100 item subset gives same results  Less work

39 ASJP: Automatic Reconstruction39 Lexical items: further reduction Early analyses have shown: - Optimal 40/100 item subset gives same results  Less work  Less missing data

40 ASJP: Automatic Reconstruction40 Lexical items: further reduction Early analyses have shown: - Optimal 40/100 item subset gives same results  Less work  Less missing data  Faster processing; combinatorial explosion: 40 : 100 ~ 3 * 10 7 : 2 * 10 10

41 ASJP: Automatic Reconstruction41 Lexical items: stability Most stable items:

42 ASJP: Automatic Reconstruction42 Lexical items: stability Most stable items: Iteratively throw out the most unstable item in terms of variation within genera (3500-4000 years; Dryer 2001; 2005) E.g. Germanic, Romance, Slavic, …

43 ASJP: Automatic Reconstruction43 Lexical items: stability Most stable items: Iteratively throw out the most unstable item in terms of variation within genera (3500-4000 years; Dryer 2001; 2005) E.g. Germanic, Romance, Slavic, … Formula: S = (E - U)/(100 - U) (weighted average % matches Eq vs Uneq)

44 ASJP: Automatic Reconstruction44 Ethnologue (Goodmann-Kruskal) WALS (Pearson) ++ --

45 ASJP: Automatic Reconstruction45 I dog nose die smoke you louse mouth kill fire we tree tooth swim ash this seed tongue fly burn that leaf claw walk path who root foot come mountain what bark knee lie red not skin hand sit green all flesh belly stand yellow many blood neck give white one bone breasts say black two grease heart sun night big egg liver moon hot long horn drink star cold small tail eat water full woman feather bite rain new man hair see stone good person head hear sand round fish ear know earth dry bird eye sleep cloud name

46 ASJP: Automatic Reconstruction46 I dog nose die smoke you louse mouth kill fire we tree tooth swim ash this seed tongue fly burn that leaf claw walk path who root foot come mountain what bark knee lie red not skin hand sit green all flesh belly stand yellow many blood neck give white one bone breast say black two grease heart sun night big egg liver moon hot long horn drink star cold small tail eat water full woman feather bite rain new man hair see stone good person head hear sand round fish ear know earth dry bird eye sleep cloud name 40 Most Stable

47 ASJP: Automatic Reconstruction47 I dog nose die smoke you louse mouth kill fire we tree tooth swim ash this seed tongue fly burn that leaf claw walk path who root foot come mountain what bark knee lie red not skin hand sit green all flesh belly stand yellow many blood neck give white one bone breast say black two grease heart sun night big egg liver moon hot long horn drink star cold small tail eat water full woman feather bite rain new man hair see stone good person head hear sand round fish ear know earth dry bird eye sleep cloud name HomophonesHomophones

48 ASJP: Automatic Reconstruction48 Lexical items: transcription First phase of project (2007): Problems with full IPA representation of words:

49 ASJP: Automatic Reconstruction49 Lexical items: transcription First phase of project (2007): Problems with full IPA representation of words: - data entry via keyboard

50 ASJP: Automatic Reconstruction50 Lexical items: transcription First phase of project (2007): Problems with full IPA representation of words: - data entry via keyboard - simple programming language (Fortran; Pascal)

51 ASJP: Automatic Reconstruction51 Lexical items: transcription First phase of project (2007): Problems with full IPA representation of words: - data entry via keyboard - simple programming language (Fortran; Pascal)  Recoding to simplified ASJPcode (only Ascii)

52 ASJP: Automatic Reconstruction52 Lexical items: transcription ASJPcode:

53 ASJP: Automatic Reconstruction53 Lexical items: transcription ASJPcode: 7 Vowels

54 ASJP: Automatic Reconstruction54 Lexical items: transcription ASJPcode: 7 Vowels 34 Consonants

55 ASJP: Automatic Reconstruction55 Lexical items: transcription ASJPcode: 7 Vowels 34 Consonants Operators for:Nasalization Labialization Palatalization Aspiration Glottalization

56 ASJP: Automatic Reconstruction56 Lexical items: transcription ASJPcode: 7 Vowels 34 Consonants Operators for:Nasalization Labialization Palatalization Aspiration Glottalization  (some) complex syllables simplified (VXC  VC)

57 ASJP: Automatic Reconstruction57 Abaza (Caucasian): Meaning PERSON LEAF SKIN HORN NOSE TOOTH

58 ASJP: Automatic Reconstruction58 Abaza (Caucasian): MeaningIPA PERSONʕʷɨʧʼʲʷʕʷɨs LEAFbɣʲɨ SKINʧʷazʲ HORNʧʼʷɨʕʷa NOSEpɨnʦʼa TOOTHpɨʦ

59 ASJP: Automatic Reconstruction59 Abaza (Caucasian): MeaningIPAASJPcode PERSONʕʷɨʧʼʲʷʕʷɨsXw~3Cw"yXw~3s LEAFbɣʲɨbxy~3 SKINʧʷazʲCw~azy~ HORNʧʼʷɨʕʷaCw"~3Xw~a NOSEpɨnʦʼap3nc"a TOOTHpɨʦp3c

60 ASJP: Automatic Reconstruction60 Lexical items Collected to date: - Over 2100 languages, dialects and proto

61 ASJP: Automatic Reconstruction61 Lexical items Collected to date: - Over 2100 languages, dialects and proto - Mean number of items/language: 36.2 (/40)

62 ASJP: Automatic Reconstruction62 Lexical items Distribution: Americas:27% Eurasia:23% Australia/PNG:18% Austronesia:15% Africa:14% Creoles: 2% Artificial: 1%

63 ASJP: Automatic Reconstruction63 Languages currently sampled

64 ASJP: Automatic Reconstruction64 Lexical items: transcription Second phase of project (2008): Problems with full IPA representation solved:

65 ASJP: Automatic Reconstruction65 Lexical items: transcription Second phase of project (2008): Problems with full IPA representation solved: 1. automatic conversion IPA to integer (Python)

66 ASJP: Automatic Reconstruction66 Lexical items: transcription Second phase of project (2008): Problems with full IPA representation solved: 1. automatic conversion IPA to integer (Python) 2. (semi-)automatic recoding to ASJPcode: transduction on the basis of a formal grammar

67 ASJP: Automatic Reconstruction67 Lexical items: transcription Abaza (Caucasian): Meaning:PERSON

68 ASJP: Automatic Reconstruction68 Lexical items: transcription Abaza (Caucasian): Meaning:PERSON IPA:ʕʷɨʧʼʲʷʕʷɨs

69 ASJP: Automatic Reconstruction69 Lexical items: transcription Abaza (Caucasian): Meaning:PERSON IPA:ʕʷɨʧʼʲʷʕʷɨs Decimal: 661 695 616 679 700 690 695 661 695 616 115

70 ASJP: Automatic Reconstruction70 Lexical items: transcription Abaza (Caucasian): Meaning:PERSON IPA:ʕʷɨʧʼʲʷʕʷɨs Decimal: 661 695 616 679 700 690 695 661 695 616 115 ASJPcode: 88 119 126 51 67 34 121 119 126 88 119 126 51 115 ( = Xw~3Cw"y ~ Xw~3s)

71 ASJP: Automatic Reconstruction71 Lexical items: transcription Second phase of project (2008): 1. automatic conversion IPA to integer (Python) 2. (semi-)automatic recoding to ASJPcode: transduction on the basis of a formal grammar Why not run on full IPA??

72 ASJP: Automatic Reconstruction72 Lexical items: transcription Second phase of project (2008): 1. automatic conversion IPA to integer (Python) 2. (semi-)automatic recoding to ASJPcode: transduction on the basis of a formal grammar - correlations IPA ~ ASJP > 0.9

73 ASJP: Automatic Reconstruction73 Lexical items: transcription Second phase of project (2008): 1. automatic conversion IPA to integer (Python) 2. (semi-)automatic recoding to ASJPcode: transduction on the basis of a formal grammar - correlations IPA ~ ASJP > 0.9 - but: ASJP better fit with classifications  IPA too specific

74 ASJP: Automatic Reconstruction74 Lexical items: transcription IPA:ʕʷɨʧʼʲʷʕʷɨs Decimal: 661 695 616 679 700 690 695 661 695 616 115 ASJP ++ code:( = any unicode string ) A  n661, n695, n616, … … P Q  A B C … Z  P Q Z formal grammar

75 ASJP: Automatic Reconstruction75 Lexical items: transcription IPA:ʕʷɨʧʼʲʷʕʷɨs Decimal: 661 695 616 679 700 690 695 661 695 616 115 ASJP ++ code:( = any unicode string ) A  n661, n695, n616, … … P Q  A B C … Z  P Q Z optimal level of abstraction for historical phonological reconstruction?

76 ASJP: Automatic Reconstruction76 2. Comparing languages

77 ASJP: Automatic Reconstruction77 Comparing words LGIYOUWE ABAZAsErEw3rESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un

78 ASJP: Automatic Reconstruction78 Comparing words LGIYOUWE ABAZAsErEbErESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un LD i =3

79 ASJP: Automatic Reconstruction79 Comparing words LGIYOUWE ABAZAsErEbErESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un LD i =3LD j =4

80 ASJP: Automatic Reconstruction80 Comparing words LGIYOUWE ABAZAsErEbErESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un LD i =3LD j =4 LD k =3

81 ASJP: Automatic Reconstruction81 Comparing words LGIYOUWE ABAZAsErEbErESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un LD i =3LD j =4 LD k =3 …

82 ASJP: Automatic Reconstruction82 Comparing words LGIYOUWE ABAZAsErEbErESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un LD i =3LD j =4LD k =3LD mean =3.73 …

83 ASJP: Automatic Reconstruction83 Comparing words LGIYOUWE ABAZAsErEbErESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un LD i =4LD j =4LD k =4LD mean =4.37 …

84 ASJP: Automatic Reconstruction84 Comparing words LGIYOUWE ABAZAsErEw3rESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un 3.73

85 ASJP: Automatic Reconstruction85 Comparing words LGIYOUWE ABAZAsErEw3rESw~ErE ABKHAZs3w3Sw~3 AGULzunwuncw~un 3.73 4.37

86 ASJP: Automatic Reconstruction86 Comparing words Levenshtein Distance

87 ASJP: Automatic Reconstruction87 Comparing words Levenshtein Distance a. between 2 words: Number of transformations to get from the shorter form to the longer one (changes, additions)

88 ASJP: Automatic Reconstruction88 Comparing words Levenshtein Distance a. between 2 words: Number of transformations to get from the shorter form to the longer one (changes, additions) b. Between 2 languages: E.g. mean LD for overlapping set (<= 40)

89 ASJP: Automatic Reconstruction89 Comparing words Levenshtein Distance Two problems with simple LD:

90 ASJP: Automatic Reconstruction90 Comparing words Levenshtein Distance Two problems: 1.Value depends on length of longest word

91 ASJP: Automatic Reconstruction91 Comparing words Levenshtein Distance Two problems: 1.Value depends on length of longest word  Normalize: LDN = ( LD / L max )

92 ASJP: Automatic Reconstruction92 Comparing words Levenshtein Distance Two problems: 1.Value depends on length of longest word  Normalize: LDN = ( LD / L max ) 2. Differences between lgs in phonological overlap

93 ASJP: Automatic Reconstruction93 Comparing words Levenshtein Distance Two problems: 1.Value depends on length of longest word  Normalize: LDN = ( LD / L max ) 2. Differences between lgs in phonological overlap  Eliminate ‘noise’: LDND = ( LDN / LDN different )

94 ASJP: Automatic Reconstruction94 Comparing words Levenshtein Distance Two problems: 1.Value depends on length of longest word  Normalize: LDN = 100 * LDN 2. Differences between lgs in phonological overlap  Eliminate ‘noise’: LDND = 100 * LDND

95 ASJP: Automatic Reconstruction95 Comparing languages Levenshtein Distance for Language Pair -Mean of all LDND’s of words in common

96 ASJP: Automatic Reconstruction96 Comparing languages Levenshtein Distance for Language Pair -Mean of all LDND’s of words in common -Synonyms (12%): -take Minimum pair -take Mean

97 ASJP: Automatic Reconstruction97 Comparing languages Levenshtein Distance for Language Pair -Mean of all LDND’s of words in common -Synonyms (12%): -take Minimum pair -take Mean Experimental option

98 ASJP: Automatic Reconstruction98 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"EyEr * LDND=55.0 ALT: AGL= c"ayif COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

99 ASJP: Automatic Reconstruction99 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"EyEr * LDND=55.0 ALT: AGL= c"ayif COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

100 ASJP: Automatic Reconstruction100 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"EyEr * LDND=55.0 ALT: AGL= c"ayif COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

101 ASJP: Automatic Reconstruction101 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"EyEr * LDND=55.0 ALT: AGL= c"ayif COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

102 ASJP: Automatic Reconstruction102 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"EyEr * LDND=55.0 ALT: AGL= c"ayif COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

103 ASJP: Automatic Reconstruction103 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"EyEr * LDND=55.0 ALT: AGL= c"ayif COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

104 ASJP: Automatic Reconstruction104 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"ayif * LDND=55.0 ALT: AGL= c"EyEr COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

105 ASJP: Automatic Reconstruction105 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"ayif * LDND=55.0 ALT: AGL= c"EyEr COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

106 ASJP: Automatic Reconstruction106 Comparing languages AVAR (AVA: NAKH-DAGHESTANIAN > AVAR-ANDIC-TSEZIC) / AGUL (AGL: NAKH-DAGHESTANIAN > LEZGIC) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 ALT: AGL= ac"ar NEW : c"iya=c"ayif * LDND=55.0 ALT: AGL= c"EyEr COMMON (LDND < 70) = AGL - AVA 6 (=15.8% of 38) LD = 4.01 / LDN = 81.76 / LDND = 89.87

107 ASJP: Automatic Reconstruction107 Comparing languages

108 ASJP: Automatic Reconstruction108 3. Some results: genetic and areal proximity

109 ASJP: Automatic Reconstruction109 Distance Matrix (0.5 * N * (N-1)) FREDUTGALPRTENG… FRE DUT 90.93 GAL71.6290.00 PRT74.3894.6151.87 ENG 91.1763.1991.3095.18 …

110 ASJP: Automatic Reconstruction110 Tools for Trees

111 ASJP: Automatic Reconstruction111 Tools for Trees  Input file to your preferred phylogenetic software using an editor such as TextPad (www.textpad.com)www.textpad.com

112 ASJP: Automatic Reconstruction112 Tools for Trees  Input file to your preferred phylogenetic software using an editor such as TextPad (www.textpad.com)www.textpad.com  Run data using phylogenetic software such as SplitsTree (www.splitstree.org)www.splitstree.org

113 ASJP: Automatic Reconstruction113 Tools for Trees  Input file to your preferred phylogenetic software using an editor such as TextPad (www.textpad.com)www.textpad.com  Run data using phylogenetic software such as SplitsTree (www.splitstree.org)www.splitstree.org  Choose the most appropriate algorithm (Neighbour Joining for distance data)

114 ASJP: Automatic Reconstruction114 Tools for Trees  Input file to your preferred phylogenetic software using an editor such as TextPad (www.textpad.com)www.textpad.com  Run data using phylogenetic software such as SplitsTree (www.splitstree.org)www.splitstree.org  Choose the most appropriate algorithm (Neighbour Joining for distance data)  Prepare tree for presentation using using a tool such as the Tree Explorer of MEGA

115 ASJP: Automatic Reconstruction115 Salishan Languages (n=30)

116 ASJP: Automatic Reconstruction116 NeighborJoining Salishan Languages (n=30)

117 ASJP: Automatic Reconstruction117 UPGMA NeighborJoining

118 ASJP: Automatic Reconstruction118 UPGMA NeighborJoining

119 ASJP: Automatic Reconstruction119 NeighborJoining NeighborJoining:

120 ASJP: Automatic Reconstruction120 NeighborJoining NeighborJoining: - specifically meant for phylogenetic trees

121 ASJP: Automatic Reconstruction121 NeighborJoining NeighborJoining: - specifically meant for phylogenetic trees - takes distance as point of departure

122 ASJP: Automatic Reconstruction122 NeighborJoining NeighborJoining: - specifically meant for phylogenetic trees - takes distance as point of departure - does NOT assume equal rate of change

123 ASJP: Automatic Reconstruction123 Mayan (n=38)

124 ASJP: Automatic Reconstruction124 Calibration of Method Calibration: best options, parameters, factors: A. for pure classification:

125 ASJP: Automatic Reconstruction125 Calibration of Method Calibration: best options, parameters, factors: A. for pure classification: - existing classifications (Ethnologue; WALS; mainly the well-documented areas)

126 ASJP: Automatic Reconstruction126 Calibration of Method Calibration: best options, parameters, factors: A. for pure classification: - existing classifications (Ethnologue; WALS; mainly the well-documented areas) - expert knowledge of specific areas

127 ASJP: Automatic Reconstruction127 Calibration of Method Calibration: best options, parameters, factors: A. for pure classification: - existing classifications (Ethnologue; WALS; mainly the well-documented areas) - expert knowledge of specific areas  diversion ±12%  niche!

128 ASJP: Automatic Reconstruction128 Calibration of Method Calibration: best options, parameters, factors: B. for dating:

129 ASJP: Automatic Reconstruction129 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events:

130 ASJP: Automatic Reconstruction130 Linguistically crucial events c. 250Goths conquer Daciasplit of E-W Romance 4th cIrish invade Scotlandsplit of Irish-Scottish Gaelic 5th c German kingdoms in W Roman Empirebreakup of W Romance 5th cGermans invade Britainsplit of English-Frisian 5th-6th cBritons flee to Brittanysplit of Welsh-Breton 400-600Hieroglyphic evidenceCh'olan begins to split 768-814 Name of Charlemagne attestedProto-Slavic Date Historical event Linguistic event

131 ASJP: Automatic Reconstruction131 Linguistically crucial events c. 250Goths conquer Daciasplit of E-W Romance 4th cIrish invade Scotlandsplit of Irish-Scottish Gaelic 5th c German kingdoms in W Roman Empirebreakup of W Romance 5th cGermans invade Britainsplit of English-Frisian 5th-6th cBritons flee to Brittanysplit of Welsh-Breton 400-600Hieroglyphic evidenceCh'olan begins to split 768-814 Name of Charlemagne attestedProto-Slavic Date Historical event Linguistic event

132 ASJP: Automatic Reconstruction132 Linguistically crucial events c. 250Goths conquer Daciasplit of E-W Romance 4th cIrish invade Scotlandsplit of Irish-Scottish Gaelic 5th c German kingdoms in W Roman Empirebreakup of W Romance 5th cGermans invade Britainsplit of English-Frisian 5th-6th cBritons flee to Brittanysplit of Welsh-Breton 400-600Hieroglyphic evidenceCh'olan begins to split 768-814 Name of Charlemagne attestedProto-Slavic Date Historical event Linguistic event

133 ASJP: Automatic Reconstruction133 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events  Standard formula (Swadesh): TimeDepth = log(Similarity) / 2 log Retention

134 ASJP: Automatic Reconstruction134 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events  Standard formula: TimeDepth = log(Similarity) / 2 log Retention

135 ASJP: Automatic Reconstruction135 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events  Standard formula: TimeDepth = log(LDND) / 2 log Retention

136 ASJP: Automatic Reconstruction136 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events  Standard formula: TimeDepth = log(LDND) / 2 log Retention

137 ASJP: Automatic Reconstruction137 Linguistically crucial events Timelinguistic event LDND Ret 1.75split of E-W Romance0.67530.73 1.65split of Irish-Scottish Gaelic0.66870.72 1.55breakup of W Romance0.64110.72 1.55split of English-Frisian0.65740.71 1.50split of Welsh-Breton0.57050.75 1.40Ch'olan begins to split0.53690.76 1.21Proto-Slavic0.58770.69 MEAN:0.73

138 ASJP: Automatic Reconstruction138 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events: - Standard formula: TimeDepth = log(LDND) / 2 log 73

139 ASJP: Automatic Reconstruction139 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events: - Standard formula: TimeDepth = log(LDND) / 2 log 73 < 75%

140 ASJP: Automatic Reconstruction140 Calibration of Method Calibration: best options, parameters, factors: B. for dating: - linguistically crucial historic events: - Standard formula: TimeDepth = log(LDND) / 2 log 73 < 75% Deeper!

141 ASJP: Automatic Reconstruction141 Glottochronology only? Calibration of method: Glottochronology: all based on lexical distance

142 ASJP: Automatic Reconstruction142 Glottochronology only? Calibration of method: Glottochronology: all based on lexical distance Add other linguistic domains …

143 ASJP: Automatic Reconstruction143 Glottochronology only? Calibration of method: Glottochronology: all based on lexical distance Add other linguistic domains … WALS Typological databaseWALS

144 ASJP: Automatic Reconstruction144 Glottochronology only? Calibration of method: Glottochronology: all based on lexical distance Add other linguistic domains … WALS Typological databaseWALS Best result: (75% 40 lex) + (25% 40 Ph/M/S features)

145 ASJP: Automatic Reconstruction145 4. On Inheritance vs Borrowing

146 ASJP: Automatic Reconstruction146 Inherited or borrowed? AVAR (AVA) / AGUL (AGL)

147 ASJP: Automatic Reconstruction147 Inherited or borrowed? AVAR (AVA) / AGUL (AGL) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 NEW : c"iya=c"EyEr * LDND=55.0

148 ASJP: Automatic Reconstruction148 Inherited or borrowed? AVAR (AVA) / AGUL (AGL) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 NEW : c"iya=c"EyEr * LDND=55.0  6 items < 70.0

149 ASJP: Automatic Reconstruction149 Inherited or borrowed? AVAR (AVA) / AGUL (AGL) I : dun=zun * LDND=36.6 YOU : mun=wun * LDND=36.6 HORN : tLar=k"arC * LDND=66.0 FIRE : c"a=c"a * LDND= 0.0 FULL : c"ura=ac"uf * LDND=66.0 NEW : c"iya=c"EyEr * LDND=55.0  6 items < 70.0  Genetically related !!

150 ASJP: Automatic Reconstruction150 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA)

151 ASJP: Automatic Reconstruction151 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2

152 ASJP: Automatic Reconstruction152 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2  6 items < 70.0

153 ASJP: Automatic Reconstruction153 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2  6 items < 70.0: RELATED ???

154 ASJP: Automatic Reconstruction154 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2  RELATED ??? NO!!!

155 ASJP: Automatic Reconstruction155 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2 INDO-EUROPEAN AUSTRONESIAN

156 ASJP: Automatic Reconstruction156 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2 CHANCE?

157 ASJP: Automatic Reconstruction157 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2 CHANCE?  ~ 5% (i.e. 1 – 2 items)

158 ASJP: Automatic Reconstruction158 Inherited or borrowed? SPANISH (SPA) / CHAMORRO (CHA) ONE : uno=unu * LDND=36.9 TWO : dos=dos * LDND= 0.0 PERSON : persona=petsona * LDND=55.3 STAR : estreya=estrecas * LDND=61.2 NIGHT : noCe=noces * LDND=68.2 NEW : nuevo=nueba * LDND=44.2 BORROWING through LANGUAGE CONTACT

159 ASJP: Automatic Reconstruction159 Inherited or borrowed? SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO ONE : uno=unu * LDND=36.9

160 ASJP: Automatic Reconstruction160 Inherited or borrowed? SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO ONE : uno=unu * LDND=36.9 SPA <> CHA:

161 ASJP: Automatic Reconstruction161 Inherited or borrowed? SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO ONE : uno=unu * LDND=36.9 SPA <> CHA: fam/gen= 0.24/0.82

162 ASJP: Automatic Reconstruction162 Inherited or borrowed? SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO ONE : uno=unu * LDND=36.9 SPA <> CHA: fam/gen= 0.24/0.82 > 0.03/0.00

163 ASJP: Automatic Reconstruction163 Inherited or borrowed? SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO ONE : uno=unu * LDND=36.9 SPA <> CHA: fam/gen= 0.24/0.82 > 0.03/0.00 phon pattern fit= 12.00 > 0.67

164 ASJP: Automatic Reconstruction164 Inherited or borrowed? SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO ONE : uno=unu * LDND=36.9 SPA <> CHA: fam/gen= 0.24/0.82 > 0.03/0.00 phon pattern fit= 12.00 > 0.67 …

165 ASJP: Automatic Reconstruction165 Borrowed! SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO ONE : uno=unu * LDND=36.9 SPA > CHA: fam/gen= 0.24/0.82 > 0.03/0.00 phon pattern fit= 12.00 > 0.67 …

166 ASJP: Automatic Reconstruction166 Borrowing SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO TWO : dos=dos * LDND= 0.0 SPA > CHA f/g= 0.62/1.00 > 0.12/0.00 swF=100.00 > 0.22

167 ASJP: Automatic Reconstruction167 Borrowing SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO PERSON : persona=petsona * LDND=55.3 SPA > CHA f/g= 0.20/0.64 > 0.01/0.00 swF=32.40 > 0.13

168 ASJP: Automatic Reconstruction168 Borrowing SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO PERSON : persona=petsona * LDND=55.3 SPA > CHA f/g= 0.20/0.64 > 0.01/0.00 swF=32.40 > 0.13 ALT: CHA= taotao (0.41/0.00)

169 ASJP: Automatic Reconstruction169 Borrowing SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO PERSON : persona=petsona * LDND=55.3 SPA > CHA f/g= 0.20/0.64 > 0.01/0.00 swF=32.40 > 0.13 ALT: CHA= taotao (0.41/0.00)

170 ASJP: Automatic Reconstruction170 Borrowing SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO STAR : estreya=estrecas * LDND=61.2 SPA > CHA f/g= 0.17/0.82 > 0.00/0.00 swF=100.00 > 4.44 ALT: CHA= puti7on (0.03/0.00)

171 ASJP: Automatic Reconstruction171 Borrowing SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO NIGHT : noCe=noces * LDND=68.2 SPA > CHA f/g= 0.23/0.55 > 0.04/0.00 swF=100.00 > 0.10 ALT: CHA= pw~eNi (0.23/0.00)

172 ASJP: Automatic Reconstruction172 Borrowing SPANISH (SPA) INDO-EUROPEAN (128) > ROMANCE / CHAMORRO (CHA) AUSTRONESIAN (310) > CHAMORRO NEW : nuevo=nueba * LDND=44.2 SPA > CHA f/g=0.50/0.64 > 0.04/0.00 swF=4.27 > 0.03

173 ASJP: Automatic Reconstruction173 5. Conclusions

174 ASJP: Automatic Reconstruction174 Conclusions - Method for automatic reconstruction of language relationships

175 ASJP: Automatic Reconstruction175 Conclusions - Method for automatic reconstruction of language relationships - Assess, discuss and correct existing classifications

176 ASJP: Automatic Reconstruction176 Conclusions - Method for automatic reconstruction of language relationships - Assess, discuss and correct existing classifications - Test hypotheses about genetic distances in time

177 ASJP: Automatic Reconstruction177 Conclusions - Method for automatic reconstruction of language relationships - Assess, discuss and correct existing classifications - Test hypotheses about genetic distances in time - Locate potential borrowings

178 ASJP: Automatic Reconstruction178 Conclusions - Method for automatic reconstruction of language relationships - Assess, discuss and correct existing classifications - Test hypotheses about genetic distances in time - Locate potential borrowings - C O R E: incremental lexical database (> 35%)

179 ASJP: Automatic Reconstruction179 Conclusions - Method for automatic reconstruction of language relationships - Assess, discuss and correct existing classifications - Test hypotheses about genetic distances in time - Locate potential borrowings - C O R E: incremental lexical database (> 35%)  One day: Online

180 ASJP: Automatic Reconstruction180 Conclusions - Method for automatic reconstruction of language relationships - Assess, discuss and correct existing classifications - Test hypotheses about genetic distances in time - Locate potential borrowings - C O R E: incremental lexical database (> 35%)  One day: Online  Cooperation!!

181 ASJP: Automatic Reconstruction181 Holman et al. (forthc. 2008) Explorations in automated language classificationExplorations in automated language classification. Folia Linguistica Brown et al. (forthc. 2008) Automated Classification of the World’s languages: A description of the method and prelimary results Sprachtypologie und Universalienforschung + Several working papers email.eva.mpg.de./~wichmann/ASJPHomePage

182 ASJP: Automatic Reconstruction182 ?


Download ppt "A dvances in Automated Language Classification ASJP Consortium Dik Bakker, Lancaster."

Similar presentations


Ads by Google