Presentation is loading. Please wait.

Presentation is loading. Please wait.

individual objects recognized as nodes We have no a physical image of the network or database, but only individual objects recognized as nodes.

Similar presentations


Presentation on theme: "individual objects recognized as nodes We have no a physical image of the network or database, but only individual objects recognized as nodes."— Presentation transcript:

1

2 individual objects recognized as nodes We have no a physical image of the network or database, but only individual objects recognized as nodes.

3 distance a shortest path The distance between two vertices in a graph is the number of edges in a shortest path connecting them.

4

5

6

7 A particular case:

8 The Moore-Penrose pseudoinverse First-Encounters restore the Euclidean space structure:

9 R i =1  1 A

10 3 examples:

11

12 12 "We shape our buildings, and afterwards our buildings shape us.“ Sir Winston Churchill Sir Winston Churchill (October 28, 1943: while requesting that the House of Commons be rebuilt exactly as before, remaining insufficient to seat all its members.)

13 The more isolated is a place, the worse is the situation in that.

14 First-passage times to Venetian canals

15 SoHo East Harlem Federal Hall Bowery East Village Times Square

16 mean household income The data on the mean household income per year provided by

17 The data taken from the

18

19

20

21 From Gray, R. D. and Q. D. Atkinson. 2003. Language tree divergence times support the Anatolian theory of Indo-European origin. Nature 426: 435-439. The tree-reconstruction phylogenetic methods based on the simple relation of ancestry fail to reveal full complexity of multidimensional phylogenetic signal where language affinity is characterized by many phonetic, morphophonemic, lexical, and grammatical isoglosses: evolutionary trees conflict with each other and with the traditionally accepted family arborescence; the languages known as isolates cannot be reliably classified into any branch with other living languages.

22 1.We present a fully automated method for building genetic language taxonomies where the relationships between different languages in the language family are represented geometrically, in terms of distances and angles, as in Euclidean geometry of everyday intuition. 2.We have tested our method for the 50 major languages of Indo- European language family; 3.and then investigated the Austronesian phylogeny considered again over 50 languages

23 encoding Challenges: 1.Languages which belong to the same family may not share many words in common, while languages in two distinct families may share many words in common. 2.The effect of bias between orthographic and phonetic realizations of meanings Brahui is Dravidian by the syntactic structure, but 85% of all words are Indo-European. Swadesh’s list) 1.We have used a short list of 200 words (Swadesh’s list) adopted to reconstruct systematic sound correspondences between the languages, known to change at a very slow rate containing terms which are common to all cultures – rather than a complete dictionary. 2.Swadeshs’ list for the languages written in the different alphabets were already transliterated into English by Dyen et al.(1997), Greenhill et al.(2008). 3.We have studied languages within a language family Levenshtein Levenshtein distance ( edit distance) is a measure of the similarity between two strings, the number of deletions, insertions, or substitutions required to transform one into another. MILCHK = MILK The lexical distance between l 1 and l 2, can be interpreted as the average probability to distinguish them by a mismatch between two characters randomly chosen from the orthographic realizations of Swadesh’s meanings.

24 representationChallenges: The multivariate lexical signal is strongly correlated → PCA, ICA Any historical development in language cannot be described only in terms of ‘pair-wise’ interactions, but it reflects a genuine higher order influence among the different language groups. The kernel PCA method (Schölkopf et al.,1998) generalizes PCA to the case where we are interested in taking all higher-order correlations between data instances. The appropriate kernel was found in Blanchard &Volchenkov(2008): P is the total probability of successful classification by an infinite series of matchings, for the two languages in the language family, The lexical distance between l 1 and l 2 is the average probability to distinguish them by a mismatch between two characters randomly chosen from the orthographic realizations of a Swadesh’s meaning. The rank-ordering of data traits, in accordance to their eigenvalues provides us with the natural geometric framework for dimensionality reduction.

25 1.The four well-separated monophyletic spines represent the four biggest traditional IE language groups: Romance & Celtic, Germanic, Balto-Slavic, and Indo-Iranian; 2.The Greek, Romance, Celtic, and Germanic languages form a class characterized by approximately the same azimuth angle (belong to one plane); 3.The Indo-Iranian, Balto-Slavic, Armenian, and Albanian languages form another class, with respect to the zenith angle. representation

26 The systematic sound correspondences between the Swadesh’s words across the different languages perfectly coincides with the well-known centum-satem isogloss of the IE family (reflecting the IE numeral ‘100’), related to the evolution in the phonetically unstable palatovelar order.

27 The normal probability plots fitting the distances r of language points from the ‘center of mass’ to univariate normality. The data points were ranked and then plotted against their expected values under normality, so that departures from linearity signify departures from normality.

28 interpretation The univariate normal distribution is closely related to the time evolution of a mass-density function under homogeneous diffusion in one dimension in which the mean value μ is interpreted as the coordinate of a point where all mass was initially concentrated, and variance σ 2 ∝ t grows linearly with time. Nothing to do with the traditional glottochronological assumption about the steady borrowing rates of cognates (Embelton, 1986)! 1.the last Celtic migration (to the Balkans and Asia Minor) (300 BC), 2.the division of the Roman Empire (500 AD), 3.the migration of German tribes to the Danube River (100 AD), 4.the establishment of the Avars Khaganate (590 AD) overspreading Slavic people who did the bulk of the fighting across Europe. Anchor events: The values of variance σ 2 give a statistically consistent estimate of age for each language group.

29 From the time–variance ratio we can retrieve the probable dates for: The break-up of the Proto-Indo-Iranian continuum. The migration from the early Andronovo archaeological horizon (Bryant, 2001). by 2,400 BC The end of common Balto-Slavic history before 1,400 BC The archaeological dating of Trziniec-Komarov culture The separation of Indo-Arians from Indo-Iranians. Probably, as a result of Aryan migration across India to Ceylon, as early as in 483BC (Mcleod, 2002) The division of Persian polity into a number of Iranian tribes, after the end of Greco-Persian wars (Green, 1996). before 400 BC

30 Einkorn wheat (Triticum boeoticum) The Anatolian hypothesis suggests the origin in the Neolithic Anatolia and associates the expansion with the Neolithic agricultural revolution in the 8 th and 6 th millennia BC (Renfrew,1987). The graphical test to check three-variate normality of the distribution of the distances of the five proto-languages from a statistically determined central point is presented by extending the notion of the normal probability plot. The χ-square distribution is used to test for goodness of fit of the observed distribution: the departures from three-variant normality are indicated by departures from linearity. The use of the previously determined time–variance ratio then dates the initial break-up of the Proto-Indo-Europeans back to 7,400 BC pointing at the early Neolithic date.

31 The components probe for a sample of 50 AU languages immediately uncovers the both Formosan (F) and Malayo-Polynesian (MP) branches of the entire language family. Headhunters

32

33 The distribution of languages spoken within Maritime Southeast Asia, Melanesia, Western Polynesia and of the Paiwan language group in Taiwan over the distances from the center of the diagram conforms to univariate normality suggesting that an interaction sphere had existed encompassing the whole region, from the Philippines and Southern Indonesia through the Solomon Islands to Western Polynesia, where ideas and cultural traits were shared and spread as attested by trade (Bellwood and Koon,1989; Kirch,1997) and translocation off animals (Matisoo-Smith and Robins,2004; Larsonetal.,2007) among shore line communities. By 550 AD pretty well before 600 –1200 AD …pretty well before 600 –1200 AD while descendants from Melanesia settled in the distant apices of the Polynesian triangle as evidenced by archaeological records (Kirch, 2000; Anderson and Sinoto,2002; Hurlesetal.,2003).

34 A system for using dice to compose music randomly, without having to know neither the techniques of composition, nor the rules of harmony, named Musikalisches Würfelspiel (Musical dice game)(MDG) had become quite popular throughout Western Europe in the 18th century: "The Ever Ready Composer of Polonaises and Minuets" was devised by Ph. Kirnberger, as early as in 1757. "The Ever Ready Composer of Polonaises and Minuets" was devised by Ph. Kirnberger, as early as in 1757. The famous chance music machine attributed to W.A. Mozart ("K 516f") consisted of numerous two-bar fragments of music named after the different letters of the Latin alphabet and destined to be combined together either at random, or following an anagram of your beloved had been known since 1787. The famous chance music machine attributed to W.A. Mozart ("K 516f") consisted of numerous two-bar fragments of music named after the different letters of the Latin alphabet and destined to be combined together either at random, or following an anagram of your beloved had been known since 1787.

35

36 Every pitch in a musical piece is characterized with respect to the entire structure of the Markov chain by its level of accessibility estimated by the first passage time to it that is the expected length of the shortest path of a random walk toward the pitch from any other pitch randomly chosen over the musical score. The values of first passage times to notes are strictly ordered in accordance to their role in the tone scale of the musical composition.

37 By analyzing the typical magnitudes of first passage times to notes in one octave, we can discover an individual creative style of a composer and track out the stylistic influences between different composers.

38 Correlation and covariance matrices calculated for the medians of the first passage times in a single octave provide the basis for the classification of composers, with respect to their tonality preferences.

39


Download ppt "individual objects recognized as nodes We have no a physical image of the network or database, but only individual objects recognized as nodes."

Similar presentations


Ads by Google