Facticity, Complexity and Big Data

Slides:



Advertisements
Similar presentations
1 Turing Machines and Equivalent Models Section 13.2 The Church-Turing Thesis.
Advertisements

Lecture 16 Deterministic Turing Machine (DTM) Finite Control tape head.
Lecture 24 MAS 714 Hartmut Klauck
Finite Automata CPSC 388 Ellen Walker Hiram College.
1 Introduction to Computability Theory Lecture11: Variants of Turing Machines Prof. Amos Israeli.
CSCI 2670 Introduction to Theory of Computing October 19, 2005.
March 11, 2015CS21 Lecture 271 CS21 Decidability and Tractability Lecture 27 March 11, 2015.
Courtesy Costas Busch - RPI1 Non Deterministic Automata.
1 Linear Bounded Automata LBAs. 2 Linear Bounded Automata are like Turing Machines with a restriction: The working space of the tape is the space of the.
Finite Automata Finite-state machine with no output. FA consists of States, Transitions between states FA is a 5-tuple Example! A string x is recognized.
Programming the TM qa  (,q) (,q) q1q1 0q1q1 R q1q1 1q1q1 R q1q1  h  Qa  (,q) (,q) q1q1 0q2q2  q1q1 1q3q3  q1q1  h  q2q2 0q4q4 R q2q2 1q4q4.
1Causality & MDL Causal Models as Minimal Descriptions of Multivariate Systems Jan Lemeire June 15 th 2006.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
CS Summer 2005 Final class - July 1st Assorted fun topics in computability and complexity.
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
Topics Automata Theory Grammars and Languages Complexities
CS Chapter 2. LanguageMachineGrammar RegularFinite AutomatonRegular Expression, Regular Grammar Context-FreePushdown AutomatonContext-Free Grammar.
Fall 2003Costas Busch - RPI1 Turing Machines (TMs) Linear Bounded Automata (LBAs)
Chapter 9 Turing Machine (TMs).
Final Exam Review Cummulative Chapters 0, 1, 2, 3, 4, 5 and 7.
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
1 Mathematical Institute Serbian Academy of Sciences and Arts, Belgrade DEUKS Meeting Valencia, September 9-11, 2008, Valencia New PhD modules proposal.
Positive and Negative Randomness Paul Vitanyi CWI, University of Amsterdam Joint work with Kolya Vereshchagin.
On Data Mining, Compression, and Kolmogorov Complexity. C. Faloutsos and V. Megalooikonomou Data Mining and Knowledge Discovery, 2007.
1 Unit 1: Automata Theory and Formal Languages Readings 1, 2.2, 2.3.
Complexity theory and combinatorial optimization Class #2 – 17 th of March …. where we deal with decision problems, finite automata, Turing machines pink.
New Bulgarian University 9th International Summer School in Cognitive Science Simplicity as a Fundamental Cognitive Principle Nick Chater Institute for.
1 Section 14.2 A Hierarchy of Languages Context-Sensitive Languages A context-sensitive grammar has productions of the form xAz  xyz, where A is a nonterminal.
Computability Kolmogorov-Chaitin-Solomonoff. Other topics. Homework: Prepare presentations.
1 Models of Computation o Turing Machines o Finite store + Tape o transition function: o If in state S1 and current cell is a 0 (1) then write 1 (0) and.
1 Undecidability Reading: Chapter 8 & 9. 2 Decidability vs. Undecidability There are two types of TMs (based on halting): (Recursive) TMs that always.
Regular Languages: Deterministic Finite Automata (DFA) a a a a b b bb DFA = NFA (Non-deterministic) = REG {w  {a,b} * | # a W and # a W both even}
Quantifying Knowledge Fouad Chedid Department of Computer Science Notre Dame University Lebanon.
Kolmogorov Complexity and Universal Distribution Presented by Min Zhou Nov. 18, 2002.
Coding Theory Efficient and Reliable Transfer of Information
1 Linear Bounded Automata LBAs. 2 Linear Bounded Automata (LBAs) are the same as Turing Machines with one difference: The input string tape space is the.
The Physics of Information: Demons, Black Holes, and Negative Temperatures Charles Xu Splash 2013.
Strings Basic data type in computational biology A string is an ordered succession of characters or symbols from a finite set called an alphabet Sequence.
Chapter 15 P, NP, and Cook’s Theorem. 2 Computability Theory n Establishes whether decision problems are (only) theoretically decidable, i.e., decides.
1.2 Three Basic Concepts Languages start variables Grammars Let us see a grammar for English. Typically, we are told “a sentence can Consist.
Lecture 24UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 24.
98 Nondeterministic Automata vs Deterministic Automata We learned that NFA is a convenient model for showing the relationships among regular grammars,
1 Course Overview Why this course “formal languages and automata theory?” What do computers really do? What are the practical benefits/application of formal.
1 Turing Machines - Chap 8 Turing Machines Recursive and Recursively Enumerable Languages.
Information complexity - Presented to HCI group. School of Computer Science. University of Oklahoma.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 12 Mälardalen University 2007.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
1 8.4 Extensions to the Basic TM Extended TM’s to be studied: Multitape Turing machine Nondeterministic Turing machine The above extensions make no increase.
MA/CSSE 474 Theory of Computation Universal Turing Machine Church-Turing Thesis (Winter 2016, these slides were also used for Day 33)
Lecture #5 Advanced Computation Theory Finite Automata.
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
Review : Theory of Computation. Regular Language and Finite Automata Context-free Language and Pushdown Automata Turing Machine and Recursive Enumerable.
Theory of Languages and Automata By: Mojtaba Khezrian.
Kolmogorov Complexity
PROGRAMMING LANGUAGES
Formal Language & Automata Theory
Non Deterministic Automata
Measuring Information in very Large Data Sets
PROGRAMMING LANGUAGES
Review : Theory of Computation
Linear Bounded Automata LBAs
The Encoding of TM Motivation for encoding of TM
CSCE 355 Foundations of Computation
A computational approximation to the AIXI model
CSE322 Chomsky classification
Hierarchy of languages
Jaya Krishna, M.Tech, Assistant Professor
Non Deterministic Automata
Theory of Computability
Presentation transcript:

Facticity, Complexity and Big Data Pieter Adriaans IvI, SNE group Universiteit van Amsterdam

Gerben de Vries, Cees de Laat, Steven de Rooij, Peter Bloem, Pieter Adriaans Group of Frank van Harmelen Vrije Universiteit Group of Luis Antunes Universidade do Porto

1 2 3 4 5 6

Need for new measure(s) of information Esthetic Measure (Birkhoff, Bense) Sophistication (Koppel) Logical Depth (Bennet) Statistical complexity (Crutchfield, Young) Effective complexity (Gell-Mann and Lloyd) Meaningful Information (Vitanyi) Self-dissimilarity (Wolpert and McReady) Computational Depth (Antunes et al.)

Research cycle System Observations Processing model

Some examples Data: D Structural Theory: T Ad Hoc Part: T(D) Description of our solar system Keppler’s laws Trajectories and mass of the planets Reuters data base English grammar The order of the individual sentences A composition by Bach Bach’s style Themes, repetition etc.

Basic Intuition: learning is data compression regularities in the data set allow us to compress the data set. If the data set is ‘representative’ then regularities in the training set will be representative for regularities in an unseen test set.

Some conventions U = Universe M = Model D = Data Set For the moment M and D are finite |M| = m |D| = d Model Data

Selecting the best model according to Bayes Universe Model Data Model Code Data to Model Code Optimal Shannon Code

The minimum description length (MDL) principle: J.Rissanen The best theory to explain a set of data is the one which minimizes the sum of: - the length, in bits, of the description of the theory and - the length, in bits, of the data when encoded with the help of the theory

Two-part code optimal compression given a model class C Data D Data to Model Code n 11 Model Code Facticity Residual Entropy

Facticity Objective: Studies the balance between structural and ad hoc data in (large) data sets. Methodology: Kolmogorov complexity. Model class: partial recursive functions (all possible computer programs). Most general theory of model induction we currently have.

Turing two-part code optimal compression Data x Index Program 11 n i p Facticity Residual Entropy Kolmogorov complexity Randomness Deficiency

Turing two-part code optimal compression Program Index

Two problems (and their solutions) Kolmogorov complexity is not computable? A Safe Approximation for Kolmogorov Complexity, P Bloem, F Mota, S de Rooij, L Antunes, P Adriaans - Algorithmic Learning Theory, 2014 Are indexes of TM’s indexes a reliable measure for model complexity (Nickname Problem)? Invariance theorem for indexes (Bloem, de Rooij 2014): there is a compact acceptable enumeration of the partial recursive functions

Facticity of random strings Program

Facticity for compressible datasets with small K(x) Program Index

Facticity for large compressible sets: Collapse onto smallest Universal Machine Program Index

Collapse point Conjecture If we have a data set D and a Universal Turing machine Uu such that: then Uu will be the preferred model under facticity. Realistic example: If |u|= 20 then all models above 106 bits will collapse

Optimal Facticity = Edge of Chaos Maximal instability Zero probability Optimal Facticity = Edge of Chaos Maximal instability Max φ(x) = δ(x) Max K2 (x) = φ(x) Zero probability φ(x) Zero probability Absolutely non-stochastic strings Low density Low probability Non-stochastic models mixed Low density High probability High density Low probability stochastic models K(x)=0 K(x)=|x|

Some applications The notion of an ideal teacher Modeling non max entropy systems in physics game playing (digital bluff poker) Honing’s problem of the ‘surprise’ value of musical information dialectic evolution of art the problem of interesting balanced esthetic composition optimal product mix selection Schmidhuber’s notion of low complexity art and Ramachandran’s neuro esthetics

Classification of processes Random process Recursive process Factic Process

Factic processes Are the only processes that create meaningful information (random processes and deterministic processes do not create meaningful information) Have no sufficient statistic Are characterized by: Randomness aversion: A factic process that appears to be random will structure itself. Model aversion: A factic process is maximally unstable when it appears to have a regular model. Are abundant in nature: game playing, learning, genetic algorithms, stock exchange, evolution,etc.

Regular Languages: Deterministic Finite Automata (DFA) 3 2 1 a b aa abab abaaaabbb {w {a,b}* | # aW and # aW both even} DFA = NFA (Non-deterministic) = REG

b b 1 0 1 0 0 b b b The machine is In state q0 The read/ write head An example state (read 0) (read 1) (read b) of a simple DTM program q0 q0,0,+1 q0,1,+1 q1,b,-1 Is in the matrix q1 q2,b,-1 q3,b,-1 qN,b,-1 q2 qy,b,-1 qN,b,-1 qN,b,-1 q3 qN,b,-1 qN,b,-1 qN,b,-1 The machine is In state q0 The read/ write head reads 0 Writes a moves (+1) one place to the right State changes To q0 +1 (state q0) b b 1 0 1 0 0 b b b q0 program This program accepts string that end with ’00’

Graph of Delta function of TM q2 b q0 1 q1 q3 qN

Erdős–Rényi models Percolation models 3 2 1 a b q2 b q0 1 q1 q3 qN

Conclusions Further research Compression based automated model selection for unique strings? No! Is Big Data just more Simple data? No! What is a really complex system? Collapse Conjecture: model collapse phenomena are relevant for the study of complex systems in the real world. (Brain, Cell, Stock Exchange etc.) Wat are viable restrictions on model classes?

Additional material

There are very small universal machines Neary, Woods SOFSEM 2012

And very large data sets 10^10 bits Human genetic code 10^14 bits Human brain 10^17 bits Salt cristal at quantum level 10^30 bits Ultimate laptop (1 kg plasma) 10^92 bits Total Universe 10^123 Total number of computational steps since Big Bang (Seth Lloyd)