On the futility of attempts to formalize clustering within conventional formal frameworks Lev Goldfarb ETS group Faculty of Computer Science UNB Fredericton,

Slides:



Advertisements
Similar presentations
Mathematics in Engineering Education 1. The Meaning of Mathematics 2. Why Math Education Have to Be Reformed and How It Can Be Done 3. WebCT: Some Possibilities.
Advertisements

Algebra Problems… Solutions Algebra Problems… Solutions © 2007 Herbert I. Gross By Herbert I. Gross and Richard A. Medeiros next Set 9.
Presentation on Artificial Intelligence
A Story of Geometry Grade 8 to Grade 10 Coherence
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Fall Semantics Juan Carlos Guzmán CS 3123 Programming Languages Concepts Southern Polytechnic State University.
Constructivism -v- Realism Is knowledge a reflection of an outside reality or constructed by us? MRes Philosophy of Knowledge: Day 2 - Session 3 (slides.
ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY
A Framework for Ontology-Based Knowledge Management System
Can a formal model unify biology? What inductive informatics offers to evolutionary biology, taxonomy, molecular biology, and developmental biology Lev.
1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.
CPSC 411, Fall 2008: Set 12 1 CPSC 411 Design and Analysis of Algorithms Set 12: Undecidability Prof. Jennifer Welch Fall 2008.
The Language of Theories Linking science directly to ‘meanings’
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
Chapter 2: Pattern Recognition
CSE115/ENGR160 Discrete Mathematics 03/03/11 Ming-Hsuan Yang UC Merced 1.
Complexity of Mechanism Design Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Developing Ideas for Research and Evaluating Theories of Behavior
A logic for reasoning about digital rights Riccardo Pucella, Vicky Weissman Cornell University.
Normal forms for Context-Free Grammars
Algorithmic Problems in Algebraic Structures Undecidability Paul Bell Supervisor: Dr. Igor Potapov Department of Computer Science
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
THE TRANSITION FROM ARITHMETIC TO ALGEBRA: WHAT WE KNOW AND WHAT WE DO NOT KNOW (Some ways of asking questions about this transition)‏
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
Writing level 3 essays An initial guide. Key principles The key principles of essay writing still apply: Understanding the topic Plan your response Structure.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
Computational Thinking The VT Community web site:
Towers of Hanoi. Introduction This problem is discussed in many maths texts, And in computer science an AI as an illustration of recursion and problem.
Are genes signs and if so what are they signs of? John Collier Philosophy, University of KwaZulu-Natal, Durban 4041 South Africa
Methods of Media Research Communication covers a broad range of topics. Also it draws heavily from other fields like sociology, psychology, anthropology,
CS/IT 138 THEORY OF COMPUTATION Chapter 1 Introduction to the Theory of Computation.
Sampletalk Technology Presentation Andrew Gleibman
Simplex method (algebraic interpretation)
LDK R Logics for Data and Knowledge Representation Modeling First version by Alessandro Agostini and Fausto Giunchiglia Second version by Fausto Giunchiglia.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
CIS 842: Specification and Verification of Reactive Systems Lecture Specifications: Sequencing Properties Copyright , Matt Dwyer, John Hatcliff,
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Lecture 1 Computation and Languages CS311 Fall 2012.
Learning Automata and Grammars Peter Černo.  The problem of learning or inferring automata and grammars has been studied for decades and has connections.
Formal Models in AGI Research Pei Wang Temple University Philadelphia, USA.
ARTIFICIAL INTELLIGENCE [INTELLIGENT AGENTS PARADIGM] Professor Janis Grundspenkis Riga Technical University Faculty of Computer Science and Information.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Plan for the Presentation Review our Algebra Problem Examine why it is such a hard problem Offer a content analysis of school algebra Argue why solving.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
TK PrasadPumping Lemma1 Nonregularity Proofs. TK PrasadPumping Lemma2 Grand Unification Regular Languages: Grand Unification (Parallel Simulation) (Rabin.
PHARMACOECONOMIC EVALUATIONS & METHODS MARKOV MODELING IN DECISION ANALYSIS FROM THE PHARMACOECONOMICS ON THE INTERNET ®SERIES ©Paul C Langley 2004 Maimon.
1 CS 385 Fall 2006 Chapter 1 AI: Early History and Applications.
The Commutative Property Using Tiles © Math As A Second Language All Rights Reserved next #4 Taking the Fear out of Math.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Scientific Debugging. Errors in Software Errors are unexpected behaviors or outputs in programs As long as software is developed by humans, it will contain.
1 Turing’s Thesis. 2 Turing’s thesis: Any computation carried out by mechanical means can be performed by a Turing Machine (1930)
 2005 SDU Lecture13 Reducibility — A methodology for proving un- decidability.
Algebra Problems… Solutions Algebra Problems… Solutions © 2007 Herbert I. Gross Set 17 part 2 By Herbert I. Gross and Richard A. Medeiros next.
 In this packet we will look at:  The meaning of acceleration  How acceleration is related to velocity and time  2 distinct types acceleration  A.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Lecture 04: Theory of Automata:08 Transition Graphs.
NP-Completeness  For convenience, the theory of NP - Completeness is designed for decision problems (i.e. whose solution is either yes or no).  Abstractly,
Discrete Mathematics Lecture # 22 Recursion.  First of all instead of giving the definition of Recursion we give you an example, you already know the.
From Natural Language to LTL: Difficulties Capturing Natural Language Specification in Formal Languages for Automatic Analysis Elsa L Gunter NJIT.
LEAP TH GRADE. DATES: APRIL 25-29, 2016 Test Administration Schedule:  Day 1 April 25- ELA Session 1: Research Simulation Task (90mins) Mathematics.
Algorithmic Problems in Algebraic Structures Undecidability Paul Bell Supervisor: Dr. Igor Potapov Department of Computer Science
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
Abstraction and Abstract Interpretation. Abstraction (a simplified view) Abstraction is an effective tool in verification Given a transition system, we.
Transition Graphs.
CIS Automata and Formal Languages – Pei Wang
Pushdown Automata.
Vocabulary Algorithm - A precise sequence of instructions for processes that can be executed by a computer.
RESEARCH BASICS What is research?.
U3L2 The Need For Algorithms
Presentation transcript:

On the futility of attempts to formalize clustering within conventional formal frameworks Lev Goldfarb ETS group Faculty of Computer Science UNB Fredericton, Canada

2 About the talk To approach the foundations of clustering, one must rely on an adequate concept of class, which I claim is completely lacking. This most fundamental issue that concerns our area (understood broadly to include machine learning) has been systematically neglected, putting any progress in the area in question. So … what is a class? In particular: Can the concept of class be adequately addressed within conventional math/CS formalisms ? As the title of the talk suggests, the answer is “no”. (For a radically new representational formalism, ETS, not considered here, see references in the abstract of this talk.)

3 What is a numeric representation? Classical measurement is a systematic method for representing objects by numbers. (Classes of objects do not at all enter into consideration.) Natural numbers (Peano representation) (via fixed measurable property) Physical objects object representation map Restricting ourselves to natural numbers:

4 What is a representational formalism? class Representational formalism class Physical objects class representation map class the two mappings are coupled object representation map I. The two representation mappings should not be decoupled. II. We should postulate that all classes have “inductive generative structure”.

5 What is a representational formalism? I. Formal implications of the tight link between the two mappings ( for objects & classes ) a)for the interpretation of basic operations in the chosen formalism: we should treat them as object operations and hence take them seriously, in contrast to present practice in applied mathematics b)for the general structure of class representation: a class representation must be expressed via basic operations in modern mathematics, this is a standard structural requirement (ignored in ML) !

Lev Goldfarb, NIPS 2005, Clustering6 What is a representational formalism? II.Refinement of the general structure of class representation [ I b) ]. Relying on our understanding of the structure of classes in nature, the above “inductive generative structure” of classes should mean that class representation must be a)of generative form: it must incorporate the mechanism by means of which the members of the class are constructed via the basic operations (also a standard structural math. requirement) and b)inductive: it must be effectively and reliably learnable from a very small training set.

Lev Goldfarb, NIPS 2005, Clustering7 Inadequacy of formal grammars Grammars do not offer an “inductive” class representation [ II b) ]. The main reason: a string over a finite alphabet does not carry within itself enough representational information to link it “effectively and reliably” with the corresponding grammar, i.e. to identify the class to which it belongs (see also the next slide). Thus, the overall deficiency of formal grammars is twofold: poor object representation class representation is “disconnected” from object representation e.g. nonterminals are not derivable from the object representation

8 Inadequacy of the string representation: there are better choices contexts Two of the possible formative histories for string abaca : An ETS representation (has nothing to do with a tree; captures the temporal sequence of insertions):

Lev Goldfarb, NIPS 2005, Clustering9 The vector space as a representational formalism However, the overwhelming practice amounts to: “take the vectors and run”, i.e. do what you want with them. When modeling various phenomena in science, classes have not yet become the focus of attention, hence it is up to us to address these new scientific representational issues. the only candidate for class/class representation is the affine subspace from above vector space representation  basic operations are {+, ·}

10 Inadequacy of the vector space formalism Obviously, it lacks generative [ II a) ] class representation. Why? The absence of “sufficient” representational structure results in: operations {+, ·} being too “simple” linear generativity producing only very “regular” classes. To compensate, a class description had to be brought in from outside the algebraic formalism proper (which again violates the standard “structural” wisdom of mathematics). The resulting class description: is structurally and representationally “alien” and “meaningless” (there is no tight link between an object and its class representation) includes non-class vectors that satisfy the class description

Lev Goldfarb, NIPS 2005, Clustering11 Inadequacy of the vector space formalism Unfortunately, the prevailing trend in machine learning is that clever distance measures or kernels should “solve the problem”. However, these have to be crafted manually, and, more importantly: they cannot rectify the inadequacy of a vector as an object representation again, they are being brought in from “outside” the representational (algebraic) formalism.

Lev Goldfarb, NIPS 2005, Clustering12 Inadequacy of the vector space formalism Thus, ML practice reinforces the scientifically counterproductive view that classes are our creation, rather than existing in nature (due to the fact that class representation is not “related” to object representation). On the other hand, once we develop a formalism in which the concept of class follows the “structural” mathematical wisdom, we would then offer the sciences a formal language of inestimable value, i.e. something that mathematics has traditionally provided.

Lev Goldfarb, NIPS 2005, Clustering13 Conclusion No adequate class representation  No foundation for clustering The golden age of classification (and “clustering”) is still ahead of us, though its arrival depends on the development of the “right” representational formalism.