Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining) Lluís A. Belanche.

Similar presentations


Presentation on theme: "Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining) Lluís A. Belanche."— Presentation transcript:

1

2 Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining) Lluís A. Belanche

3 www.lsi.upc.edu/... /~belanche/docencia/aiddm/aiddm.html /~avellido/teaching/data_mining.htm

4 Contents of the course (hopefully) 1. Introduction & methodologies 2. Exploratory DM through visualization 3. Pattern recognition: introduction 4. Pattern recognition: the Gaussian case 5. Feature extraction 6. Feature selection & weighing 7. Error estimation 8. Linear methods are nice! 9. Probability in Data Mining 10. Latency, generativity, manifolds and all that 11. Application of GTM: from medicine to ecology 12. DM Case studies Sorry guys! … no fuzzy systems …

5

6

7 Error estimation

8 Feature extraction, selection and weighing have many uses

9 Linear classifiers are nice! (I)

10 Linear classifiers are nice! (II)  Transformation  (x) = [   (x),   (x), …  m (x) ] with x = [ x 1, x 2, …, x n ] Useful for “ascending” (m>n) or “descending” (m>n) with 0 < m,n < oo (integers) … an example?

11 Linear classifiers are nice! (III)  Nets  (x) = [   (x),   (x), …  m (x) ] with x = [ x 1, x 2, …, x n ] x     (x)

12 Utility This is a very powerful setting Let us suppose: r>s  increase in dimension increase in expressive power, ease the task for almost any learning machine r<s  decrease in dimension visualization, compactation, noise reduction, removal of useless information Contradictory !?

13 On intelligence … What is Intelligence? What is the function of Intelligence?  to ensure survival in nature What are the ingredients of intelligence? –Perceive in a changing world –Reason under partial truth –Plan & prioritize under uncertainty –Coordinate different simultaneous tasks –Learn under noisy experiences

14 “Generally, a car can be parked rather easily because the final position of the car is not specified exactly. It it were specified to within, say, a fraction of a millimeter and a few seconds of arc, it would take hours of maneuvering and precise measurements of distance and angular position to solve the problem.” Highhigh  High precision carries a high cost. Parking a Car (difficult or easy?)

15 Soft Computing Rough Sets Fuzzy Logic Neural Networks Evolutionary Algorithms Chaos & Fractals Belief Networks The primordial soup

16 What could MACHINE LEARNING possibly be? In the beginning, there was a set of examples … To exploit imprecision, uncertainty, robustness, data dependencies, learning and/or optimization ability, to achieve a working solution to a problem which is hard to solve. To find an exact (approximate) solution to an imprecisely (precisely) formulated problem.

17  The challenge is to put these capabilities into use by devising methods of computation which lead to an acceptable solution at the lowest possible cost.  This should be the guiding principle So what is the aim?

18 Fuzzy Logic : the algorithms for dealing with imprecision and uncertainty Neural Networks : the machinery for learning and function approximation with noise Evolutionary Algorithms : the algorithms for adaptive search and optimization RS Rough Sets uncertainty arising from the granularity in the domain of discourse Different methods = different roles

19 Examples of soft computing TSP: 10 5 cities, –accuracy within 0.75%, 7 months –accuracy within 1%, 2 days Compare –“absoulute best for sure” with “very good with very high probability”

20 Are you one of the top guns? Consider … –Search space of size s –Draw N random samples –What is the probability p that at least one of them is in the top t ? Answer: p = 1 – (1-t/s) N Example: s= 10 12, N=100.000, t=1.000  1 in 10.000 !

21 On Algorithms what is worth? Problems Efficiency P Specialized algorithms: best performance for special problems Generic algorithms: good performance over a wide range of problems Specialized Algo. Generic Algorithms

22 Words are important ! What is a theory ? What is an algorithm ? What is an implementation ? What is a model ? What does “non-linear” mean ? What does “non-parametric” mean ?

23 Learning “Foreignia” (Poggio & Girosi’93) Can a machine learn to pronounce? 1. Do nothing and wait 2. Learn all the pronunciation rules 3. Memorize pronunciation examples 4. Pick a subset of pronunciation pairs and learn/memorize them 5. Pick subsets of pronunciation examples and develop a model explaining them

24 The problem of induction Classical problem in Philosophy Example: 1,2,3,4,5,? A more through example: JT

25 What are the conditions for successful learning? Training data (sufficiently) representative Principle of similarity Target function within capacity of the learner Non-dull learning algorithm Enough computational resources A correct (or close to) learning bias

26 And the Oscar goes to … The real problem is not whether machines think, but whether men do. B.F. Skinner, Contingencies of Reinforcement


Download ppt "Data Mining 2 (ex Análisis Inteligente de Datos y Data Mining) Lluís A. Belanche."

Similar presentations


Ads by Google