Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard.

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

Introduction to Algorithms NP-Complete
JAYASRI JETTI CHINMAYA KRISHNA SURYADEVARA
Biologically Inspired Computation Really finishing off EC.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
9.2 The Traveling Salesman Problem. Let us return to the question of finding a cheapest possible cycle through all the given towns: We have n towns (points)
Biologically Inspired Computing: Evolutionary Algorithms: Encodings, Operators and Related Issues: Timetables and Groups This is a lecture seven of `Biologically.
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 7 The Mathematics of Networks 7.1Trees 7.2Spanning Trees 7.3 Kruskal’s.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
CSE332: Data Abstractions Lecture 27: A Few Words on NP Dan Grossman Spring 2010.
1 Greedy Algorithms. 2 2 A short list of categories Algorithm types we will consider include: Simple recursive algorithms Backtracking algorithms Divide.
The Theory of NP-Completeness
CSE 830: Design and Theory of Algorithms
CSE 326: Data Structures NP Completeness Ben Lerner Summer 2007.
Analysis of Algorithms CS 477/677
Clustering and greedy algorithms Prof. Noah Snavely CS1114
Radial Basis Function Networks
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Ch. 11: Optimization and Search Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 some slides from Stephen Marsland, some images.
Minimum Spanning Trees What is a MST (Minimum Spanning Tree) and how to find it with Prim’s algorithm and Kruskal’s algorithm.
1.1 Chapter 1: Introduction What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Programming & Data Structures
Lecture 23. Greedy Algorithms
Genetic Algorithms and Ant Colony Optimisation
Algorithms for Network Optimization Problems This handout: Minimum Spanning Tree Problem Approximation Algorithms Traveling Salesman Problem.
Chapter 11 Limitations of Algorithm Power. Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples:
Scott Perryman Jordan Williams.  NP-completeness is a class of unsolved decision problems in Computer Science.  A decision problem is a YES or NO answer.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Mehdi Kargar Aijun An York University, Toronto, Canada Discovering Top-k Teams of Experts with/without a Leader in Social Networks.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Theory of Computing Lecture 15 MAS 714 Hartmut Klauck.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
Analysis of Algorithms
COSC 2007 Data Structures II Chapter 14 Graphs III.
CSE 326: Data Structures NP Completeness Ben Lerner Summer 2007.
Image segmentation Prof. Noah Snavely CS1114
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
1 Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples: b number of comparisons needed to find the.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.
Lecture 19 Greedy Algorithms Minimum Spanning Tree Problem.
2005MEE Software Engineering Lecture 11 – Optimisation Techniques.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Minimal Spanning Tree Problems in What is a minimal spanning tree An MST is a tree (set of edges) that connects all nodes in a graph, using.
Hard problems in computer science Prof. Noah Snavely CS1114
3.3 Complexity of Algorithms
Chapter 3 Brute Force. A straightforward approach, usually based directly on the problem’s statement and definitions of the concepts involved Examples:
Introduction to Genetic Algorithms. Genetic Algorithms We’ve covered enough material that we can write programs that use genetic algorithms! –More advanced.
Minimum Spanning Trees CS 146 Prof. Sin-Min Lee Regina Wang.
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 7 The Mathematics of Networks 7.1Trees 7.2Spanning Trees 7.3 Kruskal’s.
Graphs A ‘Graph’ is a diagram that shows how things are connected together. It makes no attempt to draw actual paths or routes and scale is generally inconsequential.
Biologically Inspired Computation Ant Colony Optimisation.
Chapter 11 Introduction to Computational Complexity Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Chapter 3 Brute Force Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Approximation Algorithms by bounding the OPT Instructor Neelima Gupta
Exhaustive search Exhaustive search is simply a brute- force approach to combinatorial problems. It suggests generating each and every element of the problem.
Artificial Neural Networks This is lecture 15 of the module `Biologically Inspired Computing’ An introduction to Artificial Neural Networks.
Chapter 3 Brute Force Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Heuristics Definition – a heuristic is an inexact algorithm that is based on intuitive and plausible arguments which are “likely” to lead to reasonable.
Chapter 3 Brute Force Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 11 Limitations of Algorithm Power
3. Brute Force Selection sort Brute-Force string matching
Artificial Intelligence
3. Brute Force Selection sort Brute-Force string matching
Approximation Algorithms
Our old list of problems
Complexity Theory: Foundations
3. Brute Force Selection sort Brute-Force string matching
Presentation transcript:

Biologically Inspired Computing: Optimisation This is mandatory additional material for `Biologically Inspired Computing’ Contents: Optimisation; hard problems and easy problems; computational complexity; trial and error search;

This Material A formal statement about optimisation, and explanation of the close relationship with classification. About optimisation problems: formal notions for hard problems and easy ones. Formal notions about optimisation algorithms: exact algorithms and approximate algorithms. The status of EAs (and other bio-inspired optimisation algorithms) in this context. The classical computing `alternatives’ to EAs

What to take from this material: A clear understanding of what an optimisation problem is, and an appreciation of how classification problems are also optimisation problems What it means, technically, for an optimisation problem to be hard or easy What it means, technically, for an optimisation algorithm to be exact or approximate What kinds of algorithms, technically speaking, EAs are. An appreciation of non-EA techniques for optimisation A basic understanding of when an EA might be applicable to a problem, and when it might not.

Search and Optimisation Imagine we have 4 items as follows: (item 1: 20kg; item2: 75kg; item 3: 60kg, item4: 35kg) Suppose we want to find the subset of items with highest total weight Here is a standard treatment of this as an optimisation problem. The set S of all possible solutions is indicated above, The fitness of a solution s is the function total_weight(s) We can solve the problem (I.e. find the fittest s) by working out the fitness of each one in turn. But … any thoughts? I mean, how hard is this?

Of course, this problem is much easier to solve than that. We already know that 1111 has to be the best solution. We can prove, mathematically, that any problem of the form “find heaviest subset of a set of k things”, is solved by the “all of them” subset, as long as each thing has positive weight. In general, some problems are easy in this sense, in that there is an easily provable way to construct an optimal solution. Sometimes it is less obvious than in this case though.

Search and Optimisation In general, optimisation means that you are trying to find the best solution you can (usually in a short time) to a given problem. S s1s1s2s2s3s3 … We always have a set S of all possible solutions In realistic problems, S is too large to search one by one. So we need to find some other way to search through S. One way is random search. E.g. in a 500-iteration random search, we might randomly choose something in S and evaluate its fitness, repeating that 500 times.

The Fitness function Every candidate solution s in the set of all solutions S can be given a score, or a “fitness”, by a so-called fitness function. We usually write f(s) to indicate the fitness of solution s. Obviously, we want to find the s in S which has the best score. Examples timetabling: f could be no. of clashes. wing design: f could be aerodynamic drag delivery schedule f will be total distance travelled

An Aside about Classification In a classification problem, we have a set of things to classify, and a number of possible classes. To classify s we use an algorithm called a classifier. So, classifier(s) gives us a class label for s. We can assign a fitness value to a classifier – this can be simply the percentage of examples it gets right. In finding a good classifier, we are solving the following optimisation problem: Search a space of classifiers, and find the one that gives the best accuracy. E.g. the classifier might be a neural network, and we may use an EA to evolve the NN with the best connection weights.

Searching through S When S is small (e.g. 10, 100, or only 1,000,000 or so items), we can simply do so-called exhaustive search. Exhaustive search: Generate every possible solution, work out its fitness, and hence discover which is best (or which set share the best fitness) This is also called Enumeration

However … In all interesting/important cases, S is much much much too large for exhaustive search (ever). There are two kinds of `too-big’ problem: – easy (or `tractable’, or `in P’) – hard (or `intractable’, or `not known to be in P’) There are rigorous mathematical definitions of the two types – that’s what the ‘P’ is about – it stands for the ‘class of problems that can be solved by a deterministic polynomial Turing machine’. That’s outside the scope of this module. But, what’s important for you to know is that almost all important problems are technically hard.

About Optimisation Problems To solve a problem means to find an optimal solution. I.e. to deliver an element of s whose fitness is guaranteed to be the best in S. An Exact algorithm is one which can do this (i.e. solve a problem, guaranteeing to find the best). Is 500-iteration random search an Exact algorithm?

Problem complexity ‘Problem complexity’, in the context of computer science, is all about characterising how hard it is to solve a given problem. Statements are made in terms of functions of n, which is meant to be some indication of the size of the problem. E.g.: Correctly sort a set of n numbers Can be done in around n log n steps Find the closest pair out of n vectors Can be done in O(n 2 ) steps Find best design for an n-structural-element bridge Can be done in O(10 n ) steps …

Polynomial and Exponential Complexity Given some problem Q, with `size’ n, imagine that A is the fastest algorithm known for solving that problem exactly. The complexity of problem Q is the time it takes A to solve it, as a function of n. There are two key kinds of complexity: Polynomial: the dominant term in the expression is polynomial in n. E.g. n 34, n.log.n, sin(n 2.2 ), etc … Exponential: the dominant term is exponential in n. E.g. 1.1 n, n n+2, 2 n, …

Polynomial and Exponential Complexity (exp) 1.1 n ,780 (poly) n n Problems with exponential complexity take too long to solve at large n Note how an exponential always eventually catches up, and then rapidly overtakes, a polynomial.

Hard and Easy Problems Polynomial Complexity: these are called tractable, and easy problems. Fast algorithms are known which provide the best solution. Pairwise alignment is one such problem. Sorting is another. Exponential Complexity: these are called intractable, and hard problems. The fastest known algorithm which exactly solves it is usually not significantly faster than exhaustive search.

polynomial exponential Increasing n An exponential curve always takes over a polynomial one. E.g. time needed on fastest computers to search all protein structures with 500 amino acids: trillions of times longer than the current age of the universe.

Example: Minimum Spanning Tree Problems This is a case of an easy problem, but not as obvious as our first easy example. Find the cheapest tree which connects all (i.e. spans) the nodes of a given graph. Applications: Comms network backbone design; Electricity distribution networks, water distribution networks, etc …

A graph, showing the costs of building each pair-to-pair link What is the minimal-cost spanning tree? (Spanning Tree = visits all nodes, has no cycles; cost is sum of costs of edges used in the tree)

Here’s one tree: With cost = 36

Here’s a cheaper one With cost 20 …

The problem find the minimal cost spanning tree (aka the `MST’) is easy in the technical sense Several fast algorithms are known which solve this in polynomial time; Here is the classic one: Prim’s algorithm: Start with empty tree (no edges) Repeat: choose cheapest edge which feasibly extends the tree Until: n — 1 edges have been chosen.

Prim’s step 1:

Prim’s step 2:

Prim’s step 3:

Prim’s step 4:

Prim’s step 4: Guaranteed to have minimal possible cost for this graph; i.e. this is the (or a) MST in this case.

But change the problem slightly: We may want the degree constrained – MST (I.e. the MST, but where no node in the tree has a degree above 4) Or we may want the optimal communication spanning tree – which is the MST, but constrained among those trees which satisfy certain bandwith requirements between certain pairs of nodes There are many constrained/different forms of the MST. These are essentially problems where we seek the cheapest tree structure, but where many, or even most, trees are not actually feasible solutions. Here’s the thing: These constrained versions are almost always technically hard. and Real-world MST-style problems are invariably of this kind.

Approximate Algorithms For hard optimisation problems (again, which turns out to be nearly all the important ones), we need Approximate algorithms. These: deliver solutions in reasonable time try to find pretty good (`near optimal’) solutions, and often get optimal ones. do not (cannot) guarantee that they have delivered the optimal solution.

Typical Performance of Approximate Methods Evolutionary Algorithms turn out to be the most successful and generally useful approximate algorithms around. They often take a Long time though – it’s worth getting used to the following curve which tends to apply across the board. Time Quality Simple method gets good solutions fast Sophisticated method, slow, but better solutions eventually

So … Most real world problems that we need to solve are hard problems We can’t expect truly optimal solutions for these, so we use approximate algorithms and do as well as we can. EAs (and other BIC methods we will see) are very successful approximate algorithms