Speed up the local inhibition Hideaki Suzuki September 5, 2013.

Slides:

Advertisements

Similar presentations

Algorithm Analysis Input size Time I1 T1 I2 T2 …

Advertisements

CS 1031 Recursion (With applications to Searching and Sorting) Definition of a Recursion Simple Examples of Recursion Conditions for Recursion to Work.

Efficiency of Algorithms Csci 107 Lecture 6-7. Topics –Data cleanup algorithms Copy-over, shuffle-left, converging pointers –Efficiency of data cleanup.

Grey Level Enhancement Contrast stretching Linear mapping Non-linear mapping Efficient implementation of mapping algorithms Design of classes to support.

Techniques for Dealing with Hard Problems Backtrack: –Systematically enumerates all potential solutions by continually trying to extend a partial solution.

Model generalization Test error Bias, variance and complexity

Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort

Clustering & image segmentation Goal::Identify groups of pixels that go together Segmentation.

Absorbing Random walks Coverage

Planning under Uncertainty

Algorithmic Complexity Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.

Thresholding Otsu’s Thresholding Method Threshold Detection Methods Optimal Thresholding Multi-Spectral Thresholding 6.2. Edge-based.

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.

Tirgul 9 Amortized analysis Graph representation.

Motion Analysis (contd.) Slides are from RPI Registration Class.

CSci 6971: Image Registration Lecture 4: First Examples January 23, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart, RPI Dr.

CS107 Introduction to Computer Science

Clustering Color/Intensity

CS 4700: Foundations of Artificial Intelligence

Analysis of Algorithms COMP171 Fall Analysis of Algorithms / Slide 2 Introduction * What is Algorithm? n a clearly specified set of simple instructions.

Radial Basis Function Networks

Genetic Algorithm.

Artificial Neural Networks

1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.

DATA MINING LECTURE 13 Absorbing Random walks Coverage.

Advances in Modeling Neocortex and its impact on machine intelligence Jeff Hawkins Numenta Inc. VS265 Neural Computation December 2, 2010 Documentation.

Data Analysis 1 Mark Stamp. Topics  Experimental design o Training set, test set, n-fold cross validation, thresholding, imbalance, etc.  Accuracy o.

CS Learning Rules1 Learning Sets of Rules. CS Learning Rules2 Learning Rules If (Color = Red) and (Shape = round) then Class is A If (Color.

Analysis of Algorithms

CS CM124/224 & HG CM124/224 DISCUSSION SECTION (JUN 6, 2013) TA: Farhad Hormozdiari.

Motif finding with Gibbs sampling CS 466 Saurabh Sinha.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.

1/27 Discrete and Genetic Algorithms in Bioinformatics 許聞廉中央研究院資訊所.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 9: Ways of speeding up the learning and preventing overfitting Geoffrey Hinton.

Image segmentation Prof. Noah Snavely CS1114

CSC 211 Data Structures Lecture 13

CS654: Digital Image Analysis

Chapter 27- Atomic/Quantum Physics

Local Search Pat Riddle 2012 Semester 2 Patricia J Riddle Adapted from slides by Stuart Russell,

March 23 & 28, Csci 2111: Data and File Structures Week 10, Lectures 1 & 2 Hashing.

March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.

SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.

Synthesizing Natural Textures Michael Ashikhmin University of Utah.

Duy & Piotr. How to reconstruct a high quality image with the least amount of samples per pixel the least amount of resources And preserving the image.

Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.

Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.

Yield Cleaning Software and Techniques OFPE Meeting

Vector Quantization CAP5015 Fall 2005.

Lecture 8 Source detection NASSP Masters 5003S - Computational Astronomy

D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.

ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.

Color Image Segmentation Mentor : Dr. Rajeev Srivastava Students: Achit Kumar Ojha Aseem Kumar Akshay Tyagi.

May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.

SPATIAL POOLER BY GIL SHOTAN. PRIMARY GOALS AND OBJECTIVES Move towards pure C++ implementation of CLA Easier to create bindings with other environments.

Algorithm Complexity is concerned about how fast or slow particular algorithm performs.

Algorithm Design Techniques, Greedy Method – Knapsack Problem, Job Sequencing, Divide and Conquer Method – Quick Sort, Finding Maximum and Minimum, Dynamic.

Applying Combinatorial Testing to Data Mining Algorithms

Convenience Sampling.

Adaptive Median Filter

Analysis of Algorithms

Article Review Todd Hricik.

Subject Name: File Structures

Algorithm Analysis CSE 2011 Winter September 2018.

mEEC: A Novel Error Estimation Code with Multi-Dimensional Feature

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

6.2 Grid Search of Chi-Square Space

Estimating Algorithm Performance

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Evaluation of Relational Operations: Other Techniques

Presentation transcript:

Speed up the local inhibition Hideaki Suzuki September 5, 2013

Preface In the middle of my studying HTM CLA, I’ve got some ideas to accelerate the special pooler. This presentation explains one such experiment: accelerating the local inhibition of SP.

The missions of the local inhibition There are three missions IIUC. ▫ Sparsely; activate only the fixed number of columns. ▫ Stability; similar activations for similar input patterns. ▫ Coverage; capture the entire input image.

The current local inhibition The algorithm :- ▫ For each column in the region;  Look at its preactive neighbors.  If its intensity is within the top k among the neighbors, activate this column Issues ▫ Three issues have been observed. (the next slide) I use the term “preactive” to mean the column state that the column has gotten good enough overlap and became a candidate to become active (but it’s not decided before the inhibition process)

Issues 1.Speed; O(N * n * k + A) with unordered partial sorting. ▫ where N is the number of columns in the region (>1K), n is the number of columns in the inhibition radius (typically 10~200), k is the desired local activity (a few), A is the number of active columns. ▫ The local inhibition is times slower than the global inhibition. O(p) can be achieved for the global inhibition, where p is the number of preactive columns (0 ≤ p ≤ N) before inhibition. 2.Coverage; active columns are biased toward the densest area of the input bits. ▫ If the synapses of a column do not cover the entire region, a column has locality over the input pattern. The densest input area makes those columns connected to the area to get excited very much. Thus, those columns will have higher probability to survive after the local inhibition. ▫ If a column has locality, the active columns tend to gather to the core part of the image pattern. Thus, SDR will have less coverage over the input pattern (though much better than the global inhibition). 3.Control; difficult (not so intuitive) to find out the proper parameters to achieve the sparsity. ▫ The desired local activity k indirectly relates to the sparsity. Not only k but also the shape of the input pattern affect sparsity. e.g. a big input pattern, a tiny pattern, scattered pattern,…,etc. ▫ If we use a square inhibition area, it contains 1 column in the inhibition radius 0, 9 columns in the radius 1. Then 25, 49, 81, …, and (2 * R + 1)^2 in the radius R in general, which is too rough. ▫ For example. when we want 3% sparsity, what is the best setup for the desired local activity and the inhibition radius? 1/(2*2+1)^2 = 4%, 1/(2*3+1)^2 = 2%, or the mix of those …?

The improved local inhibition The algorithm :- ▫ Repeat the below till all active columns are selected.  Choose the column with the highest intensity from the preactive columns globally, and mark it active.  Put penalty to the intensities of the neighbor columns. The result is… ▫ Faster than the current local inhibition ▫ Better coverage over the input pattern ▫ Less pain to control sparsity

The control parameters The inhibition radius ▫ If we have P preactive columns where the target active columns is A, one active column should inhibit P/A preactive columns on average. ▫ So, P/A+1 is a good candidate for the area size of the local inhibition; the inhibition radius = sqrt (P/A+1) / 2 – 1. ▫ Since preactive columns are not packed and there will be some space between columns, sqrt (P/A) seems to work well. The inhibition penalty ▫ This parameter is analog to the current desired local activity. ▫ If the penalty is set as W % against the maximum possible intensity I, and C is the number of active neighbors, one column will receive (W% * I * C) penalty from its active neighbors. ▫ For example,  Assume W = 8%, the minimum overlap threshold is T = 2%  If a preactive column has the intensity 40% of I, three active columns chosen around it will make the intensity to drop down to 40%-3*8% = 16% of I.  In the same way, four surrounding active columns will make the intensity to drop to 8%.  Five active columns around it will finally make it to become 0%, which is less than T. Then, this column will not be selected, unless all of not-inhibited-yet preactive columns are exhausted and still we’re in short to achieve the target sparsity. ▫ Note that the final intensity of a column can go negative.

Computational complexity O((P + A)*(P - A + 1) + A*n + A) ▫ P is the number of preactive columns (A<P≤N), A is the number of active columns (target sparsity), n is the number of columns within the inhibition radius. This is still slower than global inhibition O(P + A), but not as slow as the current O(N*n*k + A). The worst case is when P = N. Still, (P+A)*(P-A+1) < P*A = N*A, which is typically much less than N*n*k. Though A*n is the additional factor, the new algorithm is faster than the current one. As learning in SP progresses, the number of inhibited columns tends to decrease (having leaner preactive SDR), as columns are segregated for its own input patterns. As P  A, n  0, this local inhibition approaches toward O(3A) that is similar to the cost of global inhibition.

Pseudo Code Input: List preactiveColumns; int targetActiveColumnCount; Output: List activeColumns; 1.int inhibitionRadius = (int)sqrt(preactiveColumns.Count / targetActiveColumnCount); 2.for (; ; ) 3.{ 4. chosen = findColumnWithLargestIntensity (preactiveColumns); 5. preactiveColumns.Remove(chosen); 6. activeColumns.Add(chosen); 7. markColumnState(chosen, Active); 8. if (activeColumns.Count == targetActiveColumnCount) break; 9. for (c in neighborsOf(chosen)) 10. c.intensity -= localInhibitionPenalty; 11.} The number of scan lookups: (P+A)*(P-A+1) The constant amount of task: A The number of subtraction: A * n

Speed Measurement When I don’t have so many preactive columns (<100), the local inhibition is less than 4 times slower than global inhibition. If we have many columns to inhibit, local inhibition can get slow by ten fold. (but not 60 times or 100 times) Elapsed Time Ratio (New local / global) 1024 columns 4% sparsity

The coverage of SDR Fanout diameter 10% over the input space (fanout radius 13) Fanout diameter 30% over the input space (fanout radius 38) GlobalNew Local Reverse Map Image size: 256x256 pixs Region size: 32x32 cols Target Sparsity: 4% Potential synapse: 1024 Connected synapse: 50% Minimum overlap: 20 Inhibition penalty: 8% Red: Active columns Green: Preactive with positive intensity Blue: Preactive with negative intensity Initial SDR GlobalNew Local Input Image Active columns are more evenly distributed Thus, it provides better coverage Gray: Disconnected synapse Green: Inactive synapse Yellow: Active synapse

The coverage of SDR Fanout diameter 100% over the input space (fanout radius 128) Fanout diameter 100% No permanence bias (perm sinker 0) GlobalNew Local Reverse MapInitial SDR GlobalNew Local Image size: 256x256 pixs Region size: 32x32 cols Target Sparsity: 4% Potential synapse: 1024 Connected synapse: 50% Minimum overlap: 20 Inhibition penalty: 8% Input Image Not much difference, as all columns have the same input. If no bias is given to permanence, all columns are equal.

Effect of inhibition penalty Image size: 256x256 Region size: 32x32 Potential synapse: 1024 Min threshold: 2% Connected synapse: 50% Fanout diameter: 50% Input Image Initial SDR Penalty 1% Penalty 0% Global Inhibition Penalty 4%Penalty 32%Penalty 80%Penalty 8% Those two are identicalThe results converge to one final SDR.

How three missions are achieved Sparsity ▫ It is guaranteed to produce the target sparsity, if the enough number of preactive columns is given, because the new algorithm will not stop till the target sparsity is achieved. Stability ▫ It is achieved because the new algorithm picks up the column of the highest intensity globally every time. Similar inputs with similar intensity distribution will end up with similar SDRs. Coverage ▫ With good inhibition penalty, active columns cannot gather at a single location and will be distributed, which gives better coverage over the input image.

Direction for further work Implement the improvement in Nupic code and see any positive or negative impact to the result of prediction. Further optimize the algorithm. e.g. by dividing the space to narrow the search area for the next winner column, more GPU friendly algorithm, …,etc.