Linear Regression review http://youtu.be/GAmzwIkGFgE http://youtu.be/ocGEhiLwDVc http://youtu.be/qPga0OBV-O8 http://youtu.be/MwokVxy5tvg
Search and LR LR minimizes the sum of the errors squared between regression line and data points LR finds values for A and B in y = Ax + B to minimize the sum of the errors squared Are there other ways of “finding” A and B? Yes Do they guarantee minimizing sigma errors squared? Suppose the relationship is not linear?
Problem solving as search Through the lens of search, all problems look the same. There is a space of candidate solutions There is a candidate solution generator There is a way to measure “progress” so you know when you reach a “good solution” You can tell if you found a “good solution” You can compare two candidates and tell which is better Every candidate has a cost (minimize) or utility (maximize) which can guide progress
Generate and test Repeat Candidate = generate() if test(candidate) == “good solution” break
Search How is linear regression like generate and test? Linear regression has a very very good generator that generates a “good solution” in one iteration But it only works on linearly, related data Quadratic regression only works on quadratically related data
Poorly understood data Stock markets GDP Cancer Car buying Aisle stocking Recommendations Images, videos,
Poorly understood data Visualization can help when data is two or three dimensional (maybe upto 10 dimensions). This is still an art. Generate and test might be slow Consider using a simple generator for LS regression. Generator would generate all possible values of A and B within [-1024.00...+1024.00] Suppose we have 100 dimensions?
Can we use gradients? A gradient is a local slope. If we can tell which of two candidates is better can we make progress towards a solution? Think about the connect4 learner If one set of weights is better than another, can we make progress towards the “best” set of weights?
Search algorithm solutionOld = generate() solutionNew = generate() Repeat if evaluate(solutionNew) < evaluate(solutionOld) SolutionOld = solutionNew SolutionNew = modify(solutionNew) Else SolutionNew = modify(solutionOld)
Issues Time versus quality Limiting the search space Discretizing the search space Susceptibility to local optima