Stock Market Prediction

Slides:



Advertisements
Similar presentations
FUNCTION FITTING Student’s name: Ruba Eyal Salman Supervisor:
Advertisements

Computational Learning An intuitive approach. Human Learning Objects in world –Learning by exploration and who knows? Language –informal training, inputs.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
1 lBayesian Estimation (BE) l Bayesian Parameter Estimation: Gaussian Case l Bayesian Parameter Estimation: General Estimation l Problems of Dimensionality.
Support Vector Machines
Bayesian Estimation (BE) Bayesian Parameter Estimation: Gaussian Case
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
COURSE OVERVIEW ADVANCED TEXT ANALYTICS Thomas Tiahrt, MA, PhD CSC492 – Advanced Text Analytics.
Classifiers, Part 3 Week 1, Video 5 Classification  There is something you want to predict (“the label”)  The thing you want to predict is categorical.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
CSCI 347 / CS 4206: Data Mining Module 05: WEKA Topic 01: WEKA Navigation.
Chapter 1: Introduction to Predictive Modeling 1.1 Applications 1.2 Generalization 1.3 JMP Predictive Modeling Platforms.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Data Mining Joyeeta Dutta-Moscato July 10, Wherever we have large amounts of data, we have the need for building systems capable of learning information.
COMP3503 Intro to Inductive Modeling
7-Speech Recognition Speech Recognition Concepts
© Negnevitsky, Pearson Education, Will neural network work for my problem? Will neural network work for my problem? Character recognition neural.
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
Data Mining Techniques in Stock Market Prediction
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Neural Networks Automatic Model Building (Machine Learning) Artificial Intelligence.
Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Hidden Markov Models in Keystroke Dynamics Md Liakat Ali, John V. Monaco, and Charles C. Tappert Seidenberg School of CSIS, Pace University, White Plains,
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
Automated Interpretation of EEGs: Integrating Temporal and Spectral Modeling Christian Ward, Dr. Iyad Obeid and Dr. Joseph Picone Neural Engineering Data.
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
Pattern Recognition NTUEE 高奕豪 2005/4/14. Outline Introduction Definition, Examples, Related Fields, System, and Design Approaches Bayesian, Hidden Markov.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Feasibility of Using Machine Learning Algorithms to Determine Future Price Points of Stocks By: Alexander Dumont.
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Support Vector Machine
Neural networks and support vector machines
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Machine Learning overview Chapter 18, 21
Document Filtering Social Web 3/17/2010 Jae-wook Ahn.
School of Computer Science & Engineering
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Supervised Learning Seminar Social Media Mining University UC3M
Basic machine learning background with Python scikit-learn
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Machine Learning Week 1.
Hidden Markov Models Part 2: Algorithms
CONTEXT DEPENDENT CLASSIFICATION
CSCI N317 Computation for Scientific Applications Unit Weka
Voice Activation for Wealth Management
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Speech recognition, machine learning
Neural Networks Weka Lab
Chapter 3: Maximum-Likelihood and Bayesian Parameter Estimation (part 2)
Speech recognition, machine learning
Is Statistics=Data Science
Presentation transcript:

Stock Market Prediction using Linear Regression and Big Data Sentiment Analysis -Surya Bhatt (NOTE: The actual formatting of my PPT has changed because I had made it in a different editor. Which I'll be using in the class)

Researches Done to-date

Markov Models Used for dependent random events Markov chains depend only on the immediate preceding event It won’t predict situations which have never occurred before

Dynamic Bayesian Networks Extension of Hidden Markov Models Predict something based on history Assume that all states are described by same variables and same dependencies

Machine Learning Learn from data The machines are taught to identify patterns Which can further be used to predict future instances

Support Vector Machines It is a discriminative classifier defined by a separating hyperplane The algorithm outputs an optimal hyperplane which categorizes new examples

For a linearly separable set of 2-D points which belong to one of the 2 classes , find a separating straight line

A line is bad if it passes too close to the points because it will be noise sensitive and it will not generalize correctly. Therefore, our goal should be to find the line passing as far as possible from all points. .

SVM or DBN or Linear Regression? SVM is useful mostly for 1 or 2 class classifiers DBNs produce HMM equivalent results Linear Regression improves accuracy with increase in data input.

Linear Regression The independent variable in this case is time, with minute values The dependent variable is the stock price Weka is used to predict future values

Forecast Plugin[1]: Weka[2] Time series analysis used for forecast Plugin developed by Pentaho corporation Based on simple linear regression model [1]- http://wiki.pentaho.com/display/DATAMINING/Time+Series+Analysis+and+Forecasting+with+Weka [2]- Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten (2009); The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1.

Time Series Analysis Time series forecasting is the process of using a model to generate predictions for future events based on known past events Ex- Sales forecasting, DNA sequence prediction, speech recognition etc.

Output with 5% confidence level

Output with 95% confidence level

Python based platform for Human Language Analysis Natural Language Processing Toolkit (NLTK)[3] Python based platform for Human Language Analysis Breaks the input text into tokens, which can be handled as we want A live demo will be better… [3]- Bird, Steven, Edward Loper and Ewan Klein (2009), Natural Language Processing with Python. O’Reilly Media Inc.

Results of streaming sample data

Another classification

Integrating and validating the results For the present test case, the stock prices were omitted for the last 40 days and were left for prediction. The linear regression showed a significant accuracy in finding the stock prices for all of those days.

Integrating and validating the results Also the sentiment analysis showed positive sentiments for those days. This analysis was as expected for the test data set

Scope for thesis research For the big data sentiment analysis, there is a lot to be done Presently the data is taken only from a stream of input of a web page Future study shall include data from all the tweets and further, from other social media

Scope for thesis research Further improvement in the validation methods can be brought about. Use of Artificial Neural Networks to check the levels of accuracy between methods

Thank You!