Download presentation
Presentation is loading. Please wait.
Published bySharyl Smith Modified over 9 years ago
1
Analysis of Complex Systems John Sherwood Period 2
2
Abstract My project is involved with using data mining techniques on the internet in order to gather enough information for the use for a genetic algorithm in trend analysis of a complex system; e.g. the stock market
3
Scope The most fundamental element of my program is creating a correlation between news about a company and its stock and the price of the stock itself. In order to do this, a huge amount of data on both stock prices and news regarding companies must be processed into a quantitative format, and then extensively analyzed.
4
Expected Results In this project, I expect to at the very least have a very useful genetic algorithm, that given a list of independant and dependant data, can generate equations to create a tentative correlation. While the extremely chaotic nature of the specific application may prevent quantitative success in this instance, I do expect to have success on general terms.
5
Other's Work Due to the very lucrative nature of a program that could predict the stock market: Many have tried All have failed
6
Procedures Differs for each part of program Quantitative tests of success XML parser Data classification Trial and error tests Evaluation algorithms Discriminant generation
7
Design Several program segments: Data miner uses XML parser Stock Robots Discriminant generator Prediction shell Equation Refiner Module
8
Program Tests Bot Success Early program stage involved making subroutines that kept portfolios and invested Evaluated by profit Prediction Algorithm Not working yet Evaluation to be based on accuracy of predictions
9
Algorithms Different program segments use different algorithms Data mining algorithm Discriminant Generation Algorithm XML parsing algorithm Equation Refinement algorithm
10
Data Mining 1 st Generation Data Miner – Albert Used hardcoded processing of RSS feeds to gather news on stocks Data was hard to classify and generate a quantitative score for
11
Albert's Algorithm start_item = news.index(" ") end_item = news.index(" ")+7 item = news[start_item:end_item] news = news[end_item:] if len(item) == 0:break title = item[13:item.index(" ")] item = item[(item.index(" ")+8):] link = item[6:item.index(" ")].split("*")[-1] item = item[item.index(" ")+7:] pubdate = item[item.index(" ")+9:item.index(" ")] if " " in item:description = item[item.index(" ")+13:item.index(" ")] else:description = "" cursor = config.cursor() if cursor.execute("SELECT * FROM " + table + " WHERE stock=\""+stock+"\""+\ "AND pubdate=\""+pubdate+"\" AND title=\""+title+"\"") < 1: print title,"\n\t",pubdate,"\n\t",description cursor.execute("INSERT INTO " + table + " SET stock=\""+stock+"\", "+\ "pubdate=\""+pubdate+"\", title=\""+title+"\","+\ " description=\""+description+"\", link=\""+link+"\"")
12
Data Mining, cont 2 nd Generation Algorithm – Beatrice Planned to use XML parser to go through online search engine results Should generate much more data than Albert did, as well as more detailed data
13
XML Parser Two potential methods Iterative Uses a set of flags to determine what action to take with each character Recursive Splits XML document into sets of tags and processes each tag's child elements
14
Problems Main holdup on predictor is malformed XML Bots however are working very well (over a period of 4 weeks, averaged 7% profit for the 4, with all 4 making a profit and the highest profit being 19%)
15
Results and Conclusions Based on success of bots with very primitive prediction algorithms, encouraging for a sophisticated algorithm to be able to do anything Success/failure hinges on analysis of data
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.