Introduction to Data Management Lecture 2 1
Overview A little history Other computer-based spreadsheet systems What is a spreadsheet? Types of Data KDD Process
Spreadsheet Beginnings Pencil and Paper ’78 Dan Bricklin – Harvard Business School –Use time sharing mainframe or pencil & paper (and eraser too probably) for large project –Neither were viable choices Created VisiCalc for PC of the day –Apple II
Spreadsheet Beginnings In the beginning, VisiCalc could handle 5 rows x 20 columns – not extremely useful. Made more robust by Bob Frankston and became a huge success Michael Kapor developed Lotus in early 80s –Lotus was the first app to begin the ubiquitous inclusion of computers in the business enviro.
Spreadsheet Beginnings Then enters Microsoft with Excel in ’84. –Started out written for Apple Macintosh because of windowing ability –Moved to DOS (’87) and then to Windows (’89) –Remained the flagship windows product for 3 years Little competition until 1992
First Spreadsheet Program
Other Spreadsheets
Lotus (older Windows ver.)
Open Office Calc
Excel 2003
What is a Spreadsheet Rows and columns intersecting in cells? Allows a 2D spatial relationship to be imposed on data C1C2C3 R R =c1r1*c2r1 =c1r2*c2r2
What is a spreadsheet? Pieces of data and…. Formulas to look at the data in a particular way. –Where information comes from –Power lies in the ability to use complex formulas.
Value Recalculation When a cell for which a calculated value depends is changed, the calculated value is automatically updated. How???? –Is the value of every cell in the whole spreadsheet re-calculated?
Types of Data Different types of data require different reactions to operators –August 24, 2005 – 4 = ? –August 24, 2005 – 25 = ? Some operators don’t operate on some types of data –Addition and string data
Types of Data Textual Numeric Dates Calculated values –Contains references (absolute or relative) to other cells in spreadsheet, workbook, etc.
KDD Knowledge Discovery in Data (KDD) –Finding useful, possibly hidden, information in data Initial Set of Data Data Required Preprocessed Data Transformed Data Model Knowledge/ Information Data Selection Pre-processing Transformation Mining Analysis/ Interpretation
KDD Process Selection –Finding the data that you want to analyze Preprocessing –Erroneous data removed –Missing data supplied or predicted Transformation –“Massage” data in to most usable form
KDD Process Data Mining –Apply data mining algorithms –Discovery Interpretation/Evaluation –Interpret what you’ve discovered –Put in usable form –Visualization techniques
?