Computing and Data Analysis

Slides:



Advertisements
Similar presentations
Preliminaries Advantages –Hash tables can insert(), remove(), and find() with complexity close to O(1). –Relatively easy to program Disadvantages –There.
Advertisements

All the things to show that r is not the most common.
Spreadsheet Software lesson 14. This lesson includes the following sections: Spreadsheet Programs and Their Uses The Spreadsheet's Interface Entering.
REACH-CRC. Lookup Functions INDEX-MATCH LOOKUP Database Functions DSUM DMIN DMAX DCOUNT DAVERAGE.
A frequency table is an organized count of items or values. In its simplest form a frequency table shows how frequently each particular value occurs in.
Introduction to Spreadsheets Presented by Frank H. Osborne, Ph. D. © 2005 Bio 2900 Computer Applications in Biology.
Spreadsheets and Business Graphics: Facts and Figures Chapter 13.
Lesson: 4 Spreadsheets After completing this lesson, you will be able to: Identify the components of a spreadsheet. Enter data into a spreadsheet. Perform.
Microsoft Office 2003: Advanced 1 ADVANCED MICROSOFT ACCESS Lesson 7 – Modifying Table Design.
Chapter 11 Databases. 11 Chapter 11: Databases2 Chapter Contents  Section A: File and Database Concepts  Section B: Data Management Tools  Section.
Building the User Interface by Using HTML5: Organization, Input, and Validation Lesson 3.
Excel. Spreadsheet Software  What Is a Spreadsheet, and How Does It Work? A spreadsheet program allows users to perform simple and complex sorting. It.
Welcome to BIM Jeopardy. CopyrightAccessWordExcelPowerPoint
Lecture Contents Arrays and Vectors: Concepts of array. Memory index of array. Defining and Initializing an array. Processing an array. Parsing an array.
Introducing Students to the Analysis of Text Data Increase Relevance by Shifting Focus Away from Classical Statistical Mechanics & Hypothesis Testing Making.
Spreadsheet Vocabulary Terms
Databases and Speadsheets
Spreadsheets the basics. Readings n As per Module 5.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Working with Data Lists.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
A table is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows.
Spreadsheets and Business Graphics: Facts and Figures Chapter 13.
Office 2003 Introductory Concepts and Techniques M i c r o s o f t Excel Project 1 Creating a Worksheet and an Embedded Chart.
HEMANTH GOKAVARAPU SANTHOSH KUMAR SAMINATHAN Frequent Word Combinations Mining and Indexing on HBase.
1 CS 430: Information Discovery Lecture 5 Ranking.
Web of Science Demonstration Search Chemistry 137 – Spring 2013 Grace Baysinger Head Librarian & Bibliographer, Swain Chemistry & Chemical Engineering.
 The term “spreadsheet” covers a wide variety of elements useful for quantitative analysis of all kinds. Essentially, a spreadsheet is a simple tool.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Arrays + Functions Outline 6.5Passing Arrays to Functions.
COMP 900 (Spreadsheets) St. Lawrence College Instructor: Matthew J.W. Clarke Chapter Eight in Business Technology Today.
Computing and Data Analysis
SPSS For a Beginner CHAR By Adebisi A. Abdullateef
Regular Expressions 'RegEx'.
Computing and Data Analysis
Data Visualizer.
Lesson 2 Tables and Charts
Spreadsheet Vocabulary Terms
Statistical Analysis with Excel
Basics of histograms and frequency tables
Week 12 Option 3: Database Design
Information used to create graphs and find statistics
Chapter 8 Arrays Objectives
Statistics and Spreadsheet Functions
World Book Day.
Topics discussed in this section:
Concept of a document Lesson 3.
Statistical Analysis with Excel
Statistical Analysis with Excel
Happy Monday! Please place your vocab sheets in the front bin.
Tutorial 3 – Creating a Multiple-Page Report
Spreadsheet Vocabulary Terms
Chapter 8 Arrays Objectives
The Nature of Science & Science Skills
CSCI N207 Data Analysis Using Spreadsheet
Text Analyzer BIS1523 – Lecture 14.
Algorithm Analysis Bina Ramamurthy CSE116A,B.
MEDIATED MOOCS Introduction to descriptive statistics
Spreadsheet/Worksheet
Year 2 Summer Term Week 13 Lesson 3
Kindergarten Lessons 5-8
How to use hash tables to solve olympiad problems
Chapter 8 Arrays Objectives
Microsoft Office Access is the best –selling personal computer database management system. What is Access?
Using Complex Formulas, Functions, and Tables
Hash Tables: Associative Containers with Constant Time Operations --- On Average Consider the problem of computing the frequency of words.
What is Experimental Research?
Year 2 Summer Term Week 13 Lesson 3
Lesson 14 Spreadsheet Software.
Instructor: Zhe He Department of Computer Science
Excel for Educators Class Notes
Microsoft Office Illustrated Fundamentals
Presentation transcript:

Computing and Data Analysis Unit 6

Today’s Lesson Objectives: Basic Analytics (20 min) Computing with Text Activity Part 1 (30 min) Focusing on words (20 mint)

Text Analysis- Part 1 Basic Analytics Text can be analyzed many different ways. Research areas like “stylometrics” attempt to say something quantitative about an author’s work: e.g., by computing the average number of words per sentence or the average number of letters per word written by an author. Analytics can also be used to find patterns in other types of text. See Computing with Text Background

Basic Analytics In the first part, the basics of counting words in a file and creating bar charts based on those counts will be addressed by working with the subset of the twitter data file. Load CATwitter data file look at the tweets Change the size of the column in order to view the entire tweet Scroll and look at the variety of tweets

Text Mining Tweets are stored in an array (or vector) where the numbers in front indicate the place between is in the vector. Arrays are in important concept in computer science. Storing items in an array allows us to access particular elements, search and sort. Create a frequency table that separates out each word and counts how many time it appears in all the tweets. What is the word that appears least frequently? What is the word that appears most frequently?

Produce frequency tables that show only the most frequently appearing words and the different sorting options Demo how to produce bar chart of frequent occurring words Complete Part 1 of Computing with Text Activity