Homework 3 Responses This was meant to give you practice with data management/mining given a non ideally formatted input data structure.

Slides:



Advertisements
Similar presentations
WARNING These slides are not optimized for printing or exam preparation. These are for lecture delivery only. These slides are made for PowerPoint 2010.
Advertisements

Final Project of Information Retrieval and Extraction by d 吳蕙如.
Hurricanes Smoking Guns of Climate Change or random occurrences?
Reduction in Strength CS 480. Our sample calculation for i := 1 to n for j := 1 to m c [i, j] := 0 for k := 1 to p c[i, j] := c[i, j] + a[i, k] * b[k,
Hurricanes Smoking Guns of Climate Change or random occurrences?
Programming Logic and Design Fourth Edition, Comprehensive
Hurricanes Smoking Guns of Climate Change or random occurrences?
Grep, comm, and uniq. The grep Command The grep command allows a user to search for specific text inside a file. The grep command will find all occurrences.
H2-1 University of Washington Computer Programming I Lecture 10: Loop Development and Program Schemas © 2000 UW CSE.
Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.
Using Unix Shell Scripts to Manage Large Data
Unix Filters Text processing utilities. Filters Filter commands – Unix commands that serve dual purposes: –standalone –used with other commands and pipes.
UNIX Filters.
CS261 Data Structures Hash-like Sorting. Hash Tables: Sorting Can create very fast sort programs using hash tables These sorts are not ‘general purpose’
1 Day 16 Sed and Awk. 2 Looking through output We already know what “grep” does. –It looks for something in a file. –Returns any line from the file that.
Advanced File Processing
CPOD.exe … start here The menu is a ‘tabbed dialog box’ that comes and goes when your mouse pointer enters the blue bar This box contains a description.
CC0002NI – Computer Programming Computer Programming Er. Saroj Sharan Regmi Week 7.
Interpreting logs and reports IIPC GA 2014 Crawl engineers and operators workshop Bert Wendland/BnF.
Keylogger A presentation of computer safety. What is a Keylogger?  A keylogger is an invisible tool for surveillance that allows you to monitor the activities.
Looping Constructs “Here we go loop de loop, on a Saturday night” – Johnny Thunder “First I'm up, and then I'm down. Then my heart goes around and around.”
Agenda Sed Utility - Advanced –Using Script-files / Example Awk Utility - Advanced –Using Script-files –Math calculations / Operators / Functions –Floating.
Unix Talk #2 (sed). 2 You have learned…  Regular expressions, grep, & egrep  grep & egrep are tools used to search for text in a file  AWK -- powerful.
Bias Corrections of Storm Counts from Best Track Data Chris Landsea, National Hurricane Center, Miami, USA Gabe Vecchi, Geophysical Fluid Dynamics Lab,
CIS 218 Advanced UNIX1 CIS 218 – Advanced UNIX (g)awk.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command to search for.
Chapter Five Advanced File Processing Guide To UNIX Using Linux Fourth Edition Chapter 5 Unix (34 slides)1 CTEC 110.
Chapter Five Advanced File Processing. 2 Objectives Use the pipe operator to redirect the output of one command to another command Use the grep command.
(Stream Editor) By: Ross Mills.  Sed is an acronym for stream editor  Instead of altering the original file, sed is used to scan the input file line.
Debugging in Java. Common Bugs Compilation or syntactical errors are the first that you will encounter and the easiest to debug They are usually the result.
Wet Weather, Wet Climate? Interactive slide show.
Indexed and Relative File Processing
Unit 5 Evidence Name: sean Hingeley. 1.What software did you use? I used Microsoft excel for my spreadsheets 2.Why did you use this rather than something.
Track of Hurricane Floyd 1999 September 12 th- Floyd Becomes a Category 4 Hurricane September 14 th- Floyd Impacts the Bahamas September 16 th-
Results from Vaisala’s long range lightning detection network (LLDN) tropical cyclone studies Nicholas W. S. Demetriades Applications Manager, Meteorology.
Current Assignments Homework 2 is available and is due in three days (June 19th). Project 1 due in 6 days (June 23 rd ) Write a binomial root solver using.
Introduction to Unix – CS 21 Lecture 12. Lecture Overview A few more bash programming tricks The here document Trapping signals in bash cut and tr sed.
Lecture 20 The last function of reason is to recognize that there are an infinity of things which surpass it. Blaise Pascal.
PRODUCTION RECORD/ PROFIT AND LOSS REVIEW Paying attention to the details.
16-Dec-15Advanced Programming Spring 2002 sed and awk Henning Schulzrinne Dept. of Computer Science Columbia University.
Advanced Adhoc Reporting 2010 Visions Conference July 28, 2010.
13- 1 Chapter 13.  Overview of Sequential File Processing  Sequential File Updating - Creating a New Master File  Validity Checking in Update Procedures.
Data and Data Science Some Final Thoughts. Scientific Programming Basically always follows the same structure: – Formatted reading in of the data and.
– Introduction to the Shell 1/21/2016 Introduction to the Shell – Session Introduction to the Shell – Session 3 · Job control · Start,
Chapter 2 Getting Data into SAS Directly enter data into SAS data sets –use the ViewTable window. You can define columns (variables) with the Column Attributes.
CSC 352– Unix Programming, Spring 2015 February 2015 Unix Filters.
Sort And Filter Excel 2007 Charlie Haffey Norwood Public Schools.
Homework 4 Responses Most people really did make this assignment an illusion – not good –tempted to now cancel this class and just give everyone an A The.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
UNIX commands Head More (press Q to exit) Cat – Example cat file – Example cat file1 file2 Grep – Grep –v ‘expression’ – Grep –A 1 ‘expression’ – Grep.
Selection Methods Choosing the individuals in the population that will create offspring for the next generation. Richard P. Simpson.
Announcements Assignment 2 Out Today Quiz today - so I need to shut up at 4:25 1.
Maximal D-segments Maximal-scoring No subsegment has higher score No segment properly containing the segment satisfies the above No supersegment has higher.
Doing More Cool Things in CygNet ® Blake Miller, Principal Software Engineer Walter Goodwater, Lead Software Engineer.
Stare at the black cross
Digital readout architecture for Velopix
Translation Jon Kolko Professor, Austin Center for Design.
Managing Banner 9 Upgrade
the challenge... ask not what your country can do for you
Homework 2 Responses and Issues
What is Bash Shell Scripting?
Chapter 11 Hurricanes.
Selection CIS 40 – Introduction to Programming in Python
Homework 1 Responses and Issues
Guide To UNIX Using Linux Third Edition
Unix Talk #2 (sed).
Data Mining (Don’t worry, I am not presenting these slides; just for your reading pleasure)
Small Basic Programming
+/- Numbers Year 2-3 – Addition and subtraction of two-digit numbers
Presentation transcript:

Homework 3 Responses This was meant to give you practice with data management/mining given a non ideally formatted input data structure

Sorting and List Management The strategy I'm testing on these shorter lists is to try and wind up with a list containing only ['year','storm id','min nonzero pressure'] - so I want to try and do it by sorting through the lists, for a given year and storm id finding the min pressure and adding to a new list... so far my attempts have been fruitless This was the typical problem encountered by some of you

Some Success

Many Counting Problems

Producing the right file to “count” What is the “problem” here with the raw data file?

Code to Produce a new data file I could have filtered on “0” pressure but that would have reduced the actual number of Hurricanes carried forward

Next Pass produces inital “counting” file

Producing same counting file using csh scripts – practice this in csh! 1. grep –i “HURR” master1.txt > new.txt 2. sed s/NAMED//g new.txt > new1.txt 3. **awk ‘{print $1 FS $6 FS $7}’ new1.txt > new2.txt 4. sort –n –k 2 new2.txt > new3.txt New3.txt is then a file with 3 columns in which the FIRST occurrence of unique storm ID Is the lowest central pressure Actual files on following slides. ** could use cut –d” “ –f1 –f6 –f7 new1.txt > new2.txt but I always screw up CUT

New.txt

New1.txt

New2.txt

New3.txt Count this file anyway you like; Conditional IF statements are useful For counting.

SCIENCE  An example of science that is enabled by doing list management operations on data bases  Current project, track analysis of Atlantic Basic Hurricanes – all previous analysis is only for LANDFALL events.

Frequency: What detection efficiency improvement happened in 1950 relative to 1940?

Lots of Information!

Pressure analysis for strong storms < 965 mb; 2005 is dramatic!

New Important Result: Dynamics are changing; evolution is faster