Data and Information How and why do we organize data? Differences between data and information? What about knowledge?

Slides:



Advertisements
Similar presentations
Introduction to Java 2 Programming Lecture 5 The Collections API.
Advertisements

Collections Framework A very brief look at Java’s Collection Framework David Davenport May 2010.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
CSE 373 Data Structures and Algorithms Lecture 18: Hashing III.
CSE 143 Lecture 7 Sets and Maps reading: ; 13.2 slides created by Marty Stepp
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
CS2110: SW Development Methods Textbook readings: MSD, Chapter 8 (Sect. 8.1 and 8.2) But we won’t implement our own, so study the section on Java’s Map.
CompSci 100 Prog Design and Analysis II Sept 7, 2010 Prof. Rodger 1CompSci 100 Fall 2010.
Liang, Introduction to Java Programming, Sixth Edition, (c) 2007 Pearson Education, Inc. All rights reserved Chapter 22 Java Collections.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 22 Java Collections.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
Collections in Java. Kinds of Collections Collection --a group of objects, called elements –Set-- An unordered collection with no duplicates SortedSet.
(c) University of Washington14-1 CSC 143 Java Collections.
Compsci 06/101, Fall What will we do today? l Practice solving problems  Solving problems without a computer  Is this different than solving.
1 Concrete collections II. 2 HashSet hash codes are used to organize elements in the collections, calculated from the state of an object –hash codes are.
Collections. The Plan ● Why use collections? ● What collections are available? ● How are the collections different? ● Examples ● Practice.
Compsci 100, Fall Data and Information How and why do we organize data? Differences between data and information? What about knowledge?
Chapter 18 Java Collections Framework
Data structures Abstract data types Java classes for Data structures and ADTs.
Sets, Maps and Hash Tables. RHS – SOC 2 Sets We have learned that different data struc- tures have different advantages – and drawbacks Choosing the proper.
Sets and Maps Chris Nevison. Set Interface Models collection with no repetitions subinterface of Collection –has all collection methods has a subinterface.
Compsci 06/101, Fall Vocabulary l What's the Spanish word for 'boy'? 'eat'?  How do you know? l What about the Turkish equivalents?  How do.
CompSci 100e 3.1 Plan for the week l Classes and Objects l Model-View-Controller l Maps and Hashing l Upcoming  Towards Algorithm analysis l APTs:  IsomorphicWords,
CPS 100, Spring Data and Information How and why do we organize data? Differences between data and information? What about knowledge?
Fall 2015, Kevin Quinn CSE373: Data Structures & Algorithms Lecture 25: Problem Solving CSE373: Data Structures and algorithms1.
CPS APTs and structuring data/information l Is an element in an array, Where is an element in an array?  DIY: use a loop  Use Collections, several.
CMSC 202 Containers and Iterators. Container Definition A “container” is a data structure whose purpose is to hold objects. Most languages support several.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CPS What's in Compsci 100? l Understanding tradeoffs: reasoning, analyzing, describing…  Algorithms  Data Structures  Programming  Design l.
3-1 Java's Collection Framework Another use of polymorphism and interfaces Rick Mercer.
CompSci 100e 7.1 Plan for the week l Review:  Boggle  Coding standards and inheritance l Understand linked lists from the bottom up and top-down  As.
Introduction to Java Collection. Java Collections What are they? –A number of pre-packaged implementations of common ‘container’ classes, such as LinkedLists,
CompSci Toward understanding data structures  What can be put in a TreeSet?  What can be sorted?  Where do we find this information?  How do.
Java: Base Types All information has a type or class designation
Collections ABCD ABCD Head Node Tail Node array doubly linked list Traditional Arrays and linked list: Below is memory representation of traditional.
Sets and Maps Chapter 9.
Using the Java Collection Libraries COMP 103 # T2
Sort & Search Algorithms
Searching, Maps,Tries (hashing)
Java: Base Types All information has a type or class designation
CSC 222: Object-Oriented Programming
Chapter 19 Java Data Structures
CS 332: Algorithms Hash Tables David Luebke /19/2018.
Data Structures and Algorithms for Information Processing
Data Structures TreeMap HashMap.
Compsci 201 Priority Queues & Autocomplete
Efficiency add remove find unsorted array O(1) O(n) sorted array
Searching.
Road Map CS Concepts Data Structures Java Language Java Collections
TreeSet TreeMap HashMap
CS313D: Advanced Programming Language
© A+ Computer Science - Arrays and Lists © A+ Computer Science -
CSE 373: Data Structures and Algorithms
Sets, Maps and Hash Tables
Object Oriented Programming in java
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
CS2110: Software Development Methods
Sets and Maps Chapter 9.
Compiler Construction
CSE 373 Separate chaining; hash codes; hash maps
CS2013 Lecture 7 John Hurley Cal State LA.
slides created by Marty Stepp
Part of the Collections Framework
Java String Class String is a class
Compiler Construction
Arrays.
Introduction to Java Collection
Presentation transcript:

Data and Information How and why do we organize data? Differences between data and information? What about knowledge?

Organizing Data: ideas and issues Often there is a time/space tradeoff If we use more space (memory) we can solve a data/information problem in less time: time efficient If we use more more time, we can solve a data/information problem with less space: space efficient Search v Store: repeating the same thing again … We’re not “smart” enough to avoid the repetition Learn new data structures or algorithms! The problem is small enough or done infrequently enough that being efficient doesn’t matter Markov illustrates this (next assignment)

Map: store pairs of (key,value) Search engine: web pages for “clandestine” Key: word or phrase, value: list of web pages This is a map: search query->web pages DNS: domain name duke.edu  IP: 152.3.25.24 Key: domain name, value: IP address This is a map: domain name->IP address Map (aka table, hash) associates keys with values Insert (key,value) into map, iterate over keys or pairs Retrieve value associated with a key, remove pair

Maps, another point of view An array is a map, consider array arr The key is an index, say i, the value is arr[i] Values stored sequentially/consecutively, not so good if the keys/indexes are 1, 100, and 1000, great if 0,1,2,3,4,5 Time/space trade-offs in map implementations, we’ll see more of this later TreeMap: most operations take time log(N) for N-elements HashMap: most operations are constant time on average Time for insert, get, … doesn’t depend on N (wow!) But! Elements in TreeMap are in order and TreeMap uses less memory than HashMap

Map (foreshadowing or preview) Any kind of Object can be inserted as a key in a HashMap But, performance might be terrible if hashValue isn’t calculated well Every object has a different number associated with it, we don’t want every object to be associated with 37, we want things spread out Only Comparable object can be key in TreeMap Basically compare for less than, equal, or greater Some objects are naturally comparable: String, Integer Sometimes we want to change how objects are compared Sometimes we want to invent Comparable things

Traceroute: where’s the map here? traceroute www.cs.dartmouth.edu traceroute to katahdin.cs.dartmouth.edu (129.170.213.101), 64 hops max, 1 lou (152.3.136.61) 2.566 ms 2 152.3.219.69 (152.3.219.69) 0.258 ms 3 tel1sp-roti.netcom.duke.edu (152.3.219.54) 0.336 ms 4 rlgh7600-gw-to-duke7600-gw.ncren.net (128.109.70.17) 184.752 ms 5 rlgh1-gw-to-rlgh7600-gw.ncren.net (128.109.70.37) 1.379 ms 6 rtp11-gw-to-rpop-oc48.ncren.net (128.109.52.1) 1.840 ms 7 rtp7600-gw-to-rtp11-gw-sec.ncren.net (128.109.70.122) 1.647 ms 8 dep7600-gw2-to-rtp7600-gw.ncren.net (128.109.70.138) 2.273 ms 9 internet2-to-dep7600-gw2.ncren.net (198.86.17.66) 10.494 ms 10 ge-0-1-0.10.nycmng.abilene.ucaid.edu (64.57.28.7) 24.058 ms 11 so-0-0-0.0.rtr.newy.net.internet2.edu (64.57.28.10) 45.609 ms 12 nox300gw1-vl-110-nox-internet2.nox.org (192.5.89.221) 33.839 ms 13 … 14 … 15 border.ropeferry1-crt.dartmouth.edu (129.170.2.193) 50.991 ms 16 katahdin.cs.dartmouth.edu (129.170.213.101) 50.480 ms

John von Neumann “Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.” “There's no sense in being precise when you don't even know what you're talking about. “ “There are two kinds of people in the world: Johnny von Neumann and the rest of us.” Eugene Wigner, Noble Physicist

Interface at work: Frequencies.java From recitation: key is a string, value is # occurrences Code below is slightly modified version of recitation code What clues for prototype of map.get and map.put? What if a key is not in map, what value returned? What kind of objects can be put in a map? Kinds of maps? for(String s : words) { s = s.toLowerCase(); Integer count = map.get(s); if (count == null){ map.put(s,1); } else{ map.put(s,count+1);

Coding Interlude: FrequenciesSorted Nested classes in FrequenciesSorted WordPair: combine word and count together, why? WPFreq: allows WordPair objects to be compared by freq How are WordPair objects created? In doFreqsA is the comparable-ness leveraged? What about in sorting? Alternative in doFreqsB Use TreeMap, then ArrayList then sort, why? Is comparable-ness leveraged? Sorting?

What can an Object do (to itself)? http://www.cs.duke.edu/csed/java/jdk1.6/api/index.html Look at java.lang.Object What is this class? What is its purpose? toString() Used to print (System.out.println) an object overriding toString() useful in new classes String concatenation: String s = "value "+ x; Default is basically a pointer-value

What else can you do to an Object? equals(Object o) Determines if guts of two objects are the same, must override, e.g., for using a.indexOf(o) in ArrayList a Default is ==, pointer equality hashCode() Hashes object (guts) to value for efficient lookup If you're implementing a new class, to play nice with others you must Override equals and hashCode Ensure that equal objects return same hashCode value

What's in the boxes? "genome" is in the boxes Objects and values Primitive variables are boxes think memory location with value Object variables are labels that are put on boxes String s = new String("genome"); String t = new String("genome"); if (s == t) {they label the same box} if (s.equals(t)) {contents of boxes the same} s t What's in the boxes? "genome" is in the boxes

Objects, values, classes For primitive types: int, char, double, boolean Variables have names and are themselves boxes (metaphorically) Two int variables assigned 17 are equal with == For object types: String, ArrayList, others Variables have names and are labels for boxes If no box assigned, created, then label applied to null Can assign label to existing box (via another label) Can create new box using built-in new Object types are references/pointers/labels to storage

Anatomy of a class public class Foo { private int mySize; private String myName; public Foo(){ // what’s needed? } public int getSize(){ return mySize; public double getArea(){ double x; x = Math.sqrt(mySize); return x; What values for vars (variables) and ivars (instance variables)?

David Parnas "For much of my life, I have been a software voyeur, peeking furtively at other people's dirty code. Occasionally, I find a real jewel, a well-structured program written in a consistent style, free of kludges, developed so that each component is simple and organized, and designed so that the product is easy to change. "

Parnas on re-invention "We must not forget that the wheel is reinvented so often because it is a very good idea; I've learned to worry more about the soundness of ideas that were invented only once. "

David Parnas (entry in Wikipedia) Module Design: Parnas wrote about the criteria for designing modules, in other words, the criteria for grouping functions together. This was a key predecessor to designing objects, and today's object-oriented design. Social Responsibility: Parnas also took a key stand against the Strategic Defense Initiative (SDI) in the mid 1980s, arguing that it would be impossible to write an application that was free enough from errors to be safely deployed. Professionalism: He believes that software engineering is a branch of traditional engineering.

Tomato and Tomato, how to code java.util.Collection and java.util.Collections one is an interface add(), addAll(), remove(), removeAll(), clear() toArray(), size(), iterator() one is a collection of static methods sort(), shuffle(), reverse(), max() frequency(), indexOfSubList () java.util.Arrays Also a collection of static methods sort(), fill(), binarySearch(), asList()