Compsci 101.2, Fall 2015 26.1 Plan for LDO101 l Ethical webpage scraping  Illustrate power of regular expressions  Python makes trying things relatively.

Slides:



Advertisements
Similar presentations
Computability & Complexity. Scenario I can’t write this program because I’m too dumb.
Advertisements

Lecture 1: Overview CMSC 201 Computer Science 1 (Prof. Chang version)
INTRODUCTION T.Najah Al_Subaie Kingdom of Saudi Arabia Prince Norah bint Abdul Rahman University College of Computer Since and Information System CS240.
CSE332: Data Abstractions Lecture 27: A Few Words on NP Dan Grossman Spring 2010.
Unit 181 Recursion Definition Recursive Methods Example 1 How does Recursion work? Example 2 Problems with Recursion Infinite Recursion Exercises.
Computer Science - I Course Introduction Computer Science Department Boston College Hao Jiang.
CSC 171 – FALL 2004 COMPUTER PROGRAMMING LECTURE 0 ADMINISTRATION.
Chapter 11: Limitations of Algorithmic Power
Computer Science - I Course Introduction Computer Science Department Boston College Hao Jiang.
CSE 131 Computer Science 1 Module 1: (basics of Java)
CPS Today’s topics l Computability ä Great Ideas Ch. 14 l Artificial Intelligence ä Great Ideas Ch. 15 l Reading up to this point ä Course Pack.
Week 4-5 Java Programming. Loops What is a loop? Loop is code that repeats itself a certain number of times There are two types of loops: For loop Used.
CPS What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread.
Computability and Modeling Computation What are some really impressive things that computers can do? –Land the space shuttle (and other aircraft) from.
Peter Andreae Computer Science Victoria University of Wellington Copyright: Peter Andreae, Victoria University of Wellington Summary and Exam COMP 102.
Cs3102: Theory of Computation Class 18: Proving Undecidability Spring 2010 University of Virginia David Evans.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 12: Recursion Problem Solving, Abstraction, and Design using C++
Compsci 100, Fall ’.1 Views of programming l Writing code from the method/function view is pretty similar across languages  Organizing methods.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Instructor: Tina Tian. About me Office: RLC 203A Office Hours: Wednesday 1:30 - 4:30 PM or .
1 Unit 1: Automata Theory and Formal Languages Readings 1, 2.2, 2.3.
Programming for Beginners Martin Nelson Elizabeth FitzGerald Lecture 13: An Introduction to C++
CS 114 – Class 02 Topics  Computer programs  Using the compiler Assignments  Read pages for Thursday.  We will go to the lab on Thursday.
Compsci 06/101, Spring What is Computing? Informatics? l What is computer science, what is its potential?  What can we do with computers in.
CSCI 51 Introduction to Computer Science Dr. Joshua Stough January 20, 2009.
CSC 413/513: Intro to Algorithms NP Completeness.
CSC 172 P, NP, Etc. “Computer Science is a science of abstraction – creating the right model for thinking about a problem and devising the appropriate.
Complexity Non-determinism. NP complete problems. Does P=NP? Origami. Homework: continue on postings.
Software Design 8.1 A Rose by any other name…C or Java? l Why do we use Java in our courses (royal we?)  Object oriented  Large collection of libraries.
CSCI 3160 Design and Analysis of Algorithms Tutorial 10 Chengyu Lin.
Compsci 100, Fall What is Computing? Informatics? l What is computer science, what is its potential?  What can we do with computers in our lives?
COMPSCI 102 Introduction to Discrete Mathematics.
CSE373: Data Structures & Algorithms Lecture 22: The P vs. NP question, NP-Completeness Lauren Milne Summer 2015.
CompSci What can be computed l What class of problems can be solved? ä Google Datacenter, Desktop computer, Cell phone, pencil? ä Alan Turing proved.
Copyright 2010 by Pearson Education Building Java Programs Chapter 10 Lecture 21: ArrayList reading: 10.1.
Compsci 06/101, Spring What is Computing? Informatics? l What is computer science, what is its potential?  What can we do with computers in.
Chapter 5 – Part 3 Conditionals and Loops. © 2004 Pearson Addison-Wesley. All rights reserved5-2 Outline The if Statement and Conditions Other Conditional.
Compsci 100, Spring What’s left to talk about? l Transforms  Making Huffman compress more  Understanding what transforms do Conceptual understanding,
CPS 100, Fall What is Computing? Informatics? l What is computer science, what is its potential?  What can we do with computers in our lives?
CompSci 100e 11.1 Sorting: From Theory to Practice l Why do we study sorting?  Because we have to  Because sorting is beautiful  Example of algorithm.
CompSci Problem: finding subsets l See CodeBloat APT, requires finding sums of all subsets  Given {72, 33, 41, 57, 25} what is sum closest (not.
CPS What is Computer Science? What is it that distinguishes it from the separate subjects with which it is related? What is the linking thread.
Complexity & Computability. Limitations of computer science  Major reasons useful calculations cannot be done:  execution time of program is too long.
Compsci 101.2, Fall Plan for FWON l Review current assignments and APTs  Review Dictionaries and how to use them  Code and APT walk-through.
General Computer Science for Engineers CISC 106 Lecture 12 James Atlas Computer and Information Sciences 08/03/2009.
Compsci 101, Fall LWoC l Review Recommender, dictionaries, files  How to create recommendations in order? food.txt  Toward a Duke eatery-recommender.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 8: Crash Course in Computational Complexity.
Compsci 06/101, Fall What is Computing? Informatics? l What is computer science, what is its potential?  What can we do with computers in our.
CompSci 100e 12.1 Sorting: In 2 Slides l Why do people study sorting?  Because we have to  Because sorting is beautiful  Example of algorithm analysis.
1 1. Which of these sequences correspond to Hamilton cycles in the graph? (a) (b) (c) (d) (e)
1 P and NP. 2 Introduction The Traveling Salesperson problem and thousands of other problems are equally hard in the sense that if we had an efficient.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
Chapter 5 – Part 3 Conditionals and Loops. © 2004 Pearson Addison-Wesley. All rights reserved2/19 Outline The if Statement and Conditions Other Conditional.
Compsci 101.2, Fall Plan for LWoC l Power of Regular Expressions  From theoretical computer science to scraping webpages  Using documentation,
Great Theoretical Ideas in Computer Science.
Quiz 4 Topics Aid sheet is supplied with quiz. Functions, loops, conditionals, lists – STILL. New topics: –Default and Keyword Arguments. –Sets. –Strings.
Software Engineering Algorithms, Compilers, & Lifecycle.
CSC 241: Introduction to Computer Science I
CompSci 101 Introduction to Computer Science
Bottom up meets Top down
Chapter No. : 1 Introduction to Java.
What is Computing? Informatics?
CompSci 101 Introduction to Computer Science
Views of programming Writing code from the method/function view is pretty similar across languages Organizing methods is different, organizing code is.
CISC101 Reminders Slides have changed from those posted last night…
Introduction to Computer Science I.
Algorithms CSCI 235, Spring 2019 Lecture 36 P vs
What is Computer Science at Duke?
CSC 241: Introduction to Computer Science I
From bit to byte to char to int to long
Presentation transcript:

Compsci 101.2, Fall Plan for LDO101 l Ethical webpage scraping  Illustrate power of regular expressions  Python makes trying things relatively easy l What's left, grades, finals, work  Optional APT, lab, finishing, studying l What can't be done in Computer Science  Practical knowledge of theoretical concepts l Acknowledging Completion

Compsci 101.2, Fall Grading l There are 11 labs, each worth 4 points  Will grade with max of 38 points needed, 10% l Forty APTs are required (53 given)  Grades for 41 in Sakai, missing 8-10, 10% l Reading quizzes, we drop 20 points  Class activity will update and drop 4 points l For any concerns, fill out form by 12/4 

Compsci 101.2, Fall Final Exam l Material from semester, emphasizes recent material, builds on all  Coding questions like midterm exams l Multiple choice questions similar to in- class,  We have to grade these quickly l Best study? Look at previous midterms, be able to do our last midterm

Compsci 101.2, Fall Be a UTA!! Help next semester's 101

Compsci 101.2, Fall Ignorable: Dictionary Comprehensions Given ["x", "y", "z", "w"] create dictionary for each element, value is empty list l Use dictionary comprehension l Initializes dictionary, just update d = {} for val in letters: d[val] = [] d = {elt:[] for elt in letters} for elt in letters: d[elt].append(word.find(elt))

Compsci 101.2, Fall FriendScore APT l What is a two-friend, doing an example by hand paper-and-pencil  How do we find indexes of our friends?  How could we find indexes of another person's friends?  If Sam is my friend, and Pat is Sam's friend, is Pat my two-friend? Is Pat's friend Chris my 2F? l Try in-class questions toward going green 

Compsci 101.2, Fall Answer Questions

Compsci 101.2, Fall Scraping address from websites l Suppose we want to send to all Duke Faculty to let them know …  Visit Departmental website, people, faculty  View (HTML) Source  Develop regex to access – if possible! l RegexScraper.py  Python makes this simple  Ethical hacking?

Compsci 101.2, Fall Scraping math.duke.edu faculty l Pattern:  r'math/faculty/(.*?)\"\>(.+?)\<' URL  Matches : … ('motta', 'Francis C. Motta') ('jmmza', 'James Murphy') ('ryser', 'Marc D. Ryser') ('sv113', 'Stefano Vigogna') ('haizhao', 'Haizhao Yang')

Compsci 101.2, Fall Scraping Sanford/PubPol faculty l Pattern:  URL  l Matches (call 16 times with different URL) … ('schanzer', 'duke.edu') ('steveschewel', 'gmail.com') ('michael.schoenfeld', 'duke.edu') ('schroeder', 'law.duke.edu')

Compsci 101.2, Fall Scraping Biology faculty l Pattern:  URL  l Matches (call 26 times with different URL) … ('emily.bernhardt', 'duke.edu') ('bhandawat', 'gmail.com') ('jboynton66', 'gmail.com')

Compsci 101.2, Fall What is Computing? Informatics? l What is computer science, what is its potential?  What can we do with computers in our lives?  What can we do with computing for society?  Will networks transform thinking/knowing/doing?  Society affecting and affected by computing?  Changes in science: biology, physics, chemistry, …  Changes in humanity: access, revolution (?), … l Privileges and opportunities available if you know code  Writing and reading code, understanding algorithms  Majestic, magical, mathematical, mysterious, …

Compsci 101.2, Fall What can be programmed? l What class of problems can be solved ?  Linux, Cloud, Mac, Windows10, Android,…  Alan Turing contributions Halting problem, Church-Turing thesis l What class of problems can be solved efficiently ?  Problems with no practical solution  What does practical mean?

Compsci 101.2, Fall Schedule students, minimize conflicts l Given student requests, available teachers  write a program that schedules classes  Minimize conflicts l Add a GUI too  Web interface  … I can’t write this program because I’m too dumb

Compsci 101.2, Fall One better scenario I can’t write this program because it’s provably impossible I can’t write this program but neither can all these famous people Still another scenario, is this better?

Compsci 101.2, Fall Summary of Problem Categories l Some problems can be solved 'efficiently'  Run large versions fast on modern computers  What is 'efficient'? It depends l Some cannot be solved by computer.  Provable! We can't wait for smarter algorithms l Some problems have no efficient solution  Provably exponential 2 n so for "small" n … l Some have no known efficient solution, but  If one does they all do!

Compsci 101.2, Fall Entscheidungsproblem l What can we program?  What kind of computer? l What can't we program?  Can't we try harder? l Can we write a program that will determine if any program P will halt when run on input S ?  Input to halt: P and S  Output: yes/no halts

Compsci 101.2, Fall Some problems take forever, but … l Can we visit all cities, no repeats, using Southwest, for less than $123,  RDU->MCO->…->…->…->…->DEN  RDU->DEN->…->…->…->…->MCO  repeat and test, what's the issue here?  Can we find shortest path for packets on Internet? Yes!  Can we find longest path for silent meditation? No!  We don't know how, but if we did!!! l Contrast towers of Hanoi, 2 n moves always!

Compsci 101.2, Fall Are hard problems easy? Clay PrizeClay Prize l P = easy problems, NP = “hard” problems  P means solvable in polynomial time Difference between N, N 2, N 10 ?  NP means non-deterministic, polynomial time guess a solution and verify it efficiently l Question: P = NP ?  if yes, a whole class of difficult problems, the NP-complete problems, can be solved efficiently  if no, no hard problems can be solved efficiently  showing the first problem was NP complete was an exercise in intellectual bootstrapping, satisfiability/Cook/(1971)

Compsci 101.2, Fall How is Python like all other programming languages, how is it different?

Compsci 101.2, Fall A Rose by any other name…C or Java? l Why do we use [Python|Java] in courses ?  [is|is not] Object oriented  Large collection of libraries  Safe for advanced programming and beginners  Harder to shoot ourselves in the foot l Why don't we use C++ (or C)?  Standard libraries weak or non-existant (comparatively)  Easy to make mistakes when beginning  No GUIs, complicated compilation model  What about other languages?

Compsci 101.2, Fall Why do we learn other languages? l Perl, Python, PHP, Ruby, C, C++, Java, Scheme, Haskell,  Can we do something different in one language? In theory: no; in practice: yes  What languages do you know? All of them.  In what languages are you fluent? None of them l In later courses why do we use C or C++?  Closer to the machine, understand abstractions at many levels

Compsci 101.2, Fall Find all unique/different words in a file, in sorted order Across different languages: do these languages have the same power?

Compsci 101.2, Fall Unique Words in Python def main(): f = open('/data/melville.txt', 'r') words = f.read().strip().split() allWords = set(words) for word in sorted(allWords): print word if __name__ == "__main__": main()

Compsci 101.2, Fall Unique words in Java import java.util.*; import java.io.*; public class Unique { public static void main(String[] args) throws IOException{ Scanner scan = new Scanner(new File("/data/melville.txt")); TreeSet set = new TreeSet (); while (scan.hasNext()){ String str = scan.next(); set.add(str); } for(String s : set){ System.out.println(s); }

Compsci 101.2, Fall Unique words in C++ #include using namespace std; int main(){ ifstream input("/data/melville.txt"); set unique; string word; while (input >> word){ unique.insert(word); } set ::iterator it = unique.begin(); for(; it != unique.end(); it++){ cout << *it << endl; } return 0; }

Compsci 101.2, Fall Unique words in PHP <?php $wholething = file_get_contents("file:///data/melville.txt"); $wholething = trim($wholething); $array = preg_split("/\s+/",$wholething); $uni = array_unique($array); sort($uni); foreach ($uni as $word){ echo $word." "; } ?>

Compsci 101.2, Fall Kernighan and Ritchie l First C book, 1978 l First ‘hello world’ l Ritchie: Unix too!  Turing award 1983 l Kernighan: tools  Strunk and White l Everyone knows that debugging is twice as hard as writing a program in the first place. So if you are as clever as you can be when you write it, how will you ever debug it? Brian Kernighan

Compsci 101.2, Fall How do we read a file in C? #include int strcompare(const void * a, const void * b){ char ** stra = (char **) a; char ** strb = (char **) b; return strcmp(*stra, *strb); } int main(){ FILE * file = fopen("/data/melville.txt","r"); char buf[1024]; char ** words = (char **) malloc(5000*sizeof(char **)); int count = 0; int k;

Compsci 101.2, Fall Storing words read when reading in C while (fscanf(file,"%s",buf) != EOF){ int found = 0; // look for word just read for(k=0; k < count; k++){ if (strcmp(buf,words[k]) == 0){ found = 1; break; } if (!found){ // not found, add to list words[count] = (char *) malloc(strlen(buf)+1); strcpy(words[count],buf); count++; } l Complexity of reading/storing? Allocation of memory ?

Compsci 101.2, Fall Sorting, Printing, Freeing in C qsort(words,count,sizeof(char *), strcompare); for(k=0; k < count; k++) { printf("%s\n",words[k]); } for(k=0; k < count; k++){ free(words[k]); } free(words); } l Sorting, printing, and freeing  Ugh!

Compsci 101.2, Fall You have (almost) finished Compsci 101 l Let's talk about next steps and finishing this semester