Nov String Algorithms, Q Algorithmic programming...and testing your suffix trees...

Slides:



Advertisements
Similar presentations
Lecture 22, Revision 1 Lecture notes Java code – CodeFromLectures folder* Example class sheets – 4 of these plus solutions Extra examples (quicksort) Lab.
Advertisements

CSE 331 SOFTWARE DESIGN & IMPLEMENTATION DEBUGGING Autumn 2011 Bug-Date: Sept 9, 1947.
Recursion vs. Iteration The original Lisp language was truly a functional language: –Everything was expressed as functions –No local variables –No iteration.
Greedy Algorithms Amihood Amir Bar-Ilan University.
Object Oriented Design An object combines data and operations on that data (object is an instance of class) data: class variables operations: methods Three.
Software Engineering and Design Principles Chapter 1.
Testing and Debugging pt.2 Intro to Complexity CS221 – 2/18/09.
Discussion #36 Spanning Trees
Debugging CPSC 315 – Programming Studio Fall 2008.
Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 13 Pointers and Linked Lists.
Minimal Spanning Trees. Spanning Tree Assume you have an undirected graph G = (V,E) Spanning tree of graph G is tree T = (V,E T E, R) –Tree has same set.
27-Jun-15 Profiling code, Timing Methods. Optimization Optimization is the process of making a program as fast (or as small) as possible Here’s what the.
4-Searching1 Searching Dan Barrish-Flood. 4-Searching2 Dynamic Sets Manage a changing set S of elements. Every element x has a key, key[x]. Operations:
30-Jun-15 Profiling. Optimization Optimization is the process of making a program as fast (or as small) as possible Here’s what the experts say about.
Data Structures and Programming.  John Edgar2.
1 What NOT to do I get sooooo Frustrated! Marking the SAME wrong answer hundreds of times! I will give a list of mistakes which I particularly hate marking.
Testing. Definition From the dictionary- the means by which the presence, quality, or genuineness of anything is determined; a means of trial. For software.
Introduction to Engineering Computing GEEN 1300 Lecture 7 15 June 2010 Review for midterm.
Identifying Reversible Functions From an ROBDD Adam MacDonald.
Fundamental Structures of Computer Science Feb. 24, 2005 Ananda Guna Lempel-Ziv Compression.
AITI Lecture 20 Trees, Binary Search Trees Adapted from MIT Course 1.00 Spring 2003 Lecture 28 and Tutorial Note 10 (Teachers: Please do not erase the.
1 Principles of Computer Science I Prof. Nadeem Abdul Hamid CSC 120 – Fall 2005 Lecture Unit 10 - Testing.
Priority Queues and Binary Heaps Chapter Trees Some animals are more equal than others A queue is a FIFO data structure the first element.
ICS 220 – Data Structures and Algorithms Lecture 11 Dr. Ken Cosh.
CS 111 – Nov. 22 Chapter 7 Software engineering Systems analysis Commitment –Please read Section 7.4 (only pp ), Sections –Homework #2.
Chapter 1 Data Structures and Algorithms. Primary Goals Present commonly used data structures Present commonly used data structures Introduce the idea.
CS 206 Introduction to Computer Science II 10 / 05 / 2009 Instructor: Michael Eckmann.
Reasoning about programs March CSE 403, Winter 2011, Brun.
1 Legacy Code From Feathers, Ch 2 Steve Chenoweth, RHIT Right – Your basic Legacy, from Subaru, starting at $ 20,295, 24 city, 32 highway.
ECE450 - Software Engineering II1 ECE450 – Software Engineering II Today: Introduction to Software Architecture.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
CSE 374 Programming Concepts & Tools Hal Perkins Fall 2015 Lecture 11 – gdb and Debugging.
David Streader Computer Science Victoria University of Wellington Copyright: David Streader, Victoria University of Wellington Debugging COMP T1.
Chapter 5 Linked List by Before you learn Linked List 3 rd level of Data Structures Intermediate Level of Understanding for C++ Please.
The Last Lecture CS 5010 Program Design Paradigms "Bootcamp" Lesson © Mitchell Wand, This work is licensed under a Creative Commons Attribution-NonCommercial.
1 The Software Development Process ► Systems analysis ► Systems design ► Implementation ► Testing ► Documentation ► Evaluation ► Maintenance.
BASICS OF CODE DESIGN.  Modular  Reusable  Easy to Read  Maintainable  Testable  Easy to Change  Easy to Understand THE GOAL.
Loop Invariants and Binary Search Chapter 4.4, 5.1.
Efficiently Solving Computer Programming Problems Doncho Minkov Telerik Corporation Technical Trainer.
1 ENERGY 211 / CME 211 Lecture 14 October 22, 2008.
COMP 103 Course Review. 2 Menu  A final word on hash collisions in Open Addressing / Probing  Course Summary  What we have covered  What you should.
Advanced Data Structures Lecture 8 Mingmin Xie. Agenda Overview Trie Suffix Tree Suffix Array, LCP Construction Applications.
Nov String algorithms, Q Ukkonen’s suffix tree algorithm ● Recall McCreight’s approach: – For i = 1.. n+1, build compressed trie of {x[j..n]$
CSE373: Data Structures & Algorithms Priority Queues
Support for Program Analysis as a First-Class Design Constraint in Legion Michael Bauer 02/22/17.
McCreight's suffix tree construction algorithm
Binary Trees Linked lists: efficient insertion/deletion, inefficient search ArrayList: search can be efficient, insertion/deletion not Binary trees: efficient.
Andrzej Ehrenfeucht, University of Colorado, Boulder
March 29 – Testing and Priority QUeues
CSE 374 Programming Concepts & Tools
Binary Trees Linked lists: efficient insertion/deletion, inefficient search ArrayList: search can be efficient, insertion/deletion not Binary trees: efficient.
COSC160: Data Structures Linked Lists
Ukkonen's suffix tree construction algorithm
Minimal Spanning Trees
Objective: Be able to add and subtract directed numbers.
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
CSCE 315 – Programming Studio, Fall 2017 Tanzir Ahmed
CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions
CSE373: Data Structures & Algorithms Lecture 5: AVL Trees
Debugging at Scale.
DEBUGGING CS2110.
CSE 303 Concepts and Tools for Software Development
Debugging “Why you were up till 2AM”
CSE 373 Data Structures and Algorithms
CSE373: Data Structures & Algorithms Implementing Union-Find
CSC 143 Binary Search Trees.
Objective: Be able to add and subtract directed numbers.
CSE 373: Data Structures and Algorithms
Linked lists Low-level (concrete) data structure, used to implement higher-level structures Used to implement sequences/lists (see CList in Tapestry) Basis.
Lecture 20 – Practice Exercises 4
Presentation transcript:

Nov String Algorithms, Q Algorithmic programming...and testing your suffix trees...

Nov String Algorithms, Q KISS ● Keep It Simple, Stupid! – You know the algorithm – You know what you need – Implement that! – Don’t get inventive! ● Choose simple solutions first – Simple (but perhaps slow) solutions in 1 st iteration – Then improve them – Premature optimization is the root of all evil!

Nov String Algorithms, Q Example... ● Implement a simple compressed trie – Insert strings one at a time (using slowscan) – Test correctness ● Use it to build a suffix tree – Insert suffixes one at a time – Test correctness ● Use McCreight’s algorithm – Use fastscan and test correctness – Add suffix links and test correctness

Nov String Algorithms, Q Go easy on the OO modeling ● This is not an OO course – Don’t do OO modeling – Attributes determined by algorithm – If you need a pointer, use a bloody pointer! ● e.g. a suffix link or an edge is not an “object” with identity, attributes and methods ● OO (as you know it) is for modeling and architecture – Use it for data structure interfaces – Ignore it for the implementation – (I am exaggerating here...)

Nov String Algorithms, Q Defensive programming ● Annotate your code with assertions – Be able to turn them on/off for efficiency in production code – But test everything you reasonably can ● Pre- and post-conditions ● Invariants – Remember to test cases that “can’t happen”! – Do this as you write the code, not afterwards!

Nov String Algorithms, Q Aggressive testing ● Test early and test often – Testing suite you can run after each change ● Automate your testing! – Unit testing of all structures and routines – Regression testing – Test common and boundary cases – “Random” crash testing ● Never ever delete a (non-redundant) test! – Let the testing suite grow – Don’t delete a test after you have used it

Nov String Algorithms, Q Make debugging easy ● Add code for inspection – Be able to print algorithmic state – Be able to print data structures – Make sure you can turn this on and off (but don’t ever delete inspection code!)

Nov String Algorithms, Q Example > ~mailund/suffix_tree-2.0/print-structure.py abaab The tree structure: `- a | `- b | | `- aab$ : abaab$ | | `- $ : ab$ | | | `- ab$ : aab$ | `- b | `- aab$ : baab$ | `- $ : b$ | `- $ : $ ●...or consider e.g. Graphviz for visualization

Nov String Algorithms, Q Make debugging easy ● Use your test suite for debugging – First you build a test to trigger a bug – Then you fix it –...recurse as necessary

Nov String Algorithms, Q Test by profiling ● Annotate the code for timing – Plot time used for various routines – Compare with expected running times – Be able to turn this profiling code on/off ● For McCreight’s algorithm: – Linear time for slowscan – Linear time for fastscan – Linear time for total construction

Nov String Algorithms, Q Optimize using profiling ● Never ever optimize before profiling! ● Use profiler tools to find hotspots ● Focus on the hotspots ● Only optimize if you expect to gain from it! ● You won’t gain from the micro-optimizations!