Presentation is loading. Please wait.

Presentation is loading. Please wait.

Design and Implementation* Objective: To design and implement a program for a relatively small yet reasonably complicated problem. To introduce and review.

Similar presentations


Presentation on theme: "Design and Implementation* Objective: To design and implement a program for a relatively small yet reasonably complicated problem. To introduce and review."— Presentation transcript:

1 Design and Implementation* Objective: To design and implement a program for a relatively small yet reasonably complicated problem. To introduce and review a variety of implementation languages and to have students review the pros and cons of different implementation choices and languages. “Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious” – Frederick P. Brooks, The Mythical Man Month *The examples in these slides come from Brian W. Kernighan and Rob Pike, The Practice of Programming, Addison-Wesley, 1999.

2 Themes “Once the data structures are laid out, the algorithms tend to fall into place, and the coding is comparatively easy” The choice of programming language is relatively unimportant to the overall design. Comparing implementations demonstrates how languages can help or hinder, and ways in which they are unimportant.

3 Topics Problem: Generate random English text that reads well. Program: some data comes in, some data goes out, and the processing depends on a little ingenuity. Implementations: C, C++, Java, Perl

4 Failed Attempts Generate random letters (even with proper frequency). Generate random words chosen from a dictionary Need a statistical model with more structure (frequency of phrases)

5 The Markov Algorithm - Learn "Learn" from input: Look at all n-word phrases (prefixes); Keep a list of words that follow (one-word suffices) for each prefix Store the prefix/suffix-list in a dictionary The key (entry) is the prefix The satellite data, associated w/each prefix, is the list of suffices Note, we're "faking" a multi-map. Each prefix has, potentially, several possible suffices.

6 Sample The following example uses a subset of Dr. Brooks' quote Uses a prefix length of 2 words We won't bother w/stripping out punctuation We won't worry about capitalisation (so, "We" and "we" are different strings) We will use a special (Null, Null) prefix to indicate the start.

7 Sample Markov Algorithm (partial list) Input prefixSuffix words that follow (null) show (null) showme show meyour me yourflowcharts tables your flowchartsand flowcharts andconceal your tablesand will bemystified. obvious. be obvious(null)

8 Sample (notes) We don't filter out duplicates E.g., "and" appears twice for the prefix ("your", "tables") This is fine. "and" is, from the input, a nice word to follow. It should be a more probable choice We also use the suffix (null) to say "Here's a decent place to end our story."

9 Markov Algorithm – generate text set w 1 and w 2 to the first two words in the text print w 1 and w 2 loop: randomly choose w 3, one of the successors of w 1 and w 2 (in sample text) print w 3 replace w 1 and w 2 by w 2 and w 3

10 Sample Say the current prefix is (me, your) 1. Go to our dictionary, find the entry 2. Choose a random word from the suffix list (say, "tables") 3. Write that word out 4. Make a new prefix: ("your", "tables") Repeat, until we choose the suffix (null), or decide we've output enough words

11 Implementation See lecture outline for links to 4 different implementations C (see Makefile) C++ Java Perl Python What are the pros and cons of the different implementations?

12 The Data Structures Python and Perl have everything we need built in Java and C++ provide appropriate containers in their standard libraries In C we'll need to roll these things ourselves

13 The Data Structures - Python The dictionary (dict) is given to us The prefixes, the keys in the dictionary, will be 2-element tuples (immutable) The satellite data will be stored in a list (an array) If a prefix doesn't already exist in the table, we insert it, along w/an empty list, [] We just append the new suffix onto the end of this list

14 Hash Tables We need a dictionary to map a prefix to its list of suffices (given a prefix, what are the possible suffices?) In C we'll roll our own hash table The prefix (key) is hashed, returning a value on [0, N-1], where N is the table size Each element in the table (array) is a bucket, a linked-list of prefixes Each prefix is associated w/its own list of suffices

15 The prefix in C Prefix – stored in an array of strings (ptrs to char ): char* pref[ NPREF ]; NPREF – the # of words in a prefix, a global constant Note, each string exists once in memory (after call to strdup ); various objects just store pointers to these strings.

16 Hash table entries in C - State Use chains (linked list of entries) to handle collisions Each entry (or State ) is a node in a linked-list. It contains: The key (the prefix) The satellite data (pointer to a list of suffix ) A pointer to the next State pref[0] suf next pref[1]

17 Satellite Data – C More than one suffix may be associated w/a given entry ( State ) Stored in a linked list List made of structs, of type Suffix : char* word, the suffix (word) itself pointer to another Suffix word next word next

18 The hash table in C statetab – the table itself Use chains (linked list of State s) to handle collisions statetab is an array of pointers to States (to the beginning of a linked list)

19 The hash table – C “me” “your” pref[0] suf next pref[1] a state: pref[0] suf next pref[1] another state (same hash key): word next word next a suffix: another suffix: “flowcharts” “tables”

20 C - eprintf These are simply some user-defined functions (mostly wrappers) to help w/the coding: void eprintf( char *,... ); prints an error msg to stderr, then exits void weprintf( char *,... ); prints a warning to stderr ; doesn't exit char* estrdup( char *s ); calls strdup( s ) to dynamically allocate memory and copy s; exits if malloc fails

21 C – eprintf (cont.) void* emalloc(size_t n); Calls malloc(n). Exits upon failure void* erealloc(void *p, size_t n); Calls realloc( p, n ). Exits upon failure void setprogname(char *); If set, uses program name in messages char* progname(void);

22 memmove Moves (low-level) a block of memory memmove( d, s, n ) Moves n-bytes, starting at s (source) to t (target) Given prefix (w 1, …, w n-1 ), with suffix w n : memove(prefix, prefix+1, (NPREF-1)*sizeof(prefix[0]) ); prefix[NPREF-1] = suffix; prefix is now ( w 2, …, w n ) Just slides everybody down 1, to make/get next prefix

23 Choosing a suffix Consider this line, in generate(): if ( rand() % ++nmatch == 0 ) We walk the list of suffices For the first one, we mod by 1 (100% chance of choosing this one) At the next one, we mod by 2 (50% chance of choosing this one) At the 3 rd, 1 in 3 At the 4 th, 1 in 4 … Convince yourself that each suffix has equal probability of being chosen.


Download ppt "Design and Implementation* Objective: To design and implement a program for a relatively small yet reasonably complicated problem. To introduce and review."

Similar presentations


Ads by Google