HANGMAN OPTIMIZATION Kyle Anderson, Sean Barton and Brandyn Deffinbaugh.

HANGMAN OPTIMIZATION Kyle Anderson, Sean Barton and Brandyn Deffinbaugh

ABSTRACT Our group attempted to find an optimal strategy to solve a game of hangman between three different algorithms, a unigram, a bigram and an exhaustive search. The game of hangman would be randomly played using a imported dictionary of 8988 common English words, modified by us to remove words with symbols such as contractions.

RULES OF THE GAME 1.)A random word is chosen and presented to the player as a string of underscores Ex. Chosen word is Cat, presented string is “ _ _ _ “ 2.) A correct guess yields a filled in blank on the target string corresponding to the location or locations of the correctly guessed letter. Ex. A is guessed, player is presented with “ _ A _ “ 3.) A incorrect guess does not fill in any of the target string. 4.)No limit to amount of guesses.

FORMAL PROBLEM STATEMENT Given a string of length N randomly chosen from a internal dictionary, find the letters of the string using n-gram models to find probabilities of letter frequency, and then input the letter or letter combination with the highest occurrence into the string. If a character or combination of characters are found to match a portion of the target string, then their positions in the target string are reveled to the player. The problem is considered solved when the entire length N string is revealed.

ALGORITHMS USED 1) A unigram algorithm that guesses using the letter occurrences in the supplied dictionary. The first guess is always the most common occurring letter. Ex. E, B, C etc. 2) A bigram algorithm also guesses using the letter occurrences in the supplied dictionary; however, this algorithm uses letter pairs instead of single letters. Ex. AA, AB, AC etc. 3) Our exhaustive search algorithm simply guesses from the letter A to the letter Z, and does not reference the dictionary or letter frequencies at all.

EXPERIMENTAL PROCEDURE The dictionary used in the experiment was 8988 words long, and was formatted to remove words with symbols in them such as contractions. The letter frequency that each algorithm used was generated at run time of that algorithm, based on the entirety of the dictionary. Repeated letters in a word are not counted twice for the letter frequency. The exhaustive search algorithm did not reference the dictionary at all.

DATA UnigramRunTimeUnigramGuessesBigramRunTimeBigramGuessesExhaustiveRunTimeExhaustiveGuesses Average0.02042831516.627128070.016454012130.35339940.02299469320.75943029 Standard Deviation0.0093088233.7820549590.01893160667.662713150.0085964873.092395801 Maximum0.42000103261.4000020034770.10000014326 Minimum0.00200009330.00099992820.009999994

PROBLEMS WITH OUR WORK Our time measurement was not accurate for smaller words. Time complexity of the exhaustive search could be better optimized.

RESULTS SUMMARY

RESULTS SUMMARY CONTINUED

CONCLUSIONS The unigram performed the best in terms of average guesses with the average amount of guesses being 16.6 and second in terms of run time taking 0.020 seconds to complete. The bigram performed the worst in terms of average guesses having to guess 130.35 times on average to complete the word, but best in terms of run time averaging 0.016 seconds to complete. The exhaustive search was second in terms of average guesses taking 20.76 guesses to and the worst in terms of run time averaging 0.023 seconds to complete.

FUTURE WORK For the unigram and bigram algorithm you could update their letter frequencies based on the length of the word given. Having the algorithm remove words that do not have letters in the positions that have been guessed already and updating the unigram and bigram frequency with the new word bank. Trying to use the algorithms with a different programming language may yield more accurate run times.

DISCUSSION QUESTIONS 1) What was the most effective algorithm with respect to average guesses?  A.) Unigram. 2) What is a unigram? What is a bigram?  A.) Unigram is the probability of a single token to appear in a set using either that set or another set to find the initial probability they appear. Bigram is the probability of all combinations of tokens to appear in a given set using that set or another set to find the initial probability. 3.) What is the Big(O) of the unigram algorithm? Bigram? Exhaustive search?  A.) Unigram: M*N^2, Bigram: M*N^2, Exhaustive search: M^2*N^2.

HANGMAN OPTIMIZATION Kyle Anderson, Sean Barton and Brandyn Deffinbaugh.

Similar presentations

Presentation on theme: "HANGMAN OPTIMIZATION Kyle Anderson, Sean Barton and Brandyn Deffinbaugh."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

HANGMAN OPTIMIZATION Kyle Anderson, Sean Barton and Brandyn Deffinbaugh.

Similar presentations

Presentation on theme: "HANGMAN OPTIMIZATION Kyle Anderson, Sean Barton and Brandyn Deffinbaugh."— Presentation transcript:

Similar presentations

About project

Feedback