Download presentation
Presentation is loading. Please wait.
Published byFrancis Austin Hopkins Modified over 9 years ago
2
GOOGLE N-GRAMS ON AMAZON WEB SERVICES PART 2 Thomas Tiahrt, MA, PhD Computer Science 482 – Introduction to Text Analytics
3
2 n-gram viewer http://books.google.com/ngrams/info http://books.google.com/ngrams/info n-gram datasets http://storage.googleapis.com/books/ngrams/books/ datasetsv2.html http://storage.googleapis.com/books/ngrams/books/ datasetsv2.html Google Books N-Grams
4
3 Data is compressed Fields are separated by tabs ('\t') One record per line newline character ('\n') ends record N-gram is the 1gram, 2gram, 3gram, 4gram, 5gram File Format for Google’s N-Grams
5
4 Data created July 2012 Version 2 file format N-gram \t year \t match_count \t volume_count \n N-gram:1gram, 2gram, 3gram, 4gram, 5gram year: publication year match_count: occurrences for that year volume_count: number of books where n-gram occurred Version 2
6
This is the end of part two. Please proceed to part three. End of Part Two 5
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.