Technologies of Google Seminar Week 1 Old Dominion University Department of Computer Science CS 791/891 Spring 2007 Michael L. Nelson <> 1/10/07
Purpose of This Class We will examine the technologies that Google has created or adopted in the process of becoming the company they are today All papers (with few exceptions) were (co)authored by present (or future) Google Employees and collaborators this covers only public material about Google’s algorithms, applications and infrastructure some of the papers were written ~10 years ago -- keep the proper historical context in mind when reading
Class Format In groups of 2, each group will: read 2 assigned papers prepare a 30-40 minute presentation on the paper lead a class discussion on the material presented presenters will prepare at least 3 questions for the class in advance
Warning: Tough Papers Ahead! These papers can be tough reading. They are research papers that assume detailed knowledge about their specific application areas. You have all semester to read two papers and prepare your presentations! You have the help of a teammate You have the help of the rest of the class you will likely have to understand the papers presented by other students to understand your assigned papers feel free to ask your fellow students questions; collaboration is encouraged
This is a Seminar Class! You will have to self-teach the background material for these papers I will assist to a limited extent, but seminars are about self-learning To learn your paper, you will likely need to read other papers or web resources start early! Each group will give a dry-run of their presentations approximately 1 week before the presentations are due normally, the Friday before the presentations are to be made, but check w/ the instructor
Grading Paper presentations + class participation = 100% of your grade If you have only a superficial understanding of the material, your grade will suffer at the instructor’s discretion group members can get different grades if the contributions are not equal Attendance is required! if you’re not here, you’re not participating I will occasionally take role; if you’re not here you will lose 1 letter grade for each absence unless prior arrangements are made
Plagiarism is an Honor Code Violation What is plagiarism? It is ok to use material from other resources found on the web, but: you must cite any and all work that is not your own unattributed slides, graphs, sentences etc. is plagiarism! at least 75% of your material needs to be your own e.g. ,if you have 30 slides, then the content of 22 of those slides must be in your own words and/or figures! The other 10 slides must cite where the material comes from If you’re not sure, ask the instructor.
Administrivia Important URLs Class homepage: Email list: Class homepage: Readings are posted -- start now Check your when your presentation is scheduled Email list: sign yourself up use the email list to ask questions of the class
What We Will Learn This Semester A deep understanding of various technologies employed by Google, including: their early contributions of crawling & ranking information retrieval problems at web-scale the specialized hardware and software infrastructure created specifically by/for Google How to read, understand and present research papers deep understanding of concepts requires one to go beyond a Wikipedia-level understanding you don’t truly understand something until you know it well enough to explain it to others How to read about a new topic and self-teach the necessary background knowledge in real life, the problem you face today does not always build on the skills you learned yesterday