PhD in Databases
introduction why PhD in databases? –everyone should have a reason! which topic should I choose? –usually the supervisor does it for you, when you’re young –when you grow up, you get to choose the things you get involved with what are the requirements? how long does it take? –typically, in NTUA it takes at least 3 years and a refereed international journal publication is required –in reality, they are more strict and are set by your supervisor what is a good PhD? –it should cover in detail an interesting research area and offer novel technical insight –as a result, a PhD student should have a good publication record
example publication venues conferences –Very Large Databases (VLDB) journals –ACM Transactions on Database Systems (TODS) workshops –International Workshop on the Web and Databases (WebDB) national events –Hellenic Data Management Symposium (HDMS) other? symposia
conferences a conference attracts the most recent, state-of-the-art, work on a broad range of topics –(usually) annual event, organized around the world –highly competitive, great visibility/impact submission process –first an abstract is submitted –a week later the full version is due 12 double column pages, strict formatting –around 2 months later the decisions are announced review process –three referees summarize the contributions and give marks on relevance, novelty, presentation, impact, technical depth, overall recommendation scores: reject, weak reject, neutral, weak accept, accept, strong accept –discussion phase among referees, before final decision
conferences types of papers full paper –requires presentation: 25 mins + 5 mins questions short paper / poster paper –a reduced printed version of the paper, e.g., 6 pages –shorter (20 mins) or no presentation demo paper –describe a system implementation –a demonstration and poster is required in special session all papers appear in the proceedings of the conference –published by major publishers (ACM, IEEE press, Springer)
conferences some terms Call for Papers (CFP) contains all information, defines topics Important Dates specify the submission deadlines and event dates Organizing Committee: people responsible for the local arrangements Program Committee (PC): people that do the review process Organizing / PC Chair: person in charge Acceptance Rate: the ratio of accepted vs. submitted papers –not always representative of the quality of the venue Tracks/Areas: target group separation –research vs. industrial track –separate committees Sessions: accepted papers are thematically organized into parallel sessions
conference ranking a very old (partially subjective) list (from NUS 1999) AREA: Data Bases Rank 1: SIGMOD: ACM SIGMOD Conf on Management of Data PODS: ACM SIGMOD Conf on Principles of DB Systems VLDB: Very Large Data Bases ICDE: Intl Conf on Data Engineering ICDT: Intl Conf on Database Theory Rank 2: SSD: Intl Symp on Large Spatial Databases DEXA: Database and Expert System Applications FODO: Intl Conf on Foundation on Data Organization EDBT: Extending DB Technology DOOD: Deductive and Object-Oriented Databases DASFAA: Database Systems for Advanced Applications CIKM: Intl. Conf on Information and Knowledge Management SSDBM: Intl Conf on Scientific and Statistical DB Mgmt CoopIS - Conference on Cooperative Information Systems ER - Intl Conf on Conceptual Modeling (ER) Rank 3: …
workshops like conferences, but with a more focused subject –they are organized in parallel with a conference –some have unofficial proceedings, other full published proceedings how to recognize good workshops? –long-running, well-established –the good ones are always in collaboration with good conferences what should go in a workshop? –dblab diploma theses are excellent candidates somewhat limited impact –however, some very good papers have appeared in workshops
journals what are the differences with conferences? –no call (except for special issues) --- send anytime –longer, more detailed reviewing phase --- can take a year revisions, answer letters –longer page limits –often high acceptance rates –only a few good but a lot of bad journals which paper should you send there? –extended version of conference papers, with only 30% new material –not previously accepted papers are also good candidates why bother with journals? –in all scientific disciplines, except in CS, journals are more important/prestigious than conferences –people not familiar with CS and DB usually judge a CV by #journals
journal ranking a rather subjective list ACM Transaction on Database Systems (TODS) VLDB Journal (VLDBJ) IEEE Transactions on Knowledge and Data Engineering (TKDE) Elsevier Information Systems Elsevier Data and Knowledge Engineering (DKE) …
reading papers is it important? –yes! sometimes, more important than reading text books what do I read? –read recent bibliography on your topic –find a good survey paper –read good papers, even if they are not directly related to your topic where do I find them? –look for them in the most recent and relevant good conferences –ask your supervisor! –search the web –get them from DBLP, google (scholar), etc.
writing papers what goes in a paper? –research paper: it presents an interesting solution to an interesting problem –survey paper: reviews all relevant literature for an interesting topic structure of a paper –abstract, (1) introduction, (2) related work, (3) problem statement, (4) solution, (5) experiments, (6) conclusion, references a lot of advice available on the web – paper.htmlhttp://people.csail.mit.edu/mernst/advice/write-technical- paper.html – ines_byPV.pdfhttp:// ines_byPV.pdf
submitting papers choose carefully the right venue –based on the topic –based on the quality of your work be objective, compare your work with others –your supervisor knows better how to handle rejection –don’t despair! –reviews can be noisy/unfair, resubmit… –if you get consistently bad reviews you should lower your expectations –consider journals, where you can answer the reviewers’ comments how to handle acception –don’t cheer! –reviews can be noisy/unfair, you’re lucky –consider creating an extended version and submit it to a journal
web sources DBLP (computer science archive) – DBWorld (db announcements mailing list) – citeseer (digital library) – Google Scholar (search engine for papers) –