CS505: Final Exam Review Jinze Liu
Major Topics Before Mid-Term – Security and Access Control – Indexing After Mid-Term – Transaction Management Locking, Concurrency Control, Logging and Recovery – Query Processing & Optimization – Introduction to Information Retrieval
Concurrency Control Basic Problem: Given two sets of transactions, determine – whether they are conflict serializable – which schedule can maximize their concurrency.
Locking How does the system enforce concurrency control? Locking – 1. Basic read and write locks – 2. Incremental locks – 3. Tree based locks Validation based concurrency control
Logging and Recovery Undo logs – What’s the content of the log – How to recover from undo logs Redo logs – What’s the content of the log – How to recover from redo logs Similar for undo and redo logs. How to use checkpoints
Query Processing What are the most common queries that are time consuming? – Join What’s the basic algorithm to implement Join? – Why is it time consuming How to improve it? – Sorted Merge Join – Hash-based Join – What’s their performance
Query Plan Basic Question: Give a query, what’s the most efficient plan to execute it? – How many equivalent plans are there? Query Rewrite – What’s the plan with best performance? How do you estimate performance based on data distribution? – How to choose the best plan?
Information Retrieval What’s the data structure to store document and words relationships – Dictionary and posting. How to speed up query of words? How to tolerant errors in the queries?
CS685: Data Mining
Final Exam Exam questions – 5 big questions just like mid-term exam – 1 extra credit question Exam – Location: CB242 – Time: Dec 18 th 3pm-5pm – Just bring yourself and a pen or pencil
Exam Week Office Hours Time – Monday, Wednesday, 11am – 1pm Location – Hardymon building 237
Thank You Questions? – Send or drop by