Download presentation
Presentation is loading. Please wait.
Published byἸούδας Δράκος Modified over 6 years ago
1
Special Topics in Data Mining Applications Focus on: Text Mining
INFS-795 Spring GMU
2
General Info Instructor: Carlotta Domeniconi Office: S&T2, Rm 449
Phone: (703) Office hours: Tue 4-6pm, or by appointment Visit the class webpage often!
3
Course Format Lectures by the instructor; One midterm;
Paper presentations by students; One project: Project proposal; Project presentation Project paper;
4
Visit the class webpage often !!!
Important Dates March 10: Project proposal due; March 24: Midterm Exam; March 31: Students’ presentations start; May 12: Paper on the project due. Visit the class webpage often !!!
5
The final grade is based on…
Midterm: 25% Paper presentation: 15% Project (proposal, presentation, paper): 50% Participation in class and quizzes on papers presented: 10%
6
Course Overview Classification: Bayes decision theory
Density estimation; Discriminant analysis Decision trees; Nearest neighbors Curse of dimensionality Dimensionality reduction: Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) Support Vector Machines
7
Course Overview Clustering: Basics Distance measures K-means
Subspace clustering
8
Course Overview Text categorization: Document representation;
Latent semantic indexing; Unsupervised and supervised feature selection; Feature weighting; Similarity measures; Semantic distances; Kernel methods; Detecting Spam .
9
Course Overview Presentation/Discussion of papers Project proposals;
list of papers provided; Project proposals; Project presentations; Paper on the project;
10
We will study and learn…
Fundamental principles and techniques in data mining / machine learning; Problems that arise in Document classification Existing approaches in data mining to address these problems; Their limitations; Can we do better?
11
Some useful books
12
On Pattern Classification:
R. O. Duda, P. E. Hart, D. G. Stork, “Pattern Classification”, Second Edition, Wiley, 2001.
13
On Document Classification:
S. Chakrabarti, “Mining the Web: Discovering Knowledge from Hypertext Data”, Elsevier Science, 2003. Thorsten Joachims, “Learning to Classify Text using Support Vector Machines”, Kluwer 2002.
14
On Text Retrieval: M. Berry and M. Browne, “Understanding Search Engines. Mathematical Modeling and Text Retrieval”, SIAM, 1999.
15
On Statistical Learning:
T. Hastie, R. Tibshirani, and J. Friedman, “The Elements of Statistical Learning. Data Mining, Inference and Prediction”, Springer, (Last Print!)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.