Presentation is loading. Please wait.

Presentation is loading. Please wait.

Special Topics in Data Mining Applications Focus on: Text Mining

Similar presentations


Presentation on theme: "Special Topics in Data Mining Applications Focus on: Text Mining"— Presentation transcript:

1 Special Topics in Data Mining Applications Focus on: Text Mining
INFS-795 Spring GMU

2 General Info Instructor: Carlotta Domeniconi Office: S&T2, Rm 449
Phone: (703) Office hours: Tue 4-6pm, or by appointment Visit the class webpage often!

3 Course Format Lectures by the instructor; One midterm;
Paper presentations by students; One project: Project proposal; Project presentation Project paper;

4 Visit the class webpage often !!!
Important Dates March 10: Project proposal due; March 24: Midterm Exam; March 31: Students’ presentations start; May 12: Paper on the project due. Visit the class webpage often !!!

5 The final grade is based on…
Midterm: 25% Paper presentation: 15% Project (proposal, presentation, paper): 50% Participation in class and quizzes on papers presented: 10%

6 Course Overview Classification: Bayes decision theory
Density estimation; Discriminant analysis Decision trees; Nearest neighbors Curse of dimensionality Dimensionality reduction: Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) Support Vector Machines

7 Course Overview Clustering: Basics Distance measures K-means
Subspace clustering

8 Course Overview Text categorization: Document representation;
Latent semantic indexing; Unsupervised and supervised feature selection; Feature weighting; Similarity measures; Semantic distances; Kernel methods; Detecting Spam .

9 Course Overview Presentation/Discussion of papers Project proposals;
list of papers provided; Project proposals; Project presentations; Paper on the project;

10 We will study and learn…
Fundamental principles and techniques in data mining / machine learning; Problems that arise in Document classification Existing approaches in data mining to address these problems; Their limitations; Can we do better?

11 Some useful books

12 On Pattern Classification:
R. O. Duda, P. E. Hart, D. G. Stork, “Pattern Classification”, Second Edition, Wiley, 2001.

13 On Document Classification:
S. Chakrabarti, “Mining the Web: Discovering Knowledge from Hypertext Data”, Elsevier Science, 2003. Thorsten Joachims, “Learning to Classify Text using Support Vector Machines”, Kluwer 2002.

14 On Text Retrieval: M. Berry and M. Browne, “Understanding Search Engines. Mathematical Modeling and Text Retrieval”, SIAM, 1999.

15 On Statistical Learning:
T. Hastie, R. Tibshirani, and J. Friedman, “The Elements of Statistical Learning. Data Mining, Inference and Prediction”, Springer, (Last Print!)


Download ppt "Special Topics in Data Mining Applications Focus on: Text Mining"

Similar presentations


Ads by Google