Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning to Classify Documents Edwin Zhang Computer Systems Lab

Similar presentations


Presentation on theme: "Learning to Classify Documents Edwin Zhang Computer Systems Lab"— Presentation transcript:

1 Learning to Classify Documents Edwin Zhang Computer Systems Lab 2009-2010

2 Abstract Classifying documents
Will use a Bayesian method and calculate conditional probability Use a set of Training Documents Choose a set of features

3 Introduction Learning to Classify Documents Use a Bayesian Method
Code in Python/Java

4 Background Naïve Bayes Classifier/Bayesian Method
computes the conditional probability p(T|D) for a given document D for every topic Assigns the document D to the topic with the largest conditional probability

5 Development Program has two steps: Learning Prediction
training documents conditional probability features selection

6 Development Prediction
Predicting what a unknown document is talking about based on prediction section

7 Expected Results Initially, the program may have trouble classifying documents into the correct category As the program learns more and improves its formulas, it will get better at classifying documents into the correct categories.

8 Works Cited http://www.nltk.org/book My dad
Eyheramendy, Susana, and David Madigan. "A Flexible Bayesian Generalized Linear Model for Dichotomous Response Data with an Application to Text Categorization." Lecture Notes-Monograph Series 54 (2007): JSTOR. Web. 25 Oct <


Download ppt "Learning to Classify Documents Edwin Zhang Computer Systems Lab"

Similar presentations


Ads by Google