Presentation is loading. Please wait.

Presentation is loading. Please wait.

NEW EVENT DETECTION AND TOPIC TRACKING STEPS. PREPROCESSING Removal of check-ins and other redundant data Removal of URL’s maybe Stemming of words using.

Similar presentations


Presentation on theme: "NEW EVENT DETECTION AND TOPIC TRACKING STEPS. PREPROCESSING Removal of check-ins and other redundant data Removal of URL’s maybe Stemming of words using."— Presentation transcript:

1 NEW EVENT DETECTION AND TOPIC TRACKING STEPS

2 PREPROCESSING Removal of check-ins and other redundant data Removal of URL’s maybe Stemming of words using TRMorph – Get the root form of a word

3 PREPROCESSING(2) Expand tweets with co-occurance statistics of words – OzerOzdikisAsonam (language independent) Syntagmatic relations-> If two words appear together very frequently in texts Paradigmic relations-> If words can replace each other Use of WordNet (BalkaNet for Turkish, not so succesful) Use of Latent Semantic Indexing for expanding the tweets might be used

4 PREPROCESSING(3) Normalize the tweets to produce unit-length vectors Put the tweets and words in a vector space model with the words tf-idf values The ones with hashtags can be increased to get a better result (an idea) *Times of tweets can be used in a way*

5 ALGORITHM Clusters are vectors of the average values of belonging tweets Calculate cosine similarity between a new tweet and all the clusters If the similarity is greater than a threshold – Add the tweet to the corresponding cluster – Update the cluster ?addition to more than one cluster if the value is above threshold fore more clusters?

6 ALGORITHM(2) If the cosine similarity is below the threshold for all the clusters, this is a new event and a new cluster

7 ALGORITHM(3) We might extract queries(word groups that represents the topics) for clusters to look for the cluster-tweet similarities.[2] Update the query with each update to the cluster

8 EVALUATION Precision-Recall, F score Intra-distance similarities [1]

9 REFERENCES [1] http://ieeexplore.ieee.org/xpl/articleDetails.js p?arnumber=6425790 [1] http://ieeexplore.ieee.org/xpl/articleDetails.js p?arnumber=6425790 [2] http://citeseerx.ist.psu.edu/viewdoc/summar y?doi=10.1.1.42.8942 [2] http://citeseerx.ist.psu.edu/viewdoc/summar y?doi=10.1.1.42.8942


Download ppt "NEW EVENT DETECTION AND TOPIC TRACKING STEPS. PREPROCESSING Removal of check-ins and other redundant data Removal of URL’s maybe Stemming of words using."

Similar presentations


Ads by Google