Presentation is loading. Please wait.

Presentation is loading. Please wait.

IR, IE and QA over Social Media Social media (blogs, community QA, news aggregators)  Complementary to “traditional” news sources (Rathergate)  Grow.

Similar presentations


Presentation on theme: "IR, IE and QA over Social Media Social media (blogs, community QA, news aggregators)  Complementary to “traditional” news sources (Rathergate)  Grow."— Presentation transcript:

1 IR, IE and QA over Social Media Social media (blogs, community QA, news aggregators)  Complementary to “traditional” news sources (Rathergate)  Grow faster than “traditional” web content, gap widening Traditional/published: 4Gb/day; social media: 10gb/day [from Andrew Tomkins/Yahoo!, “Future or Web Search”, May 2007] Research challenges  Low(er) quality  Content more dynamic  User interactions crucial: ratings, comments, link structure to retrieve documents and to evaluate extracted information

2 Finding High Quality Content for IE/QA Goal: find high-quality content (accurate & well-presented)  Setting: Community QA (Yahoo! Answers)  Classifying social media (e.g., cQA) is substantially different from document classification Sources of information  Content analysis  Usage data (page views, etc)  Community ratings, link analysis General framework for quality estimation in social media Graph-based model of contributor relationships, combined with content and usage analysis Can identify high-quality items with accuracy ~ human agreement E. Agichtein, C. Castillo, D. Donato, A. Gionis, G. Mishne, Finding High Quality Content in Social Media, in Proc. of WSDM 2008

3 Finding Relevant Content for IE/QA Goal: given a query, rank social content (cQA) by expected relevance and quality Approach: Learn ranking functions specifically for social media retrieval  Features Textual content: relevance, stylistics, language models User Interactions: link structure, discussion threads User ratings: incorporate user-provided content ratings  Method: Gradient boosting (GBrank) Developed a new objective function for learning ranking function using (noisy) preference data. Results:  Outperform Yahoo! default ranking or naïve ranking by user votes  Can be made robust to ratings spam [same authors, to appear in AIRWeb 2008] J. Bian, Y. Liu, E. Agichtein and H. Zha. Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media, to appear in Proc. of WWW 2008


Download ppt "IR, IE and QA over Social Media Social media (blogs, community QA, news aggregators)  Complementary to “traditional” news sources (Rathergate)  Grow."

Similar presentations


Ads by Google