 Goal recap  Implementation  Experimental Results  Conclusion  Questions & Answers.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Real Time Information.
Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.
Overcoming Limitations of Sampling for Agrregation Queries Surajit ChaudhuriMicrosoft Research Gautam DasMicrosoft Research Mayur DatarStanford University.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
1 SEARCH ENGINE OPTIMIZATION AT Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
1 Multi-topic based Query-oriented Summarization Jie Tang *, Limin Yao #, and Dewei Chen * * Dept. of Computer Science and Technology Tsinghua University.
Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search Date: 2014/5/20 Author: Karthik Raman, Paul N. Bennett, Kevyn Collins-Thompson.
Small Business & Web Technology Going Social. Agenda What is Social Network? Why Social Network Matter? Trends in Social Networking – Facebook – Twitter.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Topic Extraction From Turkish News Articles Anıl Armağan Fuat Basık Fatih Çalışır Arif Usta.
Statistical Models for Networks and Text Jimmy Foulds UCI Computer Science PhD Student Advisor: Padhraic Smyth.
Learning to Advertise. Introduction Advertising on the Internet = $$$ –Especially search advertising and web page advertising Problem: –Selecting ads.
Presented by Zeehasham Rasheed
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
1 I256: Applied Natural Language Processing Marti Hearst Nov 8, 2006.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Overview of Search Engines
We will be starting at 1:03pm EST. Please select your choice of audio: telephone or computer on your GotoWebinar console. Use the Chat Pane in the GoToTraining.
INTRODUCTION TO ADB’s MANAGEMENT ACTION RECORD SYSTEM (MARS) and LESSONS DATABASE A Presentation by Jocelyn G. Tubadeza for the African Development Bank.
ENHANCING YOUR MARKETING STRATEGIES WITH ONLINE VIDEO How Video Marketing Works Presentation By:
Books such as The Long Thaw explain issues like climate change in language that is easy for the general public to understand. Authors.
Social Media A BETTER WAY TO MANAGE YOUR ONLINE PROFILE!
Projects for Online Advertising. 2 AD BEHAVIOR IN PANDORA PROJECT 1 Arindam Paul du
A website is a collection of related web pages, images, videos or other digital assets that are addressed relative to a common Uniform Resource Locator.
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
Web Usage Mining with Semantic Analysis Date: 2013/12/18 Author: Laura Hollink, Peter Mika, Roi Blanco Source: WWW’13 Advisor: Jia-Ling Koh Speaker: Pei-Hao.
Multimedia Databases (MMDB)
The Future of Marketing is Inbound Marketing Mike Volpe VP
A Framework for Examning Topical Locality in Object- Oriented Software 2012 IEEE International Conference on Computer Software and Applications p
Generating Intelligent Links to Web Pages by Mining Access Patterns of Individuals and the Community Benjamin Lambert Omid Fatemieh CS598CXZ Spring 2005.
Emerging Technology StumbleUpon & QuizStar By Donna Perkins FOED 3010 Instructor, Ms. D. Crabtree.
Integrating Technology for Instruction and Learning Jennifer Verschoor & Evelyn Izquierdo April 3, 2009.
Keystroke Biometric System Client: Dr. Mary Villani Instructor: Dr. Charles Tappert Team 4 Members: Michael Wuench ; Mingfei Bi ; Evelin Urbaez ; Shaji.
Interactive Power Point Evaluating Your Program. Evaluating your Program First – Review the Steps Step 1 ◦State overall objectives Step 2 ◦State desired.
Blogging By Yun Taiho. Your Favorite Blog and Why.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
1 Linmei HU 1, Juanzi LI 1, Zhihui LI 2, Chao SHAO 1, and Zhixing LI 1 1 Knowledge Engineering Group, Dept. of Computer Science and Technology, Tsinghua.
1 Yang Yang *, Yizhou Sun +, Jie Tang *, Bo Ma #, and Juanzi Li * Entity Matching across Heterogeneous Sources *Tsinghua University + Northeastern University.
Finding the Hidden Scenes Behind Android Applications Joey Allen Mentor: Xiangyu Niu CURENT REU Program: Final Presentation 7/16/2014.
ITGS Databases.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
Topic Modeling using Latent Dirichlet Allocation
Using Social Media to Deliver Feedback and Revision eSurguries in Undergraduate Bioscience Teaching.
Types of Web Sites 1/8/08. Portal Offers a variety of Internet services Offers a variety of Internet services Examples of services: Examples of services:
+ User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January.
Independence Middle School MEDIA CENTER /7.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Bell Ringer Activity  Pick up a graphic organizer about Social Networking  First, in your own words, explain what social media means to you (be prepared.
Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
CREATE, IMPLEMENT AND ENJOY! Blogs,Wikis & RSS Readers.
Sports Market Research. Know Your Customer How do businesses know their customers needs and wants?  Ask them/talking to customers  Surveys  Questionnaires.
What is Seo? SEO stands for “search engine optimization.” It is the process of getting traffic from the “free,” “organic,” “editorial” or “natural” search.
Differential Analysis on Deep Web Data Sources Tantan Liu, Fan Wang, Jiedan Zhu, Gagan Agrawal December.
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Topic Modeling for Short Texts with Auxiliary Word Embeddings
An Overview of Statistical Inference – Learning from Data
Exploring Computer Science Lesson 5-2
An Overview of Statistical Inference – Learning from Data
People-LDA using Face Recognition
Social Media Marketing Strategy Template
WEBINAR: Selenium Page Object vs Object Repository
Personalized Celebrity Video Search Based on Cross-space Mining
A Suite to Compile and Analyze an LSP Corpus
Best Helpful SEO Tips For Good Content Writing 2019 Presented By:- Abhinav Shashtri.
Jinwen Guo, Shengliang Xu, Shenghua Bao, and Yong Yu
Social Media Marketing Strategy Template
Presentation transcript:

 Goal recap  Implementation  Experimental Results  Conclusion  Questions & Answers

 Our goal is to implement framework, to predict network traffic by mining mainstream news articles  Method › Latent Dirichlet Allocation (LDA) identifies and classifies popular topics in articles  ISP can query and pre-cache highly popular videos to reduce overall traffic and delay

 Implemented a python program to parse the news articles and collect the title and content  Original LDA implementation processed random Wikipedia articles, we modified it to pass and process news articles.  Wrote a script to extract and store YouYube statistical data such as, view-counts, number of subscribers, YouTube ID’s, date of upload, user profile data, etc.

 Wrote and implemented a program to sort topics by popularity, we pick most popular topics and compare it with news websites › Popular news websites (such as CNN, BBC) generate popularity chart over time by click- view data  Implemented the ZOOM Operation › Wrote a program to distribute the articles by sources/category › Query words using frequent pattern mining and LDA results to check relevancy and accuracy of popular topics

(X axis) # of feeds VS (Y axis) Video relevance to the topic

(X axis) # of feeds VS (Y axis) Accuracy of selecting video with most traffic

 Online LDA alone accurately chooses the most popular topic around 57% of the times using 1k articles. With 100k articles it is around 91% accurate. The blue line is the accuracy using both Online LDA and frequent pattern mining. With 1k articles the accuracy is around 92%. Using 100k articles the accuracy close to 100%.  When using only Online LDA there is only around a 60% chance the selected video will be relevant to the actual topic when using 10k articles. When using 100k articles the probability rises to about 87%. When using frequent pattern mining and Online LDA there is around a 94% chance the video selected is relevant using 10k articles. With 100k the probability is close to 100%.  From these results we conclude that using Online LDA combined with frequent pattern mining we will be able to predict popular topics from mainstream media and identify relevant videos from video portals with high accuracy

 Thank you  Q&A!!