GeoFeed: A Location-Aware News Feed System Jie Bao Mohamed F. Mokbel Chi-Yin Chow Department of Computer Science and Engineering University of Minnesota – Twin Cities Department of Computer Science City University of Hong Kong \
Background Social Networking Services (e.g., Facebook & Twitter) People on Facebook More than 500 million active users 50% of our active users log on to Facebook in any given day Average user has 130 friends People spend over 700 billion minutes per month on Facebook Facebook No.1 Twitter No.17th Recently 5 billion IPO Become one of the most popular Web services!!!
What is News Feeds? News Feed function Examples: Display a set of messages/news from friends / subscribed news agents Examples: Social networking system, i.e., Facebook, Twitter News Aggregators, i.e., My Yahoo!, iGoogle
Motivation Traditional News Feed Motivating Scenarios Organized by either message issuing time, e.g., Twitter, or some user requirements, e.g., Facebook Spatial relevance is overlooked, user gets the same news feed from different log on locations Motivating Scenarios Travelling user is more interested in the news/messages that are close to her current location to explore the new place Stationary users may NOT be interested in the news/messages that are issued very far from their locations If the news feed functionality is aware of the inherent locations of users and messages, more relevant news feed will be delivered
“Locations” in Existing Social Networking Systems Google Latitude Facebook Place Twitter Nearby “Real” Location-Aware News Feed Social Relevance Messages from friends/ subscribed news agents 2. Spatial Relevance Message relevant to the user’s location Unfortunately not “real” location awareness currently Share only user’s current location, e.g., Google Latitude Use location information as a tag , e.g., Facebook Place View all the messages in a spatial range, e.g., Twitter Nearby
Location-Aware News Feeds Location-Based Messages Issuer: user/ news agent Spatial extent: point/range Location-Aware News Feeds Recent k spatial relevant messages from each of my friends Example: Carol wants her news feed from friends (Alice and Bob) Alice’s Messages Message Content Spatial Timestamp M5 Raining S5 14:30 M3 A nice bar S3 14:10 M2 Eating at bar S2 14:04 A location-based query is issued to retrieve the most recent k=2 relevant messages from Alice M2 M6 M5 Bob’s Messages M3 M4 Message Content Spatial Timestamp M6 Local Sale S6 15:30 M4 An accident S4 14:21 M1 Work finished S1 11:40 Carol M1 A location-based query is issued to retrieve the most recent k=2 relevant messages from Bob
An Overview of GeoFeed For a user U with N friends, GeoFeed abstracts location-aware news feed to a set of N location-based queries, such that: The N location-based queries are fired upon U logging on to the system Each location- based query is directed to one friend to retrieve the set of k relevant messages GeoFeed employs three approaches for each location-based query Spatial Pull approach Spatial Push approach Shared Push approach GeoFeed employs a decision model that decides upon the best approach to evaluate each query such that: The system computational overhead is minimized Each user U will get the required news feed in TU time units
GeoFeed Preliminary : Problem Formulation Given: User location User friend list User response time requirement User activity patterns, i.e., offline time and update frequency Find: Best approach among spatial pull, spatial push, and shared push approaches, to evaluate q once u logs on to the system next time Objective: Provide location-aware news feed for the user Guarantee a the response time that u will encounter to get all the requested location-aware news feeds Minimize the computational overhead for all queries in the system
The Spatial Pull Approach in GeoFeed Do nothing when the user offline Once the user logs on, compute al the queries for the user Advantage: No extra overhead during offline period Disadvantages: High user response time and not efficient for the user with short offline time Bob location-based query Alice Messages 3. Get cell 2. Alice’s location Spatial Filter Grid Index 5. Relevant messages 4. Messages in the cell
The Spatial Push Approach in GeoFeed Maintain a materialized view for the pre-computed messages Once the user logs on, the answer is ready Advantage: Users are very happy with very low response time Disadvantages: System is overwhelmed with maintaining large number of views that may not be necessary Bob 1. location-based query Alice Materialized view 3. Range query New message Other Friends Other Materialized views 2. Relevant messages 4.Update Grid Index
The Shard Push Approach in GeoFeed Shared Push approach Share one view among queries for the nearby friends Once the user logs on, the answer is ready Advantages: Users are still very happy with very low response time, and system overhead could be significantly lower Disadvantages: Users need to be close enough, continuously check if views can be shared Bob 1. location-based query 3. Range query Alice Shared materialized view Filter New message Nearby Friends 2. Relevant messages 4.Update Grid Index
GeoFeed Cost Model Spatial pull approach (based on per user-friend evaluation) Response time Evaluating the location query Spatial push approach (based on per user-friend evaluation) Response time/Query processing cost Return messages from materialized view System overhead Cost to update the materialized view with the user’s the offline time and the friend’s update frequency Shared push approach (based on per cell evaluation) Return messages from the shared view with filtering Cost to update the shared view with the user’s update frequency and friends’ minimum offline time
Challenges in Decision Model Main Challenges: Guarantee a response time requirement for the user Do not overwhelm the system Consider the wide diversity of the user activity patterns in social networking systems, e.g., offline times, update frequencies To favor user response time More spatial push approaches will be adapted System is overkilled to maintain a large number of materialized views and continuous queries To favor system overhead More spatial pull approaches may be adapted Users suffer significant delays to get their news feeds
Which is the Best Approach for a Query System-wide decision Per-User decision Per-Query decision (GeoFeed) A B C D E F A B C D E F A B C D E F A B C D E F OR Users Friends Users Friends Consider the wide diversity in user activities in social networking systems e.g., offline times and update frequencies Users Friends
GeoFeed Decision Algorithm Step 1. Response Time Guarantee For each user, this step uses our cost model to decide the MAX number queries (N) to be evaluated by the spatial pull approach Step 2. Spatial Pull & Push Selection For each user, this step selects N queries to be evaluated by the spatial pull approach based on our cost model Step 3. Shared Push Refinement For each user, this step attempts to share the execution of his/her friends’ queries that are selected to be evaluated by the spatial push approach.
Experiments (1/4) Data Sets Experimental Settings Get final 646,697 tweets issued in State of Minnesota Use location information in tweets Coordinate locations Semantic location, e.g., a city name (use Google Geocoder) Experimental Settings Based on a Postgresql database Based on the statistics from Facebook A set of evaluation experiments to get the parameters to build the cost model and decision algorithm
Experiments (2/4) Inside GeoFeed Decision model Insights: With the increase of Tu, more spatial pull approaches are selected. When Tu=0 no spatial pull approaches are applied When Tu=∞, GeoFeed aims to only minimize the system overhead through employing much of the spatial pull approach. Comparing two figures shows that with a smaller offline time, more spatial push approaches are applied. (a) Offline time = 1 hour (b) Offline time = 8 hours
Experiments (3/4) Compare with traditional approaches Insights: Pure spatial pull has bad response time Pure spatial push had bad system overhead
Experiments (4/4) System overall overhead Insight: GeoFeed with shared push refinement has the similar response time but saves significant in system overhead
System Prototype Sindbad: A Location-Aware Social Networking System (SIGMOD 2012 demo)
Conclusion Location-Aware News Feeds Social relevance, i.e., a user’s friends/subscribed news agents Spatial relevance, i.e., messages overlap user’s location GeoFeed is an efficient system equipped with a smart decision algorithm, which chooses the best approach among spatial pull, spatial push and shared push to evaluate location-aware news feed: Guarantee the user’s required response time Minimize the system overhead
Thanks