Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Definition Social Information Processing is an activity through which collective human actions organize knowledge process which allows us to collectively solve problems far beyond any individual’s capabilities a new information processing paradigm enabled by the Social Web
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium The Social Web The Social Web is a collection of technologies, practices and services that turn the Web into a platform for users to create and use content in a social context Authoring tools blogs Collaboration tools wikis, Wikipedia Tagging systems del.icio.us, Flickr, CiteULike Social networking Facebook, MySpace, Essembly Collaborative filtering Digg, Amazon, Yahoo answers
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web features Users create content Articles, opinions, creative products Users annotate content Metadata (e.g., tags) Ratings Users create connections Between content and metadata Between content or metadata and users Among users (social networks) Users interact Discuss and rate content
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is interesting Social Web as a complex dynamical system Complex collective behavior emerges from actions taken by many users Patterns emerge on large scale Variety of interactions between users Coordination, collaboration, conflict … Network vs environment-mediated
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is interesting Social Web as a knowledge-generating system Users express personal knowledge (through articles, tags, links, …) or modify knowledge expressed by others Tailor information to individual user … Personalization and recommendation … or combine users’ knowledge to create a knowledgebase Wikipedia, wikis folksonomy FAQs, …
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is interesting Social Web as a problem-solving system By exposing human activity, Social Web allows users to harness the power of collective intelligence to solve problems Manage the commons Help the visually impaired get around in new places Figure out who to trust
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is interesting Lots of data for empirical studies Large-scale experimenation Social Web is amenable to analysis Design systems for optimal performance
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is challenging Social Web is enormous and growing rapidly Some popular sites have >1 million users and >1 billion objects 2G/day of “authored” content 10-15G/day of user generated content [From Andrew Tomkins, Yahoo! Research] Need new computational techniques to process massive data
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is challenging Social Web is highly dynamic New users and content Links are created and destroyed Need new computational approaches to deal with dynamic data
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is challenging Social Web is highly heterogeneous Variety of content and media types Variety of information domains Needs to be even more heterogeneous Ability to express knowledge at different granularity levels Micro-tagging: tag data within pages Ability to express more complex knowledge Specify relations: e.g., semantics of links Need algorithms to combine heterogeneous data
ISI USC Information Sciences Institute March 2008AAAI Social Information Processing Symposium Social Web is challenging Social Web is highly diverse User participation has power law distribution User expertise has power law distribution Need approaches that go beyond ‘wisdom of crowds’ to combine knowledge from users Averaging is not always the best solution How do we best exploit diversity? Understand incentives for user participation Methods for improving content/metadata quality