From Research of Social Media to Socially Mediated Research 2010 HCIL Symposium Workshop - UMD Government Applications of Social Media Networks and Communities.

Slides:



Advertisements
Similar presentations
Group-awareness for Mobile Cooperative Learning
Advertisements

Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Workshop on Online Social Networks Microsoft Research Cambridge December 7, 2007.
Towards Methods for the Collective Gathering and Quality Control of Relevance Assessments SIGIR´09, July 2009.
Register Laulima Workshop for Instructors Solutions to help you engage your students through Laulima.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
CILIP 2007 with huge thanks to Ed Mitchell edmitchell.co.uk Knowledge networks, communities of practice, communities of interest and all that stuff…. Lyndsay.
Social Web, & Annotation in E-Learn 2005 Rosta Farzan January 12, 2006.
Telligent Social Analytics Research & Tools Marc A. Smith Chief Social Scientist Telligent Systems.
Holyoke Public Schools Professional Development By, Judy Taylor
Tom Sheridan IT Director Gas Technology Institute (GTI)
First Steps to NetViz Nirvana: Evaluating Social Network Analysis with NodeXL 1.
Tagging Systems Austin Wester. Tags A keywords linked to a resource (image, video, web page, blog, etc) by users without using a controlled vocabulary.
Tagging Systems Mustafa Kilavuz. Tags A tag is a keyword added to an internet resource (web page, image, video) by users without relying on a controlled.
Copyright © hutchinson associates 2005 The Knowledge is in the Network Patti Anklam June Holley Valdis Krebs Using Network Analysis to Understand and Improve.
Universal Access: More People. More Situations Content or Graphics Content or Graphics? An Empirical Analysis of Criteria for Award-Winning Websites Rashmi.
Open Statistics: Envisioning a Statistical Knowledge Network Ben Shneiderman Founding Director ( ), Human-Computer Interaction.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Dale Rivers Sage Get Serious about Customers Communicate, Collaborate, Compete.
Overview of Search Engines
Discovering Computers Fundamentals, 2011 Edition Living in a Digital World.
Analyzing Social Media Networks with NodeXL
Los Angeles | London | New Delhi Singapore | Washington DC SAGE Research Methods Training Jenny Hopkins – Group Marketing Manager
SOCIAL NETWORK ANALYSIS basic concepts and techniques.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα
Web 2.0: Concepts and Applications 2 Publishing Online.
SharePoint Server 2013 Features and Scenarios for IT Professionals First Lastname, Title March, 2014 Software Assurance Planning Services.
1 – Confidential – Dr. Oz Mobile site will be available through carriers’ platforms and the Mobile Web (WAP) and will include: –Entire shows and shorter.
ArcGIS Workflow Manager An Introduction
Lecture 10 Trends and future applications. Breaking Down Social Media  Media:  the means of communication, as radio and television, newspapers, and.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Margaret J. Cox King’s College London
© 2014 Bentley Systems, Incorporated Support Communities Site upgrade/migration update.
Using LinkedIn to Build Business Presented by: Mandy Boyle SEO Manager.
INTRODUCTION TO THE STATE OF MICHIGAN’S SHAREPOINT ENVIRONMENT.
American Chemical Society ACS Network & Social Collaboration Presentation to the Lehigh Valley Local Section Christopher McCarthy, Social Media Manager,
Learning Through Social Connection Hannah Beaman Online Communities and Web Development Manager SocialLearn.
Evaluating a Research Report
Creating and Operating a Digital Library for Information and Learning– the GROW Project Muniram Budhu Department of Civil Engineering & Engineering Mechanics.
The Information Challenge Exponential growth of resources New researchers with new needs Multiple communication options New expectations and opportunities.
Proposal for Term Project J. H. Wang Mar. 2, 2015.
1 Collaboration Infrastructure for a Virtual Residency in Game Culture and Technology Robert Nideffer and Walt Scacchi Game Culture and Technology Laboratory.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Social software YEFI P. TELAUMBANUA What is Social Software? It is a kind of an interactive tools handle mediated interactions between a pair or.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Introduction to Science Informatics Lecture 1. What Is Science? a dependence on external verification; an expectation of reproducible results; a focus.
Frameworks for New Media Literacy Assessments Margaret Weigel, bambini media || OLPC Workshop || April 5, 2011 | INTRODUCTION | LEARNING CONTEXT | LEARNING.
Online curriculum centre Faculty member training, April 2009.
Finding high-Quality contents in Social media BY : APARNA TODWAL GUIDED BY : PROF. M. WANJARI.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Module 9 User Profiles and Social Networking. Module Overview Configuring User Profiles Implementing SharePoint 2010 Social Networking Features.
Carla Basili - Luisa De Biagi Carla Basili * - Luisa De Biagi * * IRCrES Institute, Rome (IT) *CNR –IRCrES Institute, Rome (IT) Central Library ‘G. Marconi’,
WebDat: A Web-based Test Data Management System J.M.Nogiec January 2007 Overview.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
Discovering Computers Fundamentals, 2010 Edition Living in a Digital World.
Community of practice
TEMPLATE DESIGN © The Collaborative Classroom Website: An Interactive Instructional Tool for the 21 st Century Michelle.
Groupware What are the goals of a groupware system? - Facilitation - Coordination - Cooperation - Augmented, supported production Is efficiency the goal?
INFM 700 Project 3 (Aqua) - Akashdeep Ray - Arnaud Lawson - Neha AR - Vidisha Vedvyas.
Prediction Games Players compete by making predictions about upcoming event/observation in the real world Predictions are scored after event At TAMU, we.
Discovering Computers 2011: Living in a Digital World Chapter 3
Objectives Overview Identify the four categories of application software Describe characteristics of a user interface Identify the key features of widely.
Connecting Interface Metaphors to Support Creation of Path-based Collections Unmil P. Karadkar, Andruid Kerne, Richard Furuta, Luis Francisco-Revilla,
Proposal for Term Project
Respect for People March 22, 2018.
Developing the Guided Learner Journey
(VIP-EDC) Point 6 of the agenda
Intermountain West Data Warehouse
CS 594: Empirical Methods in HCC Social Network Analysis in HCI
Presentation transcript:

From Research of Social Media to Socially Mediated Research 2010 HCIL Symposium Workshop - UMD Government Applications of Social Media Networks and Communities May 28, 2010 Natasa Milic-Frayling Microsoft Research Cambridge

Outline  Microsoft Research. Integrated Systems team, research areas and approach  ‘Social’ as a research topic: Modelling Human to Human Interaction in Technology Mediated Communities  ‘Social’ as facilitator of research Leveraging Communities of Practice.

Microsoft Research (MSR)  MSR Sites –Redmond, Washington (September 1991) –San Francisco, California (June 1995) –Cambridge, United Kingdom (July 1997) –Beijing, China(November 1998) –Silicon Valley, California (July 2001) –Bangalore, India (January 2005) –Cambridge, Massachusetts(July 2008) MSR New England MSR Asia MSR India Redmond MSR Cambridge Silicon Valley

WEB AND ON-LINE COMMUNITIES CONTENT ANALYSIS AND RICH UI MOBILE AND CROSS PLATFORM MEDIA Information retrieval & NLP Academic Disciplines Research Areas Machine Learning and Statistics Mathematical Modelling Graph Theory and Analysis HCI and Design Academic Disciplines

WEB AND ON-LINE COMMUNITIES CONTENT ANALYSIS AND RICH UI MOBILE AND CROSS PLATFORM MEDIA Information retrieval & NLP Team Research Areas Machine Learning and Statistics Mathematical Modelling Graph Theory and Analysis HCI and Design GabriellaJanezAnnikaRachelGerardNatasaEduarda GavinJamie

WEB AND ON-LINE COMMUNITIES CONTENT ANALYSIS AND RICH UI MOBILE AND CROSS PLATFORM MEDIA Information retrieval & NLP Academic Disciplines Research Areas Machine Learning and Statistics Mathematical Modelling Graph Theory and Analysis HCI and Design GabriellaJanezAnnikaRachelGerardNatasaEduarda GavinJamie Vinay Aleks Ignjatovic Ben Shneiderman Elizabeth Bosnignore Cody Dunn Dana Rotman Marc Smith Derek Hansen Tom Lee Team

WEB AND ON-LINE COMMUNITIES CONTENT ANALYSIS AND RICH UI MOBILE AND CROSS PLATFORM MEDIA Research Areas InSite Live Web site structure analysis and decomposition into subsites Social Footprints Analysis of social interaction in online communities NodeXL Interactive graph analysis and visualization. Research Desktop Research in information management and tagging practices in the Desktop environment Social IR Extension of IR models with social network and models of approval, trust and reputation. weConnect Investigating narrow-cast of personalized content in close relationships and potential for mobile advertising. VideoSnaps Investigating concepts and services for cross platform media editing and streaming. Projects

Methodology – how to develop mobile and social applications. Integration with the ecosystem – pre-requisites for adoption Research Platforms WEB AND ON-LINE COMMUNITIES CONTENT ANALYSIS AND RICH UI MOBILE AND CROSS PLATFORM MEDIA Research Areas InSite Live Web site structure analysis and decomposition into subsites Social Footprints Analysis of social interaction in online communities NodeXL Interactive graph analysis and visualization. Research Desktop Research in information management and tagging practices in the Desktop environment Social IR Extension of IR models with social network and models of approval, trust and reputation. weConnect Investigating narrow-cast of personalized content in close relationships and potential for mobile advertising. VideoSnaps Investigating concepts and services for cross platform media editing and streaming. Projects Connect the quantitative analyses with the qualitative analyses. Principles, mechanisms, and tools for knowledge management. Trust and reputation. Shared summaries and overviews.

INTERACTIONS IN TECHNOLOGY MEDIATED COMMUNTIES social as a research topic

Community Question-Answering Online Communities Web Boards Question Answering Distribution Lists Forums Newsgroups Blogs

Community Question-Answering Question Answers

Content Organization, Browsing and Search Topic categories Tags

100 Most Frequent Tags on Live QnA

Politics

100 Most Frequent Tags on Live QnA Fun, Life, People, Philosophy

Community Analysis and Health Index Towards a sustainable community  Support novice users in becoming active community participants  Support frequent users in increasing the volume and quality of their content contributions  Promote high quality contributions (for external exploitation – through search). 85% of new users start with a question 72% never ask a question again 5% will engage in answering 61% of questions from new users don’t get more than 1 answer (23% get 0 answers)

Example: Investigate QnA Voting Practice Approach:  Statistical analysis of the user logs  Manual inspection of the content –Taxonomy of the users’ intent; to be evolved by the community of practice  Define the basic features of the individuals and governing assumptions  Derive a mathematical model of the voters metric.  Observe the properties with regards to the irregular voting behaviour: random voting or collusion. C A A V answer to vote on answer to comment to Social network activities: Q Answer to a question Comment on an answer Vote on the best answer

Which Answer to Vote On?  Different ‘best answer’ connotations The notion of the ‘best answer’ thus depends on the context and nature of the answers - from correctness and usefulness to entertainment value  Social bias Assignment of votes may be influenced by social and personal ties, voter’s perception, familiarity, and preferential treatment of familiar community members “Microsoft or Apple? Feel free to argue and point out their good and bad points. Also feel free to rebut or debate on other people's standpoint. Best argument/answer will get my friends’ and my "best answer" reward.”  Self-promotion Individuals’ aspirations to excel in their social status can adversely affect the quality of their contribution to the community.

Reliability as Conformity?  Reliability of a voter Relative reliability of two voters is determined by the proportion of all the voters who made the same choice of the best answer: The reliability scores represent a fixed-point for the function F – apply Brouwer Fixed Point Theorem.

Real Data Analysis Vote Count FP Method ‘FUN’‘PHILOSOPHY’

Random Voting Simulate Random Voting by uniform distribution in place of Zipf’s Law We vary the percentage of affected questions (from 1% to 10%) and the percentage of voters who voted randomly (from 1% to 10%). The number of best answer changed is lower for fixed point score (right) than for plurality voting (left)

 Simulate the collusion: fix the number of involved voters (‘stuffers’, here 4 and 10) and the percentage of questions affected (here 50%)  Both majority voting and fixed point scoring are susceptible to ballot stuffing  Fixed point scoring flags out the outliers and helps identifying collusion Ballot Stuffing

Detecting Sybil Attack - Leveraging Social Networks Social networks are Fast Mixing –Random walks quickly converge to stationary distribution Sybil attacks induce a bottleneck cut –Fast mixing is disrupted Knowledge of an apriori honest node –Breaks Symmetry Honest Nodes Sybil Nodes Attack edges

LEVERAGING COMMUNITIES OF PRACTICE social as facilitator of research

Issue: the Scale and the Limitations of Humans  We require user input in order to inform the systems’ design and verify our hypotheses  In search we build test collections: –A set of topics, a corpus of documents, and relevance judgements for documents in the corpus  Question: how do we build test collections for books –Search over Web pages involves low cost of inspection of individual Web pages –Search over Book collections increases the cost due to the size and the coherence of topics across pages.

Web scenario

Book scenario …

DATA STORE AND SEARCHABLE INDEX Read’n Play  Architecture comprises four functional layers  Implemented using Web services - no client based interaction with the content  Can be repurposed for other research projects SEARCH AND NAVIGATION SUPPORT USER ANNOTATIONS SOCIAL GAME SUPPORT Image Database - Scanned Document Page OCR Text Database Text and Metadata Index

Social game Explorers Reviewers Reward for finding relevant content Reward for finding mistakes in explorers’ work Reward for re-assessment (agreement is not necessary) Conflicts Penalty

Explore

Pilot Study Participants  Open to everyone  48 registered + 81 INEX participants  17 contributed assessments (16 INEX participants) Collected data  Relevance assessments –3,478 judged books with –23,098 judged pages from –29 topics  Log data –32,112 navigational events –45,126 judgement events –2,970 ‘search inside a book’ events Incentives for participation Tangible, e.g., monetary, –Winners: Microsoft Hardware and software –All: Access to collected data Intangible reward, e.g., fun, social gain –Leader board: Social statu s

Feasibility Averages across the 17 assessors  7.2 days with activity, out of 42  11.4 hours judging time  220 judged books Average effort  7.3 minutes per relevant book, 2.7 minutes per irrelevant book (comparable to INEX 2003 ad hoc track)  37 seconds per relevant page, 22 seconds per irrelevant page Extrapolated statistics  1000 books takes 52.7 hours, 1 : 9 ratio of relevant : irrelevant  33.3 days to judge one topic, with 95 minutes a day  70 topics, 200 books per topic with 20 judges takes 36.9 days  737 judges to complete task in one hour

Productivity Games

Summary  Understanding social media requires cross-disciplinary approach and new methods to study them  Defining the characteristics and metrics of ‘healthy communities’ is a challenging task.  ‘Social’ is increasing its role as an enabler for large scale experiments Generally, we need to be reflective of our methods and approaches we take when studying online communities.

Thank you Microsoft Research Cambridge