Download presentation
Presentation is loading. Please wait.
Published byRalph Roy Morton Modified over 9 years ago
1
My work: 1. Co-cluster users and content to summarize user content relationships. 2. Define a new similarity index to efficiently answer complex queries. 3. Identify and visualize differences between communities by local/global term analysis. Types of social network queries Time Content Location Network Content+Time Content+Location Content+Time+Loc … … Simple Complex David Fuhry ( fuhry@cse.ohio-state.edu | http://www.cse.ohio-state.edu/~fuhry )fuhry@cse.ohio-state.eduhttp://www.cse.ohio-state.edu/~fuhry Yiye Ruan ( ruan@cse.ohio-state.edu | http://www.cse.ohio-state.edu/~ruan )ruan@cse.ohio-state.eduhttp://www.cse.ohio-state.edu/~ruan Advisor: Srinivasan Parthasarathy ( srini@cse.ohio-state.edu | http://www.cse.ohio-state.edu/~srini )srini@cse.ohio-state.eduhttp://www.cse.ohio-state.edu/~srini Department of Computer Science and Engineering, The Ohio State University Summarization, Search, and Community Analysis in Social Networks Social network mining and analysis. Summarize, query, and analyze large social network datasets. Discovering Community Differences Complex Query Support for Social Networks Succinctly Summarizing User Content Relationships Co-clustering: Find the best overlapped hyperrectangles to represent the data. NP-hard. Our greedy approximation algorithm HYPER utilizes frequent itemsets as candidates. Extension HYPER+ allows false positives. In KDD2008, one hyperrectangle is social media researchers: Keywords: network, social, massiv, behavior, process, time Authors: A Sridharan, C Faloutsos, D Cosley, DB Neill, DP Huttenlocher, DJ Crandall, J Bolot, JG Schneider, JM Kleinberg, J Leskovec, K Das, L Backstrom, M Seshadri, S Suri, S Machiraju Summarizing transactional databases with overlapped hyperrectangles. Yang Xiang, Ruoming Jin, David Fuhry, and Feodor F. Dragan. DMKD (2011) 23:215-251. Local/Global Term Analysis for Discovering Community Differences in Social Networks. David Fuhry, Yiye Ruan, and Srinivasan Parthasarathy. ACM Web Science 2012. Domain Challenges: Scale: In the social network domain, huge amount of data which is rapidly changing, e.g. 1 billion tweets every three days. Interpretation: How can we summarize huge graph data, including for human consumption? Performance: How can we quickly and interactively find users, messages, and communities of interest in these datasets? Analysis: Once we find different communities of users, how can we identify their similarities & differences? Social networks are huge and traditional indexes can only answer simple queries. How can we build an index to efficiently answer more complex queries? We integrate both relationship structure and content in social networks to understand what communities are talking about, what they are not talking about, and how communities differ. Many large datasets give us relationships between users and content terms. For processing, storage, and visualization, how can we summarize co-occurring groups of users and terms? Between Nikon and Olympus communities, Olympus community talks more about blogs. Between camera and global communities, camera community talks less about health, teeth, and success. Complex queries can be expressed in terms of similarity. We define separate similarity measures for content, time, location, and network distance. Instead of a separate index for each, a single integrated index over all similarity measures avoids problems with skew. Overview Acknowledgement: Works supported by NSF Social-Computational Systems (SoCS) grant IIS-1111118: Social Media Organizational Sensemaking in Emergency Response. (Parthasarathy @ Ohio State, Sheth @ Wright State)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.