Understanding and Organizing User Generated Data

Slides:



Advertisements
Similar presentations
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Advertisements

Social network partition Presenter: Xiaofei Cao Partick Berg.
1 Greedy Forwarding in Dynamic Scale-Free Networks Embedded in Hyperbolic Metric Spaces Dmitri Krioukov CAIDA/UCSD Joint work with F. Papadopoulos, M.
Analysis and Modeling of Social Networks Foudalis Ilias.
Dimensionality Reduction PCA -- SVD
Locating in fingerprint space: wireless indoor localization with little human intervention. Proceedings of the 18th annual international conference on.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
ETH Zurich – Distributed Computing Group Michael Kuhn 1ETH Zurich – Distributed Computing Group Social Audio Features An Intuitive Guide to the Music Galaxy.
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
Small Worlds Presented by Geetha Akula For the Faculty of Department of Computer Science, CALSTATE LA. On 8 th June 07.
Distributed Computing Group From Web to Map: Exploring the World of Music Olga Goussevskaia Michael Kuhn Michael Lorenzi Roger Wattenhofer Web Intelligence.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Localization from Mere Connectivity Yi Shang (University of Missouri - Columbia); Wheeler Ruml (Palo Alto Research Center ); Ying Zhang; Markus Fromherz.
ETH Zurich – Distributed Computing Group Samuel Welten 1ETH Zurich – Distributed Computing Group Michael Kuhn Roger Wattenhofer Samuel Welten TexPoint.
Distributed Computing Group Exploring Music Collections on Mobile Devices Michael Kuhn Olga Goussevskaia Roger Wattenhofer MobileHCI 2008 Amsterdam, NL.
Distributed Computing Group Visually and Acoustically Exploring the High-Dimensional Space of Music Lukas Bossard Michael Kuhn Roger Wattenhofer SocialCom.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
Understanding and Organizing User Generated Data Methods and Applications.
Authors: Joseph Djugash, Sanjiv Singh, George Kantor and Wei Zhang
Michael Kuhn Distributed Computing Group (DISCO) ETH Zurich The MusicExplorer Project: Mapping the World of Music.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Distributed Computing Group Cluestr: Mobile Social Networking for Enhanced Group Communication Reto Grob (Swisscom) Michael Kuhn (ETH Zurich) Roger Wattenhofer.
Dimensionality Reduction
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Dimensionality Reduction. Multimedia DBs Many multimedia applications require efficient indexing in high-dimensions (time-series, images and videos, etc)
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Boundary Recognition in Sensor Networks by Topology Methods Yue Wang, Jie Gao Dept. of Computer Science Stony Brook University Stony Brook, NY Joseph S.B.
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
Definition Podcasting is the distribution of multimedia files (usually audio or video) by a system of syndication (RSS) allowing subscription and use.
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
Manifold learning: MDS and Isomap
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Non-Linear Dimensionality Reduction
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 The Evolving Tree — Analysis and Applications Advisor.
Range-Only SLAM for Robots Operating Cooperatively with Sensor Networks Authors: Joseph Djugash, Sanjiv Singh, George Kantor and Wei Zhang Reading Assignment.
Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.
Jin Yan Embedded and Pervasive Computing Center
Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.
Progress Report - Year 2 Extensions of the PhD Symposium Presentation Daniel McEnnis.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
1 Travel Times from Mobile Sensors Ram Rajagopal, Raffi Sevlian and Pravin Varaiya University of California, Berkeley Singapore Road Traffic Control TexPoint.
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Spectral Methods for Dimensionality
Nonlinear Dimensionality Reduction
Wenyu Zhang From Social Network Group
Social Audio Features for Advanced Music Retrieval Interfaces
Topics In Social Computing (67810)
COMP Mobile Data Networking
Personalized Social Image Recommendation
Community detection in graphs
به نام خدا Big Data and a New Look at Communication Networks Babak Khalaj Sharif University of Technology Department of Electrical Engineering.
Network Visualization
Distributed Representations of Subgraphs
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Learning Emoji Embeddings Using Emoji Co-Occurrence Network Graph
Robustness of wireless ad hoc network topologies
Department of Computer Science University of York
Robustness of wireless ad hoc network topologies
Presented by Nagesh Adluru
Dimension reduction : PCA and Clustering
Korea University of Technology and Education
NonLinear Dimensionality Reduction or Unfolding Manifolds
Presentation transcript:

Understanding and Organizing User Generated Data Methods and Applications

August 16, 1977 June 25, 2009

August 16, 1977

June 25, 2009

officially pronounced dead

Media Social In 1977, only few people had the possibility to address others by means of Radio and TV. They broadcasted to the public whatever they felt was interesting. Nowadays things look different. People are networked and got directly involved in the information flow. Similarly, media items are interconnected, and they are both, produced and cosumed by people. A significant amount of the information flow nowadays occurs over the internet, using computers, and, more and more, mobile devices. The result are novel data structures that demand being investigated. In fact, this is the context this thesis lives in. We try to investigate and exploit the data and structures that have emerged from the use of social media.

Results that are directly applicable in end-user services Part 1: Direct Links This talk: Results that are directly applicable in end-user services Part 2: Similarity

Part 1: Direct Links

high clustering Probability that two of my friends are (becoming) friends themselves is high!

VENETA: Friend Finding

I want to meet people! same contact = friend of a friend privacy preserving!

Cluestr: Contact Recommendation

Clustering Survey: Communities are often addressed as groups! „There‘s no training tonight!“ „Let‘s have a BBQ tomorrow!“ „Our next meeting is at 2pm!“ Survey: Communities are often addressed as groups! Clustering

Recommend contacts from clusters of already selected contacts Clustering Communities can be identified using clustering algorithm

Considerable time savings possible! updated group recommended contacts Group (i.e. „invited“ contacts) new recommendations Considerable time savings possible!

Part 2: Similarity

Academic Conferences

Similarity between Scientific Conferences publication conference author

Confsearch (Screenshot) Highlight Ratings Highlight Related Conference Search

Music Similarity

How similar is Michael Jackson to Elvis Presley?

Audio Analysis Usage Data

„Users who listen to Elvis also listen to ...“ #common users (co-occurrences) Occurrences of song A Occurrences of song B Problem: Only pairwise similarity, but no global view!

1 2 3 Getting a global view... d = ? pairwise similarities graph for all-pairs distances (shortest path) 2 MDS to embed graph into Euclidean space while approximately preserving distances (=> global view) 3

Classical Multidimensional Scaling (MDS) Principal Component Analysis (PCA): Project on hyperplane that maximizes variance. Computed by solving an eigenvalue problem. Basic idea of MDS: Assume that the exact positions y1,...,yN in a high-dimensional space are given. It can be shown that knowing only the distances d(yi, yj) between points we can calculate the same result as applying PCA to y1,...,yN. Problem: Complexity O(n2 log n) use approximation: LMDS [da Silva and Tenenbaum, 2002]

estimator for distance: Remove edges that get Use embedding as estimator for distance: Remove edges that get stretched most and re-embed Problem: Some links erroneously shortcut certain paths Example: Kleinberg graph (20x20 grid with random edges) Original embedding (spring embedder) After 6 rounds After 12 rounds After 30 rounds Kleinberg graph: grid edges + one random edge per node (follows 1/d^2 distribution, d=manhattan distance)

Play „random“ songs that match my mood! skip = listen =

which songs match the user‘s mood Realization using our map? After only few skips, we know pretty well which songs match the user‘s mood

Browsing Covers „In my shelf AC/DC is next to the ZZ Top...“

Conclusion „from users for users“

Thank you

List of Publications Social Audio Features for Advanced Music Retrieval Interfaces M. Kuhn, R. Wattenhofer, S. Welten Multimedia 2010 Visually and Acoustically Exploring the High-Dimensional Space of Music L. Bossard, M. Kuhn, R. Wattenhofer SocialCom 2009 Cluestr: Mobile Social Networking for Enhanced Group Communication R. Grob, M. Kuhn, R. Wattenhofer, M. Wirz GROUP 2009 From Web to Map: Exploring the World of Music O. Goussevskaia, M. Kuhn, M. Lorenzi, R. Wattenhofer WI 2008 VENETA: Serverless Friend-of-Friend Detection in Mobile Social Networking M. von Arb, M. Bader, M. Kuhn, R. Wattenhofer WiMob 2008 Exploring Music Collections on Mobile Devices O. Goussevskaia, M. Kuhn, R. Wattenhofer MobileHCI 2008 The Layered World of Scientific Conferences M. Kuhn and R. Wattenhofer APWeb 2008 The Theoretic Center of Computer Science M. Kuhn and R. Wattenhofer. (Invited paper) SIGACT News, December 2007 Layers and Hierarchies in Real Virtual Networks WI 2007