Object Fusion in Geographic Information Systems Catriel Beeri, Yaron Kanza, Eliyahu Safra, Yehoshua Sagiv Hebrew University Jerusalem Israel.

Slides:



Advertisements
Similar presentations
An Interactive-Voting Based Map Matching Algorithm
Advertisements

Efficient modelling of record linked data A missing data perspective Harvey Goldstein Record Linkage Methodology Research Group Institute of Child Health.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Proposed concepts illustrated well on sets of face images extracted from video: Face texture and surface are smooth, constraining them to a manifold Recognition.
Fingerprint Minutiae Matching Algorithm using Distance Histogram of Neighborhood Presented By: Neeraj Sharma M.S. student, Dongseo University, Pusan South.
Efficient modelling of record linked data A missing data perspective Harvey Goldstein Record Linkage Methodology Research Group Institute of Child Health.
Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
Second order cone programming approaches for handing missing and uncertain data P. K. Shivaswamy, C. Bhattacharyya and A. J. Smola Discussion led by Qi.
CS 128/ES Lecture 5b1 Vector Based Data. Great Rivalries in History Lincoln vs. Douglas “The first great Presidential Debates” Trekkies vs. Jedis.
Correlation and Autocorrelation
Co-Training and Expansion: Towards Bridging Theory and Practice Maria-Florina Balcan, Avrim Blum, Ke Yang Carnegie Mellon University, Computer Science.
Trajectories Simplification Method for Location-Based Social Networking Services Presenter: Yu Zheng on behalf of Yukun Cheng, Kai Jiang, Xing Xie Microsoft.
Jierui Xie, Boleslaw Szymanski, Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180, USA {xiej2, szymansk,
CS 128/ES Lecture 5b1 Vector Based Data. CS 128/ES Lecture 5b2 Spatial data models 1.Raster 2.Vector 3.Object-oriented Spatial data formats:
Finding Hidden Correlations and Filtering out Incorrect Matchings with Compatibility Detection across Web Query Interfaces Lei Lei June 11, 2004 June 11,
Ordinary Kriging Process in ArcGIS
Advanced GIS Using ESRI ArcGIS 9.3 Arc ToolBox 5 (Spatial Statistics)
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Ch 5 Practical Point Pattern Analysis Spatial Stats & Data Analysis by Magdaléna Dohnalová.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Yuping Lin and Gérard Medioni.  Introduction  Method  Register UAV streams to a global reference image ▪ Consecutive UAV image registration ▪ UAV to.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
CSE 185 Introduction to Computer Vision
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
Measurement theory - for the interested student Erland Jonsson Department of Computer Science and Engineering Chalmers University of Technology.
Simultaneous Localization and Mapping Presented by Lihan He Apr. 21, 2006.
Classifying Attributes with Game- theoretic Rough Sets Nouman Azam and JingTao Yao Department of Computer Science University of Regina CANADA S4S 0A2
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
Improving Suffix Tree Clustering Base cluster ranking s(B) = |B| * f(|P|) |B| is the number of documents in base cluster B |P| is the number of words in.
Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.
CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.
Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]
1 Computing Relevance, Similarity: The Vector Space Model.
Data Types Entities and fields can be transformed to the other type Vectors compared to rasters.
GeoPlannerSM for ArcGIS®: An Introduction
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
Spatial Interpolation III
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
SOFIANE ABBAR, HABIBUR RAHMAN, SARAVANA N THIRUMURUGANATHAN, CARLOS CASTILLO, G AUTAM DAS QATAR COMPUTING RESEARCH INSTITUTE UNIVERSITY OF TEXAS AT ARLINGTON.
Truth Discovery with Multiple Conflicting Information Providers on the Web KDD 07.
Query Sensitive Embeddings Vassilis Athitsos, Marios Hadjieleftheriou, George Kollios, Stan Sclaroff.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
On Concise Set of Relative Candidate Keys Shaoxu Song (Tsinghua), Lei Chen (HKUST), Hong Cheng (CUHK)
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
Probabilistic Latent Semantic Analysis as a Potential Method for Integrating Spatial Data Concepts R.A. Wadsworth 1, A.J. Comber 2, P.F. Fisher 2 1.Centre.
CSE 185 Introduction to Computer Vision Feature Matching.
The Development of a Relative Point SLAM Algorithm and a Relative Plane SLAM Algorithm.
The Database and Info. Systems Lab. University of Illinois at Urbana-Champaign Understanding Web Query Interfaces: Best-Efforts Parsing with Hidden Syntax.
A Performance Characterization Algorithm for Symbol Localization Mathieu Delalandre 1,2, Jean-Yves Ramel 2, Ernest Valveny 1 and Muhammad Muzzamil Luqman.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Refined Online Citation Matching and Adaptive Canonical Metadata Construction CSE 598B Course Project Report Huajing Li.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Robust Estimation Course web page: vision.cis.udel.edu/~cv April 23, 2003  Lecture 25.
Distinguishing humans from robots in web search logs preliminary results using query rates and intervals Omer Duskin Dror G. Feitelson School of Computer.
Semantic Alignment Spring 2009 Ben-Gurion University of the Negev.
Computational Vision CSCI 363, Fall 2012 Lecture 17 Stereopsis II
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Using Partial Reference Alignments to Align Ontologies
Computing Full Disjunctions
COMBINED UNSUPERVISED AND SEMI-SUPERVISED LEARNING FOR DATA CLASSIFICATION Fabricio Aparecido Breve, Daniel Carlos Guimarães Pedronette State University.
TECHjOSH.COM TechJosh.com.
Test study Guide/Breakdown
PageRank algorithm based on Eigenvectors
Social Practice of the language: Describe and share information
Retrieval Performance Evaluation - Measures
Presentation transcript:

Object Fusion in Geographic Information Systems Catriel Beeri, Yaron Kanza, Eliyahu Safra, Yehoshua Sagiv Hebrew University Jerusalem Israel

The Goal: Fusing Objects that Represent the Same Real-World Entity Example: three data sources that provide information about hotels in Tel-Aviv MAPI: the survey of Israel MAPA: commercial corporation MUNI: The municipally of Tel-Aviv

The Goal: Fusing Objects that Represent the Same Real-World Entity Each data source provides data that the other sources do not provide Hotel Rank Is there a nearby parking lot? polygon points MAPI: cadastral and building information MAPA: tourist information MUNI: Municipal information

The Goal: Fusing Objects that Represent the Same Real-World Entity Object fusion enables us to utilize the different perspectives of the data sources MAPI: cadastral and building information MAPA: tourist information Radison Moria MUNI: Municipal information

Why Are Locations Used for Fusion? There are no global keys to identify objects that should be fused Names cannot be used –Change often –May be missing –May be in different languages It seems that locations are keys: –Each spatial object includes location attributes –In a “perfect world,” two objects that represent the same entity have the same location

Why is it Difficult to use Locations? In real maps, locations are inaccurate The map on the left is an overlay of the three data sources about hotels in Tel-Aviv For example, the Basel Hotel has three different locations, in the three data sources

Inaccuracy  Difficult to Use Locations It is difficult to distinguish between: 1.A pair of objects that represent close entities 2.A pair of objects that represent the same entity Partial coverage complicates the problem a 2 ?

Fusion methods Assumptions There are only two data sources Each data source has at most one object for each real-world entity – i.e., the matching is one-to-one

Corresponding Objects Objects from two distinct sources that represent the same real- world entity

Fusion Sets A fusion algorithm creates two types of fusion sets: –A set with a single object –A set with a pair of objects – one from each data source + +

Confidence Our methods are heuristics  may produce incorrect fusion sets A confidence value between 0 and 1 is attached to each fusion set It indicates the degree of certainty in the correctness of the fusion set + + Fusion sets with high confidence Fusion sets with low confidence

The Mutually-Nearest Method The result includes –All mutually-nearest pairs –All singletons, when an object is not part of pair Fusion setsinput Finding nearest objects nearest 1a2 1a21a2

The Probabilistic Method + Confidence – the probability of the mutual choice A threshold value is used to discard fusion sets with low confidence An object from one dataset has a probability of choosing an object from the other dataset The probability is inversely proportional to the distance Confidence – the probability that the object is not chosen by any +

Mutual Influences Between Probabilities Case II: we expect Case I: 1a2 b 1a2 1a2 b 1a

The Normalized-Weights Method Normalization captures mutual influence Iteration brings to equilibrium Results are superior to those of the previous two methods (at a cost of only a small increase in the computation time)

Measuring the Quality of the Result E Entities in the world R Fusion sets in the result C Correct fusion sets in the result

A Case Study: Hotels in Tel-Aviv The traditional nearest neighbor (Best results) Mutually nearest Proba- bilistic method Normal- ized weights method Recall Precision All three methods perform much better than the nearest-neighbor method Our three methods State of the art

Extensive tests on synthesized data are described in the paper

Conclusions The novelty of our approach is in developing efficient methods that find fusion sets with high recall and precision, using only location of objects. You are invited to visit our poster And our web site Thank you!