Exploiting Geographic Dependencies for Real Estate Appraisal Yanjie Fu Joint work with Hui Xiong, Yu Zheng, Yong Ge, Zhihua Zhou, Zijun Yao Rutgers, the.

Slides:



Advertisements
Similar presentations
Vincent W. Zheng, Yu Zheng, Xing Xie, Qiang Yang Hong Kong University of Science and Technology Microsoft Research Asia This work was done when Vincent.
Advertisements

Mining User Similarity Based on Location History Yu Zheng, Quannan Li, Xing Xie Microsoft Research Asia.
Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
By Venkata Sai Pulluri ( ) Narendra Muppavarapu ( )
Suleyman Cetintas 1, Monica Rogati 2, Luo Si 1, Yi Fang 1 Identifying Similar People in Professional Social Networks with Discriminative Probabilistic.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Detecting Nearly Duplicated Records in Location Datasets Microsoft Research Asia Search Technology Center Yu Zheng Xing Xie, Shuang Peng, James Fu.
Geographical and Temporal Similarity Measurement in Location-based Social Networks Chongqing University of Posts and Telecommunications KTH – Royal Institute.
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
Scalable Information-Driven Sensor Querying and Routing for ad hoc Heterogeneous Sensor Networks Maurice Chu, Horst Haussecker and Feng Zhao Xerox Palo.
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
Visual Recognition Tutorial
Lec 15 LU, Part 1: Basics and simple LU models (ch6.1 & 2 (A), ch (C1) Get a general idea of urban planning theories (from rading p (A)
Ensemble Learning (2), Tree and Forest
Sparse Real Estate Ranking with Online User Reviews and Offline Moving Behaviors Yanjie Fu Hui Xiong, Yu Zheng, Yong Ge, Zijun Yao, Yanchi Liu, Jing Yuan.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.
Title: Spatial Data Mining in Geo-Business. Overview  Twisting the Perspective of Map Surfaces — describes the character of spatial distributions through.
Using Friendship Ties and Family Circles for Link Prediction Elena Zheleva, Lise Getoor, Jennifer Golbeck, Ugur Kuter (SNAKDD 2008)
Urban Point-of-Interest Recommendation by Mining User Check-in Behaviors 游晟佑
Latent (S)SVM and Cognitive Multiple People Tracker.
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
EM and expected complete log-likelihood Mixture of Experts
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
1 The Aggregate Rail Ridership Forecasting Model: Overview Dave Schmitt, AICP Southeast Florida Users Group November 14 th 2008.
Group Recommendations with Rank Aggregation and Collaborative Filtering Linas Baltrunas, Tadas Makcinskas, Francesco Ricci Free University of Bozen-Bolzano.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Bug Localization with Machine Learning Techniques Wujie Zheng
Wang-Chien Lee i Pervasive Data Access ( i PDA) Group Pennsylvania State University Mining Social Network Big Data Intelligent.
Google News Personalization: Scalable Online Collaborative Filtering
Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Xutao Li1, Gao Cong1, Xiao-Li Li2
Learning to Rank From Pairwise Approach to Listwise Approach.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying, Eric Hsueh-Chan Lu, Wen-Ning Kuo and Vincent S. Tseng Institute of Computer Science.
Detecting New a Priori Probabilities of Data Using Supervised Learning Karpov Nikolay Associate professor NRU Higher School of Economics.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Forecasting Fine-Grained Air Quality Based on Big Data Date: 2015/10/15 Author: Yu Zheng, Xiuwen Yi, Ming Li1, Ruiyuan Li1, Zhangqing Shan, Eric Chang,
The traffic noise influence in the housing market A case study for Lisbon Sandra Vieira Gomes PhD in Civil Engineering 1 Escola Superior de Actividades.
Date: 2011/1/11 Advisor: Dr. Koh. Jia-Ling Speaker: Lin, Yi-Jhen Mr. KNN: Soft Relevance for Multi-label Classification (CIKM’10) 1.
Dec. 13, 2003W 2 Implementation and Evaluation of an Adaptive Neighborhood Information Retrieval System for Mobile Users Yoshiharu Ishikawa.
NTU & MSRA Ming-Feng Tsai
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Boosting and Additive Trees (2)
System Control based Renewable Energy Resources in Smart Grid Consumer
Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.
Machine Learning Basics
Using Friendship Ties and Family Circles for Link Prediction
Passenger Demand Prediction with Cellular Footprints
Q4 : How does Netflix recommend movies?
GANG: Detecting Fraudulent Users in OSNs
Machine Learning Algorithms – An Overview
Learning to Rank with Ties
Presenter: Sihong LIN Adviser: Bei ZHOU 9 July, 2019
Presentation transcript:

Exploiting Geographic Dependencies for Real Estate Appraisal Yanjie Fu Joint work with Hui Xiong, Yu Zheng, Yong Ge, Zhihua Zhou, Zijun Yao Rutgers, the State University of New Jersey Microsoft Research, UNC Charlotte, Nanjing University York City, NY

 Background and Motivation  Problem Statement  Methodology  Evaluation  Conclusions Agenda 2

Housing Matters 3 Call for an intelligent system to rank estates based on estate investment value Zillow Yahoo Homes Win your lover Settle down your family Make extra money as investment Realtors housing consultant services

What Special in Estate Investment Value  Location! Location! Location!  Urban Geography features the geographical utility of houses  Human Mobility reflects neighborhood popularity of houses  Prosperity of Business Area shows influence on houses 4 Road NetworksSubway Stations Bus Stops Places of interests Taxi GPS traces Businesses area

Quantifying Estate Investment Value 5  We don’t predict future price  We predict growth potential of resale value (long term thing)  Estate investment return rate of a given market period  Prepare the benchmark investment values of estates for training data  Identify rising market period and falling market period of Beijing  Calculate the investment returns of each real estate during rising market period and falling market period  Value-adding capability  investment returns of rising market period  Value-protecting capability  investment returns of falling market period represent Investment return

Geographic Dependencies (1) 6  Individual dependency  the investment value of an estate is determined by the geographic characteristics of its own neighborhood Point of Interest Transportation AccessibilityTaxi Mobility Your personal profiles tell your research interests

Geographic Dependencies (2) 7  Peer dependency  Inside a business area, the estate investment value can be reflected by its nearby estates. AttributeA B Distance to Level2 road network156 meters143 meters Distance to subway station 1385 meters 1585 meters #Restaurants34 #Transportation facilities88 A B A B Your friends tell your research interests

Geographic Dependencies (3) 8  Zone dependency  The estate value can also be influenced by the values of its associated latent business area. Rising Market Period (02/ /2012) Average regional prices Depressed business area Prosperous business area Depressed business area Prosperous business area Your associated research groups tell your research interests

Problem Definition  Given  Estates with locations and historical prices  Urban geography (poi, road networks)  Human mobility (taxi GPS traces)  Objective  Rank estates based on their investment values  Core tasks  Predict estate investment value using urban geography and human mobility  Jointly model three geographic dependencies as objective function to learn estate ranking predictor 9

Methodology Overview 10 Estate Investment Value Geographic Utility Neighborhood Popularity Road NetworksSubway Stations Bus Stops Places of interests Urban Geography Cell -Tower TracesTaxicab GPS Traces Check-ins Bus Traces Human Mobility Business Area Influence of Business Area’s Prosperity Prediction Model Objective Function Individual Dependency Peer Dependency Zone Dependency Location

Modeling Estate Investment Value (1)  Geographic Utility  Feature extraction by spatial indexing  linearly regress geographic utility from geographic features of the neighborhood of each estate 11

Modeling Estate Investment Value (2)  Influence of business area (A generative view)  There are K business areas in a city  Each business area is a cluster of estates; estates tend to co- locate along multiple business areas  The more prosperous, the more easier we identify an estate from business area  The prosperities of K business areas (K spatial hidden states ) show influence on estates  The inverse influence of geo-distance between estate and business area center  Gaussian Mixture Model + Learning To Rank 12 Prosperities of K business areas

Modeling Estate Investment Value (3)  Neighborhood Popularity (A propagation view)  Propagate visit probability to POIs per drop-off point  Aggregate visit probability per POI  Aggregate visit probability per POI category  Compute popularity score  Spatial propagation and aggregation from taxi to house 13 Point of Interests HouseTaxi Passenger (drop-off point) Categories of POI

Modeling Three Dependencies  Individual Dependency  Capture prediction accuracy of investment values, locations and business area assignment 14  Peer Dependency (pairwise analysis on estate level)  Capture ranking consistency of predicted investment value of each estate pair  Zone Dependency (pairwise analysis on business area level)  Capture ranking consistency of learned prosperities of the corresponding business area pair of each estate pair The more accurate, the higher likelihood  Overall likelihood

Parameter Estimation  Given where Y, and L are the investment value, ranks and locations of I estates respectively  To maximize the posterior Parameters Priors  Expectation Maximization (EM) to learn the parameters by treating latent business area of each estate as a latent variable  Geo-clustering updates the latent business area by maximizing the posterior of latent business area  After business area assignments are updated, maximizing the posterior of model parameters 15

Experimental Data  Beijing real-world Data  Beijing estate data 2851 estates with transaction records from 04/2011 to 09/2012 Falling market(04/2011 to 02/2012) and Rising market (02/2012 to 09/2012)  Beijing transportation facility data including bus stop, subway, road networks  Beijing POI data  Beijing taxi GPS traces 16

Evaluation Methods and Metrics  Baseline algorithms  MART: it is a boosted tree model, specifically, a linear combination of the out puts of a set of regression trees  RankBoost: it is a boosted pairwise ranking method, which trains multiple weak rankers and combines their outputs as final ranking  Coordinate Ascent: it uses domination loss and applies coordinate descent for optimization  ListNet: it is a listwise ranking model with permutation top-k ranking likelihood as the objective function  Evaluation metrics  Normalized Discounted Cumulative Gain (NDCG)  Precision  Recall  Kendall’s Tau Coefficient 17

Overall Performances  We investigate the ranking performances by comparing to baseline algorithms in terms of Tau, NDCG, Pecision and Recall 18 Our method mines urban geography and human mobility to predict estate investment value and exploit three dependencies, and thus outperform traditional LTR methods

Study on Geographic Dependencies  We study the impact of three geographic dependencies by designing different variants of objective function (posterior likelihoods) 19 Individual dependency can achieve good overall ranking performance; Peer and zone dependency can achieve good top-k ranking performance; We recommend combination usage of three dependencies

Hierarchy of Needs for Human Life (1) 20 Triangle Need Structure of Human Being We hope we go shopping, work, eat, access transportation quickly and easily, children go to schools near our home We DONOT need hotels/hospitals/sports/spots located near our home Good house is always balance people’s need The POI density spectral of estates over multiple poi categories

Hierarchy of Needs for Human Life (2) 21 The triangle need structure of human life in Beijing

Conclusions  Housing analysis is funny  Use urban geography and human mobility to model estate investment value  Capture geographic individual, peer, and zone dependencies to better learn estate ranking predictor  Business applications  Decision making support for homebuyers  Improve price structure for housing agent/broker/consultant  Optimize site selection for housing developer 22

23 Thank You ! Homepage: