Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISCTSC Workshop A7 Best Practices in Data Fusion.

Similar presentations


Presentation on theme: "ISCTSC Workshop A7 Best Practices in Data Fusion."— Presentation transcript:

1 ISCTSC Workshop A7 Best Practices in Data Fusion

2 Objectives Indentify the state of the art and the state of practice Identify key research challenges and opportunities Identify tangible ways to accelerate methodological innovation and adoption in practice

3 What exactly is data fusion? Using more than one data source to estimate a parameter of interest

4 What exactly is data fusion? Using more than one data source to estimate a parameter of interest

5 SOP & SOA (1) There is a long history of data fusion in transport, but very fragmented Examples –Synthetic population generation –OD matrix updating –Data enrichment in discrete choice model estimation –Network state estimation –Activity pattern feature extraction from trace data –Use of multiple survey modes –Activity and time use survey consolidation –Population exposure modelling –Public transport (e.g. UK bus) OD matrix estimation

6 Summary: SOP & SOA (2) Problem types: –Direct observation by multiple methods Requires error model Does not in general require system process model –Direct and indirect observation Requires error model Requires additionally a system process model to link indirect observations to parameters of interest Methods: –‘Record linking’ methods (e.g., statistical matching, data mining, imputation, fuzzy logic) –Model-based inference (e.g., FIML, filtering, Bayesian inference)

7 Research needs (1) Enabling research –Better meta data (survey/data collection process + context) to support informed fusion (specially important in era of web 2.0) –More professional and disciplined protocols in reporting data treatments in published work –Better techniques of disclosure management –Understanding how to make the business case for data fusion Benefits - sample size, precision; Barriers – perception of ‘made up data’, threat to incumbent data providers

8 Research needs (2) Methodological research –Detecting genuinely conflicting information (not fuseable) – a form of specification test –Better means of validating fused data –Better methods for modelling the propagation of data and model uncertainty during data fusion – enhance confidence in fused data –Are deterministic/’mean imputation’ approaches adequate – how seriously do they distort the covariance structure? –Better re-sampling/Bayesian methods in high dimensions –Integrate methods from SAE –Opportunities to reduce respondent burden by split designs and ex-post fusion (a la SP surveys and analysis) – question substitutability –For record matching, what are the key connecting variables?

9 Research needs (3) Research infrastructure –Establish to more consistent and complete taxonomy of data fusion problems, methods, outcomes –Establish reference datasets and reference ‘cases’


Download ppt "ISCTSC Workshop A7 Best Practices in Data Fusion."

Similar presentations


Ads by Google