Building Entity Profiles over Time


To harness the rich amount of information available on the Web today, many organizations aggregate public (and private) data to build profiles for real world entities, and understand how these entities evolve over time. Since a real world entity may be described by different sources in various ways with overlapping information, and possibly conflicting or even erroneous values, we need to collate data records that refer to the entity, as well as correct any erroneous values. We also need to understand how data records from different sources are related to one another over time if they refer to the same entity.

In this project, we develop a framework that interleaves record linkage with error correction, taking into consideration the reliability of data sources to lower the impact of erroneous values. We also design a novel transition model that captures how attribute values change over time, and a source-aware temporal matching algorithm that jointly considers the value transitions and the freshness of data sources to link temporal records to entities in the right time period. The goal is to obtain an increasingly complete and up-to-date entity profile as more and more records are aggregated from different sources.

Selected Publications

  • Furong Li, Mong Li Lee, Wynne Hsu. Profiling Entities over Time in the Presence of Unreliable Sources, in IEEE Transactions on Knowledge and Data Engineering (TKDE), 2017.

  • Furong Li, Mong Li Lee, Wynne Hsu. MAROON+: A System for Profiling Entities over Time (Demo), in 33rd IEEE International Conference on Data Engineering (ICDE), San Diego, CA, USA, April 2017.

  • Furong Li, Mong Li Lee, Wynne Hsu, Wang-Chiew Tan. Linking Temporal Records for Profiling Entities, in ACM SIGMOD International Conference on Management of Data, Melbourne, Australia, May 2015.

  • Furong Li, Mong Li Lee, Wynne Hsu. Entity Profiling with Varying Source Reliabilities, in 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , New York City, USA, August 2014.