Record Linkage Methodology Under Fellegi-Sunter Paradigm, with Extensions

12:00noon to 1:00pm PST

This webinar is part of the Advanced Methods Webinar Series

In 1969, Ivan Fellegi and Alan Sunter formalized a strategy for conducting probabilistic record linkage that had been developed previously. Included in this formalization was the demonstration that the scoring method used with this is optimal under certain assumptions. While other record linkage methods have been developed (including Bayesian-based ones) for large-scale linkages the Fellegi-Sunter approach should be a strong candidate

In this talk, Mr. Resnick will give an overview of the Fellegi-Sunter approach, explaining how candidate pair are evaluated under it. He will also cover extensions and modifications to it, which include the following:

  • Data editing and other preparation
  • Estimation of scoring parameters using machine learning (E-M algorithm)
  • Use of name (and other comparison variable) frequencies
  • Use of partial string agreements
  • Hierarchical (nested) comparisons
  • Use of blocking and development of optimal blocking strategies
  • Estimation of match probability and linkage error

> download ppt presentation

View recorded presentation below.

What did you think of this webinar?

Please take a few minutes to complete our online survey. Your feedback will help shape future webinar series!

Presenter

Mr Resnick Mr Resnick is a principal data scientist with NORC at the University of Chicago. He has been working on data analysis and statistical programming for at least several decades. Most of this work has focused on using survey and administrative data for policy analysis, often in the healthcare domain.

For more than 10 years he worked at the U.S. Census Bureau in the administrative record area. Here he became familiar with record linkage, where it was being used to link very large surveys, enumerations, and administrative record files. During this period, he developed a SAS based record-linkage module for high-volume linkages that is still being used at the Bureau.

At NORC, much of his work is focused on record linkage and he has developed (in collaboration with colleagues) a new SAS-based record linkage package that incorporates the E-M algorithm and several enhanced strategies for improving the quality of record linkage analyses.

 

Upcoming events

Did you miss it?

If you did, it's not too late! 

View all our webinars and more on our YouTube channel

"Population Data BC is a go-to channel for me."
Kay R

What did you think?

Have you watched any of our recorded webinars or presentations?

Please tell us what you think by completing our short survey. Your feedback is very important to us and will help us develop future training courses and webinars.