Splink is a Python package for probabilistic record linkage (entity resolution) that allows you to deduplicate and link records from datasets that lack unique identifiers. It is used widely by within ...
Fast, accurate and scalable data linkage and deduplication Splink is a Python package for probabilistic record linkage (entity resolution) that allows you to deduplicate and link records from datasets ...
The Ministry of Justice (MoJ) has urged other government bodies to make use of its Splink software for linking datasets. MoJ data scientist Robin Linacre said in a blogpost that the software, in ...