Dedupe.io was shut down Jan 31, 2023.
The Dedupe.io team has decided to dedicate our focus to our consulting practice at DataMade and work on projects more aligned with our mission to support our clients in working toward democracy, justice, and equity.
We are continuing our consulting practice around the open source dedupe library and would be happy to consult with you on setting up a solution based on it. Contact us to get started >
Cleaning and preparing the data was done with Python, relying heavily on the pandas library and DataMade’s Dedupe. NetworkX was used to create the network graph and to perform operations on the graph. Next, the graph was imported into the open-source visualization software Gephi, which was used to compute the layout of the nodes and edges. A static version of visualization was exported from Gephi, and then the x and y coordinates were translated into latitude/longitude using Python. Then the data was imported into Mapbox for styling, interactivity, and hosting. The Mapbox GL javascript library was used to create a simple website that could be embedded in an article or act as a standalone visualization.