Dedupe.io was shut down Jan 31, 2023.
The Dedupe.io team has decided to dedicate our focus to our consulting practice at DataMade and work on projects more aligned with our mission to support our clients in working toward democracy, justice, and equity.
We are continuing our consulting practice around the open source dedupe library and would be happy to consult with you on setting up a solution based on it. Contact us to get started >
Guides for how to download and use your results from Dedupe.io.
Downloaded results from Dedupe.io
Dedupe.io provides data downloads in the form of a ZIP archive containing CSV
files for each dataset in your project. Each CSV represents your original
dataset with an additional cluster_id
column that can be used to
identify clusters.
The following are a few common operations that users perform after downloading their results.
Users of spreadsheet apps like Microsoft Excel and Google Sheets can
create a pivot table on the cluster_id
column to select the
columns they’d like to view in each sheet.
Read more about pivot tables in Microsoft Excel »
Read more about pivot tables in Google Sheets »
Users of spreadsheet apps can merge multiple tables based on the common
cluster_id
column. Microsoft Excel users can use merge queries
to join multiple tables, while Google Sheets users can install the third-party
addon Merge Sheets.
Read more about merge queries in Excel »
Read more about the Merge Sheets addon in Google Sheets »
Users of SQL databases like PostgreSQL and MySQL can import the files into
tables and join them with a SQL JOIN
operation on the cluster_id
column.
Still not sure how to work with your data? Contact us for help.