Tutorials

To get familiar with Dedupe.io's features and advanced capabilities, our tutorials and documentation.


Intro to Dedupe.io

Intro to Dedupe.io

30 minutes

Dedupe.io is a software as a service platform for quickly and accurately identifying clusters of similar records across one or more files or databases. In this tutorial, we will go over how to de-duplicate your first dataset using Dedupe.io.

Merging and matching multiple datasets

Merging and matching multiple datasets

20 minutes

In this tutorial, we will go over how to merge or find matches across multiple datasets using Dedupe.io.


Documentation

Deep dives into how Dedupe.io works and advanced settings.

Dedupe - How it works

Dedupe - How it works

Using advanced machine learning and statistics, Dedupe.io learns the best way to identify similar records in any dataset. Learn the specifics of our research-driven approach to record matching and entity resolution.

Formatting files for upload

Formatting files for upload

Instructions and tips on formatting and processing files for upload.

Field comparators

Field comparators

Dedupe.io can compare your fields in different ways depending on the makeup of the data.

Should I use Dedupe.io or the dedupe Python library?

Should I use Dedupe.io or the dedupe Python library?

While you can use either Dedupe.io or the dedupe library to de-duplicate or link your data, there are some important differences to note when choosing which one to use.

Frequently asked questions (FAQ)

Frequently asked questions (FAQ)

Frequently asked questions (and answers) from Dedupe.io users.

Working with results

Working with results

Guides for how to download and use your results from Dedupe.io.