Dedupe.io

De-duplicate and find matches in your messy data

Dedupe.io is a powerful tool that learns the best way to match similar rows in your data.

Using cutting-edge research in machine learning we quickly and accurately clean up your data—saving you time and money.


Watch the demo



Sign up for private beta

Dedupe.io is currently in private beta

Try it for free on up to 1,000 rows of data


Email us to sign up


Paid plans are available starting at $100 USD for 10,000 rows

Pay per row or with a monthly subscription



A simple tool for a complex problem

In today’s world of big data, there’s never been more information available to work with. Unfortunately, all this data is hard to use, especially if it’s been entered by hand or comes from different systems. The simple task of figuring out who is who in a spreadsheet or database can be a daunting, time-consuming task.

Dedupe.io

That’s where Dedupe.io comes in. We developed the best dynamic and scalable solution for de-duplicating and linking datasets, and built a simple step-by-step process for anyone to use it.

Read more about how and why we built Dedupe.io »


How can you use Dedupe.io?

Find duplicate rows

Upload a spreadsheet and find all exact and similar records within it

Link datasets together

Link together two or more spreadsheets and find overlapping records in each

Continuous matching

Automatically match new data as it comes in based on existing training

We find the hard matches

Real-world data is messy, and Dedupe.io was built to work with it

We find matches even when there are major data quality issues

Typos, misspellings and abbreviations

Data that is hand-typed can have misspellings, abbreviations and other typos
We match them using powerful text similarity algorithms

name address
atty title guaranty fund One S. Wacker Dr. 24th Floor Chicago, IL 60606
attorneys' title guarantee fund, inc. 1 s. wacker drive 24th floor chicago il 60606

Some words are hard to spell—that's ok!


Inconsistent formatting

Different people and systems format data differently
We parse out names, addresses and any text to make smart comparisons

site_name address phone
Chicago Commons Guadalupano 1814 S. Paulina 60608 6663883
Chicago Commons Guadalupano Family Center 1814 South Paulina 60608 6663884
Chicago Commons Association - Guadalupano Family Center 1814 S Paulina St 6663883
CHICAGO COMMONS ASSOCIATION GUADALUPANO FAMILY CENTER 1814 S PAULINA 60608 6663883

Four different ways to refer to the same place


Contradictory fields

Sometimes, your data doesn't agree with itself
We compare using multiple fields to find records with the most agreement

site_name address phone
kennedy-king college 6301 s halsted street 60621 6025340
kennedy-king college 6800 s wentworth avenue 60621 6025340

Some places have multiple addresses


How it works

Upload your data

Upload any spreadsheet or connect directly to your database

Train it

You provide training on the right way to identify similar records in your data

Validate and download

Matches are automatically found for you to review and then download


Learn more about how it works »


Still have questions?

We're happy to help!


dedupe@datamade.us
(312) 725-0195

Ready to use Dedupe.io?

Try it for free on up to 1,000 rows of data.


Sign up for beta