Dedupe.io

De-duplicate and find matches in your Excel spreadsheet or database

Dedupe.io is a powerful tool that learns the best way to find similar rows in your data.

Using cutting-edge research in machine learning we quickly and accurately identify matches in your Excel spreadsheet or database—saving you time and money.


Watch the demo



Sign up for a free trial

Try it for free on up to 1,000 rows of data
Paid plans start at $10. Learn more »

A simple tool for a complex problem

In today’s world of big data, there’s never been more information available to work with. Unfortunately, all this data is hard to use, especially if it’s been entered by hand or comes from different systems. The simple task of figuring out who is who in a spreadsheet or database can be a daunting, time-consuming task.

Dedupe.io

That’s where Dedupe.io comes in. We developed the best dynamic and scalable solution for de-duplicating and linking datasets, and built a simple step-by-step wizard for anyone to use it.

Read more about how and why we built Dedupe.io »

Dedupe.io is great for

De-duplicating customer records in Salesforce

Combining lists of addresses or businesses

Master data management

Merging different database systems together

Creating a master list of products or parts

Cleaning up lists of names and emails

Finding contributions in campaign finance data

Cross-referencing government records


And much more!
Not sure about your use case? Drop us a line dedupe@datamade.us


How can you use Dedupe.io?

Find duplicates in a spreadsheet

Upload a spreadsheet and find all exact and similar records within it

Merge multiple files

Link together two or more spreadsheets and find overlapping records in each

Check against a canonical list

Upload a master list and check new spreadsheets against it

We find the hard matches

Real-world data is messy, and Dedupe.io was built to work with it

We find matches even when there are major data quality issues

Typos, misspellings and abbreviations

Data that is hand-typed can have misspellings, abbreviations and other typos
We match them using powerful text similarity algorithms

name address
atty title guaranty fund One S. Wacker Dr. 24th Floor Chicago, IL 60606
attorneys' title guarantee fund, inc. 1 s. wacker drive 24th floor chicago il 60606

Inconsistent formatting

Different people and systems format data differently
We parse out names, addresses and any text to make smart comparisons

site_name address phone
Chicago Commons Guadalupano 1814 S. Paulina 60608 6663883
Chicago Commons Guadalupano Family Center 1814 South Paulina 60608 6663884
Chicago Commons Association - Guadalupano Family Center 1814 S Paulina St 6663883
CHICAGO COMMONS ASSOCIATION GUADALUPANO FAMILY CENTER 1814 S PAULINA 60608 6663883

Contradictory fields

Sometimes, your data doesn't agree with itself
We compare using multiple fields to find records with the most agreement

site_name address phone
kennedy-king college 6301 s halsted street 60621 6025340
kennedy-king college 6800 s wentworth avenue 60621 6025340

How it works

Upload your data

Upload any spreadsheet or connect directly to your database

Train it

You provide training on the right way to identify similar records in your data

Validate and download

Matches are automatically found for you to review and then download


Learn more about how it works »

Read our white paper

We’re told that we’re in the age of big data and the analytics revolution, leading to “the algorithmic business.” Not only will managers be able to make better strategic decisions based on data, systems will generate and then follow data-driven algorithms to make thousands of operations-related decisions each second.

But has your company fully exploited that potential?

Contact us to get a copy of our white paper, Entity Resolution with Machine Learning: Dedupe.io’s Scalable Foundation for Data Quality.


Entity Resolution with Machine Learning: Dedupe.io’s Scalable Foundation for Data Quality

Entity Resolution with Machine Learning: Dedupe.io’s Scalable Foundation for Data Quality

Still have questions?

We're happy to help!


dedupe@datamade.us
(312) 725-0195