Field comparators


Dedupe.io can compare your fields in different ways depending on the makeup of the data.

Types of field comparators

Field type Description Good for
Default Default field type. Fields will be compared based on how similar they are to each other, character by character. Person or company names
Address Splits addresses in to separate components using usaddress. Addresses in the United States
Name Splits addresses in to separate components using probablepeople. American names, corporations and households
Categorical Used for comparing long lists of pre-defined elements. Keywords or categories
Company Name Company names
Exact Match Checks to see if the fields exactly match or not. Cleaned and consistent data
Exists? Measures whether both, one, or neither of the fields are defined. Sparsely populated data (when presence is significant)
Fuzzy Categorical Very good for things like occupation or employers.
Latitude, Longitude Compares fields based on geographic proximity via the Haversine Formula. Latitude and Longitude pairs
Long Text Compares entire words. Useful for longer text fields. Product descriptions or article abstracts
Person Name Western names
Number Measures the difference between two prices in the same currency. Price amounts in the same currency
Set Used for comparing short lists of pre-defined elements. Keywords or categories

Custom parsers

For particularly messy datasets, we can improve the results of the Dedupe.io by building custom parsers fine-tuned for your data. Custom parsers enable smarter matches by breaking up semi-structured text into separate fields for better comparisons.

For more information, contact us at dedupe@datamade.us