Blog

How to Find and Merge Duplicate Records in Airtable

FE
Filla EditorialintermediateMar 19, 2026

How to Find and Merge Duplicate Records in Airtable

Duplicate records are one of the most common data quality problems in Airtable. They creep in through form submissions, CSV imports, manual entry, and integrations that create records without checking for existing ones. Over time, duplicates silently corrupt your data -- inflating counts, sending double communications, and making it impossible to trust your reports.

Airtable has no built-in duplicate detection. You can sort a view and visually scan for repeats, but that stops working past a few hundred records. Formulas can flag exact matches on a single field, but they cannot handle fuzzy matching, multi-field comparison, or automated cleanup.

This guide covers how to find and handle duplicate records in Airtable using Filla's Duplicate Finder tool -- from simple exact-match flagging to fuzzy merging across multiple fields.

Cleaning up your Airtable data Filla's Duplicate Finder scans your table, groups duplicates using exact, normalized, or fuzzy matching, and lets you flag, delete, or merge them in bulk

Start free

Why duplicates happen in Airtable

Before fixing duplicates, it helps to understand how they get created. Most duplicate problems come from one of these sources:

Form submissions without deduplication. When multiple people submit the same person's information, or the same person submits twice, Airtable creates separate records each time. Airtable's native form has no duplicate prevention. Filla's form builder does offer duplicate prevention at the field level, but if you are collecting data from multiple sources, duplicates can still appear.

CSV imports. Importing a CSV that overlaps with existing records creates new records for every row. Airtable does not match imported rows against existing data.

Automation and integration syncing. Zapier, Make, or custom scripts that create records based on external triggers can produce duplicates when the same event fires twice or when the external system contains its own duplicates.

Manual data entry. Team members creating records without checking if one already exists. Common in CRM workflows where multiple people interact with the same contact.

Merged tables. Combining data from multiple Airtable tables or bases often introduces overlapping records.


The manual approach and why it breaks down

The simplest duplicate check in Airtable uses a formula field. For example, a formula like COUNTALL(IF({Email} = BLANK(), "", IF({Email} = Email, 1, 0))) can attempt to flag records with matching emails. But this approach has serious limitations:

  • It only checks one field at a time
  • It cannot handle case differences ("john@example.com" vs "John@Example.com")
  • It cannot handle whitespace issues ("John Smith" vs "John Smith ")
  • It cannot do fuzzy matching ("Jon Smith" vs "John Smith")
  • It tells you duplicates exist but cannot merge or clean them
  • Performance degrades significantly on large tables

For anything beyond a trivial table, you need a dedicated tool.


How Filla's Duplicate Finder works

Filla's Duplicate Finder is a processor tool that connects to any Airtable table, scans all records for duplicates based on configurable matching rules, and takes action on the groups it finds. It runs in the background with real-time progress tracking, so it handles tables with thousands of records without blocking your workflow.

Here is how to set it up and run it.

Step 1: Create a new Duplicate Finder tool

In your Filla workspace, open the base that contains the table you want to deduplicate. Create a new processor tool and select Duplicate Finder as the tool type. Give it a descriptive name like "Contact Deduplication" or "Order Cleanup."

Step 2: Select the source table

Choose the Airtable table you want to scan. The tool will read all records from this table (or a filtered subset if you configure record filters).

Step 3: Configure match fields

This is the most important step. Match fields define how Filla determines whether two records are duplicates.

You can add one or more fields to compare. For each field, choose a match type:

  • Exact: Strict string equality. "John Smith" matches "John Smith" but not "john smith."
  • Case Insensitive: Ignores capitalization. "john@example.com" matches "John@Example.com."
  • Normalized: Trims whitespace, lowercases, and collapses multiple spaces. " John Smith " matches "john smith."
  • Fuzzy: Uses Jaro-Winkler similarity scoring with a configurable threshold (0.0 to 1.0). A threshold of 0.8 catches common typos and abbreviations while avoiding false positives.

When you configure multiple match fields, records must match on all specified fields to be considered duplicates. This means you can combine an exact email match with a fuzzy name match for higher precision.

Important: two records where both values are empty on a match field are not counted as a match. This prevents false positives from blank fields.

Step 4: Choose an action

Filla offers three actions for duplicate groups:

Flag duplicates. Writes a configurable value (default: "DUPLICATE") to a field you specify on each duplicate record. This is the safest option -- it marks duplicates without changing or deleting anything, so you can review the results before taking further action.

Delete duplicates. Keeps one record from each duplicate group and deletes the rest. You choose which record to keep: the newest (latest created time), the oldest (earliest created time), or the most complete (fewest empty fields).

Merge into primary. The most powerful option. Selects a primary record using the same strategies (newest, oldest, most complete), fills in any empty fields on the primary with values from the duplicate records, then deletes the duplicates. This preserves the maximum amount of data.

Step 5: Preview the scan results

Before running the tool, Filla provides an interactive preview that shows you which duplicate groups were detected and what action will be taken. Review this carefully, especially when using delete or merge actions.

Step 6: Run the tool

Click Run to start the deduplication. The tool processes records in the background with real-time progress tracking showing records processed, succeeded, and failed. A detailed execution log records every action taken.

Step 7: Review the execution log

After the run completes, check the execution log for any errors or unexpected results. If you used the Flag action, you can now filter your Airtable table by the flag field to review all identified duplicates.


Matching strategies for common scenarios

Different data types call for different matching approaches. Here are practical configurations for common deduplication scenarios.

Contact deduplication (CRM)

Match on Email (case insensitive) plus Name (fuzzy, threshold 0.85). Email catches exact duplicates. Fuzzy name matching catches cases where the same person was entered as "Robert Johnson" and "Bob Johnson" -- though at a high threshold to avoid false matches between genuinely different people.

Order or transaction deduplication

Match on Order Number or Transaction ID (exact). These should be unique identifiers, so exact matching is appropriate. If your IDs have inconsistent formatting, use normalized matching instead.

Company deduplication

Match on Company Name (fuzzy, threshold 0.8) plus Domain or Website (normalized). Company names vary widely -- "Acme Corp" vs "Acme Corporation" vs "ACME Corp." -- so fuzzy matching is essential. Adding domain as a second match field reduces false positives.

Product catalog deduplication

Match on SKU (exact) or Product Name (normalized). If SKUs are reliable, exact match is sufficient. For name-based matching, normalization handles capitalization and spacing inconsistencies.


Best practices for deduplication

Start with Flag, not Delete. Always run the Duplicate Finder in Flag mode first. Review the flagged records in Airtable to confirm the matching rules are producing accurate results. Only switch to Delete or Merge after you are confident in the match quality.

Use source view filters for large tables. If your table has thousands of records but you only need to deduplicate a subset (for example, records from a specific import batch), set up a filtered view in Airtable and use it as the source view. This speeds up the scan and limits the blast radius.

Back up before merging. Before running a Merge action, duplicate your Airtable table as a backup. Merge operations cannot be undone through Airtable's undo system.

Set up ongoing prevention. After cleaning up existing duplicates, prevent new ones from forming. Filla's form builder supports reject duplicate values validation on individual fields -- for example, rejecting a form submission if the email already exists in your table.

Schedule periodic scans. Even with prevention measures, duplicates can appear through imports and integrations. Run the Duplicate Finder periodically to catch any that slip through.


How the matching algorithm works

For those who want to understand the technical details: Filla's Duplicate Finder uses a Union-Find (disjoint set) data structure to efficiently cluster duplicate groups. It performs pairwise comparison across all records -- O(n^2) complexity -- which means it can detect transitive duplicates (if A matches B and B matches C, all three are grouped together even if A and C do not directly match).

For fuzzy matching, Filla uses the Jaro-Winkler similarity algorithm, which is specifically designed for short strings like names and addresses. It gives higher weight to matching prefixes, making it effective at catching common typos while maintaining low false-positive rates.


FAQ

Can I undo a delete or merge operation?

Airtable supports record-level undo for a limited time window, but for bulk operations it is not reliable. Filla recommends using the Flag action first to review results, and backing up your table before running Delete or Merge.

How many records can the Duplicate Finder handle?

The tool processes records in configurable batches with built-in Airtable API rate limit management. It handles tables with thousands of records, running in the background with real-time progress tracking so it does not block your browser.

Can I match on more than two fields?

Yes. You can add as many match fields as needed. All fields must match for two records to be considered duplicates. This lets you build precise matching rules -- for example, matching on email, first name, and last name simultaneously.

What happens to linked records when duplicates are deleted?

When a duplicate record is deleted, any linked record references pointing to it become broken in Airtable. If you are using the Merge action, the primary record is preserved, but linked records from deleted duplicates are not automatically re-linked to the primary. Review linked record fields after merging.

Does fuzzy matching work with non-English text?

The Jaro-Winkler algorithm operates on character sequences, so it works with any text. However, its accuracy is optimized for Latin-script names and short strings. For very different writing systems, normalized or exact matching may produce more predictable results.


Clean data starts with deduplication

Duplicate records are not just an annoyance -- they are a data integrity problem that compounds over time. Every report, automation, and decision based on your Airtable data becomes less reliable when duplicates are present.

Filla's Duplicate Finder gives Airtable teams a purpose-built tool for finding and resolving duplicates without leaving the Airtable ecosystem. Flag them for review, delete them with smart primary selection, or merge them to preserve the most complete data.

Try the Duplicate Finder → or explore all of Filla's Airtable processor tools.

Ready to build smarter Airtable workflows? Start with Filla's form builder -- forms, processor tools, and document generation, all Airtable-native.