Identifying Duplicates for Merging in Specify 7

:open_book: This guide informs on how you can identify duplicate records in your database for record merging. For more information on merging records, see Record Merging.

About Identifying Duplicates for Merging

Before merging records in Query Builder, it is beneficial to have an idea of what duplicates exist in your database. Fortunately, Specify 7 has duplicate record finding abilities built in to the Uniqueness Rules tool. With this dual-purpose function, Specify will generate a CSV file with a list of duplicates for a given scope that can be downloaded to your device all from within Schema Config.

The Export Duplicates function is helpful for routine quality checks and general QA/QC purposes, or when cleanup is overdue and many duplicate records are anticipated. Though it was built for Uniqueness Rules in mind,

[!callout] Note
While duplicate record exports can be generated for any table, record merging can only be done on select tables. See Supported Tables for Record Merging for more information.

Steps to Identify Duplicates

1. Navigate to Schema Config

  • Open the User Tools menu and select Schema Config.
  • Select your language, followed by the table you wish to export duplicates for (and merge records on).

2. Open Uniqueness Rules

  • Click the Uniqueness Rules button from the bottom left of the page.

  • Select Add Uniqueness Rule from the dialog that appears.

  • Click the :pencil_: to the left of the uniqueness rule name.

This will open the Configure Uniqueness Rule dialog:

3. Configure Duplicate Parameters

  • Under Unique Fields, choose the field(s) you would like to base the duplication search on. To add more than one field to the duplicate parameters, click the Add button .

[!callout]Note on Fields
For Agent, choosing First Name and Last Name means that any agent records where the first name and last name field match exactly will be considered duplicates.

For Collecting Event, perhaps just Field Number suffices, or Field Number and Start Date.

  • Next, Select the scope you’d like to base duplication on. Scope mapping options depend on the table and not all scopes are available for selection, but Database is always an option.

4. Export Duplicates

  • The Export Duplicates button will appear in this dialog if there are duplicates found.

  • After clicking Export Duplicates, you will be prompted to choose the file name and storage location of the file on your device.
  • Once exported, delete the Uniqueness Rule by clicking Delete in the bottom left corner of the dialog.

:sparkles: You now have a file listing all duplicate records to guide your record merging :sparkles:

Duplicate Export File Structure

Each file is structured similarly:

  • The first (leftmost) column contains the number of duplicates that exist.
  • Next are columns for the selected fields and the value on duplicate records (e.g., firstName, for Agent).
  • The last column on the right is the ID of the selected scope (e.g., Discipline_ID).

Example:

Number of Duplicates firstName lastName Discipline_ID
3 “W” “Davis” 2
3 “S” “Campbell” 2
2 “Tom” “Smith” 2
2 “P” “Mills” 2
2 “Bernard” “K” 2
2 “Peter” “Wainwright” 2
2 “C” “Johnston” 2

Additional Guidance

With your duplicate export file on hand, you can go into a record merging session with confidence. You can use this file as a checklist by adding a column or highlighting rows to track if/when records were merged.

[!caution]Remember
Always review record data in the Merge Records interface before clicking Merge.

[!tip] Query Tip
Use the In operator to filter results by a comma-separated list of field values: