Agent name protocol

@VHough this is an age old question, and one that has no perfect answer I’m afraid.
For people, I use the agent’s current legal name as the agent name, and all other forms become Agent Variants. This is somewhat problematic in that reprinting a label for an old lot may in some cases produce a different name than the agent had when the lot was collected. Also, legal transaction records such as loans, accessions, or permit records become inconsistent with what ‘actually’ happened. I believe DarwinCore expects data providers not to do this and to keep verbatim names instead and use separate IDs (dwc:recordedbyID) to link to agent records.

However, that isn’t easy to do in Specify, and I view these changes as pretty easy to understand if one examines the Agent record and sees the Agent Variant record for the prior name. In the few prominent cases I’ve dealt with so far in my collection (someone being maried and divorced a few times) I believe I added the full married names with MarriedName (née BirthName) as a variant, even if it didn’t occur on a label.
RE institutions, I’m still deciding exactly how I want to deal with these, but right now I give each name its own agent record, because the legal institutional name is more important for provenance and variation is generally less easily understood than personal name changes. Sometimes a name change can indicate a change in actual administrative structure of an organization, like the splitting off of a division or something. Our own institution has a very convoluted administrative history. There are 3 separate entities within our university that over the years have called called themselves The Ohio State University Museum of Zoology (or something close to it), for example, and though there’s a through-line connecting all of those to our current museum, those entities are not the exact same institution as we are now. I’d be interested to hear if you encounter examples like that!
As an example from an adjacent field: the Biodiversity Heritage Library typically gives unique records to journals when they change names, but they reciprocally link the new journal name to the old journal name to indicate the continuity.

On the Specify technical side of things, this question also brings up some problems I’ve noticed with the Agent Variant tables that I’ve spoken about with @SpecifyMembership before.

  • Agent Variants are single fields, meaning the name part parsing and structure of the Agent record is lost when turning it into an Agent Variant.
  • Agent Variants do not get considered upon upload (duplicate agent is created even if that agent string exists as the agent variant of an existing agent).
  • When Agent Variants are automatically created using the record merge tool on records in the Agent table, the GUID for the Agent record that becomes an Agent Variant gets destroyed in the process, meaning agent GUIDs in Specify are not persistent identifiers (for example, if you accidentally upload one or more duplicate agents and then merge them with the existing agent with the same data, you can only keep one GUID, meaning existing external references to a merged agent record(s) by GUID would be unresolvable in the future because that original GUID(s) are no longer stored anywhere.)
    • I believe this situation is partly what @dshorthouse was referring to in this post, and I think it’s a very good point that should probably be addressed in a future update (by making Agent Variants retain the GUID of the Agent they came from, at least).

With the way these two tables currently function, I’ve frequently thought that it would probably result in fewer duplicate agents and less work and cleaning on the part of curatorial staff to leave agent strings completely unparsed in Specify 7 databases, because parsing to name parts offers almost no benefit beyond being able to sort the name parts alphabetically in query results. As name strings can’t be abbreviated or matched programmatically (i.e. I can’t tell specify to print my name as N.F. shoobs on a label unless I manually make that the preferred form of my agent record and make my full name a variant, or add initials in a separate field), and a partially parsed or unparsed name string uploaded via the workbench always results in a duplicate agent record even when an Agent Variant record exists that is verbatim match to the uploaded string.
It’s a shame, because there’s a lot of power that could be leveraged with some trivial changes!emphasized text

1 Like