I’ve been watching and reading about Record Merging in Specify 7 with interest. It’s fascinating to me that Agents are the first to be rolled into these features. The effort, UI and logic are commendable.
I have a particular slant with respect to these activities that leans to the outcome of post-merge actions in Specify and then how Agent strings are shared with the wider community via dwc:recordedBy or dwc:identifiedBy. I believe HeatherC has raised comparable observations such as the ordering of Agent strings: Collector order in groups and multiple collectors. As all may know, there is a remarkable push to support community curation and linking to shared concepts using URIs/GUIDs via concepts like digital extended specimens. For those that embark on round-tripping of community-led data curation or enhancements, this puts pressure on collection management systems. What’s emerging from these externally-facing activities is a need to accurately store and share elements of provenance such as near-verbatim, in-context Agent strings on terms like dwc:recordedBy or dwc:identifiedBy as opposed to computed values accomplished through local disambiguation. We all stand to benefit in various ways when terms are populated with stable, representative, and evidence-based values (eg that plainly visible on labels) that other institutions are likely to also share. I have more thoughts on this here: https://github.com/tdwg/dwc/issues/450. As a brief summary, there is value in retaining in-context representations of Agent text strings, the least of which is to afford a local, port-mortem view of evidence to illustrate why a merge was executed. At the risk of being axiomatic, people have many names and many people have the same name. However, there is in fact a benefit to embracing some of this chaos. It may be more inclusive and sensitive to cultural differences to do so.
While I do believe this merge of Agents is a critical feature, I wish to raise caution. What for example, happens to the suppressed, non-canonical URIs/GUIDs for suppressed Agents post-merge? Are these records internally deleted with links rebuilt or are they retained in the back-end with a redirect? Is undo too intractable a feature to implement if indeed Agents are deleted post-merge? See `agent-merging`: "Undo" button for agent merge · Issue #2906 · specify/specify7 · GitHub.
If there’s one recommendation I can make it’s to consider a verbatim field for collectors and determiners to be used in context in a non-relational way and for these text string values to be used in dwc:recordedBy and dwc:identifiedBy despite the fact that the definition of these terms would lead us to populate them with computed values. We have dwc:recordedByID and dwc:identifiedByID for the purposes of communicating shared identity.