Taxon identifiers

Would it be possible to provide a Taxon identifiers table, similar to the Agent Identifier table where multiple identifiers can be recorded? We are currently using a text field to record one identifier, however we would like to be able to record multiple identifiers when they exist.

Hi @rdrinkwater,

Can you provide a few examples for others to reference? We would like to see if there is any additional community interest!

Thank you!

For our collections some of the identifier we would be looking to include are:

1 Like

This should be part of a broader discussion about taxonomy (and nomenclature) in Specify.

I understand why this idea sounds appealing, but it goes awry pretty quickly when given a bit of thought. “Taxon Identifiers” are nothing like Agent Identifiers.

The issue here is that the Specify (and Darwin Core) Taxon is not a taxon, but a data artefact that confounds the taxon with its name (taxa cannot be synonyms and do not have types; names on the other hand do not have parents).

Of the examples given, IPNI is a purely nomenclatural system, so IPNI IDs are name identifiers; MycoBank does bit of both, but the identifiers are clearly name identifiers; and PoWo is a purely taxonomic system but it uses the IPNI identifiers, so while PoWo IDs are taxon identifiers, they behave like name identifiers, i.e. there will be a new identifier when the name changes, regardless of whether the definition of the taxon has changed or not, but the same identifier will be used if the definition of a taxon is changed but the name stays the same.

So, while Agent identifiers, like ORCID and WikiData ID, will assign different identifiers to different Agents with the same name and the same identifier to Agent records that apply to the same Agent (or Person) that may have different names, these proposed “Taxon Identifiers” do the diametrically opposite: they assign the same identifier to different taxa with the same name and could assign different identifiers to the same taxon that has gone under different names (or combinations).

Getting proper IDs for online taxonomies that continually change is perhaps the single biggest challenge in dealing with taxonomic data. I do not think it is a good idea for Specify to muddy the waters even further by creating a TaxonIdentifier table for things that are not taxon identifiers.

You’ll also find that if we had proper taxon identifiers there would be a one-to-one relationship between Taxon and Taxon Identifier. Taxa are a bit more complicated than your general database object, so one cannot just link them up with an identifier (or a scientific name). Instead you have to map them with relationships that are more akin to spatial relationships: is congruent with (equal), includes, is included in, partially overlaps and is disjoint from.

I personally think Specify does not need to do all this—although I think Symbiota and Arctos might already do it—but we should try to stay standards (Darwin Core) compliant. So, if one wants to have those identifiers in Specify, the IPNI (and probably MycoBank) identifiers can go into a dwc:scientificNameID field and the PoWo identifiers (properly versioned) into a dwc:taxonConceptID field.

Going a bit off-topic, I would also like to see a dwc:nameAccordingTo (but just call it ‘accordingTo’) field and a dwc:nameAccordingToID field that links to the ReferenceWork table; and a Taxon dropdown that we can configure like other dropdowns, so that it can show Taxon Concept Labels rather than just taxon names (which is not going to work when there are multiple taxa with the same name). Once Specify has all that, we can link determinations to taxa rather than taxon names and Specify will be dealing with taxonomic data properly.

2 Likes

We provide the WoRMS AphiaID for (nearly) all taxa in our tree (I use the integer1 field for it, with a uniqueness rule to prevent duplicate entries). The AphiaID is the identifier for a given name record. Can be combined with the URL prefix “https://www .marinespecies.org/aphia.php?p=taxdetails&id=” as a permalink, and “urn:lsid:marinespecies.org:taxname:” to form the LSID for a name.

I think this feature might be better implemented by being included in a different way of dealing with nomenclatural data in general. It is not sustainable for collections to maintain entirely separate classifications from global species files, IMO. We should be able to selectively pull data in from existing sources like ITIS, WoRMS, GBIF, etc without having to manually upload through the workbench. Specify should have built in connections to existing APIs in all the major nomenclatural databases that can be configured by each collection.

Currently a semi-functional version of this exists as “Specify Network”. Except often for whatever reason the GBIF name that is pulled by the Specify Network plugin provides incorrect matches (at least for many of our taxa) and there’s no way to turn the Specify Network button off. :frowning: