Establishing relationship between synonym and preferred/accepted taxon, en masse

We need to upload a (long and fairly clean) list of new taxa prior to uploading the actual specimens/records that bear those names. I know many of those binomials are synonyms of others (specific ones) in that same list, where each synonym is unequivocally linked to its accepted/preferred name.
However, I can’t figure out how to map that relationship in the WorkBench. Fields like Taxon.acceptedTaxon and Taxon.isAccepted (populating it would help down the line) are not available in the WB. But neither is Determination.preferredTaxon if one is to try an alternative route.
Again, it’s a long list and synonymizing binomials one by one on the taxon tree would be painful.
I must be missing something. Thanks.

2 Likes

Hi @igranzow,

Unfortunately, you are not missing anything. Currently, there is no method to bulk upload synonyms through WorkBench; they need to be set up one by one via the tree interface. Updating Taxon records individually through SQL or the API is feasible but more complex and risky compared to using the user interface.

This has been requested a number of times and I’ve added your comment to the official issue tracking. If you have more comments or want to share your ideal workflow once this request is implemented, please reply to this topic below or on GitHub!

1 Like

Hi @igranzow,

As mentioned by @Specify, the best way this can be accomplished without individually synonymizing the Taxa records using the Tree Viewer would be through SQL or through a script utilizing the API (or some combination of the two).

If such a solution appeals to you before the ability to upload synonymies via the WorkBench is implemented and you are in need of assistance, can you clarify the structure of the input or provide a (optionally fictitious) sample of the dataset?

For example, what are the mapped column headings (to fields/relationships from CollectionObject, Determination, or Taxon, etc.) in the dataset?

Hi, @jason_m

We have 32.4k taxon names that we wish to import into Specify. 28% (9.1k) of them are synonyms. I’m interested to find out more about the options I have here to avoid forcing our curatorial staff to hand-edit that many synonyms post-launch.

By way of example, our data contains parent/child relationships between members of the taxonomic hierarchy and a relationship between a prior taxon name (a synonym of some kind) and a more recent name, which may itself be a synonym. That is, there may be several “hops” across names before arriving at one or more accepted taxon names.

NAME_ID RANK_NAME IS_CURRENT NAME AUTHOR
9024 Species false Boronia machardiana F.Muell.
9025 Species false Boronia viminea Lindl.
16636 Subspecies true Boronia crenulata subsp. viminea (Lindl.) Paul G. Wilson
OLD_NAME_ID NEW_NAME_ID XREF_TYPE
9024 16636 TSY
9025 16636 TSY

This data documents the following synonym relationships.

  1. Boronia machardiana F.Muell. is a taxonomic synonym of Boronia crenulata subsp. viminea (Lindl.) Paul G. Wilson.
  2. Boronia viminea Lindl. is a taxonomic synonym of Boronia crenulata subsp. viminea (Lindl.) Paul G. Wilson.

I had planned to convert this data into a CSV suitable for uploading to the Workbench and hoped to build sufficient data into that CSV file to allow Specify’s taxon tree to recognise the synonym relationships.

1 Like

Thanks Iñigo and Ben for bringing up the issue. We are facing exactly the same problem here in the French Guiana Herbarium (CAY) for migrating our taxon tree to Specify.
We have ~55k taxa to import and ~30k are synonyms, so this is definitely not something that can be deal with manually.
We +1 the feature request and in the meantime we are happy to share dataset and brain power to come up with a SQL based (semi)automatized solution.
Thanks,
Philippe V. (CAY)

Hi @pverley,

Could you share the dataset of synonyms you want to import into Specify? We will explore and suggest an interim method for importing synonyms until WorkBench fully supports this feature.

Thank you!

Hi Grant.

You asked Philippe for the dataset of synonyms he’s dealing with, so I tag along. I have a couple of them (seed plants & fishes) from the MNCN Biobank database, which I’m so close to finalize migrating. The taxa lists are by no means the size of what Philippe is talking about because the Biobank contains a small subset of taxa, obviously, but I will encounter much much larger volumes when I tackle the main collections.

… if it’s of any help

Thanks so much.

Íñigo

FishesACCEPTED in MNCN-CSIC biobank.xlsx (37.1 KB)

PlantsACCEPTED in MNCN-CSIC biobank.xlsx (22.2 KB)

1 Like

Hi everyone!

Thank you all for providing example datasets and being patient.

I have created a repository on GitHub which demonstrates how the API can be used with Python and the requests library to create an application which mass-imports taxonomic data (including synonyms) to a Specify 7 instance.

In short, the demo takes a CSV containing information in the following format, creates a Mammalia taxon node if one does not exist, and uploads the taxon records under the Mammalia node (at the correct ranks specified in the CSV columns)

Order Family Genus Species isAccepted Author AcceptedGenus AcceptedSpecies AcceptedAuthor
Afrosoricida Tenrecidae Microgale talazaci Yes Major, 1896
Afrosoricida Tenrecidae Oryzorictes talpoides No G.Grandidier & Petit, 1930 Oryzorictes hova A.Grandidier, 1870

By default, the application is set to connect to https://sp7demofish.specifycloud.org/ using the sp7demofish user and logging into the KUFishvoucher collection, so you can see it in action and independently make edits to the code/data and see the result without worrying about making changes to a live production instance.
(If you plan on developing your own application or apopting the one in the repository, you can use this sp7demofish instance for API testing purposes. The data in the instance should be regularly wiped).

The code was developed to be minimum-viable product (demo) without optimization in mind, so optimizations can be made to the code.
And/or host a Specify 7 instance locally and have the application connect to the local instance to improve performance.


:warning: If interested, please read the README of the repository

If this is not helpful, or an alternative approach should be considered, a demo using SQL directly to accomplish the same task can be made.

1 Like

Hi, @jason_m

Thanks for this. I will be able to create rows in that format quite easily.

Some questions:

  • Does it also handle subordinate ranks of Species, e.g. Subspecies, Variety etc.?

  • Also, will the tool cope when there are several accepted names for a taxon with isAccepted = No?

Hi @Benr,

This is not a one-size-fits-all solution and is merely a starting point for a more evolved import system. It would need to be adjusted to accommodate these situations based on my understanding.

The repository provides more details about the specifics. This tool only allows for importing or updating Taxon records and linking them to the appropriate “accepted” Taxon record. While it does not offer additional functionality, it would be great to share this tool with the community if you or someone else further develops it!