Establishing relationship between synonym and preferred/accepted taxon, en masse

igranzow · March 31, 2024, 11:03pm

We need to upload a (long and fairly clean) list of new taxa prior to uploading the actual specimens/records that bear those names. I know many of those binomials are synonyms of others (specific ones) in that same list, where each synonym is unequivocally linked to its accepted/preferred name.
However, I can’t figure out how to map that relationship in the WorkBench. Fields like Taxon.acceptedTaxon and Taxon.isAccepted (populating it would help down the line) are not available in the WB. But neither is Determination.preferredTaxon if one is to try an alternative route.
Again, it’s a long list and synonymizing binomials one by one on the taxon tree would be painful.
I must be missing something. Thanks.

Grant · April 1, 2024, 6:46pm

Hi @igranzow,

Unfortunately, you are not missing anything. Currently, there is no method to bulk upload synonyms through WorkBench; they need to be set up one by one via the tree interface. Updating Taxon records individually through SQL or the API is feasible but more complex and risky compared to using the user interface.

github.com/specify/specify7

Allow configuring synonyms during initial imports

opened 05:32PM - 24 May 23 UTC

grantfitzsimmons

1 - Request 2 - Trees

> Is there a way to configure synonyms through batch import of taxons to the tre…e (either through code or through formatting a table a certain way?). Dragging and dropping after a taxonomy tree is setup is doable, but would be a lot easier to just configure the synonym relationships from the get go if possible. **Requested By:** Mark Pitblado on the [Speciforum](https://discourse.specifysoftware.org/t/trees-in-specify-7/534/2?u=specify) (on behalf of the University of British Columbia - Beaty Biodiversity Museum)

This has been requested a number of times and I’ve added your comment to the official issue tracking. If you have more comments or want to share your ideal workflow once this request is implemented, please reply to this topic below or on GitHub!

jason_m · April 3, 2024, 12:08am

Hi @igranzow,

As mentioned by @Grant, the best way this can be accomplished without individually synonymizing the Taxa records using the Tree Viewer would be through SQL or through a script utilizing the API (or some combination of the two).

If such a solution appeals to you before the ability to upload synonymies via the WorkBench is implemented and you are in need of assistance, can you clarify the structure of the input or provide a (optionally fictitious) sample of the dataset?

For example, what are the mapped column headings (to fields/relationships from CollectionObject, Determination, or Taxon, etc.) in the dataset?

Benr · April 4, 2024, 2:57am

Hi, @jason_m

We have 32.4k taxon names that we wish to import into Specify. 28% (9.1k) of them are synonyms. I’m interested to find out more about the options I have here to avoid forcing our curatorial staff to hand-edit that many synonyms post-launch.

By way of example, our data contains parent/child relationships between members of the taxonomic hierarchy and a relationship between a prior taxon name (a synonym of some kind) and a more recent name, which may itself be a synonym. That is, there may be several “hops” across names before arriving at one or more accepted taxon names.

NAME_ID	RANK_NAME	IS_CURRENT	NAME	AUTHOR
9024	Species	false	Boronia machardiana	F.Muell.
9025	Species	false	Boronia viminea	Lindl.
16636	Subspecies	true	Boronia crenulata subsp. viminea	(Lindl.) Paul G. Wilson

OLD_NAME_ID	NEW_NAME_ID	XREF_TYPE
9024	16636	TSY
9025	16636	TSY

This data documents the following synonym relationships.

Boronia machardiana F.Muell. is a taxonomic synonym of Boronia crenulata subsp. viminea (Lindl.) Paul G. Wilson.
Boronia viminea Lindl. is a taxonomic synonym of Boronia crenulata subsp. viminea (Lindl.) Paul G. Wilson.

I had planned to convert this data into a CSV suitable for uploading to the Workbench and hoped to build sufficient data into that CSV file to allow Specify’s taxon tree to recognise the synonym relationships.

pverley · April 9, 2024, 10:37am

Thanks Iñigo and Ben for bringing up the issue. We are facing exactly the same problem here in the French Guiana Herbarium (CAY) for migrating our taxon tree to Specify.
We have ~55k taxa to import and ~30k are synonyms, so this is definitely not something that can be deal with manually.
We +1 the feature request and in the meantime we are happy to share dataset and brain power to come up with a SQL based (semi)automatized solution.
Thanks,
Philippe V. (CAY)

Grant · April 10, 2024, 7:46pm

Hi @pverley,

Could you share the dataset of synonyms you want to import into Specify? We will explore and suggest an interim method for importing synonyms until WorkBench fully supports this feature.

Thank you!

igranzow · April 22, 2024, 10:01am

Hi Grant.

You asked Philippe for the dataset of synonyms he’s dealing with, so I tag along. I have a couple of them (seed plants & fishes) from the MNCN Biobank database, which I’m so close to finalize migrating. The taxa lists are by no means the size of what Philippe is talking about because the Biobank contains a small subset of taxa, obviously, but I will encounter much much larger volumes when I tackle the main collections.

… if it’s of any help

Thanks so much.

Íñigo

FishesACCEPTED in MNCN-CSIC biobank.xlsx (37.1 KB)

PlantsACCEPTED in MNCN-CSIC biobank.xlsx (22.2 KB)

jason_m · April 28, 2024, 7:14pm

Hi everyone!

Thank you all for providing example datasets and being patient.

I have created a repository on GitHub which demonstrates how the API can be used with Python and the requests library to create an application which mass-imports taxonomic data (including synonyms) to a Specify 7 instance.

In short, the demo takes a CSV containing information in the following format, creates a Mammalia taxon node if one does not exist, and uploads the taxon records under the Mammalia node (at the correct ranks specified in the CSV columns)

Order	Family	Genus	Species	isAccepted	Author	AcceptedGenus	AcceptedSpecies	AcceptedAuthor
Afrosoricida	Tenrecidae	Microgale	talazaci	Yes	Major, 1896
Afrosoricida	Tenrecidae	Oryzorictes	talpoides	No	G.Grandidier & Petit, 1930	Oryzorictes	hova	A.Grandidier, 1870

By default, the application is set to connect to https://sp7demofish.specifycloud.org/ using the sp7demofish user and logging into the KUFishvoucher collection, so you can see it in action and independently make edits to the code/data and see the result without worrying about making changes to a live production instance.
(If you plan on developing your own application or apopting the one in the repository, you can use this sp7demofish instance for API testing purposes. The data in the instance should be regularly wiped).

The code was developed to be minimum-viable product (demo) without optimization in mind, so optimizations can be made to the code.
And/or host a Specify 7 instance locally and have the application connect to the local instance to improve performance.

If interested, please read the README of the repository

If this is not helpful, or an alternative approach should be considered, a demo using SQL directly to accomplish the same task can be made.

Benr · April 30, 2024, 1:44am

Hi, @jason_m

Thanks for this. I will be able to create rows in that format quite easily.

Some questions:

Does it also handle subordinate ranks of Species, e.g. Subspecies, Variety etc.?
Also, will the tool cope when there are several accepted names for a taxon with isAccepted = No?

Grant · April 30, 2024, 7:30pm

Hi @Benr,

This is not a one-size-fits-all solution and is merely a starting point for a more evolved import system. It would need to be adjusted to accommodate these situations based on my understanding.

The repository provides more details about the specifics. This tool only allows for importing or updating Taxon records and linking them to the appropriate “accepted” Taxon record. While it does not offer additional functionality, it would be great to share this tool with the community if you or someone else further develops it!

NielsKlazenga · May 3, 2024, 12:15am

I do not know the WorkBench in Specify7 at all, but if you can upload Taxon Citations in it, that is what I would try.

As of last week, I am managing five different taxonomies, VicFlora, the Australian Plant Census (APC), the World Checklist of Vascular Plants (WCVP/PoWo), AusMoss and Bryonames in the Taxon Tree. Each taxonomy is a Reference Work and in the Taxon Citation I have a field (Text1) in which I record the accepted name according to that taxonomy/Reference Work. From there it is a simple update query to get the synonyms (provided the accepted names are in the Taxon Tree).

I have not had the chance to tie this all up in a bow yet, but I plan to do this only once and having it update automatically periodically after that. So, @Benr , if you can wait until September, when I am back from long service leave, I can help you out.

pverley · May 14, 2024, 4:01pm

I was not very responsive in sending demo data, but import file is basically just like the one sent by Íñigo: taxon details + accepted taxon details.
I will read carefully the README, but at first glance I understand that I can :

either use it to upload new taxa with synonyms
or update existing taxa and link them as synonyms

Both ways of using the tool will be very useful. Thank you so much for the hard work !

pverley · May 30, 2024, 1:51pm

Thank you so much @jason_m for sharing the python code as an example of how to use the API in a programming way, so handy!

I adapted slightly the code to our needs since we are importing an entire taxon tree (plants of Guyane) in Specify from an existing Database where we already had synonym index. I’m writing down what I did if it could ever be useful to other:

Rank	Name	ID	SynonymRank	SynonymName	SynonymID
Forma	acrocarpa	80667	Subspecies	sphaerocarpa	73402
Forma	acropteron	18346	Species	hemipteron	18345
Forma	aculeata	50701	Variety	aculeata	50699
Forma	acuminatum	99284	Species	auriculatum	17972
Forma	acutifolia	81079	Species	subrevoluta	1465
Forma	alabamense	16918	Species	jenmanii	16915
Forma	albiflora	77914	Species	volubilis	5452
Forma	albolana	77954	Species	pentandra	838
Forma	alternatum	99289	Species	auriculatum	17972
Forma	althaeoides	80673	Species	althaeoides	5326
Forma	amazonica	17157	Species	sellowii	16554
Forma	amurensis	100008	Species	sibirica	12046
Forma	anadencum	99217	Species	campyloptera	98013
…	…	…	…	…	…

That way we could handle synonymy for every taxon rank and handle the fact that the accepted taxon may not be of same rank.

We synonymized ~30k taxa this way. I split the import in a bulk of 60 CSV files of ~500 taxa to avoid Session closed error and looped the import script over the CSV files. It took a work day of understanding, adapting the python code, preparing the CSV files. The import ran overnight in less than 12 hours

Cheers,
Philippe V. (CAY, NOU)

Topic		Replies	Views
Bug Report: PreferredTaxon does not automatically update in existing Determinations when Taxon Tree is updated Get Help	0	23	June 11, 2025
Sp7 Workbench: Import taxa as synonyms Get Help	1	199	November 20, 2023
Synonym Placeholder Get Help	11	49	May 12, 2025
Merging into synonyms in Taxon tree New Feature Requests	1	29	November 6, 2024
Flag when taxon does not equal prefered taxon New Feature Requests	4	276	August 30, 2023

Establishing relationship between synonym and preferred/accepted taxon, en masse

Related topics