It would be great if Specify had a TaxonGeography table, like the AgentGeography table that it already has.
Hi @NielsKlazenga,
Thank you for your comment! I’m interested to know more about your vision for a TaxonGeography table.
Is the main goal be to view associated Geography from the Taxon side, or would the TaxonGeography table be unrelated to determined objects in your collection?
Currently, the Geography tree provides taxonomic context for each node when queried upon from the tree ( button). Querying a node from the Taxon tree includes associated Locality data, but not Geography. Would adding Geography to the Taxon query help satisfy this request?
@bronwynscc, I won’t speak for Niels but I’ve often thought that it would be neat to be able to indicate the type locality of a taxon in Specify via a linking table between Taxon and Geography.
Being able to indicate a type locality for a species-group taxon and then being able to automatically flag records that match it as topotypes would be pretty cool.
I think the main use, however, would be for indicating the range of taxa via geography. One could imagine automatic out-of-range flagging of specimens based on geography using this relationship.
Thanks @bronwynscc for elevating my off-the-cuff remark to a feature request.
I had proposed this as a solution for @igranzow 's problem in Taxon Attribute: loading up data. I think having a Taxon geographies subform in the Taxon form would be preferrable to some string tucked away in the Taxon Attributes.
I might have misunderstood the issue, but I would like to have the table anyway. I myself would indeed use it mainly to indicate which areas we’ve got collections from. We started delivering our Taxon Tree to ChecklistBank last year (well, I’ve done it once) and ChecklistBank let’s you include distribution, e.g., https://www.checklistbank.org/dataset/307515/taxon/104681 (I still need to work out how to deliver them in a way ChecklistBank recognises them).
I like @nfshoobs 's idea of having the authoritative distribution of a taxon in Taxon Geographies and using that as a DQ test and issuing a warning when data is entered that places a Taxon outside its know distribution. Even without the DQ test you could use it to manage checklists in Specify (since you already have the infrastructure).
You can do both things, and others, just by adding a pick list to indicate whether the distribution is Authoritative or Inferred. In our Agent geographies subform, I’ve got an Agent role field with possible values Collector and Determiner.
As botanists, we do not really do type localities, but if Geography is granular enough for this purpose that would be a good use too. Just goes to show that there are many possible uses of an Agent Geography table.
Thanks for your interest, all three of you. The proposed features sound great an would do great deal of service. The granularity @NielsKlazenga refers to is much to be taken into account for its extreme usefulness.
It’ll be challenging to adopt unless one resorts to the literature or authoritative standard/universal source lists for Taxon Geographies --at least for a given taxonomic group (or geographical region)–. It’s a good way to start nonetheless.
I say this because what I encounter in the collection datasets I’m currently migrating is just a unstructured string of haphazardly built geographic construction of ranges. These include strings like “throughout northern X”, “east of southern X”, “from X to Y”, “in X, Y and Z”, “Atlantic coast of X”, etc… The most I could do was to make sure that for a given taxon, all attribute “distribution” occurrences within a taxon are written in the exact same way. This info is of limited use, but I need to keep it so it can be accessed in a query by plain text searches for specific terms (wishful thinking) as a proxy for true locality/geography (if only to keep my collection managers happy, despite my efforts to convey that searching in this manner is quite dysfunctional).
Yes, having messy strings of geographic descriptors as a TaxonAttribute is no long-term solution, but a less than ideal way of preserving very unstructured data. At least while we work on implementing Taxon Geographies.
Thanks again
Íñigo
I just realized that the type locality function would probably need to be a separate linking table that links taxon with one or more localities, and actually may represent a different table request. I agree with @igranzow that it’s difficult to think of a way to comprehensively list the taxon’s geographic range in political boundaries of geography tree nodes. There are GIS-based approaches using convex hulls of distributions that might make sense, but those would be quite difficult to adopt in Specify. @NielsKlazenga I’m wondering if there’s a way to do this via automated querying of the GBIF data for taxon records, as Specify Network already links the accepted name to occurrence data (though I’m not a huge fan of the way this is implemented). One could imagine taking the geographic data from those points, aggregating it, and using it to auto-populate Taxon Geography entries for taxa in Specify.
@nfshoobs , Catalogue of Life seems to be the obvious place, if you do not want to fill it in yourself, but I have never downloaded that data myself. I would not describe data from GBIF as authoritative or complete, but it could be useful depending on how you use it. At the Atlas of Living Australia (ALA) we use a reverse jackknife algorithm to detect outliers based on five climate variables and other records in ALA.
The KU Biodiversity Institute & Natural History Museum used to have a project that created species distributions based on GBIF data. I do not know what has become of that.
And you do not have to do it for absolutely everything. We have the data for the Australian flora and, despite being a global collection, Australian flora is still about half of our collection. Many countries have lists for many groups of organisms, although I suspect there won’t be any for many groups of invertebrates.
It does not really matter where the data comes from. I would not know where to get authoritative data for Agent Geography either and yet it is there, and it is being used. Taxon Geography is at least as obvious a thing to have as Agent Geography and most likely has more uses. I would use it.
The most important use case for me is still the recording of geographic areas we’ve got collections of a taxon from.
@nfshoobs, if I would want to have a link to the type locality of a taxon name, I would create a Collection Object for the type specimen, even if I do not have it in my collection, and create a new Prep. Type and/or Collection, at least for the records for which we do not have the specimen.
I agree with this
I think it would make sense to have regardless of where the data comes from, but I think that for most collections one would want to have an external authority that could be monitored for changes. The jacknife outlier detection approach you describe seems cool!
I personally wouldn’t add an unassociated object to my collection, since we have a hard rule that only physical specimens that we have or had physical ownership of get records in the CollectionObject table. I don’t use CollectionObject records for literature records, observations, other institution’s specimens, etc. But even in collections that do use CO for non-owned vouchers, I think having the type localities linkable through the Taxon table makes much more sense because it can be applied to all records identified as that particular species-group taxon, and in addition the type locality data is most likely to be included in global species databases that other taxonomic information gets pulled from.
It might make more sense to think about it from the Locality side of the interaction. It would be nice to be able to have a subform on the Locality form that indicates whether that locality is the type locality for any taxa, by linking those taxon records to the Locality.
In principle, I would support—or at least not too strongly object too—a Taxon–Locality map, as, outside Specify, Locality is the lowest level in the dcterms:Location (the Darwin Core class) hierarchy—as in dwc:continent, dwc:country, dwc:stateProvince, dwc:county, dwc:municipality, dwc:locality—making Taxon–Geography and Taxon–Locality one and the same thing. However, data-model artifact or not, in Specify, TaxonLocality (or LocalityTaxon) is a different feature than the one we are talking about here, so it would be best to create a new feature request if people want to pursue it. (Let’s forget for a moment that I piggybacked my feature request on another issue as well.)
