Specify v7.9.6.2
I’ve just made a DwCA export and RSS feed for of our collection data (attached) DwCA_OSUM_Bivalves.xml (12.7 KB). I had a few quick questions for those with experience setting these up about configuring it so that the export can better conform to darwincore.
-
Since our date data is in the form of a start date and end date both conforming to ISO 8601 (YYYY-MM-DD), we want to report the start and end dates concatenated with “/” between them if both are present in the dwc:eventDate field, and the enddate with a “/” prefix if there is no startdate. Is there a way to do this without an intermediate processing step?
-
Similar issue – dwc:associatedMedia should have both the media associated with the CO and CE (since CE and CO are both “associated with” an occurrence record. Really, any table linked to CO with an attachment might be considered associatedMedia to GBIF). Is there a way to make both aggregated lists of media links (CO attachments Aggregated and CE Attachments Aggregated) to be concatenated into one single field in the DwCA file. The order doesn’t matter, I just want both aggregated strings to be concatenated with a “ | ” in between them. I currently have them formatted correctly so they display a list of links to the attachment files.
For the above two things: I’ve tried a couple of different tricks with the xml (like using conditional logic or concatenation rules that normally work on labels made in jaspersoft) and haven’t been able to make them work in the DwCA file without it failing with a traceback:null error.
-
Our taxon table uses the WoRMS AphiaID to match taxa to WoRMS. In the DWC export, in order to conform to the dwc:taxonID and dwc:acceptedNameUsageID terms, the AphiaID should be exported with its LSID prefix (urn:lsid:marinespecies.org:taxname:), but I don’t record that prefix in the field because the WoRMS API / URL stems often use the aphiaID by itself. Is there a way I can format the AphiaID with a static text prefix that doesn’t use the taxon table formatter? Since I can only put one formatted record per table in a query, I can’t use it for both fields, and I am actually already using it for another dwc field.
-
the dwc:Modified term is broader than the function of Specify CO table’s timestampModified field. dwc:dateModified is meant to capture the last time that any of the data in a single row was modified. If I modify a collecting event, the CO table’s timestampModified field doesn’t update for all related records, even though fundamentally the data I have changed effects them. Is there a way to have Specify take the most recent timestampModified of the CE, CO, Locality, Preparations, Determinations etc tables and use that as timestampModified? As it stands, I could edit the value of almost every single field that appears in the DwCA row, and the CO timestampmodified would not be changed.
Some general DwCA export questions:
-
GBIF validator claims “The description of the dataset is missing or too short”:“DESCRIPTION_MISSING_OR_TOO_SHORT” and “The EML document does not validate against the schema”:“URI is not absolute”. I can’t find descriptions of these issues online. I setup the EML following the instructions on the forum here (swapping it out for the relevant info on our collection). I feel like this is probably an easy fix, but haven’t been able to figure it out yet. EML here: Test EML.xml (1.3 KB)
-
Is there a way to have make an export that contains all of the collections using the same query? I assume no, but do recall reading that some settings allow queries across collections. I don’t normally want queries to access all collections, but in this specific context it would be convenient because it would mean 1 occurrence dataset to keep track of on GBIF vs 3.
Thanks!