Export as CSV with alternative delimiter e.g. semicolon

A recurrent nuisance for us at our side of the pond is trying to open a csv-file generated by Specify7. Since our edition of MS Office Excel expects the so-called “comma-separated” files to have a semicolon as delimiter instead. So all the columns are squeezed into a single one when opening it in that program.

Of course, there are a multitude of workarounds, like importing the file into MS Excel or using a different program. The reality on the ground is that people just have MS Excel installed and expect it to open the spreadsheet file as is.

Right now, I’m hacking Specify7 to use a semicolon as delimiter instead, but this will be overwritten every time we update.

So a feature, where the delimiter can be specified using a global setting would be greatly appreciated.

Hi!

In user preferences, there is an option to change the export file delimiter for WorkBench data sets:

You can change the above for all users in the collection/institution by setting default user preferences - Setting default user preferences · specify/specify7 Wiki · GitHub

There is no similar option for Query Builder exports at the moment, but it would be a good idea to add it.

https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba

A quick glance at this Microsoft support site says this:

For: “Excel for Microsoft 365 Excel 2021 Excel 2019 Excel 2016 Excel 2013 Excel 2010 Excel 2007”

“You can change the separator character that is used in both delimited and .csv text files. This may be necessary to make sure that the import or export operation works the way that you want it to.”

Are you sure you cannot change the input file delimiter in your version of Excel? Just curious.

That option is not visible in my version of Specify7 (7.6.1)

As I wrote, there is the workaround of importing the file using the Import Text Wizard, but this is a real hazzle compared to simply opening the file you just downloaded. There is, unfortunately, not a way to tell MS Excel to open csv files assuming a different delimiter like a comma. A semicolon is always assumed.

Is there any reason in particular that you are not on the latest version (7.8.4)?
Do you use multiple versions of Specify 7 with your database?

The “;” delimiter vs. a “,” can be reset in your OS’ by defining the decimal marker (comma vs. point) in your preferences. MSExcel sticks to the settings in the OS general preferences in this regard.

Despite the great flexibility that Sp7.8 allowes for choosing a delimiter while import/export to/from WorhBench, Query Results are only downloadable as CSVs (as far as I’ve seen)

This said, one can work with Excel to get around the issue: Rather than simply opening your CSV with Excel, one should import it. With a Mac it’s pretty straightforward: [File > Import]. With Windows –as expected– requires more steps: [Data > Get external data > From external file (CSV)] … and just pick whatever delimiter your file comes with.
Those on this same side of the pond, remember to also choose the right character encoding: Unicode (UTF 8) in that same dialog if you don’t want characters other than English becoming a total mess.

Good luck

I appreciate your well-meaning input, but I’ve mentioned the import option already several times myself. I’m aware of it, but as I point it, it is a lot of a hassle, especially for the average user on the floor. The user experience with Specify7 needs to be as smooth as possible for specific reasons. The same goes for adjusting settings in the OS and what have you. It’s unreasonable to expect a common user to have to jump through several hoops just because they live on the wrong continent.

And I just discovered that fixing the delimiters is only half the problem; Excel also expects the encoding to be ANSI rather than the UTF-8 that is delivered.

I know these are issues of Microsoft and Excel and not actually Specify7 itself. Yet, it would be nice if there were to be some options for alleviating this. For instance, it would be nice to set an alternative delimiter as well as encoding for CSV downloads in general.

Is there any reason in particular that you are not on the latest version (7.8.4)?
Do you use multiple versions of Specify 7 with your database?

The particular reason is that we limit ourselves to only upgrade every 3 months, because often when we upgrade this leads to a plethora of user support cases that we need to address and thus extra work.

I’m replying for the benefit of other readers.

In my understanding it is necessary to allow all users (using different spreadsheet software, which might change at any time) to be able to manipulate query results. So you can never satisfy everyone all the time. A delimited text file therefore caters to everyone, provided that they are willing to do a small amount of work.

I import the csv into Excel. All I have to do is state that there are delimiters, state that there are headers, select the UTF-8 character set, select the delimiter and state whether there is a text qualifier in the csv file (i.e. if I know that commas will appear within a single field, this field will be enclosed in quotes so that it doesn’t get separated according to those commas). So yes, I have to understand text files and then I have to select these 5 properties.

For some reason I do not find this particularly onerous. I guess I don’t mind it because I understand the reason. I have googled to see whether there is a way to save this import configuration, but there doesn’t appear to be (I think you can save a file export configuration).

One thing that I would really like to see is the headers of the csv (query output) being created from the form captions rather than the backend field names. To me that is far more onerous to change.

1 Like

Hi,

Thank you for your feedback
The headers in query exports have been improved in 7.8.4. They are now using localized field names rather than database field names

2 Likes

Thanks for your take on this, Willem.

Listen… It is of course not onerous for those of us who are fluent in data and regularly handle files in different formats. I am speaking on behalf of our end users who are working with directly with their collections materials and digitization is a side-quest to them for which they, honestly speaking, have only limited time and resources to spend on.

As said, there are various specific reasons why we need their user experience to be as smooth as possible. It’s already a challenge to get them used to Specify7 to begin with and have all of them finally switch over from Specify6. The smallest kink puts them off and makes our work even harder. Also, we’re running two major projects that really put a strain on a lot of staff: Our new exhibits and the DiSSCo-funded mass digitization effort. Things need to be intuitive and snappy for common end users on the floor.

Also, whereas I understand the technical explanation for this phenomenon, I do not understand the reason for why M$ Excel has not made this easier for the end user. I’ve found LibreOffice can figure out this fine.

I’d think it’s a modest addition to help compensate for this situation, so I would appreciate if this is kept on the radar.

We have an open issue now on GitHub for the project:

Thank you again!

@fedoras

This is fixed in our Specify 7.8.7 release that is coming later today!