Issues

Select view

Select search mode

 
50 of 151

Postem: Issues with Spanish characters

Description

One of my students is named Ramón Núñez Pérez. So far the other tools in Sakai display the name correctly, but Post’em does not. I’m attaching the file I used.

Attachments

7

Details

Priority

Components

Assignee

Reporter

Environment

nightly servers
Created August 23, 2024 at 11:15 PM
Updated September 18, 2024 at 6:40 PM

Activity

Show:

Wilma Hodges September 18, 2024 at 6:40 PM

Yes, if the UTF-8 format is required, it should definitely be added to the upload instructions.

While it would be great to update PostEm to accept Excel files, or even better Google Sheets, I think those would both be feature requests beyond the scope of this Jira.

Adding a note in the upload instructions page should solve the immediate issue.

Matthew Jones August 25, 2024 at 11:36 PM
Edited

Many of the Spanish characters exist in Extended Ascii and it looks like that’s how your file was encoded.

For instance Ramón in standard Ascii is 52 61 6d f3 6e

However in UTF-8 they are 52 61 6d c3 b3 6e

My guess is the CSV reader in PostEm isn’t as “good” as the CSV reader in other places and only supports Standard Ascii and UTF-8 encodings. Looking at the code, yes it does have custom code for parsing CSV’s rather than using a library like Apache Commons CSV. If someone did come into this code I’d probably also make it be able to read Excel files directly which have better i18n support than CSV’s typically do. And maybe also provide a sample file. But that’s seems like a lot of features for a tool that hasn’t had many changes.

So I guess you could leave this open as an issue and maybe someone would replace the entire CSV reader but that feels like a large project with low value if putting a tip about UTF-8 saving solves the problem for now. While not the default Excel has a “Save As” for “CSV UTF-8” as well.

Andrea Schmidt August 25, 2024 at 11:19 PM

Saving the file as a UTF-8 csv does work: 25x: https://trunk-maria.nightly.sakaiproject.org/, build:fc042b24

Still not understanding what the difference is between users file upload and post’em. If post’em requires a utf-8 file, then it definitely needs to be stated in the instructions on the upload page. Clicking the Add tab, this is what the user sees:

Instructions:

Your feedback file must be saved in .csv format.
The first column of your file must contain individual usernames.
The first row of your file must contain headings.

your thoughts?

Andrea Schmidt August 25, 2024 at 10:49 PM

this is strange. When I uploaded the file, it was displaying for me properly. When I downloaded the attachment (postem_text.csv) and open it, it also displays properly.

The users file I upload every evening to trunk creates the users correctly, including Ramón. Why is there a difference between files I can use to create new users and the csv for Postem?

Matthew Jones August 25, 2024 at 7:21 PM
Edited

The file you uploaded (postem_text.csv) is looks like it was saved n ANSI format. I uploaded one (postem_text_utf8.csv) encoded with UTF-8. Are you able to try that to see if it works? The original doesn’t even display correctly in jira.

There are a few ways to save as a UTF-8, On Windows with regular notepad you have to change the Encoding dropdown to be UTF-8, though the others might also work.

This could be a documentation issue if it doesn’t mention this but I think it’s more common for people who frequently work with these languages and plain text files. It could be added as a tip on

Loading...