Searching users with accent insensitive

GENERAL

TESTING

GENERAL

TESTING

Description

We are spanish speaking and our names sometimes conteining accents.
However it is accepted to write the names without accents, even being a typo. So that, sometimes, in the database are stored the names with accents and sometimes without accents.
The problem we have is that the search tool users distinguish between a name with an accent and a name without the accent, although in reality it is the same name and this poses a problem for our teachers.
Therefore, we have changed the sql query to the searches are not sensitive accent, could probably be improved in order to regard the characteristics of other languages, we have focused on solving the problems of the Spanish language.

Index: user/user-impl/impl/src/java/org/sakaiproject/user/impl/DbUserService.java
===================================================================
— user/user-impl/impl/src/java/org/sakaiproject/user/impl/DbUserService.java (revision 379)
+++ user/user-impl/impl/src/java/org/sakaiproject/user/impl/DbUserService.java (working copy)
@@ -311,7 +311,7 @@
fields[4] = search;
List rv = super
.getSelectedResources(
- "SAKAI_USER.USER_ID = SAKAI_USER_ID_MAP.USER_ID AND (SAKAI_USER.USER_ID = ? OR UPPER(EID) LIKE UPPER OR EMAIL_LC LIKE ? OR UPPER(FIRST_NAME) LIKE UPPER OR UPPER(LAST_NAME) LIKE UPPER(?))",
+ "SAKAI_USER.USER_ID = SAKAI_USER_ID_MAP.USER_ID AND (SAKAI_USER.USER_ID = ? OR UPPER(EID) LIKE UPPER OR EMAIL_LC LIKE ? OR TRANSLATE(UPPER(FIRST_NAME), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))) LIKE TRANSLATE(UPPER(?), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))) OR TRANSLATE(UPPER(LAST_NAME), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))) LIKE TRANSLATE(UPPER(?), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))))",
"SAKAI_USER_ID_MAP.EID",
fields,
"SAKAI_USER_ID_MAP");
@@ -330,7 +330,7 @@
fields[4] = search;
int rv = super
.countSelectedResources(
- "SAKAI_USER.USER_ID = SAKAI_USER_ID_MAP.USER_ID AND (SAKAI_USER.USER_ID = ? OR UPPER(EID) LIKE UPPER OR EMAIL_LC LIKE ? OR UPPER(FIRST_NAME) LIKE UPPER OR UPPER(LAST_NAME) LIKE UPPER(?))",
+ "SAKAI_USER.USER_ID = SAKAI_USER_ID_MAP.USER_ID AND (SAKAI_USER.USER_ID = ? OR UPPER(EID) LIKE UPPER OR EMAIL_LC LIKE ? OR TRANSLATE(UPPER(FIRST_NAME), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))) LIKE TRANSLATE(UPPER(?), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))) OR TRANSLATE(UPPER(LAST_NAME), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))) LIKE TRANSLATE(UPPER(?), 'ÁÉÍÓÚ-', CONCAT('AEIOU', CHR(32))))",
fields,
"SAKAI_USER_ID_MAP");

Activity

Show:

Aaron Zeckoski February 12, 2013 at 7:14 AM

My understanding is that this refers to the user search (from the admin tool or API calls) and improving it to work better with accents and special chars.

If there is a patch that is general enough to use here then the kernel team will review it and possibly apply it, otherwise I don't think we have the expertise to adjust the searches ourselves at the moment or the community resources to investigate this. Resolving as no resources for now. Please reopen if there is a solution or someone wants to work on this.

Beth Kirschner January 22, 2011 at 7:03 PM

database query should be reviewed to allow accent-insensitive searches.

Beth Kirschner January 22, 2011 at 6:50 PM

"...The strength property you choose depends on what your application is trying to accomplish. For example, when performing a text search you may allow a "weak" match, in which accents and differences in case (upper vs. lower) are ignored. This type of search employs the PRIMARY strength."

Use the java.text.Collator.setStrength( java.text.Collator.PRIMARY); to sort both case-insensitive and accent-insensitive.

Anthony Whyte January 21, 2011 at 11:55 AM

Downgrading status to critical following triage review by Berg, Kirschner, Leveque, May and Whyte.

David Roldán Martínez September 16, 2010 at 1:33 PM

Probably, sorting can be delegated in DBMS. For example, Oracle has a NLS parameter, named NLS_SORT to indicate in which language you want results to be sorted. This is faster than sort string within Java code.