Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Can't use rules to match string fields by a Polish letter if it's at the beginning of a word #840

Closed
1 task done
Jackenmen opened this issue Apr 2, 2023 · 8 comments · Fixed by #865
Closed
1 task done
Labels
bug Something isn't working help wanted Extra attention is needed user interface Related to the user interface

Comments

@Jackenmen
Copy link
Contributor

Verified issue does not already exist?

  • I have searched and found no existing issue

What happened?

I have transactions with an imported payee named "Ing Bank Śląski S.A.". I noticed that it is impossible to match them using is against that exact string and I noticed that when using contains instead, it only works as long as I match with anything that doesn't contain Ś, i.e. both "Ing Bank" and "ląski S.A." are matching fine. It seems that the rules in Actual are having trouble with matching string fields by a Polish letter if that letter is the first letter of any word in the field. I've also been able to reproduce this with the "Notes" field.

Reproduction steps:

  1. Start a fresh Actual instance.
  2. Click on "Try demo"
  3. Choose any account (i.e. "Ally Savings")
  4. Click on import.
  5. Choose this file: polish-letters-repro.csv
  6. Ensure CSV fields are properly matched, either of these configurations can be used for reproduction:
    image
    image
  7. After import, 4 transactions should be added:
    image
  8. Go to the rules page and start creating a new rule.
  9. Depending on what you chose in step 6, choose the "imported payee" or "notes" field.
  10. Use the "contains" matcher (this can be reproduced with "is" as well but you can test it more easily with "contains")
  11. Type in any single Polish letter: ĄĆĘŁŃÓŚŹŻąćęłńóśźż
  12. See only 2 transactions even though 4 should match. When you look closer, both of the matched transactions are the ones that don't start with a Polish letter:
    image

What error did you receive?

The relevant transactions do not show up.

Where are you hosting Actual?

Fly.io

What browsers are you seeing the problem on?

Chrome

Operating System

Linux

@Jackenmen Jackenmen added bug Something isn't working needs triage labels Apr 2, 2023
@Jackenmen
Copy link
Contributor Author

This is reproducible on the current edge image.

@MatissJanis MatissJanis added help wanted Extra attention is needed user interface Related to the user interface and removed needs triage labels Apr 6, 2023
@MatissJanis
Copy link
Member

👋 Thanks for the bug report. I was able to reproduce it and dug into it a bit. This smells like a case sensitivity bug somewhere.

When we query for transactions to display in the table we use a query like this.. WHERE LOWER(notes) LIKE %ē%. If you have two transactions with Ē and ē - only the lowercased transaction will be returned.

So it smells like a bug in better-sqlite3? Or perhaps misconfiguration on our end?

@Jackenmen
Copy link
Contributor Author

When I was looking into it, I was unable to reproduce the problem with an analogous query in SQLite Browser on Actual's db file so it would make sense for it to be a better-sqlite3 bug or a misconfiguration.

@Jackenmen
Copy link
Contributor Author

My testing confirms that this is an issue with better-sqlite3's behavior:
image

It looks like better-sqlite3 ships with a SQLite3 build without ICU support:
WiseLibs/better-sqlite3#465

@MatissJanis
Copy link
Member

Good investigation @Jackenmen ! Would you want to pursue this further by opening a ticket with better-sqlite3? Or by continuing conversations in the linked ticket.

@Jackenmen
Copy link
Contributor Author

I would prefer someone else to pursue it. I'm not sure if there's any chance for this to be improved in better-sqlite3 any time soon, though, considering it's a 2.5-year-old issue.

@j-f1
Copy link
Contributor

j-f1 commented Apr 6, 2023

I imagine ICU support would be very large (I believe Unicode data takes up most of the ~5MB size of SwiftWasm binaries) so perhaps the better option would be to do the case folding in JS?

@Jackenmen
Copy link
Contributor Author

After further investigation, I figured that we actually care more about sql.js not supporting ICU rather than better-sqlite3 since the former is actually being used in the web client while the latter no longer matters much now that there's no Electron client. The issue for that can be found here: sql-js/sql.js#458

@j-f1 j-f1 closed this as completed in #865 Apr 7, 2023
j-f1 pushed a commit that referenced this issue Apr 7, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
…es (#865)

Fixes #840 by creating application-defined SQL functions
(https://www.sqlite.org/appfunc.html) for Unicode-aware implementations
of `LOWER()` and `UPPER()`. This uses
`String.prototype.toLower/UpperCase()` JS method.

I initially wanted to just redefine `LOWER()` and `UPPER()` but due to
[sql.js not supporting the definition of deterministic
functions](sql-js/sql.js#551), I had to just
define them as separate functions and use that in the appropriate
places. It's probably better like that anyway...
FlorianLang06 pushed a commit to FlorianLang06/actual that referenced this issue Mar 7, 2024
…es (actualbudget#865)

Fixes actualbudget#840 by creating application-defined SQL functions
(https://www.sqlite.org/appfunc.html) for Unicode-aware implementations
of `LOWER()` and `UPPER()`. This uses
`String.prototype.toLower/UpperCase()` JS method.

I initially wanted to just redefine `LOWER()` and `UPPER()` but due to
[sql.js not supporting the definition of deterministic
functions](sql-js/sql.js#551), I had to just
define them as separate functions and use that in the appropriate
places. It's probably better like that anyway...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed user interface Related to the user interface
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants