Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package with Unicode characters is not searched correctly using the gallery #3186

Closed
skofman1 opened this issue Aug 8, 2016 · 2 comments
Closed

Comments

@skofman1
Copy link
Contributor

skofman1 commented Aug 8, 2016

Although the package was indexed correctly to Lucene, gallery keeps showing this message:
image

Link to package: https://dev.nugettest.org/packages/Z̡̜͍̈̍̐̃̊͋́A̜̣͍̬̞̝͉̽ͧ͗L̸̖͕̤̠̹̘͖̃̌ͤG͓̝͓̰̀ͪO͈͌/

@maartenba
Copy link
Contributor

Looks like an encoding mismatch between gallery and search service.

Gallery encodes the package id as: Z%CC%88%CC%8D%CC%90%CC%83%CC%8A%CD%8B%CC%81%CC%A1%CC%9C%CD%8DA%CC%BD%CD%A7%CD%97%CC%9C%CC%A3%CD%8D%CC%AC%CC%9E%CC%9D%CD%89L%CC%83%CC%8C%CD%A4%CC%B8%CC%96%CD%95%CC%A4%CC%A0%CC%B9%CC%98%CD%96G%CC%80%CD%AA%CD%93%CC%9D%CD%93%CC%B0O%CD%8C%CD%88

Search service encodes the package id as: Z%CC%A1%CC%9C%CD%8D%CC%88%CC%8D%CC%90%CC%83%CC%8A%CD%8B%CC%81A%CC%9C%CC%A3%CD%8D%CC%AC%CC%9E%CC%9D%CD%89%CC%BD%CD%A7%CD%97L%CC%B8%CC%96%CD%95%CC%A4%CC%A0%CC%B9%CC%98%CD%96%CC%83%CC%8C%CD%A4G%CD%93%CC%9D%CD%93%CC%B0%CC%80%CD%AAO%CD%88%CD%8C

I checked the database and the search service. Database

The bigger question here is: do we want to allow non-ASCII/non-UTF-8 encodings for package id's? @yishaigalatzer
If so, this will require some work to ensure all systems use the same encoding (and thus search results return a correct value here)

@maartenba
Copy link
Contributor

Looked a bit deeper into this one, seems normalizing the string works. Here's a PR: #3188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants