Ignore some extra chars in no-combining search #2929

abdnh · 2024-01-04T01:34:28Z

I just took the translation list in WordPress page and kept only letters that don't change after NFKD normalization.

dae · 2024-01-04T02:06:55Z

rslib/src/text.rs

        .nfkd()
        .filter(|c| !is_combining_mark(*c))
-        .collect::<String>()
-        .into()
+        .collect::<String>();


Use of phf is nice! I think the step below could be done more efficiently though, by using .map() here to transform the unwanted characters, instead of modifying the string below. You should be able to look up each character instead of having to iterate over the phf, too.

(might be more efficient to do this in a 'for char in ... { output.push(char) or output.push_str(repl) }' than using flat_map)

rslib/src/text.rs

dae · 2024-01-05T04:22:42Z

Thanks Abdo!

Ignore some extra chars in no-combining search

2d99fe7

dae reviewed Jan 4, 2024

View reviewed changes

Construct new string

3d2d1f3

dae reviewed Jan 5, 2024

View reviewed changes

rslib/src/text.rs Outdated Show resolved Hide resolved

Update rslib/src/text.rs

f4146a0

dae merged commit 646ba41 into ankitects:main Jan 5, 2024

abdnh deleted the nc branch January 5, 2024 09:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore some extra chars in no-combining search #2929

Ignore some extra chars in no-combining search #2929

abdnh commented Jan 4, 2024

dae Jan 4, 2024

dae Jan 4, 2024

dae commented Jan 5, 2024

Ignore some extra chars in no-combining search #2929

Ignore some extra chars in no-combining search #2929

Conversation

abdnh commented Jan 4, 2024

dae Jan 4, 2024

Choose a reason for hiding this comment

dae Jan 4, 2024

Choose a reason for hiding this comment

dae commented Jan 5, 2024