[xExtension-ReadingTime] Reading time of Japanese articles are too short #292

hkcomori · 2025-02-14T08:19:22Z

Words counts seem to be incorrect in languages such as Japanese, where words are not separated by spaces.
Therefore, reading time calculated from word counts is very short compared to actual reading time.

I think it is better to calculate the reading time from letters, because it is difficult to accurately count words in these languages.

Therefore, it would be nice to be able to set up the following:

Source Metrics: Select what to calculate reading time from words, letters. (Default: words)
Conversion factor: Factor to convert source metrics to reading time. (Default: 300)

The text was updated successfully, but these errors were encountered:

Alkarex · 2025-02-20T14:41:47Z

Before making a new option, I think it would be worth trying to make a more robust function:

Extensions/xExtension-ReadingTime/static/readingtime.js

Lines 46 to 50 in 8e31be6

    
           reading_time.textContent = reading_time.textContent.replace(/(^\s*)|(\s*$)/gi, ''); // exclude  start and end white-space 
        
           reading_time.textContent = reading_time.textContent.replace(/[ ]{2,}/gi, ' '); // 2 or more space to 1 
        
           reading_time.textContent = reading_time.textContent.replace(/\n /, '\n'); // exclude newline with a start spacing 
        
           return reading_time.textContent.split(' ').length;

An example of idea to investigate could be to add the number of ideograms to the number of Latin words

hkcomori · 2025-02-21T01:19:26Z

In my experience, there does not seem to be a correlation between the number of Japanese ideograms (i.e., kanji) and reading time.

I did some more research on how to count the number of words.
It requires morphological element analysis, and there seem to be several libraries for that (i.e., MeCab, kuromoji.js).
But, it seems to be difficult, for example, when proper nouns appear, the results are incorrect.

Then I believe it is much more accurate and stable to calculate reading time based on the number of letters.

Alkarex · 2025-02-21T10:35:41Z

We could try to use letters for all languages. PR and tests welcome

math-GH added the xExtension-ReadingTime xExtension-ReadingTime label Feb 15, 2025

Alkarex added help wanted good first issue javascript Pull requests that update Javascript code labels Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[xExtension-ReadingTime] Reading time of Japanese articles are too short #292

[xExtension-ReadingTime] Reading time of Japanese articles are too short #292

hkcomori commented Feb 14, 2025

Alkarex commented Feb 20, 2025

hkcomori commented Feb 21, 2025

Alkarex commented Feb 21, 2025

[xExtension-ReadingTime] Reading time of Japanese articles are too short #292

[xExtension-ReadingTime] Reading time of Japanese articles are too short #292

Comments

hkcomori commented Feb 14, 2025

Alkarex commented Feb 20, 2025

hkcomori commented Feb 21, 2025

Alkarex commented Feb 21, 2025