Skip to content

Words counting issue for mixed languages #21

@liushuping

Description

@liushuping

It is very common that we could have content of mixed languages such as a paragraph mixed of English and Chinese. A big difference of counting English and CJK words is that CJK does not separate words with spaces (actually "word" and "character" are the same concept in CJK) but they are just adjacent.
For example
The quick brown fox jumps over the lazy dog will be counted as 9 words.
The Chinese translation of that sentence is 敏捷的棕毛狐狸从懒狗身上跃过 it should be counted as 14 words, but the actual result is 1.

This results the issue like TryGhost/Ghost#2656 when writing a blog post of mixed languages.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions