Skip to content
This repository was archived by the owner on Dec 15, 2022. It is now read-only.
This repository was archived by the owner on Dec 15, 2022. It is now read-only.

Problem with Unicode support #971

@digital-carver

Description

@digital-carver

Description

Auto-completion doesn't work properly when writing Tamil text. This is probably because the unicode-helpers.js file has only the codepoints from the Letter Unicode category, where it should actually have all codepoints that have the Alphabetic property. So for eg. it has 0B95 for க, but not 0BC0 which is a vowel mark that combines with க to make கீ.

To give an idea of how weird this is, Atom gives completion when I type the equivalent of "SaRaGaMa", but not when I type the equivalent Tamil text of "SaReGaMa" - having any vowel other than a (அ) in the prefix disables the autocompletion.

A lot of codepoints for many Indic scripts (and some other Asian scripts) are placed in the M (Mark) categories in Unicode, and then given the Other_Alphabetic property (find for Other_Alphabetic in https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt). This means the current completion probably doesn't work properly for any of those languages. The solution, afaict, is just to add all the Other_Alphabetic codepoints from that page to unicode-helpers.js.

Steps to Reproduce

  1. Type (or paste) எழுத்துக்கள் on the first line of a file
  2. Type (or paste) எழுத்து on the second line

Expected behavior: Completion to எழுத்துக்கள் should appear

Actual behavior: No completion appears - and pressing Ctrl-Space does nothing either

Reproduces how often: 100%

Versions

Version 1.27.1 on Windows 7 64-bit

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions