Skip to content

Feature request: at least some support for hsc2hs pragmas #17

@sergv

Description

@sergv

I'd like suggest incorporating at least some knowledge about hsc2hs pragmas into treesitter grammar for Haskell.

While complete support may require some work, I would be glad to see the grammar not erroring out on some of the simpler and pragmas that constitute 80%+ of typical hsc2hs use, for example:

// With C header having something like
typedef struct foo {
...
    chilt_t child;
...
} bar_t;
test1 ptr foo =
  #{poke bar_t, child} ptr foo

test2 ptr foo =
  (#poke bar_t, child) ptr foo

test3 =
  #poke struct foo, child

All of the above are equivalent from hsc2hs tool's perspective. All define a function of two arguments that will write a value into C struct named struct_t through its child named child given a pointer to the structure and a new value. In the wild any of the three forms may occur.

The reason to supported this construct in the grammar comes from the fact that forms 1 or 3 mess up the pasin state and file occurrence of either of those is not parsed properly. If not for this fact, it would have been OK to not do anything and parse pragmas into some soup of tokens since very few users are likely to want to query for pragma AST nodes specifically (none wanted so far as far as it seems).

It would be very useful to at least support all 3 variants of following pragmas, hopefully their grammar would not prove too bad:

Pragma Example
#type ⟨C_type⟩ #type int32_t
#peek ⟨struct_type⟩, ⟨field⟩ (#peek struct_t, child)
#poke ⟨struct_type⟩, ⟨field⟩ #{poke struct foo, bar}
#ptr ⟨struct_type⟩, ⟨field⟩ #ptr struct bar, baz
#offset ⟨struct_type⟩, ⟨field⟩ #offset bar_t, baz
#size ⟨struct_type⟩ #{size foo}
#alignment ⟨struct_type⟩ (#alignment bar)
#const ⟨C_expression⟩ #const FOO + 1
#const_str ⟨C_expression⟩ #{const_str global_constants[FOO]}

NB Both C_type and struct_type are any valid type in C from the C compiler's perspective, so either single word type identifier like struct_t as well as a multi-word type denoting struct referred to via its tag struct foo may be used.

Arbitrary C expressions are likely to be tricky. But maybe fully parsing them may is not necessary - if the grammar can just drop everything until closing delimiter, for example supporting only form 1 and 2 above, that would be very useful.

For reference, the full syntax of pragmas is documented here: https://github.com/haskell/hsc2hs/tree/2059c961fc28bbfd0cafdbef96d5d21f1d911b53?tab=readme-ov-file#input-syntax

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions