Skip to content

Adoption of the Apache Arrow memory alignment and padding? #771

Open
@paddyhoran

Description

@paddyhoran

Hi,

I'm just trying to get a sense of the level of interest from the ndarray developers regarding adopting the Apache Arrow memory layout and padding.

I have been wanting to build integrations between Arrow and ndarray for some time. Today it should be easy enough to build a zero-copy converter to ndarray types. Arrow has a tensor type and this could be converted (with the optional names for dimensions in Arrow dropped).

However, without guarantees over the memory alignment and padding assumptions you could not go back to Arrow with zero-copy. The easiest way to do this would be for ndarray to use the Arrow functions that allocate memory through the Arrow Buffer type.

Arrow is attempting to make integrations between crates easier, I noticed this issue today. This is the kind of issue we could avoid.

In general, I think that Arrow and ndarray fit together quite nicely where Arrow could provide alot of help processing data and ndarray provides all the algorithms once data is cleaned and in-memory.

I'm not very familiar with the ndarray codebase, if this sounds like a good idea could you point me to where you allocate memory etc. and any other information that might help?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions