Skip to content

Conversation

@philip-paul-mueller
Copy link
Collaborator

@philip-paul-mueller philip-paul-mueller commented Oct 30, 2025

This PR refactors how calling an SDFG works.
PR#1467 introduced the fast_call() API, which allowed to call a compiled SDFG and skipping some tests.
This was done to support the use case of calling the same SDFG with the same (as in pointers) multiple times.
However, the PR did not introduced a simple way to generate the argument vector that had to be passed to fast_call() without relying on internal implementation details of the class.
This PR, beside other things, introduces this use case and give access to the all steps needed to call an SDFG:

  • construct_arguments(): It accepts Python arguments, such as int or NumPy arrays and turns them into an argument vector in the right order and converted to the required C type.
  • fast_call(): Performs the actual call using the passed argument vectors, if needed it will also run initialization.
    Note that this function is not new, but was slightly modified and no longer handles the return values, see below.
  • convert_return_values(): This function performs the actual return operation, i.e. composes the specified return type, i.e. either a single array or a tuple.
    Note, before this function was called by fast_call() but it was moved outside to reduce the hot path, because usually return values are passed using inout or out arguments directly.

Beside these changes the PR also modifies the following things:

  • It was possible to pass return values, i.e. __return, as ordinary arguments, this is still supported, but now a warning is returned.
  • CompiledSDFG was technically able to handle scalar return values, however, due to technical limitation this is not possible, thus the feature was removed.
    However, the feature was not dropped completely and is still used to handle pyobjects since they are passed as pointer, see below.
  • The handling of pyobject return values was modified.
    Before it was not possible to use pyobject instances as return values that were manged outside, i.e. not allocated by, CompiledSDFG, now they are "handled".
    It is important that an array, i.e. multiple instances, of pyobjects are handled as a single object (this is the correct behaviour and retained for bug compatibility with the unit tests), however, a warning is generated.
  • It was possible to pass an argument as named argument and as positional argument, this is now forbidden.
  • safe_call() is not possible to handle return values, if the method is called on such an SDFG an error is generated.
  • Before it was not possible to return a tuple with a single argument, in that case the value was always directly returned, this has been fixed and is correctly handled.
  • The allocation of return values was inconsistent.
    If there was no change in size, then __call__() would always return the same arrays, which might lead to very sudden bugs.
    The new behaviour is to always allocate new memory, this is done by construct_arguments().
  • Shared return values.
  • Before CompiledSDFG had a member _lastargs which "cached" the last pointer arguments that were used to call the SDFG.
    It was updated by _construct_args() (old version of construct_arguments()), which did not make much sense.
    The original intention was to remove it, but this proved to be harder and it is thus maintained.
    However, it is now updated by __call__() and initialize() to support the use case for {get, set}_workspace_size().

Due to the refactoring the case that a variable is passed once as positional and as named argument is not detected and asserted.
This test however, passed `a` always as positional argument and if `symbolic` is `True` also as named argument.
:note: If not initialized this function will initialize the memory for
the return values, however, it might also reallocate said memory.
:note: This function will also update the internal argument cache.
:note: The update of `self._lastargs` should be considered a bug rather than a feature.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? It's not a bug, it caches the result types to avoid reallocation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion a call to construct_arguments() should not update the self._lastargs because at that point it has not been called this only happens in fast_call() (which should for speed reasons also not do it).
The place where this update should happen is inside __call__(), to be exact between construct_arguments() and fast_call().
And to be honest here, I would remove _lastargs

Now is the reason why this is not possible.
construct_arguments() must call _initialize_return_values(), because the return value is part of the argument vectors.
Now there are a few very obscure points:

  • _lastargs only contains the pointer to the memory, thus keep them is not enough for keeping the memory allocated.
  • _initialize_return_values() does not always allocate new memory, if the memory still has the right size it will not allocate them anew (you can force a reallocation by calling clear_return_value()).
    This has the nice side effect that the return values might be shared, I use the word might here, because depending on the situation they are or they are not (BTW: That behaviour is not documented).

These two reasons implies that if you reallocate the return value you should throw the pointers away because they might become dangling.
I mean because the user is free to do anything what he wants anyway the input he provided could also become dangling.

Therefore I propose the following changes:

  • Every time construct_arguments() is called new return values are allocated, there is no sharing.
  • Actually describing what happens with the memory management.
  • I agree _lastargs might be useful for debugging and would thus propose, to keep it, but to never use it and set it in __call__ in all other cases ignore it.

symbols=kwargs,
callback_retval_references=self._callback_retval_references)
for aval, atype, aname in zip(arglist, argtypes, argnames))
constants = self.sdfg.constants
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semi-related note: If you want the function to be faster, this should be constants_prop (because constants computes a new object every time).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was in the original code.
However, as far as I know SDFG::constants is a property, thus the line self.sdfg.constants will call SDFG::constants which returns a dict that will be stored inside constants (the variable inside the function).
I also looked at the implementation of SDFG::constants and does not think that it can be replaced with SDFG::constant_props as SDFG::constants also considers SDFGs recursively.
However, I think that the output of SDFG::constants can be saved in CompiledSDFG::__init__() in the same way as free_symbols is.
I will do that.

@philip-paul-mueller philip-paul-mueller changed the title Made _construct_args() Public Refactored How Calling in CompiledSDFG Works Oct 31, 2025
philip-paul-mueller added a commit to GridTools/gt4py that referenced this pull request Nov 4, 2025
This PR changes how calls to the underlying `CompiledSDFG` objects are
carried out. Before, the implementation heavily relied on the internal data
format of the DaCe class. However a recent [change in DaCe](spcl/dace#2185) changed this internal data leading to errors, that had to [be patched](GridTools/dace#9).
This PR introduces a more stable fix and builds upon a [refactoring in DaCe](spcl/dace#2206) that beside other things, exposes the tools that were needed by GT4Py to
work independently of the internals.
For that reason this PR also updates the DaCe dependency to `2025.11.04`.

The main change is, that the argument vector, i.e. the C representation
of the arguments used for the call, are no longer managed by
`CompiledSDFG` but instead by GT4Py's `CompiledDaceProgram`.

Co-authored-by: edopao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants