-
Notifications
You must be signed in to change notification settings - Fork 147
Refactored How Calling in CompiledSDFG Works
#2206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refactored How Calling in CompiledSDFG Works
#2206
Conversation
Due to the refactoring the case that a variable is passed once as positional and as named argument is not detected and asserted. This test however, passed `a` always as positional argument and if `symbolic` is `True` also as named argument.
dace/codegen/compiled_sdfg.py
Outdated
| :note: If not initialized this function will initialize the memory for | ||
| the return values, however, it might also reallocate said memory. | ||
| :note: This function will also update the internal argument cache. | ||
| :note: The update of `self._lastargs` should be considered a bug rather than a feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean? It's not a bug, it caches the result types to avoid reallocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion a call to construct_arguments() should not update the self._lastargs because at that point it has not been called this only happens in fast_call() (which should for speed reasons also not do it).
The place where this update should happen is inside __call__(), to be exact between construct_arguments() and fast_call().
And to be honest here, I would remove _lastargs
Now is the reason why this is not possible.
construct_arguments() must call _initialize_return_values(), because the return value is part of the argument vectors.
Now there are a few very obscure points:
_lastargsonly contains the pointer to the memory, thus keep them is not enough for keeping the memory allocated._initialize_return_values()does not always allocate new memory, if the memory still has the right size it will not allocate them anew (you can force a reallocation by callingclear_return_value()).
This has the nice side effect that the return values might be shared, I use the word might here, because depending on the situation they are or they are not (BTW: That behaviour is not documented).
These two reasons implies that if you reallocate the return value you should throw the pointers away because they might become dangling.
I mean because the user is free to do anything what he wants anyway the input he provided could also become dangling.
Therefore I propose the following changes:
- Every time
construct_arguments()is called new return values are allocated, there is no sharing. - Actually describing what happens with the memory management.
- I agree
_lastargsmight be useful for debugging and would thus propose, to keep it, but to never use it and set it in__call__in all other cases ignore it.
dace/codegen/compiled_sdfg.py
Outdated
| symbols=kwargs, | ||
| callback_retval_references=self._callback_retval_references) | ||
| for aval, atype, aname in zip(arglist, argtypes, argnames)) | ||
| constants = self.sdfg.constants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semi-related note: If you want the function to be faster, this should be constants_prop (because constants computes a new object every time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was in the original code.
However, as far as I know SDFG::constants is a property, thus the line self.sdfg.constants will call SDFG::constants which returns a dict that will be stored inside constants (the variable inside the function).
I also looked at the implementation of SDFG::constants and does not think that it can be replaced with SDFG::constant_props as SDFG::constants also considers SDFGs recursively.
However, I think that the output of SDFG::constants can be saved in CompiledSDFG::__init__() in the same way as free_symbols is.
I will do that.
However, I saw that it also, under the hood sometimes tests if the argument is a pyobject. Since that thing is a pointer it is possible and I implemented it for that. But it was again not implemented properly, since for the case when the return value is passed as a regular argument, it was not checking that, only for managed return values.
_construct_args() PublicCompiledSDFG Works
This PR changes how calls to the underlying `CompiledSDFG` objects are carried out. Before, the implementation heavily relied on the internal data format of the DaCe class. However a recent [change in DaCe](spcl/dace#2185) changed this internal data leading to errors, that had to [be patched](GridTools/dace#9). This PR introduces a more stable fix and builds upon a [refactoring in DaCe](spcl/dace#2206) that beside other things, exposes the tools that were needed by GT4Py to work independently of the internals. For that reason this PR also updates the DaCe dependency to `2025.11.04`. The main change is, that the argument vector, i.e. the C representation of the arguments used for the call, are no longer managed by `CompiledSDFG` but instead by GT4Py's `CompiledDaceProgram`. Co-authored-by: edopao <[email protected]>
This PR refactors how calling an SDFG works.
PR#1467 introduced the
fast_call()API, which allowed to call a compiled SDFG and skipping some tests.This was done to support the use case of calling the same SDFG with the same (as in pointers) multiple times.
However, the PR did not introduced a simple way to generate the argument vector that had to be passed to
fast_call()without relying on internal implementation details of the class.This PR, beside other things, introduces this use case and give access to the all steps needed to call an SDFG:
construct_arguments(): It accepts Python arguments, such asintor NumPy arrays and turns them into an argument vector in the right order and converted to the required C type.fast_call(): Performs the actual call using the passed argument vectors, if needed it will also run initialization.Note that this function is not new, but was slightly modified and no longer handles the return values, see below.
convert_return_values(): This function performs the actualreturnoperation, i.e. composes the specified return type, i.e. either a single array or a tuple.Note, before this function was called by
fast_call()but it was moved outside to reduce the hot path, because usually return values are passed using inout or out arguments directly.Beside these changes the PR also modifies the following things:
__return, as ordinary arguments, this is still supported, but now a warning is returned.CompiledSDFGwas technically able to handle scalar return values, however, due to technical limitation this is not possible, thus the feature was removed.However, the feature was not dropped completely and is still used to handle
pyobjectssince they are passed as pointer, see below.pyobjectreturn values was modified.Before it was not possible to use
pyobjectinstances as return values that were manged outside, i.e. not allocated by,CompiledSDFG, now they are "handled".It is important that an array, i.e. multiple instances, of
pyobjects are handled as a single object (this is the correct behaviour and retained for bug compatibility with the unit tests), however, a warning is generated.safe_call()is not possible to handle return values, if the method is called on such an SDFG an error is generated.tuplewith a single argument, in that case the value was always directly returned, this has been fixed and is correctly handled.If there was no change in size, then
__call__()would always return the same arrays, which might lead to very sudden bugs.The new behaviour is to always allocate new memory, this is done by
construct_arguments().CompiledSDFGhad a member_lastargswhich "cached" the last pointer arguments that were used to call the SDFG.It was updated by
_construct_args()(old version ofconstruct_arguments()), which did not make much sense.The original intention was to remove it, but this proved to be harder and it is thus maintained.
However, it is now updated by
__call__()andinitialize()to support the use case for{get, set}_workspace_size().