-
Notifications
You must be signed in to change notification settings - Fork 214
Native function support #211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Notes to self:
Assuming that eager gradients is done by calling SymbolicGradient, I probably want an option to do the gradient calculation when the function is created (like for graph) and use that function in eager, rather than re-calculating it each time. Depends on what gradient tape looks like. |
* | ||
* Functions can also be created from {@link ConcreteFunction}s using {@link #create(ConcreteFunction, boolean)}. | ||
*/ | ||
public class GraphFunction implements AutoCloseable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GraphFunction
is a bit confusing as a name, it sounds like it is a function attached to a Graph
which is not necessarily a case. I'd turn this over to FunctionGraph
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, though lets have the "merge with ConcreteFunction" discussion first. I didn't want to use FunctionGraph
because Python has a FuncGraph
that subclasses Graph, and I'm not sure if we'll end up needing one for further Function support (I'd prefer a subclass of Ops, which should handle most of the same stuff).
if (op != null) { | ||
outputs.add(op.output(index)); | ||
} | ||
Operation op = graph.operationOrError(operation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break current behavior by throwing an error where it was silently failing before... but I think it's for the best, so let's keep it that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't actually change things, I just copied the Session function to Graph. The null check there was extraneous. But I agree about the behavior, I'll note it in the docs.
...core/tensorflow-core-api/src/main/java/org/tensorflow/internal/c_api/presets/tensorflow.java
Outdated
Show resolved
Hide resolved
* @return the outputs of the function | ||
*/ | ||
@Endpoint(name = "callFunction") | ||
public static List<Operand<?>> callFunction(Scope scope, GraphFunction function, List<Operand<?>> inputs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply call these endpoints call
? I like the simplicity of tf.call(function, x).get("y")
The function instance type is enough to differentiate the different signatures
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had issues with call(NamedGraphFunction, List<>)
, in that I'd like to have it return a Map using the named outputs even if named inputs aren't used, but that collides with call(GraphFunction, List<>)
which returns a List. Let's delay this till after the merging w/ ConcreteFunction conversation?
tensorflow-core/tensorflow-core-api/src/main/java/org/tensorflow/GraphFunction.java
Outdated
Show resolved
Hide resolved
TF_Output handles = new TF_Output(operands.size()); | ||
for (int i = 0; i < operands.size(); i++) { | ||
Operand<?> input = operands.get(i); | ||
//TODO body.checkInput(input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain more in detail here what is left TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's from #207, I'd originally written this in that branch. I'll probably need to generalize the "handle input from another environment" anyways to handle closure captures in functions, at some point, and we'd want that to apply here, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm expecting it to get merged before this branch anyways, in which case I'll uncomment it.
* | ||
* @param body the graph to use as the function body | ||
* @param name the name of the function | ||
* @param appendHash whether to append the has of the signature to the name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
".. the hash..."
btw, just curious, why do we let the user decide to append a hash or not to the signature name? Can we take this decision for him?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd lean in favor of always appending it, but I could see a case where you want to create a function with a specific name, so I left it in. It's definitely necessary most of the time to avoid name collisions for functions with different arguments (aka overloading support).
* @param function the {@code ConcreteFunction} to create the function from | ||
* @param appendHash whether to append a hash of the signature to the name | ||
*/ | ||
public static NamedGraphFunction create(ConcreteFunction function, boolean appendHash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to understand why do we need both NamedGraphFunction
and ConcreteFunction
, they seem to achieve pretty much the same goal, which is to provided named inputs/outputs on an encapsulated function, right?
In the Python paradigm, a Function
is a method that wraps an executable graph (or eager code) while a ConcreteFunction
is the realization of this function for specific type of input arguments (since Python is non-typed, the function is realized (or "concretized") when its invoke for the first time with a given set of input types.
In Java, as everything is typed, we might not need to make this distinction. So why not just calling it simply Function
and merge both classes together? The only difference I see is that one runs a main Graph
(using a session) while the other runs a secondary graph (using a ops). But there might be a way to abstract this to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty much, yeah, but NamedGraphFunction
has to manage the TF_Function
handle and doesn't require a signature. ConcreteFunction
also allows for Tensor execution. As for function, see #205, this is generally modeling Python's DefinedFunction
/_EagerDefinedFunction
/defun, which is separate from the Function
class.
I could look into merging them, but it would make it a bit harder to declare the GraphFunction
. Would you be in favor of getting rid of the positional GraphFunction
altogether and merging NamedGraphFunction
with ConcreteFunction
?
I'd want to keep a few of the positional function creator methods around somewhere, for gradient functions if nothing else. Probably package private though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm for merging everything together if that can simplify the API for the user. Or we can continue to segregate the logic into multiple classes but only one should be public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When trying to make GraphFunction
(now called DefinedFunction
) package local, I ran into the issue where I'd like to have a Ops.call(ConcreteFunction, Map<String, Operand<?>)
function, but the necessary methods (and DefinedFunction
) won't be accessible there. There's a few options, as far as I can see:
- Only have a call method on
ConcreteFunction
, likeConcreteFunction.call(Ops tf, Map<String, Operand<?>)
. - Have a
@Operator
Function class in the main package - Have a public call method on
ConcreteFunction
that takesScope
instead ofOps
, call that from the operator class - Move
ConcreteFunction
,DefinedFunction
, the Operator class, and any future function classes to a new package (function
, presumably).
4 is my preference, thoughts?
* TODO implement | ||
*/ | ||
TF_Function gradNativeHandle() { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea how to implement this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to do some tests, but for Graph I'm thinking I'd call the existing gradient op on it, and then construct a function using the op list constructor. Eager is up in the air until we get gradient tape support.
Hi @rnett , I'm really happy that you tackle this feature that was in my backlog for too long now, let's get it done. I lean to an approach that would abstract as much as possible all the implementation details to the user, whether the function is a graph, a subgraph or code executed eagerly. We'll see where it goes in the scope of this PR |
Yeah, I very much agree, I didn't intend to put that in this PR, just the native support. I've put my thoughts on it in #205. |
9810b26
to
5f37483
Compare
* | ||
* Functions can also be created from {@link ConcreteFunction}s using {@link #create(ConcreteFunction, boolean)}. | ||
*/ | ||
public class DefinedFunction implements AutoCloseable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not convinced that we need this extra class as part of the public API. Can't we have everything fronted by ConcreteFunction
, eventually changing it to an interface that can have multiple implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I'm running to access modifiers (and lack of internal
):
When trying to make
DefinedFunction
package local, I ran into the issue where I'd like to have aOps.call(ConcreteFunction, Map<String, Operand<?>)
function, but the necessary methods (andDefinedFunction
) won't be accessible there. There's a few options, as far as I can see:
- Only have a call method on
ConcreteFunction
, likeConcreteFunction.call(Ops tf, Map<String, Operand<?>)
.- Have a
@Operator
Function class in the main package- Have a public call method on
ConcreteFunction
that takesScope
instead ofOps
, call that from the operator class- Move
ConcreteFunction
,DefinedFunction
, the Operator class, and any future function classes to a new package (function
, presumably).
4 is my preference, thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So like we discussed, let's add endpoints to ConcreteFunction
that accepts a Scope
object and do the actual job. Anyway, most users will always go through the ops so it's ok to have these kind of endpoints in the public API of the core classes even if they are more meant for internal usage.
We also agreed that only ConcreteFunction
should remain and probably be converted to an interface that has 2 different possible implementations: one with a graph (like the actual one) and one with a function graph (the new one). Maybe add these implementations in the org.tensorflow.internal
package?
And last, can we check if it will be possible to build ConcreteFunction
instances out of function graphs found in a saved model that has been loaded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've started on this, and as far as I can tell, the only thing ConcreteFunction
provides that the native functions doesn't is the ability to call with Tensor
s, and even that's relocatable by using an temporary EagerSession. So do we even need it? The Signature
builders can be ported just fine. The only thing I could think of that would cause issues is Saving, which I haven't looked at enough to be able to tell. If I understand it right it puts everything under a default named function anyways, so we could just create a graph and add the function to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| So do we even need it?
The goal is that we don't! Maybe I'll preserve ConcreteFunction
as the name for the common function interface since it is still accurate without making breaking changes again. But if you think it is simple to do everything within a single class instead of one interface and two implementations, you can give it a try and I'll take a look at the PR but I have slightly a preference for the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. I'll start it without saving, and figure that out after. I'm also not sure how variables are handled within TF_Function
, which may cause some differences from ConcreteFunction
(especially wrt saving), but I suspect that resource variables will be handled just fine, it's only the legacy ones that will have issues, so imo we don't need to worry about it.
Signed-off-by: Ryan Nett <[email protected]>
…pparently forgot) Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
Signed-off-by: Ryan Nett <[email protected]>
5f37483
to
f215237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rnett , I like where this is going. I'm concerned a bit about the support in the saved model though, for which reason the resource ownership of a ConcreteFunction
was added for (and removed in this PR). I can give you more context about this if you want.
Also, I have a feeling that the size of this PR has increased dramatically (did it?), if that the case, is everything in there required to add this new functionality for native functions or there is a mix of other things as well? Reducing the size of the PRs helps a lot with the review process. Thanks!
@@ -39,16 +57,87 @@ | |||
*/ | |||
public class ConcreteFunction implements AutoCloseable { | |||
|
|||
private static TF_Operation outputHandle(Operand<?> operand) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: can we move the private parts at the bottom of the class? It's the Google way to do it and I kind of like it now :)
private final Signature signature; | ||
private final Ownership ownership; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the ownership part? It is quite important to prevent unexpected behavior when closing a function that comes from a saved model vs. one that has been instantiated manually.
Tensor tensor = arguments.get(argName); | ||
if (tensor == null) { | ||
throw new IllegalArgumentException(String.format("Missing argument [%s]", argName)); | ||
public Map<String, Operand<?>> call(Scope scope, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javadoc?
// session = function.session(); | ||
// } else if (session != function.session()) { | ||
// throw new UnsupportedOperationException("Saving multiple functions with different graphs/sessions is not supported yet."); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't merge this PR until we support them in saved models properly
* the function, they mean the type. | ||
*/ | ||
@Operator(name = "call") | ||
public abstract class Function { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this class to FunctionHelper
or something like that, to prevent clashing with java.util.Function
?
// } | ||
// assertThrows(IllegalStateException.class, () -> s.run(Init.DEFAULT_NAME)); | ||
// g.toGraphDef(); // check that graph is still valid | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix these tests before merging
I didn't think you'd get to reviewing it yet, parts are still WIP (tests and saving). I'll mark it as draft until they are done. The size hasn't increased much, I think it's mostly bad diffs and perhaps some rebase issues, plus auto-formatting files that weren't previously (Graph mostly, it looks like). I can pull some parts out into an earlier PR. The only large part I added recently was tree-walking functions on Context on saving would be appreciated, I'm going to take a look at how python does it. If I understand it right, both the function native instances and the resource variables are not associated with any specific session or graph (although I need to test that for variables), so I can use a graph and session for loading and then discard them. We're going to need some kind of variable lifting anyways for How much do we need to support non-resource variables, anyways? It seems like resource variables were introduced largely to make the functional stuff easier. Of course, we don't have access to them yet, but we should soon-ish (my gradient PR is sitting in their review queue). |
Replaced by #233. |
Adds support for creating
TF_Function
s from graphs and executing them in eager and graph mode.I'm undecided as to whether I should keep the positional
GraphFunction
or just make the base function use named arguments. Thoughts @karllessard?I'm also considering making the
create
methods automatically find the minimal subgraph for the passed inputs and outputs (and throw if that's impossible, as the function would break anyways). This would help with gradient handling. I'd add another overload to create everything (which could be used by ConcreteFunction), and possibly ones to detect inputs and outputs from the graph.Before merging, I need to handle gradients for graph mode (it doesn't seem to need to be specified in eager mode, I need to check how Python does it), and add overloads with control outputs.