-
Notifications
You must be signed in to change notification settings - Fork 701
Teach Glow how to support layout requirements for nodes and tensors #3452
Description
Taking the tensor layouts portion out of #2686 as alignment requirements have already been done and this is a different issue.
Quoting @opti-mix :
Different operations (e.g. convolutions, gemms, etc) may require each of their inputs and outputs to use a specific memory layout, both in terms of alignments and logical ordering of dimensions (e.g. NHWC or NCHW). More over, these requirements may depend on the properties of the operation, e.g. dimensions of its operands, etc. E.g. a convolution with a small filter may need the input operands in a format different from a convolution with a big filter.
Today, Glow core and custom backends implicitly hard-code this knowledge about the operations into (backend-specific) nodes and code that works with them, i.e. in the code that transforms Glow nodes, which are in the canonical
NHWCformat, into backend-specific nodes. This is pretty fragile and involves a lot of boiler plate code.What if Glow would be able to ask a backend for each node what are the requirements for each of node's inputs and results in terms of alignments and ordering of dimensions?
The backend would return this information and Glow core could insert all the required layout transformations using a new node called
LayoutTransform(may be we could extend/subsume Transpose for these purposes).This new
LayoutTransformnode could be applied to Constant weights at compile-time. It can also be optimized in almost the same ways as Glow optimizes Transpose nodes, e.g. multipleLayoutTransformscan be combined into one, oppositeLayoutTransformscan eliminate each other, etc.This way all of the functionality to insert the required layout transforms is handled by the Glow core, which removes a lot of code duplication from backends. It also allows for better optimizations as the optimizer would understand the semantics of
LayoutTransform. And it also allows for better verification, as Glow can now check that layouts of all inputs/outputs of each node satisfy the requirements of this operation as reported by backends.
To expand on this idea I'd like to share an idea by @artemrakhov
first start with a "raw" state of uncompliance, where we insert LayoutTransformation nodes while needed. Then we have a loop to sink and clamp layout transformations together, similarly to what we do to transposes. In fact, LayoutTransformation is generalization of Transpose, since we can not only change the order of dimensions, but also their alignments.
Basically, we need to port/expand transpose syncing / elimination optimization to support layouts.
Additionally, I think that if we pursue this direction, It should also include a verifier that makes sure that a node A that expects certain layout constraints for its input have said constraints satisfied after doing layout transformations.
This would clean-up node creation in Graph.cpp and create a central location for satisfying layout-constraints. Consider static void assertConv3DDims for example, as the name suggests it does some checks to make sure the tensor layout and dimensions make sense. such tests could be moved into a central location: the proposed layout verifier.
Another useful outcome of this work is simplifying model-import: We (currently) silently add a transpose before some nodes because we may have them in a non-Glow-compliant layout. Take Bucketize from Caffee2 for example: The Caffe2 operator supports NCHW layout while we, in Glow, do not. When importing from Caffe2 we check for NCHW layout and add a transpose. We can make the import process for a large number of operators way more intuitive if we don't hard-code a transpose node but add them via layout requirements.
If we support such a change for model loaders, A tricky situation that we'd need to tackle is knowing the original layout that we imported. We have that knowledge when we import from say Caffe2, but, we don't keep track of it in Glow. One possible solution to this problem is if we included that information in Placeholders (and constants?).