From 82e3021fbf5c558975c9131455fa571e6a0f1abf Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 24 Jul 2023 02:32:55 -0700 Subject: [PATCH 01/51] [SYCL] Add ext_intel_virtual_functions specification --- .../sycl_ext_intel_virtual_functions.asciidoc | 435 ++++++++++++++++++ 1 file changed, 435 insertions(+) create mode 100644 sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc new file mode 100644 index 0000000000000..958c0d31d3317 --- /dev/null +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -0,0 +1,435 @@ += sycl_ext_oneapi_virtual_functions + +:source-highlighter: coderay +:coderay-linenums-mode: table + +// This section needs to be after the document title. +:doctype: book +:toc2: +:toc: left +:encoding: utf-8 +:lang: en +:dpcpp: pass:[DPC++] + +// Set the default source code type in this document to C++, +// for syntax highlighting purposes. This is needed because +// docbook uses c++ and html5 uses cpp. +:language: {basebackend@docbook:c++:cpp} + + +== Notice + +[%hardbreaks] +Copyright (C) 2023-2023 Intel Corporation. All rights reserved. + +Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks +of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by +permission by Khronos. + + +== Contact + +To report problems with this extension, please open a new issue at: + +https://github.com/intel/llvm/issues + + +== Dependencies + +This extension is written against the SYCL 2020 revision 7 specification. All +references below to the "core SYCL specification" or to section numbers in the +SYCL specification refer to that revision. + +This extension also depends on the following other SYCL extensions: + +* link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[ + sycl_ext_oneapi_kernel_properties] +* link:../experimental/sycl_ext_oneapi_properties.asciidoc[ + sycl_ext_oneapi_properties] +* link:../experimental/sycl_ext_oneapi_named_sub_group_sizes.asciidoc[ + sycl_ext_oneapi_named_sub_group_sizes] + +== Status + +This is a proposed extension specification, intended to gather community +feedback. Interfaces defined in this specification may not be implemented yet +or may be in a preliminary state. The specification itself may also change in +incompatible ways before it is finalized. *Shipping software products should +not rely on APIs defined in this specification.* + +== Backend support status + +The APIs in this extension may be used only on a device that has +`aspect::ext_oneapi_virtual_functions`. The application must check that the +device has this aspect before submitting a kernel using any of the APIs in this +extension. If the application fails to do this, the implementation throws +a synchronous exception with the `errc::kernel_not_supported` error code +when the kernel is submitted to the queue. + +== Overview + +The main purpose of this extension is to reduce amount of SYCL language +restrictions for device code by allowing to call virtual member functions +from device functions. + +== Specification + +=== Feature test macro + +This extension provides a feature-test macro as described in the core SYCL +specification. An implementation supporting this extension must predefine the +macro `SYCL_EXT_ONEAPI_VIRTUAL_FUNCTIONS` to one of the values defined in the +table below. Applications can test for the existence of this macro to determine +if the implementation supports this feature, or applications can test the +macro's value to determine which of the extension's features the implementation +supports. + +[%header,cols="1,5"] +|=== +|Value +|Description + +|1 +|The APIs of this experimental extension are not versioned, so the + feature-test macro always has this value. +|=== + +=== New language restrictions for device functions + +The following restriction, listed in section 5.4 of the core SYCL specification +does not apply if this extension is supported by the implementation: + +> The odr-use of polymorphic classes and classes with virtual inheritance is +> allowed. *However, no virtual member functions are allowed to be called in a +> device function.* + +However, there are still some limitations of how virtual member functions can +be used: + +- if an object is constructed in host code, calling a virtual function for that + object in device code has undefined behavior +- if an object is constructed in device code, calling a virtual function for + that object in host code has undefined behavior + +=== New properties + +Due to the indirect nature of virtual member functions, compiler may not be able +to understand member function of which exact class is being called. Moreover, +when code is distributed over several translation units, compiler may not be +able to even see definitions of all virtual member functions which may be called +in a kernel. + +Therefore, to provide a mechanism for an implementation to help detect virtual +member functions, which are going to be used from kernels and enforce necessary +restrictions on them, new compile-time-constant properties are proposed: + +[source,dpcpp] +---- +namespace sycl::ext::oneapi::experimental { + + struct indirectly_callable_key { + template + using value_t = property_value; + }; + + struct calls_indirectly_key { + template + using value_t = property_value; + }; + + template + inline constexpr indirectly_callable_key::value_t indirectly_callable; + + template + inline constexpr calls_indirectly_key::value_t calls_indirectly; + + template <> + struct is_property_key : std::true_type {}; + template <> struct is_property_key : std::true_type {}; +} +---- + +|=== +|Property|Description +|`indirectly_callable` +|The `indirectly_callable` property marks a virtual member function as a device +function, thus making it available to be called from SYCL kernel and device +functions. Should only be applied to virtual member functions and to do so, +`SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro should be used. + +Parameter `Set` specifies a group of kernels, which can call this virtual member +function, it must be a C++ typename. Calling a virtual member function from a +kernel without `calls_indirectly` property, or with a `calls_indirectly` +property with a different value of `Set` parameter is an undefined behavior. + +|`calls_indirectly` +|The `calls_indirectly` property marks a SYCL kernel function as performing +calls through virtual member functions. + +Parameter `Set` specifies a group of virtual member functions which can be +called from this kernel, it must be a C++ typename. Calling a virtual member +function without `indirectly_callable` property, or with an +`indirectly_callable` property with a different value of `Set` parameter is an +undefined behavior. +|=== + +If a virtual member function is called from device code, both definition and +declaration of that function must be decorated with the +indirectly_callable+ +property. `Set` property parameter must match between definition and +declaration and implementation should provide a diagnostic in case of mismatch. + +Applying the +indirectly_callable+ property to a SYCL Kernel function is illegal +and an implementation should produce a diagnostic for that. + +Applying the +indirectly_callable+ property to an arbitrary device function, +which is not a virtual member function has no effect. NOTE: This behavior may be +changed in either future version of this extension or in other extensions. + +Virtual member functions that are decorated with the +indirectly_callable+ +property are considered device functions, which must obey the restrictions +listed in section 5.4 of the core SYCL specification "Language restrictions for +device functions". Virtual member functions that are not decorated with this +attribute do not need to obey these restrictions, even if other definitions of +that virtual member function in other classes in the inheritance hierarchy are +decorated with the attribute. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct set_A; + +class Foo { +public: + // properties to functions should be applied using the macro: + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void foo() {} + + // both declaration and definition should be annotated: + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void bar(); + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void baz() {} +}; + +void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) Foo::bar() {} + +// kernel calling virtual function should also be annotated: +/* */.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + Foo *ptr = /* ... */; + ptr->bar() + + // Note: this kernel can only call 'Foo::foo' and 'Foo::bar' but not + // 'Foo::baz', because the latter is declared within a different "set". +}); +---- + +=== Optional kernel features handling + +The core SYCL specification (5.8 Attributes for device code) says the following +in the description of `device_has` attribute for SYCL kernels and non-kernel +device functions. + +When the attribute is applied to a kernel: + +> ... it causes the compiler to issue a diagnostic if the kernel (or any of the +> functions it calls) uses an optional feature that is associated with an aspect +> that is not listed in the attribute. + +When the attribute is applied to a function: + +> ... it causes the compiler to issue a diagnostic if the device function (or +> any of the functions it calls) uses an optional feature that is associated +> with an aspect that is not listed in the attribute. + +Due to dynamic nature of virtual member functions, compiler is not able to +perform static analysis of a call graph in order to understand which exact +virtual functions are called from which kernels, in general case. + +Therefore, compiler is not required to issue a diagnostic if a virtual member +function called from a kernel or a device function uses optional kernel features +which are not listed in `device_has` attribute attached to the kernel or the +device function. + +Calling a virtual function which uses optional kernel features not compatible +with a current device is an undefined behavior. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct Foo { + // properties to functions should be applied using the macro: + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void foo() { + double d = 3.14; + } + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void bar() {} +}; + +sycl::queue q; + +q.single_task(syclext::properties{syclext::calls_indirectly}, + [=] [[sycl::device_has()]] { + Foo *ptr = /* ... */; + // No diagnostic about kernel using 'fp64' aspect not listed in `device_has` + // attribute is not guaranteed to be emitted here. + ptr->foo() +}); +---- + +An implementation may not raise a compile time diagnostic or a run time +exception merely due to speculative compilation of a virtual member function for +a device when the application does not actually call that member function on +that device. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct Foo { +// properties to functions should be applied using the macro: +virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void foo() { + double d = 3.14; +} + +virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void bar() {} +}; + +sycl::queue q(/* device selector choosing a device *without* fp64 support */); +assert(!q.get_device().has(sycl::aspect::fp64)); + +q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + Foo *ptr = /* ... */; + // 'Foo::bar' doesn't use any optional features and this call is legal. + // No compilation issues or runtime exceptions should be reported due to + // 'Foo::foo' using unsupported fp64 aspect, because it is not called. + ptr->bar() +}); +---- + +=== Kernel bundles and device images + +When an object of a polymorphic class is constructed, it stores a pointer to +virtual table, which points to its virtual member functions. Addresses of those +functions are only accessible and valid within a kernel bundle containing a +kernel which used to construct an object. + +Performing calls to virtual member functions of an object constructed in a +kernel from a different kernel bundle is an undefined behavior. + +If an object of a polymorphic class is constructed in a kernel `A`, stored to a +memory and retrieved in a kernel `B` to perform a call through virtual member +function, then both kernels `A` and `B` must be present in the same device +image or otherwise behavior is undefined. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct Base { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void foo() {} +}; + +class Constructor; +class Use; + +int main() { + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(2, Q); + + auto bundleA + = sycl::get_kernel_bundle(Q.get_context(), + {sycl::get_kernel_id()}); + auto bundleB + = sycl::get_kernel_bundle(Q.get_context(), + {sycl::get_kernel_id()}); + + + Q.submit([&](sycl::handler &CGH) { + CGH.use_kernel_bundle(bundleA); + CGH.single_task(syclext::properties{syclext::calls_indirectly}, + [=] { + // Only placement new can be used within device functions. + new (Obj) Derived; + }); + }); + + Q.submit([&](sycl::handler &CGH) { + CGH.use_kernel_bundle(bundleB); + CGH.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + // Call to 'Base::foo' is an undefined behavior here, because 'Obj' was + // constructed within kernel bundle `bundleA` + Obj->foo(); + }); + }); + + return 0; +} +---- + + +== Example usage + +[source,dpcpp] +---- +#include + +using syclext = sycl::ext::oneapi::experimental; + +class Base { +public: + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) int get_random_number() { + return 4; // Chosen by fair dice roll. Guaranteed to be random + } + + // Not considered to be a device function, can use full set of C++ features + virtual int get_host_random_number() { + throw std::runtime_error("Not Implemented"); + } +}; + +class Derived : public Base { +public: + SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + int get_random_number() override { + return 221; + } +}; + +int main() { + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(1, Q); + + Q.single_task([=] { + // Only placement new can be used within device functions. + new (Obj) Derived; + }); + + auto props = syclext::properties{syclext::calls_indirectly}; + Q.single_task(props, [=] { + Base B; + Result[0] = B.get_random_number(); + }).wait(); + assert(A[0] == 4); + + Q.single_task(props, [=] { + A[0] = Obj->get_random_number(); + }).wait(); + assert(A[0] == 221); + + return 0; +} +---- From 0a80c43fb5e74a92c49d3e83f442338b5e0d25cf Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 28 Aug 2023 05:33:59 -0700 Subject: [PATCH 02/51] Updates Add a note that the extension does not intent to support `dynamic_cast` or `typeid`. Update properties interface, fix all examples to correctly use new properties (vatiable templates don't have default template argument values). --- .../sycl_ext_intel_virtual_functions.asciidoc | 90 +++++++++++++------ 1 file changed, 64 insertions(+), 26 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 958c0d31d3317..70557b3d37848 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -72,6 +72,9 @@ The main purpose of this extension is to reduce amount of SYCL language restrictions for device code by allowing to call virtual member functions from device functions. +NOTE: this extension **does not** cover (i.e. enables) things like +`dynamic_cast`, `typeid` or calls through function pointers. + == Specification === Feature test macro @@ -97,7 +100,7 @@ supports. === New language restrictions for device functions The following restriction, listed in section 5.4 of the core SYCL specification -does not apply if this extension is supported by the implementation: +does not apply if this extension is supported by an implementation: > The odr-use of polymorphic classes and classes with virtual inheritance is > allowed. *However, no virtual member functions are allowed to be called in a @@ -133,15 +136,18 @@ namespace sycl::ext::oneapi::experimental { }; struct calls_indirectly_key { - template - using value_t = property_value; + template + using value_t = + std::conditional_t, + property_value>; }; template inline constexpr indirectly_callable_key::value_t indirectly_callable; - template - inline constexpr calls_indirectly_key::value_t calls_indirectly; + template + inline constexpr calls_indirectly_key::value_t calls_indirectly; template <> struct is_property_key : std::true_type {}; @@ -160,17 +166,18 @@ functions. Should only be applied to virtual member functions and to do so, Parameter `Set` specifies a group of kernels, which can call this virtual member function, it must be a C++ typename. Calling a virtual member function from a kernel without `calls_indirectly` property, or with a `calls_indirectly` -property with a different value of `Set` parameter is an undefined behavior. +property with a value which does not include the same `Set` as specified by +`indirectly_callable` in its parameter is an undefined behavior. |`calls_indirectly` |The `calls_indirectly` property marks a SYCL kernel function as performing calls through virtual member functions. -Parameter `Set` specifies a group of virtual member functions which can be +Parameter `Sets` specifies groups of virtual member functions which can be called from this kernel, it must be a C++ typename. Calling a virtual member function without `indirectly_callable` property, or with an -`indirectly_callable` property with a different value of `Set` parameter is an -undefined behavior. +`indirectly_callable` property with a value of `Set` parameter which is not part +of `Sets` is an undefined behavior. |=== If a virtual member function is called from device code, both definition and @@ -198,26 +205,27 @@ decorated with the attribute. using syclext = sycl::ext::oneapi::experimental; struct set_A; +struct set_B; class Foo { public: // properties to functions should be applied using the macro: virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() {} + syclext::indirectly_callable) void foo() {} // both declaration and definition should be annotated: virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar(); + syclext::indirectly_callable) void bar(); virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void baz() {} + syclext::indirectly_callable) void baz() {} }; void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) Foo::bar() {} + syclext::indirectly_callable) Foo::bar() {} // kernel calling virtual function should also be annotated: -/* */.single_task(syclext::properties{syclext::calls_indirectly}, [=] { +/* */.single_task(syclext::properties{syclext::calls_indirectly}, [=] { Foo *ptr = /* ... */; ptr->bar() @@ -226,6 +234,36 @@ void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( }); ---- +Reasons about why it may be necessary to put different virtual functions into +different "sets" are explained in further sections, but for simplicity purposes +both properties cane be used without explicitly specifying a "set", thus using +a default "set": + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct set_A; + +class Foo { +public: + // properties to functions should be applied using the macro: + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable<>) void foo() {} + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void bar() {} +}; + +/* */.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + Foo *ptr = /* ... */; + ptr->bar() + + // Note: this kernel can only call 'Foo::foo' but not 'Foo::bar', because the + // latter is declared within a different (non-default) "set". +}); +---- + === Optional kernel features handling The core SYCL specification (5.8 Attributes for device code) says the following @@ -263,17 +301,17 @@ using syclext = sycl::ext::oneapi::experimental; struct Foo { // properties to functions should be applied using the macro: virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() { + syclext::indirectly_callable<>) void foo() { double d = 3.14; } virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar() {} + syclext::indirectly_callable<>) void bar() {} }; sycl::queue q; -q.single_task(syclext::properties{syclext::calls_indirectly}, +q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] [[sycl::device_has()]] { Foo *ptr = /* ... */; // No diagnostic about kernel using 'fp64' aspect not listed in `device_has` @@ -294,18 +332,18 @@ using syclext = sycl::ext::oneapi::experimental; struct Foo { // properties to functions should be applied using the macro: virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() { + syclext::indirectly_callable<>) void foo() { double d = 3.14; } virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar() {} + syclext::indirectly_callable<>) void bar() {} }; sycl::queue q(/* device selector choosing a device *without* fp64 support */); assert(!q.get_device().has(sycl::aspect::fp64)); -q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { +q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { Foo *ptr = /* ... */; // 'Foo::bar' doesn't use any optional features and this call is legal. // No compilation issues or runtime exceptions should be reported due to @@ -335,7 +373,7 @@ using syclext = sycl::ext::oneapi::experimental; struct Base { virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() {} + syclext::indirectly_callable<>) void foo() {} }; class Constructor; @@ -357,7 +395,7 @@ int main() { Q.submit([&](sycl::handler &CGH) { CGH.use_kernel_bundle(bundleA); - CGH.single_task(syclext::properties{syclext::calls_indirectly}, + CGH.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { // Only placement new can be used within device functions. new (Obj) Derived; @@ -366,7 +404,7 @@ int main() { Q.submit([&](sycl::handler &CGH) { CGH.use_kernel_bundle(bundleB); - CGH.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + CGH.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { // Call to 'Base::foo' is an undefined behavior here, because 'Obj' was // constructed within kernel bundle `bundleA` Obj->foo(); @@ -389,7 +427,7 @@ using syclext = sycl::ext::oneapi::experimental; class Base { public: virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) int get_random_number() { + syclext::indirectly_callable<>) int get_random_number() { return 4; // Chosen by fair dice roll. Guaranteed to be random } @@ -401,7 +439,7 @@ public: class Derived : public Base { public: - SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) int get_random_number() override { return 221; } @@ -418,7 +456,7 @@ int main() { new (Obj) Derived; }); - auto props = syclext::properties{syclext::calls_indirectly}; + auto props = syclext::properties{syclext::calls_indirectly<>}; Q.single_task(props, [=] { Base B; Result[0] = B.get_random_number(); From 2035d66b4c1df215012510b54f123c55350ee34d Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 28 Aug 2023 07:13:08 -0700 Subject: [PATCH 03/51] Revamp optional kernel features section --- .../sycl_ext_intel_virtual_functions.asciidoc | 122 +++++++++++++----- 1 file changed, 93 insertions(+), 29 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 70557b3d37848..7240e8899db40 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -286,51 +286,111 @@ Due to dynamic nature of virtual member functions, compiler is not able to perform static analysis of a call graph in order to understand which exact virtual functions are called from which kernels, in general case. -Therefore, compiler is not required to issue a diagnostic if a virtual member -function called from a kernel or a device function uses optional kernel features -which are not listed in `device_has` attribute attached to the kernel or the -device function. - -Calling a virtual function which uses optional kernel features not compatible -with a current device is an undefined behavior. +Instead, information from the new properties is used by an implementation to +issue such diagnostic. When determining a set of aspects which are used by a +SYCL kernel function, implementation also takes into account all aspects which +are used by all virtual member functions included into all "sets" listed in +`calls_indirectly` property. [source,dpcpp] ---- using syclext = sycl::ext::oneapi::experimental; +struct set_A; +struct set_B; + struct Foo { - // properties to functions should be applied using the macro: + // This function uses 'fp64' aspect virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable<>) void foo() { + syclext::indirectly_callable) void foo() { double d = 3.14; } + // This function uses 'fp16' aspect virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable<>) void bar() {} + syclext::indirectly_callable) void bar() { + sycl::half h = 2.71; + } }; sycl::queue q; +q.single_task(syclext::properties{syclext::calls_indirectly}, + [=] [[sycl::device_has(sycl::aspect::fp64)]] { + // Diagnostic is required for this kernel, because it is declared as only + // using 'fp64' aspect, but it also uses virtual member functions from + // "set_B", which includes 'Foo:bar' that uses 'fp16' aspect. +}); + +q.single_task(syclext::properties{syclext::calls_indirectly}, + [=] [[sycl::device_has()]] { + // Diagnostic is required for this kernel, because it is declared as not + // using any optional features, but it also uses virtual member functions from + // "set_A", which includes 'Foo::foo' that uses 'fp64' aspect. +}); + +q.single_task(syclext::properties{syclext::calls_indirectly}, + [=] [[sycl::device_has(sycl::aspect::fp64)]] { + // No diagnostic is required for this kernel, because list of declared aspects + // matches list of used aspects. That includes viratul member functions from + // "set_A", which includes 'Foo::foo' that uses 'fp64' aspect +}); + q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] [[sycl::device_has()]] { - Foo *ptr = /* ... */; - // No diagnostic about kernel using 'fp64' aspect not listed in `device_has` - // attribute is not guaranteed to be emitted here. - ptr->foo() + // No diagnostic is required for this kernel, because list of declared aspects + // matches list of used aspects. There are no virtual member functions defined + // in a default "set", which means that no extra optional kernel features + // requirements were attached to the kernel. +}); +---- + +Submitting a kernel with `calls_indirectly` property, which includes virtual +member functions that use optional kernel features to a device which doesn't +support them, should result in an exception at runtime, similar to how it is +defined by the core SYCL specification. + +[source,dpcpp] +---- +using syclext = sycl::ext::oneapi::experimental; + +struct set_A; +struct set_B; + +struct Foo { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void foo() { + double d = 3.14; + } + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void bar() {} +}; + +sycl::queue q(/* device selector returns a device *without* fp64 support */); +assert(!q.get_device().has(sycl::aspect::fp64)); + +q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + // Exception is expected to be thrown, because target device doesn't support + // fp64 aspect and it is required by 'Foo::foo' which is included into 'set_A' +}); + +q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + // No exceptions are expected, because 'set_B' doesn't bring any requirements + // for optional kernel features. }); ---- An implementation may not raise a compile time diagnostic or a run time exception merely due to speculative compilation of a virtual member function for -a device when the application does not actually call that member function on -that device. +a device when the application does not specify a use of virtual member functions +through the correponding properties. [source,dpcpp] ---- using syclext = sycl::ext::oneapi::experimental; struct Foo { -// properties to functions should be applied using the macro: virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( syclext::indirectly_callable<>) void foo() { double d = 3.14; @@ -343,12 +403,16 @@ virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( sycl::queue q(/* device selector choosing a device *without* fp64 support */); assert(!q.get_device().has(sycl::aspect::fp64)); -q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { - Foo *ptr = /* ... */; - // 'Foo::bar' doesn't use any optional features and this call is legal. - // No compilation issues or runtime exceptions should be reported due to - // 'Foo::foo' using unsupported fp64 aspect, because it is not called. - ptr->bar() +auto *Storage = sycl::malloc_device(1, q); + +q.single_task([=] { + // The kernel is not submitted with 'calls_indirectly' property and therefore + // it is not considered to be using any of virtual member functions of 'Foo'. + // This means that the object of 'Foo' can be successfully created by this + // kernel, regardless of whether a target device supports 'fp64' aspect which + // is used by 'Foo::foo'. + // No exceptions are expected to be thrown. + new (Storage) Foo; }); ---- @@ -362,11 +426,6 @@ kernel which used to construct an object. Performing calls to virtual member functions of an object constructed in a kernel from a different kernel bundle is an undefined behavior. -If an object of a polymorphic class is constructed in a kernel `A`, stored to a -memory and retrieved in a kernel `B` to perform a call through virtual member -function, then both kernels `A` and `B` must be present in the same device -image or otherwise behavior is undefined. - [source,dpcpp] ---- using syclext = sycl::ext::oneapi::experimental; @@ -382,7 +441,7 @@ class Use; int main() { sycl::queue Q; - Base *Obj = sycl::malloc_device(1, Q); + Base *Obj = sycl::malloc_device(1, Q); int *Result = sycl::malloc_shared(2, Q); auto bundleA @@ -415,6 +474,11 @@ int main() { } ---- +As a side effect, the restriction above also means that if an object of a +polymorphic class was constructed by a kernel within a kernel bundle associcated +only with a device `DA`, then performing calls to virtual member functions of +that object in any kernels submitted to a device is an undefined behavior. + == Example usage From bfecee1a2513658822ecb3aee4890b0a887df2f5 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 29 Aug 2023 02:21:08 -0700 Subject: [PATCH 04/51] WIP on implementation design doc --- sycl/doc/design/VirtualFunctions.md | 39 +++++++++++++++++++++++++++++ sycl/doc/index.rst | 1 + 2 files changed, 40 insertions(+) create mode 100644 sycl/doc/design/VirtualFunctions.md diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md new file mode 100644 index 0000000000000..36fcd35c010c5 --- /dev/null +++ b/sycl/doc/design/VirtualFunctions.md @@ -0,0 +1,39 @@ +# Implementation design for sycl_ext_oneapi_virtual_functions + +Corresponding language extension specification: +[sycl_ext_oneapi_virtual_functions][1] + + +## Design + +### Changes to the SYCL header files + +New compile-time properties `indirectly_callable` and `calls_indirectly` should +be implemented in accordance with the corresponding [design document][2]. + +**TODO**: `calls_indirectly` requires conversion from C++ typename to a string. +Document how it should be done. `__sycl_builtin_unique_stable_name` should +likely be used. +**TODO**: `calls_indirectly` requires compile-time concatenation of strings. +Document how it should be done. + +### Changes to the compiler front-end + +Compiler front-end should be updated to respect rules defined by the +[extension specifiction][1], such as: + +- virtual member functions annotated with `indirectly_callable` compile-time + property should be emitted into device code; +- virtual member function *not* annotated with `indirectly_callable` + compile-time property should *not* be emitted into device code; + +### Changes to the compiler middle-end + +Note: some of the changes attributed to this category could technically be +implemented in front-end instead. However, it would be more complicated + + + +[1]: <../extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc> +[2]: + diff --git a/sycl/doc/index.rst b/sycl/doc/index.rst index 8f6e0854df8f5..d6b3e4ac08ec1 100644 --- a/sycl/doc/index.rst +++ b/sycl/doc/index.rst @@ -51,6 +51,7 @@ Design Documents for the oneAPI DPC++ Compiler design/DeviceConfigFile design/PropagateCompilerFlagsToRuntime design/SYCLNativeCPU + design/VirtualFunctions New OpenCL Extensions New SPIR-V Extensions From 888313791009d8c65df4f0e5464e14bbec86cf38 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 13 Sep 2023 06:29:40 -0700 Subject: [PATCH 05/51] First complete draft of the design doc --- sycl/doc/design/VirtualFunctions.md | 141 +++++++++++++++++++++++++++- 1 file changed, 139 insertions(+), 2 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 36fcd35c010c5..298ddba69b382 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -27,12 +27,149 @@ Compiler front-end should be updated to respect rules defined by the - virtual member function *not* annotated with `indirectly_callable` compile-time property should *not* be emitted into device code; +**TODO**: investigate if it is possible to enforce SYCL device code restrctions +on virtual member functions marked with the compile-time property. + ### Changes to the compiler middle-end -Note: some of the changes attributed to this category could technically be -implemented in front-end instead. However, it would be more complicated +#### Aspects propagation and related diagnostics + +Aspects propagation pass should be extended to not only gather aspects which are +used directly, but also aspects that are used indirectly, through virtual +functions. + +For that the pass should compile a list of aspects used by each set of +indirectly callable functions (as defined by `indirectly_callable` property set +by user) and then append those aspects to every kernel which uses those sets. + +Diagnostic should be emitted if a kernel is marked with `device_has` attribute +or property which doesn't include an indirectly used aspect. + +#### Device code split and device images + +The extension specification restricts implementation from raising a diagnostic +when a kernel not marked with `calls_indirectly` kernel property creates an +object of a polymorphic class where some virtual functions use optional kernel +features incompatible with a target device. + +Consider the following example: + +```c++ +using syclext = sycl::ext::oneapi::experimental; + +struct fp64_set; +struct regular_set; + +struct Foo { +virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void foo() { + // uses double + double d = 3.14; +} + +virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void bar() {} +}; + +sycl::queue q; + +auto *Storage = sycl::malloc_device(1, q); + +q.single_task([=] { + // The kernel is not submitted with 'calls_indirectly' property and therefore + // it is not considered to be using any of virtual member functions of 'Foo'. + // This means that the object of 'Foo' can be successfully created by this + // kernel, regardless of whether a target device supports 'fp64' aspect which + // is used by 'Foo::foo'. + // No exceptions are expected to be thrown. + new (Storage) Foo; +}); + +if (q.get_device().has(sycl::aspect::fp64)) { + auto props = syclext::properties{syclext::calls_indirectly}; + q.single_task(props, [=] { + Storage->foo(); + }); +} else { + auto props = syclext::properties{syclext::calls_indirectly}; + q.single_task(props, [=] { + Storage->bar(); + }); +} + +``` + +This example should work regardless of whether target device supports 'fp64' +aspect or not. To achieve that, virtual member functions are outlined into +separate device images which are linked at runtime depending on whether they are +compatible with a target device. + +Regardless of device code split mode selected by a user, functions marked with +`indirectly_callable` property should be outlined into a separate device images +by `sycl-post-link` tool based on the property argument. + +Additionally, if any virtual function in such device image uses any optional +kernel features, then the whole image should be cloned with all function bodies +emptied. This cloned device image will be further referred to as "dummy virtual +functions device image". + +This dummy device image is needed to support the example showed above when a +kernel creates an object of a polymorhpic class where some of virtual functions +use optional features. LLVM IR generated by front-end will contain a vtable, +which references all methods of the class. However, not all of them can be +directly included into kernel's device image to avoid speculative compilation. + +When such kernel is submitted to a device, runtime will check which optional +features are supported and link one or another device image with virtual +functions. + +#### New device image properties + +To let runtime know which device images should be linked together to get virtual +functions working, new property set is introduced: "SYCL/virtual functions". + +For device images, which contain virtual functions (i.e. ones produced by +outlining `indirectly_callable` functions into a separate device image), the +following properties are set within the new property set: +- "virtual-functions-set" with a string value containing name of virtual + functions set contained within the image (value of the property argument); +- "dummy-image=1" if an image is a dummy virtual functions device image; + +For other device images, the following properties are set within the new +property set: +- "calls-virtual-functions-set" with a string value containing comma-separated + list of names of virtual function sets used by kernels in the image (as + indicated by `calls_indirectly` kernel property); +- "creates-virtual-functions-set" with a string value containing comma-separate + list of names of virtual function sets which are referenced from functions + included into vtables used by a kernel within a device image; + **TODO:** this item definitely needs better description + +### Changes to the runtime + +When a kernel submitted to a device comes from a device image with some +properties set in "SYCL/virtual functions" property set, then runtime does some +extra actions to link several device images together to ensure that the kernel +can be executed. + +Algorithm for discovery of device images which has to be linked: +- if device image has property "calls-virtual-functions-set=A,B,...,N" on it, + then all device images with "virtual-functions-set" property equal to "A", + "B", ..., "N" are taken to be linked with the initial device image; +- if device image has property "creates-virtual-functions-set=A,B,...,N" on it, + then for each device image with "virtual-functions-set" property equal to "A", + "B", ..., "N" and *without* "dummy-image=1" property on it: + - if that device image is compatible with device, it is taken to be linked + with the initial device image; + - otherwise, runtime looks for a device image with the same + "virtual-functions-set" property, but *with* "dummy-image=1" property on it + and takes that device image to be linked with the initial device image; +Produced list of device images is then linked together and used to enqueue a +kernel. +**TODO:** do we need to say anything about in-memory and on-disk cache +functionality here? [1]: <../extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc> [2]: From 846404115004f84c6ac0ce3608a6fcbf62189316 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 18 Sep 2023 06:59:34 -0700 Subject: [PATCH 06/51] Headers part design doc update --- sycl/doc/design/VirtualFunctions.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 298ddba69b382..f340c45a62840 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -11,9 +11,17 @@ Corresponding language extension specification: New compile-time properties `indirectly_callable` and `calls_indirectly` should be implemented in accordance with the corresponding [design document][2]. -**TODO**: `calls_indirectly` requires conversion from C++ typename to a string. -Document how it should be done. `__sycl_builtin_unique_stable_name` should -likely be used. +`indirectly_callable` property should lead to emission of +`"indirectly-callable"="set"` function attribute, where "set" is a string +representation of the property template parameter. + +`calls_indirectly` property should lead to emission of +`"calls-indirectly"="set1,set2"`, where "set1" and "set2" are string +representations of the property template parameters. + +In order to convert a type to a string, [\__builtin_sycl_unique_stable_name][3] +could be used. + **TODO**: `calls_indirectly` requires compile-time concatenation of strings. Document how it should be done. @@ -173,4 +181,5 @@ functionality here? [1]: <../extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc> [2]: +[3]: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-sycl-unique-stable-name From d30ea9ae35bacff6b4a68b8cd468d561783cd190 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 18 Sep 2023 08:53:57 -0700 Subject: [PATCH 07/51] Spec tweaks --- .../sycl_ext_intel_virtual_functions.asciidoc | 142 +++++++++++------- 1 file changed, 86 insertions(+), 56 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 7240e8899db40..0221f8441c55f 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -72,7 +72,7 @@ The main purpose of this extension is to reduce amount of SYCL language restrictions for device code by allowing to call virtual member functions from device functions. -NOTE: this extension **does not** cover (i.e. enables) things like +NOTE: this extension **does not** cover (i.e. doesn't enable) things like `dynamic_cast`, `typeid` or calls through function pointers. == Specification @@ -109,45 +109,45 @@ does not apply if this extension is supported by an implementation: However, there are still some limitations of how virtual member functions can be used: -- if an object is constructed in host code, calling a virtual function for that - object in device code has undefined behavior -- if an object is constructed in device code, calling a virtual function for - that object in host code has undefined behavior +- if an object is constructed in host code, calling a virtual member function + for that object in device code has undefined behavior; +- if an object is constructed in device code, calling a virtual member function + for that object in host code has undefined behavior; === New properties Due to the indirect nature of virtual member functions, compiler may not be able -to understand member function of which exact class is being called. Moreover, -when code is distributed over several translation units, compiler may not be -able to even see definitions of all virtual member functions which may be called -in a kernel. +to understand which exact virtual member function is being called (i.e. which +class it belongs to). Moreover, when code is distributed over several +translation units, compiler may not be able to even see definitions of all +virtual member functions which may be called in a kernel. Therefore, to provide a mechanism for an implementation to help detect virtual member functions, which are going to be used from kernels and enforce necessary -restrictions on them, new compile-time-constant properties are proposed: +restrictions on them, new compile-time-constant properties are proposed. [source,dpcpp] ---- namespace sycl::ext::oneapi::experimental { struct indirectly_callable_key { - template - using value_t = property_value; + template + using value_t = property_value; }; struct calls_indirectly_key { - template + template using value_t = std::conditional_t, - property_value>; + property_value>; }; - template - inline constexpr indirectly_callable_key::value_t indirectly_callable; + template + inline constexpr indirectly_callable_key::value_t indirectly_callable; - template - inline constexpr calls_indirectly_key::value_t calls_indirectly; + template + inline constexpr calls_indirectly_key::value_t calls_indirectly; template <> struct is_property_key : std::true_type {}; @@ -155,50 +155,79 @@ namespace sycl::ext::oneapi::experimental { } ---- +Before describing those properties those properties in more detail, a couple of +new terms are introduced to simplify the extension specification: + +Set of virtual member functions:: a group of virtual member functions which are +defined with `indirectly_callable` property with the same value of the property +parameter `SetId`. For the simplicity, this will also be further refferred to as +a set, or as a set of virtual functions. + +Kernel declares a use of a set of virtual member functions:: a kernel is +considered to be declaring a use of a set of virtual member functions `SetIdA` +when it is submitted with `calls_indirectly` property with `SetIdA` included +into the property parameter `SetIds`. If `SetIdA` is not included into the +property parameter `SetIds`, or if a kernel is submitted without the property, +then it is *not* considered to be declaring a use of the set of virtual member +functions. + +:fn-property-macro-docs: footnote:[To read more about the +`SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro, see +link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[ + sycl_ext_oneapi_kernel_properties] extension] + |=== |Property|Description |`indirectly_callable` -|The `indirectly_callable` property marks a virtual member function as a device -function, thus making it available to be called from SYCL kernel and device -functions. Should only be applied to virtual member functions and to do so, -`SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro should be used. +|The `indirectly_callable` property indicates that a virtual member function as +a device function, thus making it available to be called from SYCL kernel and +device functions. Should only be applied to virtual member functions and to do +so, `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro{fn-property-macro-docs} should be +used. + +Optional parameter `SetId` specifies a set of virtual member functions this +function belongs to and at the same time it defines a group of kernels, which +can call this function, it must be a C++ typename. When the parameter is +ommitted, a virtual member function is considered to belong to a default set. -Parameter `Set` specifies a group of kernels, which can call this virtual member -function, it must be a C++ typename. Calling a virtual member function from a -kernel without `calls_indirectly` property, or with a `calls_indirectly` -property with a value which does not include the same `Set` as specified by -`indirectly_callable` in its parameter is an undefined behavior. +Calling a virtual member function from a kernel which does not declare use of a +set the virtual member function belongs to is an undefined behavior. |`calls_indirectly` -|The `calls_indirectly` property marks a SYCL kernel function as performing -calls through virtual member functions. - -Parameter `Sets` specifies groups of virtual member functions which can be -called from this kernel, it must be a C++ typename. Calling a virtual member -function without `indirectly_callable` property, or with an -`indirectly_callable` property with a value of `Set` parameter which is not part -of `Sets` is an undefined behavior. +|The `calls_indirectly` property indicates that a SYCL kernel function is +performing calls through virtual member functions and declares use one or more +of sets of virtual member functions. + +Optional parameter `SetIds` specifies which sets of virtual member functions are +declared to be used by the kernel, it must be zero or more comma-separated C++ +typenames. If the argument is omitted (zero C++ typenames specified), than a +kernel is considered to be using a default set of virtual member functions. + +Calling a virtual member function, which does not belong to any of sets of +virtual member functions declared to be used is an undefined behavior. |=== If a virtual member function is called from device code, both definition and declaration of that function must be decorated with the +indirectly_callable+ -property. `Set` property parameter must match between definition and +property. `SetId` property parameter must match between definition and declaration and implementation should provide a diagnostic in case of mismatch. Applying the +indirectly_callable+ property to a SYCL Kernel function is illegal and an implementation should produce a diagnostic for that. Applying the +indirectly_callable+ property to an arbitrary device function, -which is not a virtual member function has no effect. NOTE: This behavior may be -changed in either future version of this extension or in other extensions. +which is not a virtual member function has no effect. + +NOTE: This behavior may be changed in either future version of this extension or +in another extensions. Virtual member functions that are decorated with the +indirectly_callable+ -property are considered device functions, which must obey the restrictions -listed in section 5.4 of the core SYCL specification "Language restrictions for -device functions". Virtual member functions that are not decorated with this -attribute do not need to obey these restrictions, even if other definitions of -that virtual member function in other classes in the inheritance hierarchy are -decorated with the attribute. +property are considered to be device functions, i.e. they must obey the +restrictions listed in section 5.4 of the core SYCL specification "Language +restrictions for device functions". Virtual member functions that are not +decorated with this attribute do not need to obey these restrictions, even if +other definitions of that virtual member function in other classes in the +inheritance hierarchy are decorated with the attribute. [source,dpcpp] ---- @@ -230,14 +259,14 @@ void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( ptr->bar() // Note: this kernel can only call 'Foo::foo' and 'Foo::bar' but not - // 'Foo::baz', because the latter is declared within a different "set". + // 'Foo::baz', because the latter is declared within a different set. }); ---- Reasons about why it may be necessary to put different virtual functions into -different "sets" are explained in further sections, but for simplicity purposes -both properties cane be used without explicitly specifying a "set", thus using -a default "set": +different sets are explained in further sections, but for simplicity purposes +both properties can be used without explicitly specifying a set, thus using +a default set: [source,dpcpp] ---- @@ -247,7 +276,7 @@ struct set_A; class Foo { public: - // properties to functions should be applied using the macro: + // This virtual member function belongs to a default set of virtual functions. virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( syclext::indirectly_callable<>) void foo() {} @@ -260,7 +289,7 @@ public: ptr->bar() // Note: this kernel can only call 'Foo::foo' but not 'Foo::bar', because the - // latter is declared within a different (non-default) "set". + // latter belongs to a different (non-default) set of virtual functions. }); ---- @@ -282,15 +311,16 @@ When the attribute is applied to a function: > any of the functions it calls) uses an optional feature that is associated > with an aspect that is not listed in the attribute. -Due to dynamic nature of virtual member functions, compiler is not able to -perform static analysis of a call graph in order to understand which exact -virtual functions are called from which kernels, in general case. +Due to dynamic nature of virtual member functions, compiler in general case is +not able to perform static analysis of a call graph in order to understand which +exact virtual functions are called from which kernels. Instead, information from the new properties is used by an implementation to issue such diagnostic. When determining a set of aspects which are used by a SYCL kernel function, implementation also takes into account all aspects which -are used by all virtual member functions included into all "sets" listed in -`calls_indirectly` property. +are used by all virtual member functions included into all sets of virtual +member functions declared to be used by a kernel. + [source,dpcpp] ---- @@ -340,7 +370,7 @@ q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] [[sycl::device_has()]] { // No diagnostic is required for this kernel, because list of declared aspects // matches list of used aspects. There are no virtual member functions defined - // in a default "set", which means that no extra optional kernel features + // in a default set, which means that no extra optional kernel features // requirements were attached to the kernel. }); ---- From cdaa1bc79daacc47140fdf79085a5ef954822920 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 18 Sep 2023 08:54:43 -0700 Subject: [PATCH 08/51] Try to fix footnote --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 0221f8441c55f..f13a400a72751 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -171,10 +171,7 @@ property parameter `SetIds`, or if a kernel is submitted without the property, then it is *not* considered to be declaring a use of the set of virtual member functions. -:fn-property-macro-docs: footnote:[To read more about the -`SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro, see -link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[ - sycl_ext_oneapi_kernel_properties] extension] +:fn-property-macro-docs: footnote:[To read more about the `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro, see link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[sycl_ext_oneapi_kernel_properties] extension] |=== |Property|Description From 89ea15176c7730204bc90c57e7ca5662e5bced30 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 19 Sep 2023 02:50:50 -0700 Subject: [PATCH 09/51] Add sub-group-sizes handling section; some typo fixes and further tweaks --- .../sycl_ext_intel_virtual_functions.asciidoc | 46 +++++++++++++++---- 1 file changed, 36 insertions(+), 10 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index f13a400a72751..c364b079e1c96 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -111,8 +111,9 @@ be used: - if an object is constructed in host code, calling a virtual member function for that object in device code has undefined behavior; -- if an object is constructed in device code, calling a virtual member function - for that object in host code has undefined behavior; +- if an object is constructed in device code on a device `A`, calling a virtual + member function for that object in host code, or on another device `B` has + undefined behavior; === New properties @@ -359,7 +360,7 @@ q.single_task(syclext::properties{syclext::calls_indirectly}, q.single_task(syclext::properties{syclext::calls_indirectly}, [=] [[sycl::device_has(sycl::aspect::fp64)]] { // No diagnostic is required for this kernel, because list of declared aspects - // matches list of used aspects. That includes viratul member functions from + // matches list of used aspects. That includes virtual member functions from // "set_A", which includes 'Foo::foo' that uses 'fp64' aspect }); @@ -443,11 +444,42 @@ q.single_task([=] { }); ---- +==== Interation with `reqd_sub_group_size` attribute + +The `reqd_sub_group_size` attribute is a bit of a special case comparing to +other optional kernel features, because it requires to compile a kernel in a +certain way, which may require special handling for all functions which are +called from it. + +When the same helper function is called from two or more kernels +with different `reqd_sub_group_size` attribute, it may be required for the +implementation to duplicate that helper function to create different versions +of it tailored to different sub-group sizes. It can be done in a straightforward +manner when operating on a static call graph. + +Virtual member functions are essentially called indirectly and pointers to them +are initialized just once when an object of a polymorhpic class is being +created. Therefore, to support calling such virtual member function from two or +more kernels with different `reqd_sub_group_size`, each kernel may need to +receive a different pointer to a different version of a virtual member function. + +To avoid possibly posing such multi-versioning requirements on implementations, +virtual member functions can only be called from a kernel with _primary_ +sub-group-size as defined by +link:../experimental/sycl_ext_oneapi_named_sub_group_sizes.asciidoc[ +sycl_ext_oneapi_named_sub_group_sizes] extension, or otherwise behavior is +undefined. + +NOTE: for implementations that don't support +`sycl_ext_oneapi_named_sub_group_sizes` extension, virtual member functions can +only be called from kernels which *don't* have `reqd_sub_group_size` attribute +set on them explicitly, or otherwise behavior is undefined. + === Kernel bundles and device images When an object of a polymorphic class is constructed, it stores a pointer to virtual table, which points to its virtual member functions. Addresses of those -functions are only accessible and valid within a kernel bundle containing a +functions are accessible and valid only within a kernel bundle containing a kernel which used to construct an object. Performing calls to virtual member functions of an object constructed in a @@ -501,12 +533,6 @@ int main() { } ---- -As a side effect, the restriction above also means that if an object of a -polymorphic class was constructed by a kernel within a kernel bundle associcated -only with a device `DA`, then performing calls to virtual member functions of -that object in any kernels submitted to a device is an undefined behavior. - - == Example usage [source,dpcpp] From b2ab2024ccfc0b1fbe825ea4d72076df755e74db Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 2 Oct 2023 02:37:54 -0700 Subject: [PATCH 10/51] updates to aspects propagation section --- sycl/doc/design/VirtualFunctions.md | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index f340c45a62840..473022b32224f 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -40,7 +40,7 @@ on virtual member functions marked with the compile-time property. ### Changes to the compiler middle-end -#### Aspects propagation and related diagnostics +#### Aspects propagation Aspects propagation pass should be extended to not only gather aspects which are used directly, but also aspects that are used indirectly, through virtual @@ -48,10 +48,20 @@ functions. For that the pass should compile a list of aspects used by each set of indirectly callable functions (as defined by `indirectly_callable` property set -by user) and then append those aspects to every kernel which uses those sets. +by user) and then append those aspects to every kernel which use those sets (as +defiend by `calls_indirectly` property set by user). -Diagnostic should be emitted if a kernel is marked with `device_has` attribute -or property which doesn't include an indirectly used aspect. +NOTE: if the aspects propagation pass is ever extended to track function +pointers, then aspects attached to virtual functions **should not** be attached +to kernels using this mechanism. For example, if a kernel uses a variable, +which is initialized with a function pointer to a virtual function which uses +an aspect, then such kernel **should not** be considered as using that aspect. +Properties-based mechanism which is described above should be used for aspects +propagation for virtual functions. + +#### New compiler diagnostics + +**TBD** #### Device code split and device images From b5b2e26d87b4b3653ca8c9ac3fc6deecfa1c87fb Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 2 Oct 2023 03:15:17 -0700 Subject: [PATCH 11/51] Update properies definition based on implmenetation poc feedback; introduce new macro for indirectly_callable property --- .../sycl_ext_intel_virtual_functions.asciidoc | 68 ++++++++----------- 1 file changed, 28 insertions(+), 40 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index c364b079e1c96..ea98d7bb23e5c 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -137,18 +137,16 @@ namespace sycl::ext::oneapi::experimental { }; struct calls_indirectly_key { - template - using value_t = - std::conditional_t, - property_value>; + template + using value_t = property_value; }; template inline constexpr indirectly_callable_key::value_t indirectly_callable; - template - inline constexpr calls_indirectly_key::value_t calls_indirectly; + template + inline constexpr calls_indirectly_key::value_t + calls_indirectly; template <> struct is_property_key : std::true_type {}; @@ -172,16 +170,15 @@ property parameter `SetIds`, or if a kernel is submitted without the property, then it is *not* considered to be declaring a use of the set of virtual member functions. -:fn-property-macro-docs: footnote:[To read more about the `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro, see link:../experimental/sycl_ext_oneapi_kernel_properties.asciidoc[sycl_ext_oneapi_kernel_properties] extension] - |=== |Property|Description |`indirectly_callable` |The `indirectly_callable` property indicates that a virtual member function as a device function, thus making it available to be called from SYCL kernel and device functions. Should only be applied to virtual member functions and to do -so, `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro{fn-property-macro-docs} should be -used. +so, function-style `SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY` macro should +be used. It accepts a single optional argument, which is passed to the property +parameter `SetId` and therefore has all the same requirements. Optional parameter `SetId` specifies a set of virtual member functions this function belongs to and at the same time it defines a group of kernels, which @@ -237,19 +234,15 @@ struct set_B; class Foo { public: // properties to functions should be applied using the macro: - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void foo() {} // both declaration and definition should be annotated: - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar(); + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void bar(); - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void baz() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void baz() {} }; -void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) Foo::bar() {} +void SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) Foo::bar() {} // kernel calling virtual function should also be annotated: /* */.single_task(syclext::properties{syclext::calls_indirectly}, [=] { @@ -275,11 +268,9 @@ struct set_A; class Foo { public: // This virtual member function belongs to a default set of virtual functions. - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable<>) void foo() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void bar() {} }; /* */.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { @@ -291,6 +282,10 @@ public: }); ---- +NOTE: `void` typename is explicitly reserved as identifier of the default +virtual functions set. It can still be used, but with the expectation that it +is also implicitly used by the extension implementation. + === Optional kernel features handling The core SYCL specification (5.8 Attributes for device code) says the following @@ -329,14 +324,12 @@ struct set_B; struct Foo { // This function uses 'fp64' aspect - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void foo() { double d = 3.14; } // This function uses 'fp16' aspect - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar() { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void bar() { sycl::half h = 2.71; } }; @@ -386,13 +379,11 @@ struct set_A; struct set_B; struct Foo { - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void foo() { double d = 3.14; } - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void bar() {} }; sycl::queue q(/* device selector returns a device *without* fp64 support */); @@ -419,13 +410,11 @@ through the correponding properties. using syclext = sycl::ext::oneapi::experimental; struct Foo { -virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable<>) void foo() { +virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() { double d = 3.14; } -virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable<>) void bar() {} +virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void bar() {} }; sycl::queue q(/* device selector choosing a device *without* fp64 support */); @@ -490,8 +479,7 @@ kernel from a different kernel bundle is an undefined behavior. using syclext = sycl::ext::oneapi::experimental; struct Base { - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable<>) void foo() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} }; class Constructor; @@ -543,8 +531,8 @@ using syclext = sycl::ext::oneapi::experimental; class Base { public: - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable<>) int get_random_number() { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() + int get_random_number() { return 4; // Chosen by fair dice roll. Guaranteed to be random } @@ -556,7 +544,7 @@ public: class Derived : public Base { public: - SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() int get_random_number() override { return 221; } From cac42479dfe275825a29b318ab700f58c6d8acd3 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 28 Nov 2023 04:20:39 -0800 Subject: [PATCH 12/51] Pass with minor tweaks and fixes over the extension spec --- .../sycl_ext_intel_virtual_functions.asciidoc | 74 ++++++++++--------- 1 file changed, 38 insertions(+), 36 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index ea98d7bb23e5c..a31a242058a80 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -160,7 +160,7 @@ new terms are introduced to simplify the extension specification: Set of virtual member functions:: a group of virtual member functions which are defined with `indirectly_callable` property with the same value of the property parameter `SetId`. For the simplicity, this will also be further refferred to as -a set, or as a set of virtual functions. +a _set_, or as a _set of virtual functions_. Kernel declares a use of a set of virtual member functions:: a kernel is considered to be declaring a use of a set of virtual member functions `SetIdA` @@ -173,50 +173,50 @@ functions. |=== |Property|Description |`indirectly_callable` -|The `indirectly_callable` property indicates that a virtual member function as +|The `indirectly_callable` property indicates that a virtual member function is a device function, thus making it available to be called from SYCL kernel and device functions. Should only be applied to virtual member functions and to do so, function-style `SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY` macro should -be used. It accepts a single optional argument, which is passed to the property -parameter `SetId` and therefore has all the same requirements. +be used. The macro accepts a single optional argument, which is passed to the +property parameter `SetId` and therefore has all the same requirements. Optional parameter `SetId` specifies a set of virtual member functions this function belongs to and at the same time it defines a group of kernels, which can call this function, it must be a C++ typename. When the parameter is -ommitted, a virtual member function is considered to belong to a default set. +ommitted, a virtual member function is considered to belong to the default set. Calling a virtual member function from a kernel which does not declare use of a set the virtual member function belongs to is an undefined behavior. |`calls_indirectly` |The `calls_indirectly` property indicates that a SYCL kernel function is -performing calls through virtual member functions and declares use one or more -of sets of virtual member functions. +performing calls through virtual member functions and declares use of one or +more sets of virtual member functions. Optional parameter `SetIds` specifies which sets of virtual member functions are -declared to be used by the kernel, it must be zero or more comma-separated C++ -typenames. If the argument is omitted (zero C++ typenames specified), than a -kernel is considered to be using a default set of virtual member functions. +declared to be used by a kernel, it must be zero or more C++ typenames. If the +argument is omitted (zero C++ typenames specified), then a kernel is considered +to be using the default set of virtual member functions. Calling a virtual member function, which does not belong to any of sets of virtual member functions declared to be used is an undefined behavior. |=== If a virtual member function is called from device code, both definition and -declaration of that function must be decorated with the +indirectly_callable+ +declaration of that function must be decorated with the `indirectly_callable` property. `SetId` property parameter must match between definition and -declaration and implementation should provide a diagnostic in case of mismatch. +declaration, or otherwise behavior is undefined. -Applying the +indirectly_callable+ property to a SYCL Kernel function is illegal +Applying the `indirectly_callable` property to a SYCL Kernel function is illegal and an implementation should produce a diagnostic for that. -Applying the +indirectly_callable+ property to an arbitrary device function, +Applying the `indirectly_callable` property to an arbitrary device function, which is not a virtual member function has no effect. NOTE: This behavior may be changed in either future version of this extension or in another extensions. -Virtual member functions that are decorated with the +indirectly_callable+ +Virtual member functions that are decorated with the `indirectly_callable` property are considered to be device functions, i.e. they must obey the restrictions listed in section 5.4 of the core SYCL specification "Language restrictions for device functions". Virtual member functions that are not @@ -236,13 +236,14 @@ public: // properties to functions should be applied using the macro: virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void foo() {} - // both declaration and definition should be annotated: - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void bar(); + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void bar() {}; - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void baz() {} + // both declaration and definition should be annotated: + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void baz(); }; -void SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) Foo::bar() {} +// both declaration and definition should be annotated: +void SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) Foo::baz() {} // kernel calling virtual function should also be annotated: /* */.single_task(syclext::properties{syclext::calls_indirectly}, [=] { @@ -257,7 +258,7 @@ void SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) Foo::bar() {} Reasons about why it may be necessary to put different virtual functions into different sets are explained in further sections, but for simplicity purposes both properties can be used without explicitly specifying a set, thus using -a default set: +the default set: [source,dpcpp] ---- @@ -267,7 +268,8 @@ struct set_A; class Foo { public: - // This virtual member function belongs to a default set of virtual functions. + // This virtual member function belongs to the default set of virtual + // functions. virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void bar() {} @@ -319,55 +321,55 @@ member functions declared to be used by a kernel. ---- using syclext = sycl::ext::oneapi::experimental; -struct set_A; -struct set_B; +struct set_fp64; +struct set_fp16; struct Foo { // This function uses 'fp64' aspect - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void foo() { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp64) void f64() { double d = 3.14; } // This function uses 'fp16' aspect - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void bar() { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp16) void f16() { sycl::half h = 2.71; } }; sycl::queue q; -q.single_task(syclext::properties{syclext::calls_indirectly}, +q.single_task(syclext::properties{syclext::calls_indirectly}, [=] [[sycl::device_has(sycl::aspect::fp64)]] { // Diagnostic is required for this kernel, because it is declared as only // using 'fp64' aspect, but it also uses virtual member functions from - // "set_B", which includes 'Foo:bar' that uses 'fp16' aspect. + // "set_fp16", which includes 'Foo::f16' that uses 'fp16' aspect. }); -q.single_task(syclext::properties{syclext::calls_indirectly}, +q.single_task(syclext::properties{syclext::calls_indirectly}, [=] [[sycl::device_has()]] { // Diagnostic is required for this kernel, because it is declared as not // using any optional features, but it also uses virtual member functions from - // "set_A", which includes 'Foo::foo' that uses 'fp64' aspect. + // "set_fp64", which includes 'Foo::f64' that uses 'fp64' aspect. }); -q.single_task(syclext::properties{syclext::calls_indirectly}, +q.single_task(syclext::properties{syclext::calls_indirectly}, [=] [[sycl::device_has(sycl::aspect::fp64)]] { // No diagnostic is required for this kernel, because list of declared aspects // matches list of used aspects. That includes virtual member functions from - // "set_A", which includes 'Foo::foo' that uses 'fp64' aspect + // "set_fp64", which includes 'Foo::f64' that uses 'fp64' aspect }); q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] [[sycl::device_has()]] { // No diagnostic is required for this kernel, because list of declared aspects // matches list of used aspects. There are no virtual member functions defined - // in a default set, which means that no extra optional kernel features + // in the default set, which means that no extra optional kernel features // requirements were attached to the kernel. }); ---- Submitting a kernel with `calls_indirectly` property, which includes virtual -member functions that use optional kernel features to a device which doesn't +member functions that use optional kernel features to a device that doesn't support them, should result in an exception at runtime, similar to how it is defined by the core SYCL specification. @@ -433,7 +435,7 @@ q.single_task([=] { }); ---- -==== Interation with `reqd_sub_group_size` attribute +==== Interaction with `reqd_sub_group_size` attribute The `reqd_sub_group_size` attribute is a bit of a special case comparing to other optional kernel features, because it requires to compile a kernel in a @@ -453,9 +455,9 @@ more kernels with different `reqd_sub_group_size`, each kernel may need to receive a different pointer to a different version of a virtual member function. To avoid possibly posing such multi-versioning requirements on implementations, -virtual member functions can only be called from a kernel with _primary_ +virtual member functions can only be called from kernels with _primary_ sub-group-size as defined by -link:../experimental/sycl_ext_oneapi_named_sub_group_sizes.asciidoc[ +link:../proposed/sycl_ext_oneapi_named_sub_group_sizes.asciidoc[ sycl_ext_oneapi_named_sub_group_sizes] extension, or otherwise behavior is undefined. From b492a06ac7ae351314d77fdfd2e32c7387310f39 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 28 Nov 2023 04:21:24 -0800 Subject: [PATCH 13/51] An attempt to fix ++ rendering --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index a31a242058a80..b18e31f34a1aa 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -194,8 +194,8 @@ performing calls through virtual member functions and declares use of one or more sets of virtual member functions. Optional parameter `SetIds` specifies which sets of virtual member functions are -declared to be used by a kernel, it must be zero or more C++ typenames. If the -argument is omitted (zero C++ typenames specified), then a kernel is considered +declared to be used by a kernel, it must be zero or more C\++ typenames. If the +argument is omitted (zero C\++ typenames specified), then a kernel is considered to be using the default set of virtual member functions. Calling a virtual member function, which does not belong to any of sets of From dd8b1821ea58ad3afe79d442954783c11a0de83a Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 28 Nov 2023 04:23:22 -0800 Subject: [PATCH 14/51] Fixup previous commit --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index b18e31f34a1aa..e2de14ea0587a 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -195,7 +195,7 @@ more sets of virtual member functions. Optional parameter `SetIds` specifies which sets of virtual member functions are declared to be used by a kernel, it must be zero or more C\++ typenames. If the -argument is omitted (zero C\++ typenames specified), then a kernel is considered +argument is omitted (zero C++ typenames specified), then a kernel is considered to be using the default set of virtual member functions. Calling a virtual member function, which does not belong to any of sets of From 012a50f2320c498987d7d36aa928bffaf64c796b Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 28 Nov 2023 06:46:12 -0800 Subject: [PATCH 15/51] attempt to fix render --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index e2de14ea0587a..20898516bfde7 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -296,13 +296,13 @@ device functions. When the attribute is applied to a kernel: -> ... it causes the compiler to issue a diagnostic if the kernel (or any of the +> \... it causes the compiler to issue a diagnostic if the kernel (or any of the > functions it calls) uses an optional feature that is associated with an aspect > that is not listed in the attribute. When the attribute is applied to a function: -> ... it causes the compiler to issue a diagnostic if the device function (or +> \... it causes the compiler to issue a diagnostic if the device function (or > any of the functions it calls) uses an optional feature that is associated > with an aspect that is not listed in the attribute. From 8408f841db69f1a823bc5f27757179795e5699a0 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 28 Nov 2023 06:48:02 -0800 Subject: [PATCH 16/51] Fix doxygen build --- sycl/doc/design/VirtualFunctions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 473022b32224f..9e4fda62c5a3d 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -189,7 +189,7 @@ kernel. **TODO:** do we need to say anything about in-memory and on-disk cache functionality here? -[1]: <../extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc> +[1]: <../extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc> [2]: [3]: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-sycl-unique-stable-name From fc5c99041275a5deb68a6bb86f45237b3f835b3d Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 1 Dec 2023 02:56:42 -0800 Subject: [PATCH 17/51] Add virtual functions implementation design overview --- sycl/doc/design/VirtualFunctions.md | 87 +++++++++++++++++++ .../sycl_ext_intel_virtual_functions.asciidoc | 3 +- 2 files changed, 88 insertions(+), 2 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 9e4fda62c5a3d..e188da851ca19 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -3,6 +3,91 @@ Corresponding language extension specification: [sycl_ext_oneapi_virtual_functions][1] +## Overview + +Main complexity of the feature comes from its co-existence with optional kernel +features ([SYCL 2020 spec][sycl-spec-optional-kernel-features], +[implementaiton design][optional-kernel-features-design]) mechanism. Consider +the following example: + +```c++ +using syclext = sycl::ext::oneapi::experimental; + +struct set_fp64; + +struct Base { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp64) void bar() { + // this virtual function uses double + double d = 3.14; + } +}; + +class Constructor; +class Use; +class UseFP64; + +int main() { + // Selected device may not support 'fp64' aspect + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(2, Q); + + Q.single_task([=] { + // Only placement new can be used within device functions. + // When an object of a polymorphic class is created, its vtable is filled + // with pointer to virtual member functions. However, we don't always know + // featuures supported by a target device (in case of JIT) and therefore + // can't decide whether both 'foo' and 'bar' should be both included in the + // resulting device image - the decision must be made at runtime when we + // know the target device. + new (Obj) Derived; + }); + + // The same binary produced by thy sycl compiler should correctly work on both + // devices with and without support for 'fp64' aspect. + Q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + Obj->foo(); + }); + + if (Q.get_device().has(sycl::aspect::fp64)) { + Q.single_task(syclext::properties{syclext::calls_indirectly}, + [=] { + Obj->bar(); + }); + } + + return 0; +} +``` + +As comments in the snippet say the main issue is with vtables: at compile time +it may not be clear which exact functions can be safely included in there and +which are not in order to avoid speculative compilation and fulfill optional +kernel features requirements from the SYCL 2020 specificaiton. + +To solve this, the following approach is used: all virtual functions marked with +`indirectly_callable` property are grouped by set they belong to and outlined +into separate device images (i.e. device images with kernels using them are left +with declarations only of those virtual functions). + +For each device image with virtual functinos that use optional features we also +create a "dummy" version of it where bodies of all virtual functions are +emptied. + +Dependencies between deivce images are recorded in properties based on +`calls_indirectly` and `indirectly_callable` properties. They are used later by +runtime to link them together. Device images which depend on optional kernel +features are linked only if those features are supported by a target device and +dummyy versions of those device images are used otherwise. + +This way we can emit single unified version of LLVM IR where vtables reference +all device virtual functions, but their definitions are outlined and linked +back dynamically based on device capabilities. + +For AOT flow, we don't do outlining and dynamic linking, but instead do direct +cleanup of virtual functions which are incompatible with a target device. ## Design @@ -192,4 +277,6 @@ functionality here? [1]: <../extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc> [2]: [3]: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-sycl-unique-stable-name +[sycl-spec-optional-kernel-features]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:optional-kernel-features +[optional-kernel-features-design]: diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 20898516bfde7..b3788894eef9b 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -503,8 +503,7 @@ int main() { Q.submit([&](sycl::handler &CGH) { CGH.use_kernel_bundle(bundleA); - CGH.single_task(syclext::properties{syclext::calls_indirectly<>}, - [=] { + CGH.single_task([=] { // Only placement new can be used within device functions. new (Obj) Derived; }); From 4edec6485f468533fe05b2673d57a0a08a7ee83b Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 1 Dec 2023 06:35:15 -0800 Subject: [PATCH 18/51] design doc update --- sycl/doc/design/VirtualFunctions.md | 47 +++++++++++++++++++++-------- 1 file changed, 35 insertions(+), 12 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index e188da851ca19..b123b88abb521 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -94,15 +94,14 @@ cleanup of virtual functions which are incompatible with a target device. ### Changes to the SYCL header files New compile-time properties `indirectly_callable` and `calls_indirectly` should -be implemented in accordance with the corresponding [design document][2]. +be implemented in accordance with the corresponding [design document][2]: -`indirectly_callable` property should lead to emission of -`"indirectly-callable"="set"` function attribute, where "set" is a string -representation of the property template parameter. - -`calls_indirectly` property should lead to emission of -`"calls-indirectly"="set1,set2"`, where "set1" and "set2" are string -representations of the property template parameters. +- `indirectly_callable` property should lead to emission of + `"indirectly-callable"="set"` function attribute, where "set" is a string + representation of the property template parameter. +- `calls_indirectly` property should lead to emission of + `"calls-indirectly"="set1,set2"`, where "set1" and "set2" are string + representations of the property template parameters. In order to convert a type to a string, [\__builtin_sycl_unique_stable_name][3] could be used. @@ -110,18 +109,42 @@ could be used. **TODO**: `calls_indirectly` requires compile-time concatenation of strings. Document how it should be done. +`indirectly_callable` property is applied to functions using "custom" (comparing +to other properties) `SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY` macro. This +is done to allow implementations to attach some extra attributes alongside the +property. In particular, functions marked with the macro should be considered +SYCL device functions and compiler should emit diagnostics if those functions +do not conform with the SYCL 2020 specification. To achieve that and avoid +extending FE to parse strings within properties, the aforementioned macro should +also set `sycl_device` attribute: + +``` +#define SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(SetId) \ + __attribute__((sycl_device)) [[__sycl_detail__::add_ir_attribute_function( \ + "indirectly-callable", __builtin_sycl_unique_stable_name(SetId))]] +``` + ### Changes to the compiler front-end -Compiler front-end should be updated to respect rules defined by the -[extension specifiction][1], such as: +Most of the handling for virtual functions happens in middle-end and thanks to +compile-time properties, no extra work is required to propagate necessary +information down to passes from headers. + +However, we do need to filter out those virtual functions which are not +considered to be device as defined by the [extension specifiction][1], such as: - virtual member functions annotated with `indirectly_callable` compile-time property should be emitted into device code; - virtual member function *not* annotated with `indirectly_callable` compile-time property should *not* be emitted into device code; -**TODO**: investigate if it is possible to enforce SYCL device code restrctions -on virtual member functions marked with the compile-time property. +There is no need to actually check which exact property is applied to a +function, it is enough to check if `add_ir_attribute_function` attribute was +applied and that we are in SYCL device mode to decide whether or not a virtual +function should be emitted into vtable and device code. + +**TODO:** any extra diagnostics we would like to emit? Like kernel without +`calls_indirectly` property performing virtual function call. ### Changes to the compiler middle-end From 9fa3bc0dc7e2040d64a745b7b555e3d164f2fbdd Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 5 Dec 2023 04:54:39 -0800 Subject: [PATCH 19/51] Design doc updates --- sycl/doc/design/VirtualFunctions.md | 109 +++++++++++++++++++++++++--- 1 file changed, 99 insertions(+), 10 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index b123b88abb521..52c60adcbeebc 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -38,11 +38,11 @@ int main() { // Only placement new can be used within device functions. // When an object of a polymorphic class is created, its vtable is filled // with pointer to virtual member functions. However, we don't always know - // featuures supported by a target device (in case of JIT) and therefore + // features supported by a target device (in case of JIT) and therefore // can't decide whether both 'foo' and 'bar' should be both included in the // resulting device image - the decision must be made at runtime when we // know the target device. - new (Obj) Derived; + new (Obj) Base; }); // The same binary produced by thy sycl compiler should correctly work on both @@ -167,6 +167,70 @@ an aspect, then such kernel **should not** be considered as using that aspect. Properties-based mechanism which is described above should be used for aspects propagation for virtual functions. +To illustrate this, let's once again consider the example from Overview section +which is copied below for convinience: + +```c++ +using syclext = sycl::ext::oneapi::experimental; + +struct set_fp64; + +struct Base { + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} + virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp64) void bar() { + // this virtual function uses double + double d = 3.14; + } +}; + +class Constructor; +class Use; +class UseFP64; + +int main() { + // Selected device may not support 'fp64' aspect + sycl::queue Q; + + Base *Obj = sycl::malloc_device(1, Q); + int *Result = sycl::malloc_shared(2, Q); + + Q.single_task([=] { + // Even though at LLVM IR level this kernel does reference 'Base::foo' + // and 'Base::bar' though global variable containing `vtable` for `Base`, + // we do not consider the kernel to be using `fp64` optional feature. + new (Obj) Base; + }); + + Q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + // This kernel is not considered to be using any optional features, because + // virtual functions in default set do not use any. + Obj->foo(); + }); + + if (Q.get_device().has(sycl::aspect::fp64)) { + Q.single_task(syclext::properties{syclext::calls_indirectly}, + [=] { + // This kernel is considered to be using 'fp64' optional feature, because + // there is a virtual function in 'set_fp64' which uses double. + Obj->bar(); + }); + } + + return 0; +} +``` + +This way, "Consturctor" kernel(s) won't pull optional features +requirements from virtual functions it may reference through vtable, making it +independent from those. This allows to launch such kernels on wider list of +devices, even though there could be virtual functions which require optional +features. + +"Use" kernel(s) do pull optional features requirements from virtual functions +they may call through `calls_indirectly` property and associated sets. This +enables necessary runtime diagnistics that a kernel is not submitted to a device +which doesn't support all required optional features. + #### New compiler diagnostics **TBD** @@ -174,9 +238,9 @@ propagation for virtual functions. #### Device code split and device images The extension specification restricts implementation from raising a diagnostic -when a kernel not marked with `calls_indirectly` kernel property creates an -object of a polymorphic class where some virtual functions use optional kernel -features incompatible with a target device. +when a kernel that is not marked with `calls_indirectly` kernel property creates +an object of a polymorphic class where some virtual functions use optional +kernel features incompatible with a target device. Consider the following example: @@ -222,7 +286,6 @@ if (q.get_device().has(sycl::aspect::fp64)) { Storage->bar(); }); } - ``` This example should work regardless of whether target device supports 'fp64' @@ -232,7 +295,11 @@ compatible with a target device. Regardless of device code split mode selected by a user, functions marked with `indirectly_callable` property should be outlined into a separate device images -by `sycl-post-link` tool based on the property argument. +by `sycl-post-link` tool based on the property argument, i.e. all functions +from the same set should be bundled into a dedicated device image. + +Virtual functions in the original device image should be turned into +declarations instead of definitions. Additionally, if any virtual function in such device image uses any optional kernel features, then the whole image should be cloned with all function bodies @@ -269,7 +336,23 @@ property set: - "creates-virtual-functions-set" with a string value containing comma-separate list of names of virtual function sets which are referenced from functions included into vtables used by a kernel within a device image; - **TODO:** this item definitely needs better description + +There is a reason why we need to separate properties and can't just use one for +both kinds of relationships: + +When a kernel only creates an object of a polymorphic class, we should only use +virtual functions which are compatible with a target device. Virtual functions +that use unsupported optional features are expected to be outlined into separate +sets in that case and we need to ensure that we are still able to create an +object so that virtual functions that use suppported optional features are +usable. + +However, when a kernel actually makes calls to virtual functions, we assert +that all optional features used by virtual functions in all sets used by the +kernel are supported on a target device. All those aspects have been already +attached to the kernel as part of aspects propagation phase and therefore at +runtime we will unconditionally pull all device images with virtual functions +which are used by a kernel to make calls to them. ### Changes to the runtime @@ -294,8 +377,14 @@ Algorithm for discovery of device images which has to be linked: Produced list of device images is then linked together and used to enqueue a kernel. -**TODO:** do we need to say anything about in-memory and on-disk cache -functionality here? +NOTE: when shared libraries are involved, they could also provide some +`indirectly_callable` functions in the same sets as application. This means that +there could be more than one image registered with the same value of +"virtual-functions-set" property. + +NOTE: No changes are needed for both in-memory and on-disk caches, because they +take both kernel and device as keys and for that pair list of device images +which needs to be linked together does not change from launch to launch. [1]: <../extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc> [2]: From 5cae901c1f49c1d6f334721e438535d9d6281ddb Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 6 Dec 2023 02:37:12 -0800 Subject: [PATCH 20/51] Fix a bunch of typos --- sycl/doc/design/VirtualFunctions.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 52c60adcbeebc..0665d4705b520 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -7,7 +7,7 @@ Corresponding language extension specification: Main complexity of the feature comes from its co-existence with optional kernel features ([SYCL 2020 spec][sycl-spec-optional-kernel-features], -[implementaiton design][optional-kernel-features-design]) mechanism. Consider +[implementation design][optional-kernel-features-design]) mechanism. Consider the following example: ```c++ @@ -65,22 +65,22 @@ int main() { As comments in the snippet say the main issue is with vtables: at compile time it may not be clear which exact functions can be safely included in there and which are not in order to avoid speculative compilation and fulfill optional -kernel features requirements from the SYCL 2020 specificaiton. +kernel features requirements from the SYCL 2020 specification. To solve this, the following approach is used: all virtual functions marked with `indirectly_callable` property are grouped by set they belong to and outlined into separate device images (i.e. device images with kernels using them are left with declarations only of those virtual functions). -For each device image with virtual functinos that use optional features we also +For each device image with virtual functions that use optional features we also create a "dummy" version of it where bodies of all virtual functions are emptied. -Dependencies between deivce images are recorded in properties based on +Dependencies between device images are recorded in properties based on `calls_indirectly` and `indirectly_callable` properties. They are used later by runtime to link them together. Device images which depend on optional kernel features are linked only if those features are supported by a target device and -dummyy versions of those device images are used otherwise. +dummy versions of those device images are used otherwise. This way we can emit single unified version of LLVM IR where vtables reference all device virtual functions, but their definitions are outlined and linked @@ -157,7 +157,7 @@ functions. For that the pass should compile a list of aspects used by each set of indirectly callable functions (as defined by `indirectly_callable` property set by user) and then append those aspects to every kernel which use those sets (as -defiend by `calls_indirectly` property set by user). +defined by `calls_indirectly` property set by user). NOTE: if the aspects propagation pass is ever extended to track function pointers, then aspects attached to virtual functions **should not** be attached @@ -168,7 +168,7 @@ Properties-based mechanism which is described above should be used for aspects propagation for virtual functions. To illustrate this, let's once again consider the example from Overview section -which is copied below for convinience: +which is copied below for convenience: ```c++ using syclext = sycl::ext::oneapi::experimental; @@ -228,7 +228,7 @@ features. "Use" kernel(s) do pull optional features requirements from virtual functions they may call through `calls_indirectly` property and associated sets. This -enables necessary runtime diagnistics that a kernel is not submitted to a device +enables necessary runtime diagnostics that a kernel is not submitted to a device which doesn't support all required optional features. #### New compiler diagnostics @@ -307,7 +307,7 @@ emptied. This cloned device image will be further referred to as "dummy virtual functions device image". This dummy device image is needed to support the example showed above when a -kernel creates an object of a polymorhpic class where some of virtual functions +kernel creates an object of a polymorphic class where some of virtual functions use optional features. LLVM IR generated by front-end will contain a vtable, which references all methods of the class. However, not all of them can be directly included into kernel's device image to avoid speculative compilation. @@ -344,7 +344,7 @@ When a kernel only creates an object of a polymorphic class, we should only use virtual functions which are compatible with a target device. Virtual functions that use unsupported optional features are expected to be outlined into separate sets in that case and we need to ensure that we are still able to create an -object so that virtual functions that use suppported optional features are +object so that virtual functions that use supported optional features are usable. However, when a kernel actually makes calls to virtual functions, we assert From 6059d0d2b52839ef3c9d670cb71d082853c3dc59 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 6 Dec 2023 03:06:46 -0800 Subject: [PATCH 21/51] More typo fixes --- sycl/doc/design/VirtualFunctions.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 0665d4705b520..5aa57aed24590 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -131,7 +131,8 @@ compile-time properties, no extra work is required to propagate necessary information down to passes from headers. However, we do need to filter out those virtual functions which are not -considered to be device as defined by the [extension specifiction][1], such as: +considered to be device as defined by the [extension specification][1], such +as: - virtual member functions annotated with `indirectly_callable` compile-time property should be emitted into device code; From 3883c9034bcd293c5b11fc46b0b9a4b8f3e76300 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 6 Dec 2023 03:29:38 -0800 Subject: [PATCH 22/51] Add details about AOT flow --- sycl/doc/design/VirtualFunctions.md | 45 +++++++++++++++++++++++++---- 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 5aa57aed24590..efd59a09ca85c 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -160,6 +160,10 @@ indirectly callable functions (as defined by `indirectly_callable` property set by user) and then append those aspects to every kernel which use those sets (as defined by `calls_indirectly` property set by user). +**TODO**: should we consider outlining "indirectly used" aspects into a separate +metadata and device image property? This should allow for more precise and +user-friendly exceptions at runtime + NOTE: if the aspects propagation pass is ever extended to track function pointers, then aspects attached to virtual functions **should not** be attached to kernels using this mechanism. For example, if a kernel uses a variable, @@ -290,14 +294,18 @@ if (q.get_device().has(sycl::aspect::fp64)) { ``` This example should work regardless of whether target device supports 'fp64' -aspect or not. To achieve that, virtual member functions are outlined into -separate device images which are linked at runtime depending on whether they are -compatible with a target device. +aspect or not. Implementation differs for JIT and AOT flows. + +##### JIT flow Regardless of device code split mode selected by a user, functions marked with -`indirectly_callable` property should be outlined into a separate device images -by `sycl-post-link` tool based on the property argument, i.e. all functions -from the same set should be bundled into a dedicated device image. +`indirectly_callable` property should be outlined into separate device images +by `sycl-post-link` tool based on the argument of the `indirectly_callable` +property, i.e. all functions from the same set should be bundled into a +dedicated device image. + +**TODO**: as an optimization, we can consider preserving virtual functions from +sets that do not use any optional kernel features. Virtual functions in the original device image should be turned into declarations instead of definitions. @@ -317,11 +325,35 @@ When such kernel is submitted to a device, runtime will check which optional features are supported and link one or another device image with virtual functions. +##### AOT flow + +In AOT mode, there will be no dynamic linking, but at the same time we know the +list of supported optional features by a device thanks to +[device config file][device-config-file-design]. + +Therefore, `sycl-post-link` should read the device config file to determine list +of optional features supported by a target and based on that drop all virtual +functions from sets that use unsupported optional features. + +Note that we are making decision not based on which aspects are used by each +individual virtual functions, but based on which aspects are used by a set of +virtual functions (as identified by the `indirectlly_callable` property +argument). The latter is computed as conjunction of aspects used by each +virtual function within a set. + +The behavior is defined this way to better match the extension speficiation +which defines virtual functions availability in terms of whole sets and not +individual functions. + #### New device image properties To let runtime know which device images should be linked together to get virtual functions working, new property set is introduced: "SYCL/virtual functions". +NOTE: in AOT mode, every device image is already self-contained and contains +the right (supported by a device) set of virtual functions in it. Therefore, we +do not need to emit any of those properties when we are in AOT mode. + For device images, which contain virtual functions (i.e. ones produced by outlining `indirectly_callable` functions into a separate device image), the following properties are set within the new property set: @@ -392,4 +424,5 @@ which needs to be linked together does not change from launch to launch. [3]: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-sycl-unique-stable-name [sycl-spec-optional-kernel-features]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:optional-kernel-features [optional-kernel-features-design]: +[device-config-file-design]: From 683bc66e92e3acb4c652791cfd7a815e5afa5783 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 3 Jan 2024 08:07:39 -0800 Subject: [PATCH 23/51] Apply trivial comments from reviewers --- sycl/doc/design/VirtualFunctions.md | 14 +++++----- .../sycl_ext_intel_virtual_functions.asciidoc | 28 +++++++++---------- 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index efd59a09ca85c..423c0206cc52f 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -39,13 +39,13 @@ int main() { // When an object of a polymorphic class is created, its vtable is filled // with pointer to virtual member functions. However, we don't always know // features supported by a target device (in case of JIT) and therefore - // can't decide whether both 'foo' and 'bar' should be both included in the + // can't decide whether both 'foo' and 'bar' should be included in the // resulting device image - the decision must be made at runtime when we // know the target device. new (Obj) Base; }); - // The same binary produced by thy sycl compiler should correctly work on both + // The same binary produced by a sycl compiler should correctly work on both // devices with and without support for 'fp64' aspect. Q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { Obj->foo(); @@ -201,7 +201,7 @@ int main() { Q.single_task([=] { // Even though at LLVM IR level this kernel does reference 'Base::foo' - // and 'Base::bar' though global variable containing `vtable` for `Base`, + // and 'Base::bar' through global variable containing `vtable` for `Base`, // we do not consider the kernel to be using `fp64` optional feature. new (Obj) Base; }); @@ -213,7 +213,7 @@ int main() { }); if (Q.get_device().has(sycl::aspect::fp64)) { - Q.single_task(syclext::properties{syclext::calls_indirectly}, + Q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { // This kernel is considered to be using 'fp64' optional feature, because // there is a virtual function in 'set_fp64' which uses double. @@ -225,7 +225,7 @@ int main() { } ``` -This way, "Consturctor" kernel(s) won't pull optional features +This way, "Constructor" kernel(s) won't pull optional features requirements from virtual functions it may reference through vtable, making it independent from those. This allows to launch such kernels on wider list of devices, even though there could be virtual functions which require optional @@ -335,13 +335,13 @@ Therefore, `sycl-post-link` should read the device config file to determine list of optional features supported by a target and based on that drop all virtual functions from sets that use unsupported optional features. -Note that we are making decision not based on which aspects are used by each +Note that we are making decisions not based on which aspects are used by each individual virtual functions, but based on which aspects are used by a set of virtual functions (as identified by the `indirectlly_callable` property argument). The latter is computed as conjunction of aspects used by each virtual function within a set. -The behavior is defined this way to better match the extension speficiation +The behavior is defined this way to better match the extension specification which defines virtual functions availability in terms of whole sets and not individual functions. diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index b3788894eef9b..7b9c8d43b239d 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -137,7 +137,7 @@ namespace sycl::ext::oneapi::experimental { }; struct calls_indirectly_key { - template + template using value_t = property_value; }; @@ -154,13 +154,13 @@ namespace sycl::ext::oneapi::experimental { } ---- -Before describing those properties those properties in more detail, a couple of -new terms are introduced to simplify the extension specification: +Before describing those properties in more detail, a couple of new terms are +introduced to simplify the extension specification: Set of virtual member functions:: a group of virtual member functions which are -defined with `indirectly_callable` property with the same value of the property -parameter `SetId`. For the simplicity, this will also be further refferred to as -a _set_, or as a _set of virtual functions_. +defined with the `indirectly_callable` property and with the same value of the +property parameter `SetId`. For simplicity, this will also be further referred +to as a _set_, or as a _set of virtual functions_. Kernel declares a use of a set of virtual member functions:: a kernel is considered to be declaring a use of a set of virtual member functions `SetIdA` @@ -183,7 +183,7 @@ property parameter `SetId` and therefore has all the same requirements. Optional parameter `SetId` specifies a set of virtual member functions this function belongs to and at the same time it defines a group of kernels, which can call this function, it must be a C++ typename. When the parameter is -ommitted, a virtual member function is considered to belong to the default set. +omitted, a virtual member function is considered to belong to the default set. Calling a virtual member function from a kernel which does not declare use of a set the virtual member function belongs to is an undefined behavior. @@ -312,7 +312,7 @@ exact virtual functions are called from which kernels. Instead, information from the new properties is used by an implementation to issue such diagnostic. When determining a set of aspects which are used by a -SYCL kernel function, implementation also takes into account all aspects which +SYCL kernel function, an implementation must take into account all aspects which are used by all virtual member functions included into all sets of virtual member functions declared to be used by a kernel. @@ -332,7 +332,7 @@ struct Foo { // This function uses 'fp16' aspect virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp16) void f16() { - sycl::half h = 2.71; + sycl::half h = 2.71f; } }; @@ -405,7 +405,7 @@ q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { An implementation may not raise a compile time diagnostic or a run time exception merely due to speculative compilation of a virtual member function for a device when the application does not specify a use of virtual member functions -through the correponding properties. +through the corresponding properties. [source,dpcpp] ---- @@ -505,7 +505,7 @@ int main() { CGH.use_kernel_bundle(bundleA); CGH.single_task([=] { // Only placement new can be used within device functions. - new (Obj) Derived; + new (Obj) Base; }); }); @@ -567,12 +567,12 @@ int main() { Base B; Result[0] = B.get_random_number(); }).wait(); - assert(A[0] == 4); + assert(Result[0] == 4); Q.single_task(props, [=] { - A[0] = Obj->get_random_number(); + Result[0] = Obj->get_random_number(); }).wait(); - assert(A[0] == 221); + assert(Result[0] == 221); return 0; } From 8c8185205f8c4c6e8e08eb055c94c27bb936c40d Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 3 Jan 2024 09:14:29 -0800 Subject: [PATCH 24/51] Apply comments about virtual functions filtering in FE --- sycl/doc/design/VirtualFunctions.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 423c0206cc52f..b10b1233c7210 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -139,10 +139,10 @@ as: - virtual member function *not* annotated with `indirectly_callable` compile-time property should *not* be emitted into device code; -There is no need to actually check which exact property is applied to a -function, it is enough to check if `add_ir_attribute_function` attribute was -applied and that we are in SYCL device mode to decide whether or not a virtual -function should be emitted into vtable and device code. +Since mechanism for attaching the property automatically attaches `sycl_device` +attribute to virtual functions (see the previous section), it is enough for the +FE to only look for the `sycl_device` attribute, following the logic which is +already in place for regular directly called functions. **TODO:** any extra diagnostics we would like to emit? Like kernel without `calls_indirectly` property performing virtual function call. From 82c9303c35ef26d026f59a33b0d2e1faba257159 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Thu, 4 Jan 2024 07:30:40 -0800 Subject: [PATCH 25/51] Rebase to SYCL 2020 rev.8; Update year in the copyright --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 7b9c8d43b239d..ee8e4401edd92 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -20,7 +20,7 @@ == Notice [%hardbreaks] -Copyright (C) 2023-2023 Intel Corporation. All rights reserved. +Copyright (C) 2024-2024 Intel Corporation. All rights reserved. Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by @@ -36,7 +36,7 @@ https://github.com/intel/llvm/issues == Dependencies -This extension is written against the SYCL 2020 revision 7 specification. All +This extension is written against the SYCL 2020 revision 8 specification. All references below to the "core SYCL specification" or to section numbers in the SYCL specification refer to that revision. From 65016fc0248ee61b753d7aa479e0a67774b8ca5f Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 9 Jan 2024 05:54:19 -0800 Subject: [PATCH 26/51] Rewrite properties intro --- .../sycl_ext_intel_virtual_functions.asciidoc | 24 ++++++++++++------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index ee8e4401edd92..c8ee7764c107f 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -117,15 +117,21 @@ be used: === New properties -Due to the indirect nature of virtual member functions, compiler may not be able -to understand which exact virtual member function is being called (i.e. which -class it belongs to). Moreover, when code is distributed over several -translation units, compiler may not be able to even see definitions of all -virtual member functions which may be called in a kernel. - -Therefore, to provide a mechanism for an implementation to help detect virtual -member functions, which are going to be used from kernels and enforce necessary -restrictions on them, new compile-time-constant properties are proposed. +Under the hood virtual functions are essentially function pointers which are +stored in a global variable and managed by compiler-generated code. Therefore, +each call to a virtual member function is an indirect call and compiler may not +be able to understand which exact virtual function is being called (i.e. which +class it belongs to). + +Without any knowledge about which virtual function can be called from which +kernels compiler will have to make all virtual functions available to all +kernels. That may not be desirable because some of those virtual functions could +use optional kernel features and thus would propagate their use into kernels +designed to be submitted to devices without support for those optional features. + +In order to help compiler to build a mapping between kernels and virtual +functions they may call, the extension introduces new compile-time-constant +properties. [source,dpcpp] ---- From 11f3a7d520fc29174fd21055704ecc97bd051c12 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 9 Jan 2024 05:56:16 -0800 Subject: [PATCH 27/51] Cleanup mentions of undefined helper function --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index c8ee7764c107f..d5af3ff5a8777 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -448,11 +448,11 @@ other optional kernel features, because it requires to compile a kernel in a certain way, which may require special handling for all functions which are called from it. -When the same helper function is called from two or more kernels -with different `reqd_sub_group_size` attribute, it may be required for the -implementation to duplicate that helper function to create different versions -of it tailored to different sub-group sizes. It can be done in a straightforward -manner when operating on a static call graph. +When the same function is called from two or more kernels with different +`reqd_sub_group_size` attribute, it may be required for the implementation to +duplicate that function to create different versions of it tailored to different +sub-group sizes. It can be done in a straightforward manner when operating on a +static call graph. Virtual member functions are essentially called indirectly and pointers to them are initialized just once when an object of a polymorhpic class is being From a07048ffca79dec10df6257b2e7d396b4960c7b7 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 10 Jan 2024 07:41:16 -0800 Subject: [PATCH 28/51] Change wording about default set identifier --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index d5af3ff5a8777..7b08adf426ecc 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -290,9 +290,11 @@ public: }); ---- -NOTE: `void` typename is explicitly reserved as identifier of the default -virtual functions set. It can still be used, but with the expectation that it -is also implicitly used by the extension implementation. +NOTE: By convention, the type `void` is used to denote the default set of +virtual functions. When the typename is omitted from the +`SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY` macro, the type `void` is used by +default. Applications may also explicitly use the type `void` to denote this +default set of virtual functions. === Optional kernel features handling From 6507226724b4ae8293be656329f237567a8f865b Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 17 Jan 2024 06:46:38 -0800 Subject: [PATCH 29/51] Explain better why it may be required to split virtual functions into sets --- .../sycl_ext_intel_virtual_functions.asciidoc | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 7b08adf426ecc..f61015d3160d6 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -261,10 +261,10 @@ void SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) Foo::baz() {} }); ---- -Reasons about why it may be necessary to put different virtual functions into -different sets are explained in further sections, but for simplicity purposes -both properties can be used without explicitly specifying a set, thus using -the default set: +The main reason for virtual functions to be split into different sets is use of +optional kernel features in those virtual functions. It is explained in more +details in the next section. However, for simplicity purposes both properties +can be used without explicitly specifying a set, thus using the default set: [source,dpcpp] ---- @@ -324,6 +324,13 @@ SYCL kernel function, an implementation must take into account all aspects which are used by all virtual member functions included into all sets of virtual member functions declared to be used by a kernel. +Therefore, if only default set of virtual functions is used by an application, +it means that every kernel which is submitted with the `calls_indirectly` +property is assumed to use _all_ virtual functions marked with the +`indirectly_callable` property. If some of those virtual functions use optional +kernel features and there are kernels which are supposed to work on devices +without support for those optional kernel features, then virtual functions +using them should be outlined into a separate set. [source,dpcpp] ---- From 531083a82ccb48a2142bfcac6f8a71bd24813047 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 17 Jan 2024 07:07:14 -0800 Subject: [PATCH 30/51] Switch indirectly_callable property to use generic function properties macro --- sycl/doc/design/VirtualFunctions.md | 39 +++++------- .../sycl_ext_intel_virtual_functions.asciidoc | 62 ++++++++++++------- 2 files changed, 54 insertions(+), 47 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index b10b1233c7210..b7c00431fb67e 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -16,8 +16,11 @@ using syclext = sycl::ext::oneapi::experimental; struct set_fp64; struct Base { - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp64) void bar() { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + void foo() {} + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void bar() { // this virtual function uses double double d = 3.14; } @@ -109,21 +112,6 @@ could be used. **TODO**: `calls_indirectly` requires compile-time concatenation of strings. Document how it should be done. -`indirectly_callable` property is applied to functions using "custom" (comparing -to other properties) `SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY` macro. This -is done to allow implementations to attach some extra attributes alongside the -property. In particular, functions marked with the macro should be considered -SYCL device functions and compiler should emit diagnostics if those functions -do not conform with the SYCL 2020 specification. To achieve that and avoid -extending FE to parse strings within properties, the aforementioned macro should -also set `sycl_device` attribute: - -``` -#define SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(SetId) \ - __attribute__((sycl_device)) [[__sycl_detail__::add_ir_attribute_function( \ - "indirectly-callable", __builtin_sycl_unique_stable_name(SetId))]] -``` - ### Changes to the compiler front-end Most of the handling for virtual functions happens in middle-end and thanks to @@ -139,10 +127,12 @@ as: - virtual member function *not* annotated with `indirectly_callable` compile-time property should *not* be emitted into device code; -Since mechanism for attaching the property automatically attaches `sycl_device` -attribute to virtual functions (see the previous section), it is enough for the -FE to only look for the `sycl_device` attribute, following the logic which is -already in place for regular directly called functions. +To achieve that, the front-end should implicitly add `sycl_device` attribtue to +each function which is marked with the `indirectly_callable` attribute. This +can be done during handling of `[[__sycl_detail__::add_ir_attributes_function]]` +attribute by checking if one of string literals passed in there as a property +name is equal to "indirectly_callable". Later the `sycl_device` attribute can be +used to decide if a virtual function should be emitted into device code. **TODO:** any extra diagnostics we would like to emit? Like kernel without `calls_indirectly` property performing virtual function call. @@ -181,8 +171,11 @@ using syclext = sycl::ext::oneapi::experimental; struct set_fp64; struct Base { - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp64) void bar() { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + void foo() {} + + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void bar() { // this virtual function uses double double d = 3.14; } diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index f61015d3160d6..490362cda718e 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -182,9 +182,7 @@ functions. |The `indirectly_callable` property indicates that a virtual member function is a device function, thus making it available to be called from SYCL kernel and device functions. Should only be applied to virtual member functions and to do -so, function-style `SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY` macro should -be used. The macro accepts a single optional argument, which is passed to the -property parameter `SetId` and therefore has all the same requirements. +so, function-style `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro should be used. Optional parameter `SetId` specifies a set of virtual member functions this function belongs to and at the same time it defines a group of kernels, which @@ -240,16 +238,23 @@ struct set_B; class Foo { public: // properties to functions should be applied using the macro: - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void foo() {} + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void + foo() {} - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void bar() {}; + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void + bar(){}; // both declaration and definition should be annotated: - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void baz(); + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable) void + baz(); }; // both declaration and definition should be annotated: -void SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) Foo::baz() {} +void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) +Foo::baz() {} // kernel calling virtual function should also be annotated: /* */.single_task(syclext::properties{syclext::calls_indirectly}, [=] { @@ -276,9 +281,11 @@ class Foo { public: // This virtual member function belongs to the default set of virtual // functions. - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + void foo() {} - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void bar() {} + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void bar() {} }; /* */.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { @@ -291,10 +298,10 @@ public: ---- NOTE: By convention, the type `void` is used to denote the default set of -virtual functions. When the typename is omitted from the -`SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY` macro, the type `void` is used by -default. Applications may also explicitly use the type `void` to denote this -default set of virtual functions. +virtual functions. When the typename is omitted from the `indirectly_callable` +or `calls_indirectly` propertyh, the type `void` is used by default. +Applications may also explicitly use the type `void` to denote this default set +of virtual functions. === Optional kernel features handling @@ -341,12 +348,14 @@ struct set_fp16; struct Foo { // This function uses 'fp64' aspect - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp64) void f64() { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void f64() { double d = 3.14; } // This function uses 'fp16' aspect - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_fp16) void f16() { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void f16() { sycl::half h = 2.71f; } }; @@ -396,11 +405,13 @@ struct set_A; struct set_B; struct Foo { - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_A) void foo() { + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void foo() { double d = 3.14; } - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY(set_B) void bar() {} + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + void bar() {} }; sycl::queue q(/* device selector returns a device *without* fp64 support */); @@ -427,11 +438,13 @@ through the corresponding properties. using syclext = sycl::ext::oneapi::experimental; struct Foo { -virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() { - double d = 3.14; -} + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + void foo() { + double d = 3.14; + } -virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void bar() {} + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + void bar() {} }; sycl::queue q(/* device selector choosing a device *without* fp64 support */); @@ -496,7 +509,8 @@ kernel from a different kernel bundle is an undefined behavior. using syclext = sycl::ext::oneapi::experimental; struct Base { - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() void foo() {} + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + void foo() {} }; class Constructor; @@ -547,7 +561,7 @@ using syclext = sycl::ext::oneapi::experimental; class Base { public: - virtual SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) int get_random_number() { return 4; // Chosen by fair dice roll. Guaranteed to be random } @@ -560,7 +574,7 @@ public: class Derived : public Base { public: - SYCL_EXT_ONEAPI_INDIRECTLY_CALLABLE_PROPERTY() + SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) int get_random_number() override { return 221; } From f33bab59e165de42149cad747abb28bb447bbbe1 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 23 Jan 2024 05:22:03 -0800 Subject: [PATCH 31/51] Documents how the calls_indirectly property parameters could be handled in headers --- sycl/doc/design/VirtualFunctions.md | 71 ++++++++++++++++++++++++++++- 1 file changed, 69 insertions(+), 2 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index b7c00431fb67e..ac652838ee4f8 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -109,8 +109,75 @@ be implemented in accordance with the corresponding [design document][2]: In order to convert a type to a string, [\__builtin_sycl_unique_stable_name][3] could be used. -**TODO**: `calls_indirectly` requires compile-time concatenation of strings. -Document how it should be done. +The `calls_indirectly` compile-time property accepts a list of types which +identify virtual functions set. It can be handled using metaprogramming magic to +compile-time concatenate strings to produce a single value out of a set of +parameters. Similar approach is used to handle `reqd_work_group_size` and other +compile-time properties that accept integers: + +```c++ +// Helper to hide variadic list of arguments under a single type +template struct CharList {}; + +// Helper to concatenate several lists of characters into a single string. +// Lists are separated from each other with comma within the resulting string. +template struct ConcatenateCharsToStr; + +// Specialization for a single list +template struct ConcatenateCharsToStr> { + static constexpr char value[] = {Chars..., '\0'}; +}; + +// Specialization for two lists +template +struct ConcatenateCharsToStr, CharList> + : ConcatenateCharsToStr> {}; + +// Specialization for the case when there are more than two lists +template +struct ConcatenateCharsToStr, CharList, + Rest...> + : ConcatenateCharsToStr, + Rest...> {}; + +// Helper to convert type T to a list of characters representing the type (its +// mangled name). +template struct StableNameToCharsHelper { + using chars = CharList<__builtin_sycl_unique_stable_name(T)[Indices]...>; +}; + +// Wrapper helper for the struct above +template struct StableNameToChars; + +// Specialization of that wrapper helper which accepts sequence of integers +template +struct StableNameToChars> + : StableNameToCharsHelper {}; + +// Top-level helper, which should be used to convert list of typenames into a +// string that contains comma-separated list of their string representations +// (mangled names). +template struct PropertyValueHelper { + static constexpr const char *name = "my-fancy-attr"; + static constexpr const char *value = + ConcatenateCharsToStr>::chars...>::value; +}; + +// Example usage: +SYCL_EXTERNAL +[[__sycl_detail__::add_ir_attributes_function( + PropertyValueHelper::name, + PropertyValueHelper::value)]] void +foo() { + // Produced LLVM IR: + // define void @_Z3foov() #0 { ... } + // attributes #0 = { "my-fancy-attr"="_ZTSv,_ZTSi" ... } +} + +``` ### Changes to the compiler front-end From 2ca0e946371ed1f5fd3285de19e29eb0ddf18e0a Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jan 2024 05:55:24 -0800 Subject: [PATCH 32/51] Definition is a declaration --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 490362cda718e..948c7806f0905 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -206,10 +206,10 @@ Calling a virtual member function, which does not belong to any of sets of virtual member functions declared to be used is an undefined behavior. |=== -If a virtual member function is called from device code, both definition and -declaration of that function must be decorated with the `indirectly_callable` -property. `SetId` property parameter must match between definition and -declaration, or otherwise behavior is undefined. +If a virtual member function is called from device code, all declarations of +that function must be decorated with the `indirectly_callable` property with +the same value of `SetId` property parameter, or otherwise behavior is +undefined. Applying the `indirectly_callable` property to a SYCL Kernel function is illegal and an implementation should produce a diagnostic for that. From 7db0a2aec63cdc0e51bd596ac6f539c2fbfb3a3b Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jan 2024 05:59:50 -0800 Subject: [PATCH 33/51] Refactor code snippets to bring them closer to a real life code --- .../sycl_ext_intel_virtual_functions.asciidoc | 86 +++++++++++-------- 1 file changed, 48 insertions(+), 38 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 948c7806f0905..03cfe00f93db7 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -256,14 +256,17 @@ public: void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) Foo::baz() {} -// kernel calling virtual function should also be annotated: -/* */.single_task(syclext::properties{syclext::calls_indirectly}, [=] { - Foo *ptr = /* ... */; - ptr->bar() - - // Note: this kernel can only call 'Foo::foo' and 'Foo::bar' but not - // 'Foo::baz', because the latter is declared within a different set. -}); +int main() { + sycl::queue q; + // kernel calling virtual function should also be annotated: + q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + Foo *ptr = /* ... */; + ptr->bar() + + // Note: this kernel can only call 'Foo::foo' and 'Foo::bar' but not + // 'Foo::baz', because the latter is declared within a different set. + }); +} ---- The main reason for virtual functions to be split into different sets is use of @@ -288,13 +291,16 @@ public: void bar() {} }; -/* */.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { - Foo *ptr = /* ... */; - ptr->bar() +int main() { + sycl::queue q; + q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + Foo *ptr = /* ... */; + ptr->bar() - // Note: this kernel can only call 'Foo::foo' but not 'Foo::bar', because the - // latter belongs to a different (non-default) set of virtual functions. -}); + // Note: this kernel can only call 'Foo::foo' but not 'Foo::bar', because + // the latter belongs to a different (non-default) set of virtual functions. + }); +} ---- NOTE: By convention, the type `void` is used to denote the default set of @@ -414,18 +420,20 @@ struct Foo { void bar() {} }; -sycl::queue q(/* device selector returns a device *without* fp64 support */); -assert(!q.get_device().has(sycl::aspect::fp64)); +int main() { + sycl::queue q(/* device selector returns a device *without* fp64 support */); + assert(!q.get_device().has(sycl::aspect::fp64)); -q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { - // Exception is expected to be thrown, because target device doesn't support - // fp64 aspect and it is required by 'Foo::foo' which is included into 'set_A' -}); + q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + // Exception is expected to be thrown, because target device doesn't support + // fp64 aspect and it is used by 'Foo::foo' which is included into 'set_A' + }); -q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { - // No exceptions are expected, because 'set_B' doesn't bring any requirements - // for optional kernel features. -}); + q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + // No exceptions are expected, because 'set_B' doesn't bring any + // requirements for optional kernel features. + }); +} ---- An implementation may not raise a compile time diagnostic or a run time @@ -447,20 +455,22 @@ struct Foo { void bar() {} }; -sycl::queue q(/* device selector choosing a device *without* fp64 support */); -assert(!q.get_device().has(sycl::aspect::fp64)); - -auto *Storage = sycl::malloc_device(1, q); - -q.single_task([=] { - // The kernel is not submitted with 'calls_indirectly' property and therefore - // it is not considered to be using any of virtual member functions of 'Foo'. - // This means that the object of 'Foo' can be successfully created by this - // kernel, regardless of whether a target device supports 'fp64' aspect which - // is used by 'Foo::foo'. - // No exceptions are expected to be thrown. - new (Storage) Foo; -}); +int main() { + sycl::queue q(/* device selector choosing a device *without* fp64 support */); + assert(!q.get_device().has(sycl::aspect::fp64)); + + auto *Storage = sycl::malloc_device(1, q); + + q.single_task([=] { + // The kernel is not submitted with 'calls_indirectly' property and + // therefore it is not considered to be using any of virtual member + // functions of 'Foo'. This means that the object of 'Foo' can be + // successfully created by this kernel, regardless of whether a target + // device supports 'fp64' aspect which is used by 'Foo::foo'. No exceptions + // are expected to be thrown. + new (Storage) Foo; + }); +} ---- ==== Interaction with `reqd_sub_group_size` attribute From 95fd59fa592e9c66a7c3352d9771252130f68d04 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 22 May 2024 07:14:43 -0700 Subject: [PATCH 34/51] Record issue about reqd_sub_group_size handling --- .../sycl_ext_intel_virtual_functions.asciidoc | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 03cfe00f93db7..b3810c96b99ac 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -616,3 +616,22 @@ int main() { return 0; } ---- + +== Issues + +=== Handling of `reqd_sub_group_size` attribute + +The extension allows virtual calls to be performed only from kernels with +_primary_ sub-group size, which is quite limiting and doesn't allow you to rely +on a particular sub-group size you want within a virtual function. + +This is more of an implementation limitation, rather than a language problem, +because at both SPIR-V and SYCL levels we don't have a mechanism of assigning +`reqd_sub_group_size` attribute to on-kernel SYCL functions and considering +indirect nature of virtual functions, compiler may not be able to figure out +which kernels use which exact virtual functions. + +By implementing some extra interfaces at SPIR-V and SYCL level we should be able +to improve the situation and lift some of the limitations around +`reqd_sub_group_size` attribute use together with virtual functions, but this +won't be a part of the initial language specification and implementation. From b4de2488c4173a6c73fcf516c74b94f67b5df4cb Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 24 May 2024 02:26:26 -0700 Subject: [PATCH 35/51] Start working on a test plan --- .../VirtualFunctions/test-plan.asciidoc | 259 ++++++++++++++++++ 1 file changed, 259 insertions(+) create mode 100644 sycl/test-e2e/VirtualFunctions/test-plan.asciidoc diff --git a/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc b/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc new file mode 100644 index 0000000000000..52132d2af6b88 --- /dev/null +++ b/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc @@ -0,0 +1,259 @@ +:sectnums: + += Test plan for virtual functions support in SYCL + +This is a test plan for virtual functions functionality described by the +corresponding upcoming `ext_sycl_intel_virtual_functions` extension. + +NOTE: This test plan does not cover unit tests, or negative tests related to +compiler diagnostics, it is focused on end-to-end examples to make sure that the +new functionality works as expected in different scenarios. + +== Testing scope + +=== Device coverage + +All of the tests described below are performed on a single device, which could +be any device: the feature is guarded by an aspect, so tests are expected to +exit early if a device doesn't support virtual functions. + +=== Data types coverage + +There is no need to repeat each and every test using different data types, +because it won't bring any significant improvements to tests quality. However, +for some test cases data types used in them matter more. Their description would +contain explicit requirements about data types which should be covered. + +=== Code paths coverage + +Test cases below often describe an example where behavior of a virtual member +function in a base class is overridden by derived classes. In those scenarios +test case should be repeated several times, each time taking a code path to a +different _actual_ class used under the hood. + +For example, for the following scenario: +[source,c++] +---- +class Base { +public: + virtual void foo() { /* ... */ } +}; +class Derived1 : public Base { +public: + void foo() override { /* ... */ } +}; +class Derived2 : public Base { +public: + void foo() override { /* ... */ } +}; +---- + +Test cases should be repeated to invoke both +Derived1::foo+ and ++Derived2::foo+. + +== Tests + +NOTE: Compiler will attempt to de-virtualize the program as much as possible. +Therefore, it is important that it is not statically known which exact method +of which exact class is being called in all test cases. + +=== The simplest case: create and call + +Key feature of this group of test cases is that an object of a polymorphic class +is created and used (virtual member functions of it are called) within the same +kernel. + +Tests in this category should only use default set of virtual functions. + +NOTE: Tests in this category are specifically simplified to use limited set of +available functionality in each case. The intent here is to have a sub-suite of +basic acceptance tests, which are closer to unit tests in context of being +focused on a single aspect of feature, but still being E2E tests. + +==== Virtual functions with no access to object data + +For each test in this sub-category, classes with virtual functions should not +have any data members. Virtual functions should simply return some values, +possibly based on input arguments. + +===== Simple hierarchy + +Test checks that a very basic usage model of virtual functions works. + +There is a base class with a virtual member function, which is being overridden +in several derived classes. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of either one of derived classes, or of a base class. Address of that +object is stored in a variable of type "pointer to a base class" and used within +the same kernel to call a virtual member function. Result is stored in a buffer +and verified on host. + +===== More complex hierarchy + +Test checks that derived classes can be derived further and that pointers to +objects of polymorphic classes can be passed to functions and that virtual +functions continue to work correctly. + +There is a base class with a virtual member function, which is being overridden +in a derived class (further referred as "level 1 class"). That sub-class defines +another virtual member function, which uses first virtual member function from +the base class. That second virtual member function is overridden in several +more sub-derived classes (further referred as "level 2 classes"). + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of level 2 classes and passes it to a function accepting a +pointer to a level 1 class object. The function calls the second virtual member +function, result is stored to a buffer and verified on host. + +===== Missing overrides + +Test checks that the right functions are being called, trying different +combinations of which classes in hierarchy override virtual member function +from a base class. + +There is a base class with a few virtual member functions. There is a set of +derived classes which may themselves be parents to some other classes, building +an hierarchy of 5-7 different classes. Not all virtual member functions from +base class are overridden in every sub-class. Example of cases which are +expected to be tested: + +- `Base` defines `foo`; `Derived1` inherits `Base`; `Derived2` inherits + `Derived1`, overrides `foo` +- `Base` defines `bar`; `Derived3` inherits `Base`, overrides `bar`; + `Derived4` inherits `Derived3`; `Derived5` inherits `Derived4`, overrides + `bar` + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of classes in the hierarchy and passes it to a function +accepting a pointer to a base object. The function calls virtual member +functions, results are stored in a buffer and verified on host. + +==== Virtual functions with access to class/object data + +Tests in this sub-category intended to check access to both static and +non-static class data members. + +===== Static data members access in a simple hierarchy + +Test checks that a static data member can be accessed through virtual member +functions from different overrides in a classes hierarchy. + +There is a base class with static data members and a virtual member function, +which is being overridden in few derived classes. All overrides of the virtual +member function access static data members of the base class. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of either one of derived classes, or of a base class. Address of that +object is stored in a variable of type "pointer to a base class" and used within +the same kernel to call a virtual member function. Result is stored in a buffer +and verified on host. + +===== Static data members access in a more complex hierarchy + +Test checks that a static data member can be accessed through virtual member +functions from different overrides in a classes hierarchy. + +There is a base class with static data members and a virtual member function, +which is being overridden in few derived classes. Some of those classes have +extra static data members and in turn may have derived classes as well. All +overrides of the virtual member function access static data member of their base +classes. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of derived classes and passes it to a function accepting a +pointer to a base class object. The function the virtual member function, result +is stored to a buffer and verified on host. + +===== Non-static data members access to read data in a simple hierarchy + +Test checks that virtual member functions can access non-static data members +of the current and base classes to read their values. + +There is a base class with non-static data members and a virtual member +function, which is being overridden in a few derived classes. All overrides of +the virtual member function access non-static data members described in the +base class to only read their values. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of either one of derived classes, or of a base class. Address of that +object is stored in a variable of type "pointer to a base class" and used within +the same kernel to call a virtual member function. Result is stored in a buffer +and verified on host. + +NOTE: This test case can have a variation where virtual member functions are +additionally marked as `const`. + +===== Non-static data members access to read data in a more complex hierarchy + +Test checks that non-static data members can be accessed through virtual member +functions from different overrides in a classes hierarchy. + +There is a base class with non-static data members and a virtual member +function, which is being overridden in few derived classes. Some of those +classes have extra non-static data members and in turn may have derived classes +as well. All overrides of the virtual member function access non-static data +members of their base classes. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of derived classes and passes it to a function accepting a +pointer to a base class object. The function the virtual member function, result +is stored to a buffer and verified on host. + +=== Passing objects of polymorphic classes between kernels + +Contrary to the previous section, an object of a polymorphic class is +constructed in one kernel, but used in another, which is closer to a real +examples where initialization is a separate phase of an application. + +This category also makes use of non-default sets of virtual functions, i.e. it +tests template arguments that you can pass into the new compile-time properties. + +Both USM and SYCL buffers should be used by tests as a mean of storing data and +transferring it between kernels. + +==== Single construct, single use + +Test submits two kernels: one constructs an object of a polymorphic class and +another performs virtual function calls using that object. The test should +check both default and non-default sets of virtual functions, as well as +access to object's data members. + +==== Single construct, multiple use + +In this test, different virtual functions should be put into different sets, but +there should still be a single kernel that constructs an object of a polymorphic +class. + +Then there should be a few kernels that each perform a virtual function call of +a method from a different set on that single object. + +==== Multiple construct, single use + +In this test, there should be several kernels each constructing an object of a +different derived class. It should be followed by a single kernel that calls +virtual functions from that created object. + +==== Multiple construct, multiple use + +In this test, there should be several kernels each constructing an object of a +different derived class. Those objects should have several virtual functions +each in a different set. The "construct" kernel should be followed by a few +"use" kernels each performing a virtual call of a different virtual method of +that created object. + +=== Separate translation units + +TBD. + +=== Optional kernel features + +TBD. + +=== Misc TODOs + +Test where each work-item in a sub-group calls a different virtual function +Test that experimental::printf works within virtual functions +Test that work-group built-ins work within virtual functions. Barriers? +Test that class can have non-device virtual functions + From eb5ea4d3dfbaae21c8df1629a163673bd7e423b1 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 27 May 2024 03:13:29 -0700 Subject: [PATCH 36/51] Notes about spec constants and in-memory cache --- sycl/doc/design/VirtualFunctions.md | 19 ++++++++++++++++--- .../sycl_ext_intel_virtual_functions.asciidoc | 18 ++++++++++++++++++ 2 files changed, 34 insertions(+), 3 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index ac652838ee4f8..c10cf05860884 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -475,9 +475,22 @@ NOTE: when shared libraries are involved, they could also provide some there could be more than one image registered with the same value of "virtual-functions-set" property. -NOTE: No changes are needed for both in-memory and on-disk caches, because they -take both kernel and device as keys and for that pair list of device images -which needs to be linked together does not change from launch to launch. +#### In-memory cache of kernels and programs + +It is very important that all kernels that use virtual functions from the same +set and operate (construct and perform calls) on the same objects are bundled +into the same program. If that program changes somewhere in between an object +construction and virtual call, it will lead to undefined behavior because of +invalidated vtable pointers. + +Therefore, in-memory cache eviction mechanism should be updated not to evict +kernels that use virtual functions, because otherwise it will lead to functional +issues. + +NOTE: in our experience we have only encountered a situation where in-memory +cache eviction was required with SYCL CTS test for specialization constants, +which is very heavy. Therefore, it is not expected that any changes to in-memory +cache eviction mechanism will be needed any time soon. [1]: <../extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc> [2]: diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index b3810c96b99ac..2b715f0a0a147 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -561,6 +561,24 @@ int main() { } ---- +If no explicit kernel bundle operations are performed by a program, it is +responsibility of a SYCL implementation to ensure that all kernels that use +virtual functions from the same set are implicitly put together into the same +kernel bundle to ensure that everything works correctly. + +Note, however, that there are APIs which may require SYCL implementation to +re-compile a kernel bundle. For example, if a specialization constant value is +changed, SYCL implementation may need to re-compile a kernel bundle to embed +new value of a specialization constant into a device program. Such +re-compilation will invalidate all addresses of virtual functions which may +have been previously recorded in a constructed object making behavior of +virtual function calls through that object undefined. + +Correct manipulation with specialization constants in kernels that also use +virtual functions requires advanced knowledge of implementation details and +therefore it is not recommended to use specialization constants together with +virtual functions. + == Example usage [source,dpcpp] From e59e68bf96fa82b5c9393bb9e168c31bb120d0a4 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 27 May 2024 09:21:31 -0700 Subject: [PATCH 37/51] Reduce amount of used device image properties --- sycl/doc/design/VirtualFunctions.md | 55 ++++++++++++----------------- 1 file changed, 22 insertions(+), 33 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index c10cf05860884..712bad72c30b4 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -421,31 +421,17 @@ following properties are set within the new property set: functions set contained within the image (value of the property argument); - "dummy-image=1" if an image is a dummy virtual functions device image; -For other device images, the following properties are set within the new -property set: -- "calls-virtual-functions-set" with a string value containing comma-separated - list of names of virtual function sets used by kernels in the image (as - indicated by `calls_indirectly` kernel property); -- "creates-virtual-functions-set" with a string value containing comma-separate - list of names of virtual function sets which are referenced from functions - included into vtables used by a kernel within a device image; - -There is a reason why we need to separate properties and can't just use one for -both kinds of relationships: - -When a kernel only creates an object of a polymorphic class, we should only use -virtual functions which are compatible with a target device. Virtual functions -that use unsupported optional features are expected to be outlined into separate -sets in that case and we need to ensure that we are still able to create an -object so that virtual functions that use supported optional features are -usable. - -However, when a kernel actually makes calls to virtual functions, we assert -that all optional features used by virtual functions in all sets used by the -kernel are supported on a target device. All those aspects have been already -attached to the kernel as part of aspects propagation phase and therefore at -runtime we will unconditionally pull all device images with virtual functions -which are used by a kernel to make calls to them. +For other device images (i.e. ones containing actual user-provided kernels): +- "uses-virtual-functions-set" with a string value containing comma-separated + list of names of virtual function sets used by kernels in the image. + +For the purposes of generating "uses-virtual-functions-set" device image +property value the fact that kernel uses a set of virtual functions is inferred +based on two things: +- kernel is set to explicitly use a set of virtual functions through + `calls_indirectly` property; +- kernel constructs an object of a polymorphic class and thus references vtable + global variable which in turn references functions that belong to some sets; ### Changes to the runtime @@ -454,18 +440,21 @@ properties set in "SYCL/virtual functions" property set, then runtime does some extra actions to link several device images together to ensure that the kernel can be executed. -Algorithm for discovery of device images which has to be linked: -- if device image has property "calls-virtual-functions-set=A,B,...,N" on it, - then all device images with "virtual-functions-set" property equal to "A", - "B", ..., "N" are taken to be linked with the initial device image; -- if device image has property "creates-virtual-functions-set=A,B,...,N" on it, - then for each device image with "virtual-functions-set" property equal to "A", - "B", ..., "N" and *without* "dummy-image=1" property on it: - - if that device image is compatible with device, it is taken to be linked +Let's say that a submitted kernel is from device image that has property +"uses-virtual-functions-set=A,B,...,N" on it, then the following other device +images are linked together with it: +- all device images with "virtual-functions-set" property equal to "A", "B", + ..., "N" and *without* "dummy-image=1" property on it: + - if that device image is compatible with a device, it is taken to be linked with the initial device image; - otherwise, runtime looks for a device image with the same "virtual-functions-set" property, but *with* "dummy-image=1" property on it and takes that device image to be linked with the initial device image; +- all other device images with "uses-virtual-functions-set" property equal to + "A", "B", ..., "N" if they are compatible with a device. Note that this + triggers further recursive search for device images that should be linked + together, i.e. runtime should keep track of which device images have already + been looked at to avoid entering an infinite recursion; Produced list of device images is then linked together and used to enqueue a kernel. From 63023d53733dcb3a94798c5da9067061b7ae53f7 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 27 May 2024 09:37:56 -0700 Subject: [PATCH 38/51] Add some clarifications about negative cases and diagnostics --- sycl/doc/design/VirtualFunctions.md | 5 +++++ .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 9 +++++++++ 2 files changed, 14 insertions(+) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 712bad72c30b4..489ac42ae3080 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -456,6 +456,11 @@ images are linked together with it: together, i.e. runtime should keep track of which device images have already been looked at to avoid entering an infinite recursion; +If for any used virtual functions set there is no device image that provides +virtual functions from it, the runtime should throw an exception, because that +is likely a user error (missing or misspelled `indirectly_callable` property +on a virtual function). + Produced list of device images is then linked together and used to enqueue a kernel. diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 2b715f0a0a147..f3b1952dfffc3 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -204,6 +204,11 @@ to be using the default set of virtual member functions. Calling a virtual member function, which does not belong to any of sets of virtual member functions declared to be used is an undefined behavior. + +This property should be attached to a kernel if it contains a virtual member +function call in its call graph, even if the said function is never actually +called. If a kernel submitted without this property calls a virtual member +function, diagnostic should be emitted by an implementation. |=== If a virtual member function is called from device code, all declarations of @@ -211,6 +216,10 @@ that function must be decorated with the `indirectly_callable` property with the same value of `SetId` property parameter, or otherwise behavior is undefined. +If a kernel is submitted with the `indirectly_callable` property that points to +an empty set of virtual functions, an exception should be thrown by an +implementation. + Applying the `indirectly_callable` property to a SYCL Kernel function is illegal and an implementation should produce a diagnostic for that. From 3728caeadfb218e8dc4370ae56d1e909a18ce9e4 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 27 May 2024 09:42:51 -0700 Subject: [PATCH 39/51] Record handling of spec constants as an issue --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index f3b1952dfffc3..1301f089f7ad6 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -662,3 +662,11 @@ By implementing some extra interfaces at SPIR-V and SYCL level we should be able to improve the situation and lift some of the limitations around `reqd_sub_group_size` attribute use together with virtual functions, but this won't be a part of the initial language specification and implementation. + +=== Interaction with specialization constants + +Implementation of specialization constants may involve re-compilation and +therefore can easily break virtual functions functionality. Current extension +spec wording is to _discourage_ use of specialization constants together with +virtual functions, but not to completely prohibit. Should we be more clear here +maybe with the wording and make it stricter or more precise/formal? From c3ed3f802fe8cb389137cfb702c6a95db4bc15bc Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 27 May 2024 09:59:39 -0700 Subject: [PATCH 40/51] Expand a bit on diagnostics that we can emit --- sycl/doc/design/VirtualFunctions.md | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 489ac42ae3080..8fdadd64a55f0 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -194,15 +194,17 @@ as: - virtual member function *not* annotated with `indirectly_callable` compile-time property should *not* be emitted into device code; -To achieve that, the front-end should implicitly add `sycl_device` attribtue to +To achieve that, the front-end should implicitly add `sycl_device` attribute to each function which is marked with the `indirectly_callable` attribute. This can be done during handling of `[[__sycl_detail__::add_ir_attributes_function]]` attribute by checking if one of string literals passed in there as a property name is equal to "indirectly_callable". Later the `sycl_device` attribute can be used to decide if a virtual function should be emitted into device code. -**TODO:** any extra diagnostics we would like to emit? Like kernel without -`calls_indirectly` property performing virtual function call. +When emitting virtual calls, front-end should emit an extra `virtual-call` LLVM +IR attribute at every call site. This attribute will be used by a middle-end +pass to check that there are no virtual function calls in kernels _not_ marked +with the `calls_indirectly` property and emit a diagnostic about that. ### Changes to the compiler middle-end @@ -298,7 +300,14 @@ which doesn't support all required optional features. #### New compiler diagnostics -**TBD** +A new pass should be added to analyze virtual calls and emit diagnostics if a +kernel without the `calls_indirectly` property performs a virtual call and emit +a diagnostic about that. `virtual-call` LLVM IR attribute we attach to such +call instructions should help us with detecting those calls. + +The pass should be launched somewhere at the beginning of the optimization +pipeline so that LLVM IR is as close to the input source file as possible for +better diagnostics. #### Device code split and device images From c9c8da346efab51b5452e461a57ac9e43da9ff7b Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 28 May 2024 02:45:37 -0700 Subject: [PATCH 41/51] Clarification on indirectly_callable property and overrides --- .../proposed/sycl_ext_intel_virtual_functions.asciidoc | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 1301f089f7ad6..7088463da5a34 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -184,6 +184,11 @@ a device function, thus making it available to be called from SYCL kernel and device functions. Should only be applied to virtual member functions and to do so, function-style `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro should be used. +NOTE: This property affect a particular function and does not impact any of its +overrides in derived classes. If the whole hierarchy of overrides is expected +to be callable from a device, then each and every override should be marked with +the property. + Optional parameter `SetId` specifies a set of virtual member functions this function belongs to and at the same time it defines a group of kernels, which can call this function, it must be a C++ typename. When the parameter is From 4b622e169b8078305d130b0b02c4480b54f712b8 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 4 Jun 2024 09:56:19 -0700 Subject: [PATCH 42/51] Update rules for properties on redeclarations --- .../sycl_ext_intel_virtual_functions.asciidoc | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc index 7088463da5a34..0790fccc9a7b0 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc @@ -197,6 +197,17 @@ omitted, a virtual member function is considered to belong to the default set. Calling a virtual member function from a kernel which does not declare use of a set the virtual member function belongs to is an undefined behavior. +The property must appear on the first declaration of the function in the +translation unit. Redeclarations of the function may optionally be decorated +with the same property if the property argument is the same. The effect is the +same regardless of whether redeclarations are so decorated. + +If a function is decorated with one of these properties in one translation unit, +any other translation unit that declares the same function must also decorate +the function with the same property (with the same argument). + +The programs that decorate the same function with multiple instances of the +property with different argument are ill formed. |`calls_indirectly` |The `calls_indirectly` property indicates that a SYCL kernel function is performing calls through virtual member functions and declares use of one or @@ -216,11 +227,6 @@ called. If a kernel submitted without this property calls a virtual member function, diagnostic should be emitted by an implementation. |=== -If a virtual member function is called from device code, all declarations of -that function must be decorated with the `indirectly_callable` property with -the same value of `SetId` property parameter, or otherwise behavior is -undefined. - If a kernel is submitted with the `indirectly_callable` property that points to an empty set of virtual functions, an exception should be thrown by an implementation. From 2e825208d1a9665e3e37f9388a448bed338f4e95 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 7 Jun 2024 06:20:12 -0700 Subject: [PATCH 43/51] Rename the extension spec file --- ...ctions.asciidoc => sycl_ext_oneapi_virtual_functions.asciidoc} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename sycl/doc/extensions/proposed/{sycl_ext_intel_virtual_functions.asciidoc => sycl_ext_oneapi_virtual_functions.asciidoc} (100%) diff --git a/sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc similarity index 100% rename from sycl/doc/extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc rename to sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc From a09a32a34e6ec2fd79a586b0506fa92c4baad663 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 7 Jun 2024 06:31:02 -0700 Subject: [PATCH 44/51] Record direction for potential implementation design improvements --- sycl/doc/design/VirtualFunctions.md | 34 +++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 8fdadd64a55f0..e59db051f226f 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -495,6 +495,40 @@ cache eviction was required with SYCL CTS test for specialization constants, which is very heavy. Therefore, it is not expected that any changes to in-memory cache eviction mechanism will be needed any time soon. +## Design alternatives + +Discussions over this feature resulted in suggestion for an alternative +implementation that would lift some of the usage restrictions of virtual +functions, but they require more time for investigation and analysis than we +currently have and therefore information below is recorded as a potential +future changes to this design. + +### Do not record an absolute address of a vtable in an object + +One of the significant limitations of the design outlined above is that if a +device image got recompiled in-between object creation and virtual call, then +vtable pointer stored in an object is invalidated. Such re-compilation could +happen if specialization constant value was changed, for example. + +As a possible solution to lift that limitation, we could have recorded an index +of a vtable instead of its address into an object. We will need to change the +LLVM IR we emit for object construction and making virtual function call, but +it will allow to avoid invalidating of vtable pointer on device image +recompilation. + +To introduce an order to vtables, we could generate a couple of helper functions +to map between vtable and its index and vice-versa. + +Theoretically, this solution could be extended further to make sure that vtable +index is still accessible even if an object is passed between different device +images: if we make sure to include every vtable into every device image and +somehow maintain the stable order of those. + +There are many questions that need to be explored and answered and therefore +this implementation design is not being immediatly proposed, but it sounds like +a promising direction to lift some of existing limitations and improve user +experience. + [1]: <../extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc> [2]: [3]: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-sycl-unique-stable-name From b7cdc2d8e86b72368f332fc108faf92b751db205 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jul 2024 03:06:56 -0700 Subject: [PATCH 45/51] Rename properties in accordance with the discussion --- ...sycl_ext_oneapi_virtual_functions.asciidoc | 203 ++++++++++-------- 1 file changed, 117 insertions(+), 86 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc index 0790fccc9a7b0..408a98ea8786d 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc @@ -147,12 +147,17 @@ namespace sycl::ext::oneapi::experimental { using value_t = property_value; }; - template - inline constexpr indirectly_callable_key::value_t indirectly_callable; + inline constexpr indirectly_callable_key::value_t indirectly_callable; - template + template + inline constexpr indirectly_callable_key::value_t + indirectly_callable_in; + + inline constexpr calls_indirectly_key::value_t assume_indirect_calls; + + template inline constexpr calls_indirectly_key::value_t - calls_indirectly; + assume_indirect_calls_to; template <> struct is_property_key : std::true_type {}; @@ -179,8 +184,19 @@ functions. |=== |Property|Description |`indirectly_callable` -|The `indirectly_callable` property indicates that a virtual member function is -a device function, thus making it available to be called from SYCL kernel and +|This is an alias to `indirectly_callable_in`, please read the description +of the `indirectly_callable_in` property for full documentation. + +This property is expected to be used in situations where application is not that +huge and/or complex and therefore doesn't care about having more than one set +of virtual functions. + +Going forward, the document will only reference the `indirectly_callable_in` +property, but whatever is said about it also applies to the +`indirectly_callable` property because it is a simple alias. +|`indirectly_callable_in` +|The `indirectly_callable_in` property indicates that a virtual member function +is a device function, thus making it available to be called from SYCL kernel and device functions. Should only be applied to virtual member functions and to do so, function-style `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` macro should be used. @@ -189,10 +205,9 @@ overrides in derived classes. If the whole hierarchy of overrides is expected to be callable from a device, then each and every override should be marked with the property. -Optional parameter `SetId` specifies a set of virtual member functions this -function belongs to and at the same time it defines a group of kernels, which -can call this function, it must be a C++ typename. When the parameter is -omitted, a virtual member function is considered to belong to the default set. +Parameter `SetId` specifies a set of virtual member functions this function +belongs to and at the same time it defines a group of kernels, which can call +this function, it must be a C++ typename. Calling a virtual member function from a kernel which does not declare use of a set the virtual member function belongs to is an undefined behavior. @@ -204,43 +219,54 @@ same regardless of whether redeclarations are so decorated. If a function is decorated with one of these properties in one translation unit, any other translation unit that declares the same function must also decorate -the function with the same property (with the same argument). +the function with the same property (with the same argument). Otherwise the +program is considered ill-formed, but no diagnostic is required. The programs that decorate the same function with multiple instances of the property with different argument are ill formed. -|`calls_indirectly` -|The `calls_indirectly` property indicates that a SYCL kernel function is -performing calls through virtual member functions and declares use of one or +|`assume_indirect_calls` +|This is an alias to `assume_indirect_calls_to`, please read the +description of the `assume_indirect_calls_to` property for full documentation. + +This property is expected to be used in situations where application is not that +huge and/or complex and therefore doesn't care about having more than one set +of virtual functions. + +Going forward, the document will only reference the `assume_indirect_calls_to` +property, but whatever is said about it also applies to the +`assume_indirect_calls` property because it is a simple alias. +|`assume_indirect_calls_to` +|The `assume_indirect_calls_to` property indicates that a SYCL kernel function +may perform calls through virtual member functions and declares use of one or more sets of virtual member functions. -Optional parameter `SetIds` specifies which sets of virtual member functions are -declared to be used by a kernel, it must be zero or more C\++ typenames. If the -argument is omitted (zero C++ typenames specified), then a kernel is considered -to be using the default set of virtual member functions. +Parameter `SetIds` specifies which sets of virtual member functions are +declared to be used by a kernel, it must be zero or more C\++ typenames. Calling a virtual member function, which does not belong to any of sets of virtual member functions declared to be used is an undefined behavior. This property should be attached to a kernel if it contains a virtual member function call in its call graph, even if the said function is never actually -called. If a kernel submitted without this property calls a virtual member -function, diagnostic should be emitted by an implementation. +called. If a kernel submitted without this property contains a virtual member +function call in its call graph, diagnostic should be emitted by an +implementation. |=== -If a kernel is submitted with the `indirectly_callable` property that points to -an empty set of virtual functions, an exception should be thrown by an -implementation. +If a kernel is submitted with the `assume_indirect_calls_to` property that +points to an empty set of virtual functions, a synchronious exception with the +`errc::invalid` error code should be thrown by an implementation. -Applying the `indirectly_callable` property to a SYCL Kernel function is illegal -and an implementation should produce a diagnostic for that. +Applying the `indirectly_callable_in` property to a SYCL Kernel function is +illegal and an implementation should produce a diagnostic for that. -Applying the `indirectly_callable` property to an arbitrary device function, +Applying the `indirectly_callable_in` property to an arbitrary device function, which is not a virtual member function has no effect. NOTE: This behavior may be changed in either future version of this extension or in another extensions. -Virtual member functions that are decorated with the `indirectly_callable` +Virtual member functions that are decorated with the `indirectly_callable_in` property are considered to be device functions, i.e. they must obey the restrictions listed in section 5.4 of the core SYCL specification "Language restrictions for device functions". Virtual member functions that are not @@ -259,27 +285,31 @@ class Foo { public: // properties to functions should be applied using the macro: virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void + syclext::indirectly_callable_in) void foo() {} virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void - bar(){}; + syclext::indirectly_callable_in) void + bar(); - // both declaration and definition should be annotated: + // first declaration must be annotated virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void + syclext::indirectly_callable_in) void baz(); }; -// both declaration and definition should be annotated: -void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) +// redeclarations may be annotated as well +void SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable_in) Foo::baz() {} +// but it is not required +Foo::bar() {} + int main() { sycl::queue q; // kernel calling virtual function should also be annotated: - q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { Foo *ptr = /* ... */; ptr->bar() @@ -292,7 +322,7 @@ int main() { The main reason for virtual functions to be split into different sets is use of optional kernel features in those virtual functions. It is explained in more details in the next section. However, for simplicity purposes both properties -can be used without explicitly specifying a set, thus using the default set: +have aliases which allow to omit the set, thus using the default set: [source,dpcpp] ---- @@ -304,16 +334,18 @@ class Foo { public: // This virtual member function belongs to the default set of virtual // functions. - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) void foo() {} - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void bar() {} }; int main() { sycl::queue q; - q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + // This kernel declares a use of default set of virtual functions + q.single_task(syclext::properties{syclext::assume_indirect_calls}, [=]() { Foo *ptr = /* ... */; ptr->bar() @@ -323,11 +355,11 @@ int main() { } ---- -NOTE: By convention, the type `void` is used to denote the default set of -virtual functions. When the typename is omitted from the `indirectly_callable` -or `calls_indirectly` propertyh, the type `void` is used by default. -Applications may also explicitly use the type `void` to denote this default set -of virtual functions. +NOTE: By definition of the `indirectly_callable` and `assume_indirect_calls` +properties above, the type `void` is used to denote the default set of +virtual functions. Applications may also explicitly use the type `void` to +denote this default set of virtual functions when using `indirectly_calleble_in` +and `assume_indirect_calls_to` properties. === Optional kernel features handling @@ -358,12 +390,12 @@ are used by all virtual member functions included into all sets of virtual member functions declared to be used by a kernel. Therefore, if only default set of virtual functions is used by an application, -it means that every kernel which is submitted with the `calls_indirectly` -property is assumed to use _all_ virtual functions marked with the -`indirectly_callable` property. If some of those virtual functions use optional -kernel features and there are kernels which are supposed to work on devices -without support for those optional kernel features, then virtual functions -using them should be outlined into a separate set. +it means that every kernel which is submitted with the +`assume_indirect_calls_to` property is assumed to use _all_ virtual functions +marked with the `indirectly_callable_in` property. If some of those virtual +functions use optional kernel features and there are kernels which are supposed +to work on devices without support for those optional kernel features, then +virtual functions using them should be outlined into a separate set. [source,dpcpp] ---- @@ -374,13 +406,15 @@ struct set_fp16; struct Foo { // This function uses 'fp64' aspect - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void f64() { double d = 3.14; } // This function uses 'fp16' aspect - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void f16() { sycl::half h = 2.71f; } @@ -388,40 +422,32 @@ struct Foo { sycl::queue q; -q.single_task(syclext::properties{syclext::calls_indirectly}, - [=] [[sycl::device_has(sycl::aspect::fp64)]] { +q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() [[sycl::device_has(sycl::aspect::fp64)]] { // Diagnostic is required for this kernel, because it is declared as only // using 'fp64' aspect, but it also uses virtual member functions from // "set_fp16", which includes 'Foo::f16' that uses 'fp16' aspect. }); -q.single_task(syclext::properties{syclext::calls_indirectly}, - [=] [[sycl::device_has()]] { +q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() [[sycl::device_has()]] { // Diagnostic is required for this kernel, because it is declared as not // using any optional features, but it also uses virtual member functions from // "set_fp64", which includes 'Foo::f64' that uses 'fp64' aspect. }); -q.single_task(syclext::properties{syclext::calls_indirectly}, - [=] [[sycl::device_has(sycl::aspect::fp64)]] { +q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() [[sycl::device_has(sycl::aspect::fp64)]] { // No diagnostic is required for this kernel, because list of declared aspects // matches list of used aspects. That includes virtual member functions from // "set_fp64", which includes 'Foo::f64' that uses 'fp64' aspect }); - -q.single_task(syclext::properties{syclext::calls_indirectly<>}, - [=] [[sycl::device_has()]] { - // No diagnostic is required for this kernel, because list of declared aspects - // matches list of used aspects. There are no virtual member functions defined - // in the default set, which means that no extra optional kernel features - // requirements were attached to the kernel. -}); ---- -Submitting a kernel with `calls_indirectly` property, which includes virtual -member functions that use optional kernel features to a device that doesn't -support them, should result in an exception at runtime, similar to how it is -defined by the core SYCL specification. +Submitting a kernel with `assume_indirect_calls_to` property, which includes +virtual member functions that use optional kernel features to a device that +doesn't support them, should result in an exception at runtime, similar to how +it is defined by the core SYCL specification. [source,dpcpp] ---- @@ -431,12 +457,14 @@ struct set_A; struct set_B; struct Foo { - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void foo() { double d = 3.14; } - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( + syclext::indirectly_callable_in) void bar() {} }; @@ -444,12 +472,14 @@ int main() { sycl::queue q(/* device selector returns a device *without* fp64 support */); assert(!q.get_device().has(sycl::aspect::fp64)); - q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { // Exception is expected to be thrown, because target device doesn't support // fp64 aspect and it is used by 'Foo::foo' which is included into 'set_A' }); - q.single_task(syclext::properties{syclext::calls_indirectly}, [=] { + q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { // No exceptions are expected, because 'set_B' doesn't bring any // requirements for optional kernel features. }); @@ -466,12 +496,12 @@ through the corresponding properties. using syclext = sycl::ext::oneapi::experimental; struct Foo { - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) void foo() { double d = 3.14; } - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) void bar() {} }; @@ -481,7 +511,7 @@ int main() { auto *Storage = sycl::malloc_device(1, q); - q.single_task([=] { + q.single_task([=]() { // The kernel is not submitted with 'calls_indirectly' property and // therefore it is not considered to be using any of virtual member // functions of 'Foo'. This means that the object of 'Foo' can be @@ -539,7 +569,7 @@ kernel from a different kernel bundle is an undefined behavior. using syclext = sycl::ext::oneapi::experimental; struct Base { - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) void foo() {} }; @@ -562,7 +592,7 @@ int main() { Q.submit([&](sycl::handler &CGH) { CGH.use_kernel_bundle(bundleA); - CGH.single_task([=] { + CGH.single_task([=]() { // Only placement new can be used within device functions. new (Obj) Base; }); @@ -570,7 +600,8 @@ int main() { Q.submit([&](sycl::handler &CGH) { CGH.use_kernel_bundle(bundleB); - CGH.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + CGH.single_task(syclext::properties{syclext::assume_indirect_calls}, + [=]() { // Call to 'Base::foo' is an undefined behavior here, because 'Obj' was // constructed within kernel bundle `bundleA` Obj->foo(); @@ -609,7 +640,7 @@ using syclext = sycl::ext::oneapi::experimental; class Base { public: - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) int get_random_number() { return 4; // Chosen by fair dice roll. Guaranteed to be random } @@ -622,7 +653,7 @@ public: class Derived : public Base { public: - SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) int get_random_number() override { return 221; } @@ -634,19 +665,19 @@ int main() { Base *Obj = sycl::malloc_device(1, Q); int *Result = sycl::malloc_shared(1, Q); - Q.single_task([=] { + Q.single_task([=]() { // Only placement new can be used within device functions. new (Obj) Derived; }); - auto props = syclext::properties{syclext::calls_indirectly<>}; - Q.single_task(props, [=] { + auto props = syclext::properties{syclext::assume_indirect_calls}; + Q.single_task(props, [=]() { Base B; Result[0] = B.get_random_number(); }).wait(); assert(Result[0] == 4); - Q.single_task(props, [=] { + Q.single_task(props, [=]() { Result[0] = Obj->get_random_number(); }).wait(); assert(Result[0] == 221); From 0d7fb4ea28e88757d33641ac78fad54f03417180 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jul 2024 04:34:39 -0700 Subject: [PATCH 46/51] Fix language restrictions section --- .../proposed/sycl_ext_oneapi_virtual_functions.asciidoc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc index 408a98ea8786d..9c76c1a06ebe2 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc @@ -100,7 +100,8 @@ supports. === New language restrictions for device functions The following restriction, listed in section 5.4 of the core SYCL specification -does not apply if this extension is supported by an implementation: +does not apply to kernels submitted with the `assume_indirect_calls_to` and +`assume_indirect_calls` properties: > The odr-use of polymorphic classes and classes with virtual inheritance is > allowed. *However, no virtual member functions are allowed to be called in a From ba37bc2db6614aaaac547eb58e8182fee78cc5ca Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jul 2024 04:37:24 -0700 Subject: [PATCH 47/51] Simplify description of motivation for new properties --- .../proposed/sycl_ext_oneapi_virtual_functions.asciidoc | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc index 9c76c1a06ebe2..b491bb8157668 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc @@ -127,8 +127,7 @@ class it belongs to). Without any knowledge about which virtual function can be called from which kernels compiler will have to make all virtual functions available to all kernels. That may not be desirable because some of those virtual functions could -use optional kernel features and thus would propagate their use into kernels -designed to be submitted to devices without support for those optional features. +use features that are prohibited in device code. In order to help compiler to build a mapping between kernels and virtual functions they may call, the extension introduces new compile-time-constant From 21468f1d6865920af7681036c53f5536dd592d1e Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jul 2024 04:37:48 -0700 Subject: [PATCH 48/51] Fix docs build --- sycl/doc/design/VirtualFunctions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index e59db051f226f..c067ce5568735 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -529,7 +529,7 @@ this implementation design is not being immediatly proposed, but it sounds like a promising direction to lift some of existing limitations and improve user experience. -[1]: <../extensions/proposed/sycl_ext_intel_virtual_functions.asciidoc> +[1]: <../extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc> [2]: [3]: https://clang.llvm.org/docs/LanguageExtensions.html#builtin-sycl-unique-stable-name [sycl-spec-optional-kernel-features]: https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:optional-kernel-features From 4241c26892a4528b91df6eddc0455a81926f1563 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jul 2024 05:16:03 -0700 Subject: [PATCH 49/51] Fix some typos --- sycl/doc/design/VirtualFunctions.md | 2 +- .../proposed/sycl_ext_oneapi_virtual_functions.asciidoc | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index c067ce5568735..9ed7fa2da8f55 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -525,7 +525,7 @@ images: if we make sure to include every vtable into every device image and somehow maintain the stable order of those. There are many questions that need to be explored and answered and therefore -this implementation design is not being immediatly proposed, but it sounds like +this implementation design is not being immediately proposed, but it sounds like a promising direction to lift some of existing limitations and improve user experience. diff --git a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc index b491bb8157668..09bc0d51cbba1 100644 --- a/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc +++ b/sycl/doc/extensions/proposed/sycl_ext_oneapi_virtual_functions.asciidoc @@ -254,7 +254,7 @@ implementation. |=== If a kernel is submitted with the `assume_indirect_calls_to` property that -points to an empty set of virtual functions, a synchronious exception with the +points to an empty set of virtual functions, a synchronous exception with the `errc::invalid` error code should be thrown by an implementation. Applying the `indirectly_callable_in` property to a SYCL Kernel function is @@ -358,7 +358,7 @@ int main() { NOTE: By definition of the `indirectly_callable` and `assume_indirect_calls` properties above, the type `void` is used to denote the default set of virtual functions. Applications may also explicitly use the type `void` to -denote this default set of virtual functions when using `indirectly_calleble_in` +denote this default set of virtual functions when using `indirectly_callable_in` and `assume_indirect_calls_to` properties. === Optional kernel features handling @@ -537,7 +537,7 @@ sub-group sizes. It can be done in a straightforward manner when operating on a static call graph. Virtual member functions are essentially called indirectly and pointers to them -are initialized just once when an object of a polymorhpic class is being +are initialized just once when an object of a polymorphic class is being created. Therefore, to support calling such virtual member function from two or more kernels with different `reqd_sub_group_size`, each kernel may need to receive a different pointer to a different version of a virtual member function. From 46897b3337cadf7c7ba7ed7f66f81ee1e59e9f18 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jul 2024 05:22:03 -0700 Subject: [PATCH 50/51] Update design doc with new property names --- sycl/doc/design/VirtualFunctions.md | 103 ++++++++++++++-------------- 1 file changed, 52 insertions(+), 51 deletions(-) diff --git a/sycl/doc/design/VirtualFunctions.md b/sycl/doc/design/VirtualFunctions.md index 9ed7fa2da8f55..572977f8bd4c9 100644 --- a/sycl/doc/design/VirtualFunctions.md +++ b/sycl/doc/design/VirtualFunctions.md @@ -16,10 +16,10 @@ using syclext = sycl::ext::oneapi::experimental; struct set_fp64; struct Base { - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) void foo() {} - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable_in) void bar() { // this virtual function uses double double d = 3.14; @@ -50,13 +50,13 @@ int main() { // The same binary produced by a sycl compiler should correctly work on both // devices with and without support for 'fp64' aspect. - Q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + Q.single_task(syclext::properties{syclext::assume_inddirect_calls}, [=]() { Obj->foo(); }); if (Q.get_device().has(sycl::aspect::fp64)) { - Q.single_task(syclext::properties{syclext::calls_indirectly}, - [=] { + Q.single_task(syclext::properties{syclext::assum_indirect_calls_to}, + [=]() { Obj->bar(); }); } @@ -71,7 +71,7 @@ which are not in order to avoid speculative compilation and fulfill optional kernel features requirements from the SYCL 2020 specification. To solve this, the following approach is used: all virtual functions marked with -`indirectly_callable` property are grouped by set they belong to and outlined +`indirectly_callable_in` property are grouped by set they belong to and outlined into separate device images (i.e. device images with kernels using them are left with declarations only of those virtual functions). @@ -80,10 +80,10 @@ create a "dummy" version of it where bodies of all virtual functions are emptied. Dependencies between device images are recorded in properties based on -`calls_indirectly` and `indirectly_callable` properties. They are used later by -runtime to link them together. Device images which depend on optional kernel -features are linked only if those features are supported by a target device and -dummy versions of those device images are used otherwise. +`assume_indirect_calls_to` and `indirectly_callable_in` properties. They are +used later by runtime to link them together. Device images which depend on +optional kernel features are linked only if those features are supported by a +target device and dummy versions of those device images are used otherwise. This way we can emit single unified version of LLVM IR where vtables reference all device virtual functions, but their definitions are outlined and linked @@ -96,24 +96,25 @@ cleanup of virtual functions which are incompatible with a target device. ### Changes to the SYCL header files -New compile-time properties `indirectly_callable` and `calls_indirectly` should -be implemented in accordance with the corresponding [design document][2]: +New compile-time properties `indirectly_callable_in` and +`assume_indirect_calls_to` should be implemented in accordance with the +corresponding [design document][2]: -- `indirectly_callable` property should lead to emission of +- `indirectly_callable_in` property should lead to emission of `"indirectly-callable"="set"` function attribute, where "set" is a string representation of the property template parameter. -- `calls_indirectly` property should lead to emission of +- `assume_indirect_calls_to` property should lead to emission of `"calls-indirectly"="set1,set2"`, where "set1" and "set2" are string representations of the property template parameters. In order to convert a type to a string, [\__builtin_sycl_unique_stable_name][3] could be used. -The `calls_indirectly` compile-time property accepts a list of types which -identify virtual functions set. It can be handled using metaprogramming magic to -compile-time concatenate strings to produce a single value out of a set of -parameters. Similar approach is used to handle `reqd_work_group_size` and other -compile-time properties that accept integers: +The `assume_indirect_calls_to` compile-time property accepts a list of types +which identify virtual functions set. It can be handled using metaprogramming +magic to compile-time concatenate strings to produce a single value out of a set +of parameters. Similar approach is used to handle `reqd_work_group_size` and +other compile-time properties that accept integers: ```c++ // Helper to hide variadic list of arguments under a single type @@ -189,17 +190,17 @@ However, we do need to filter out those virtual functions which are not considered to be device as defined by the [extension specification][1], such as: -- virtual member functions annotated with `indirectly_callable` compile-time +- virtual member functions annotated with `indirectly_callable_in` compile-time property should be emitted into device code; -- virtual member function *not* annotated with `indirectly_callable` +- virtual member function *not* annotated with `indirectly_callable_in` compile-time property should *not* be emitted into device code; To achieve that, the front-end should implicitly add `sycl_device` attribute to -each function which is marked with the `indirectly_callable` attribute. This +each function which is marked with the `indirectly_callable_in` attribute. This can be done during handling of `[[__sycl_detail__::add_ir_attributes_function]]` -attribute by checking if one of string literals passed in there as a property -name is equal to "indirectly_callable". Later the `sycl_device` attribute can be -used to decide if a virtual function should be emitted into device code. +attribute by checking if one of string literals passed in there is an attribute +name argument name to "indirectly_callable". Later the `sycl_device` attribute +can be used to decide if a virtual function should be emitted into device code. When emitting virtual calls, front-end should emit an extra `virtual-call` LLVM IR attribute at every call site. This attribute will be used by a middle-end @@ -215,9 +216,9 @@ used directly, but also aspects that are used indirectly, through virtual functions. For that the pass should compile a list of aspects used by each set of -indirectly callable functions (as defined by `indirectly_callable` property set -by user) and then append those aspects to every kernel which use those sets (as -defined by `calls_indirectly` property set by user). +indirectly callable functions (as defined by `indirectly_callable_in` property +set by user) and then append those aspects to every kernel which use those sets +(as defined by `assume_indirect_calls_to` property set by user). **TODO**: should we consider outlining "indirectly used" aspects into a separate metadata and device image property? This should allow for more precise and @@ -240,10 +241,10 @@ using syclext = sycl::ext::oneapi::experimental; struct set_fp64; struct Base { - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable<>) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable void foo() {} - virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable) + virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY(syclext::indirectly_callable_in) void bar() { // this virtual function uses double double d = 3.14; @@ -261,22 +262,22 @@ int main() { Base *Obj = sycl::malloc_device(1, Q); int *Result = sycl::malloc_shared(2, Q); - Q.single_task([=] { + Q.single_task([=]() { // Even though at LLVM IR level this kernel does reference 'Base::foo' // and 'Base::bar' through global variable containing `vtable` for `Base`, // we do not consider the kernel to be using `fp64` optional feature. new (Obj) Base; }); - Q.single_task(syclext::properties{syclext::calls_indirectly<>}, [=] { + Q.single_task(syclext::properties{syclext::assume_indirect_calls}, [=]() { // This kernel is not considered to be using any optional features, because // virtual functions in default set do not use any. Obj->foo(); }); if (Q.get_device().has(sycl::aspect::fp64)) { - Q.single_task(syclext::properties{syclext::calls_indirectly}, - [=] { + Q.single_task(syclext::properties{syclext::assume_indirect_calls_to}, + [=]() { // This kernel is considered to be using 'fp64' optional feature, because // there is a virtual function in 'set_fp64' which uses double. Obj->bar(); @@ -301,9 +302,9 @@ which doesn't support all required optional features. #### New compiler diagnostics A new pass should be added to analyze virtual calls and emit diagnostics if a -kernel without the `calls_indirectly` property performs a virtual call and emit -a diagnostic about that. `virtual-call` LLVM IR attribute we attach to such -call instructions should help us with detecting those calls. +kernel without the `assume_indirect_calls_to` property performs a virtual call +and emit a diagnostic about that. `virtual-call` LLVM IR attribute we attach to +such call instructions should help us with detecting those calls. The pass should be launched somewhere at the beginning of the optimization pipeline so that LLVM IR is as close to the input source file as possible for @@ -326,13 +327,13 @@ struct regular_set; struct Foo { virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void foo() { + syclext::indirectly_callable_in) void foo() { // uses double double d = 3.14; } virtual SYCL_EXT_ONEAPI_FUNCTION_PROPERTY( - syclext::indirectly_callable) void bar() {} + syclext::indirectly_callable_in) void bar() {} }; sycl::queue q; @@ -340,7 +341,7 @@ sycl::queue q; auto *Storage = sycl::malloc_device(1, q); q.single_task([=] { - // The kernel is not submitted with 'calls_indirectly' property and therefore + // The kernel is not submitted with 'assume_indirect_calls_to' property and therefore // it is not considered to be using any of virtual member functions of 'Foo'. // This means that the object of 'Foo' can be successfully created by this // kernel, regardless of whether a target device supports 'fp64' aspect which @@ -350,13 +351,13 @@ q.single_task([=] { }); if (q.get_device().has(sycl::aspect::fp64)) { - auto props = syclext::properties{syclext::calls_indirectly}; - q.single_task(props, [=] { + auto props = syclext::properties{syclext::assume_indirect_calls_to}; + q.single_task(props, [=]() { Storage->foo(); }); } else { - auto props = syclext::properties{syclext::calls_indirectly}; - q.single_task(props, [=] { + auto props = syclext::properties{syclext::assume_indirect_calls_to}; + q.single_task(props, [=]() { Storage->bar(); }); } @@ -368,8 +369,8 @@ aspect or not. Implementation differs for JIT and AOT flows. ##### JIT flow Regardless of device code split mode selected by a user, functions marked with -`indirectly_callable` property should be outlined into separate device images -by `sycl-post-link` tool based on the argument of the `indirectly_callable` +`indirectly_callable_in` property should be outlined into separate device images +by `sycl-post-link` tool based on the argument of the `indirectly_callable_in` property, i.e. all functions from the same set should be bundled into a dedicated device image. @@ -424,7 +425,7 @@ the right (supported by a device) set of virtual functions in it. Therefore, we do not need to emit any of those properties when we are in AOT mode. For device images, which contain virtual functions (i.e. ones produced by -outlining `indirectly_callable` functions into a separate device image), the +outlining `indirectly_callable_in` functions into a separate device image), the following properties are set within the new property set: - "virtual-functions-set" with a string value containing name of virtual functions set contained within the image (value of the property argument); @@ -438,7 +439,7 @@ For the purposes of generating "uses-virtual-functions-set" device image property value the fact that kernel uses a set of virtual functions is inferred based on two things: - kernel is set to explicitly use a set of virtual functions through - `calls_indirectly` property; + `assume_indirect_calls_to` property; - kernel constructs an object of a polymorphic class and thus references vtable global variable which in turn references functions that belong to some sets; @@ -467,15 +468,15 @@ images are linked together with it: If for any used virtual functions set there is no device image that provides virtual functions from it, the runtime should throw an exception, because that -is likely a user error (missing or misspelled `indirectly_callable` property +is likely a user error (missing or misspelled `indirectly_callable_in` property on a virtual function). Produced list of device images is then linked together and used to enqueue a kernel. NOTE: when shared libraries are involved, they could also provide some -`indirectly_callable` functions in the same sets as application. This means that -there could be more than one image registered with the same value of +`indirectly_callable_in` functions in the same sets as application. This means +that there could be more than one image registered with the same value of "virtual-functions-set" property. #### In-memory cache of kernels and programs From 02ba869938b7c77eb7a6a88b9bbbbcc866057084 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 24 Jul 2024 07:31:12 -0700 Subject: [PATCH 51/51] Expand test plan a little bit --- .../VirtualFunctions/test-plan.asciidoc | 49 +++++++++++++++++-- 1 file changed, 46 insertions(+), 3 deletions(-) diff --git a/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc b/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc index 52132d2af6b88..c4a05d53cfad8 100644 --- a/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc +++ b/sycl/test-e2e/VirtualFunctions/test-plan.asciidoc @@ -3,7 +3,7 @@ = Test plan for virtual functions support in SYCL This is a test plan for virtual functions functionality described by the -corresponding upcoming `ext_sycl_intel_virtual_functions` extension. +`ext_sycl_oneapi_virtual_functions` extension. NOTE: This test plan does not cover unit tests, or negative tests related to compiler diagnostics, it is focused on end-to-end examples to make sure that the @@ -200,6 +200,22 @@ an object of one of derived classes and passes it to a function accepting a pointer to a base class object. The function the virtual member function, result is stored to a buffer and verified on host. +===== Handling of non-device virtual functions + +The test checks that presence of virtual member functions which were not marked +to be callable from device code can be handled correctly. + +In a simple hierarchy of classes some of virtual functions and their overrides +should be marked as callable from device, but other should not. Those virtual +functions should perform access to non-static data members. + +Depending on a runtime parameter (passed as a kernel argument) a kernel creates +an object of one of derived classes and passes it to a function accepting a +pointer to a base class object. The function the virtual member function, result +is stored to a buffer and verified on host. In device code we only check virtual +functions which were marked as callable on device. Host part of the program +also does calls to host-only virtual functions to verify their correctness. + === Passing objects of polymorphic classes between kernels Contrary to the previous section, an object of a polymorphic class is @@ -232,7 +248,7 @@ a method from a different set on that single object. In this test, there should be several kernels each constructing an object of a different derived class. It should be followed by a single kernel that calls -virtual functions from that created object. +virtual functions using all those constructed objects. ==== Multiple construct, multiple use @@ -244,7 +260,34 @@ that created object. === Separate translation units -TBD. +Test cases in this section aimed to cover different scenarios where definitions +of virtual functions, kernels that construct objects and kernels which perform +virtual calls are all distributed among several translation units in different +combinations. + +Test cases in this section could be a copy of test cases from the sections above +with only difference that they are split into several source files. + +==== Virtual functions defined in a separate translation unit + +For this test case, definition of virtual functions which are called from device +should be outlined into a separate translation unit, but kernels which construct +objects and perform virtual calls should all be in the same translation unit. + +==== Virtual functions defined in several translation unit + +This is the same test case as one above, except that every virtual function +definition should be placed in its individual translation unit. + +==== Kernels that use virtual functions are defined in different translation units + +For this test cases, both virtual functions and kernels that use them (including +kernels that construct objects) should be outlined into separate translation +units, i.e. there should be at least 3 translation units: + +- virtual functions definitions +- kernels that construct objects +- kernels that perform virtual calls === Optional kernel features