Description
I am encountering some strange performance behavior on the A770. For example, taking the CIFAR-10 example in the documentation.
Using FP32, I get around 5.75s per epoch and, using BF16, I get around 6.2s per epoch. I also get the same exact performance with and without ipex.optimize()
.
Also, when I compare the performance with a Tesla T4 on Colab, in FP32, it runs each epoch in around 1s and, for FP16, around 0.25s. Wayy faster and the A770 has technically better specs...
Are the XMX engines being used on Arc GPUs? Yes #258?
Dunno, if it might be related, but I get the following warnings when running the example (EDIT I started going through the code in the repo, those warnings are not related to the current issue):
[/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:447](https://vscode-remote+ssh-002dremote-002b192-002e168-002e0-002e124.vscode-resource.vscode-cdn.net/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:447): UserWarning: For XPU device, the split master weight is unsupported for now, so temp to disable it
warnings.warn("For XPU device, the split master weight is unsupported for now, so temp to disable it")
[/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:457](https://vscode-remote+ssh-002dremote-002b192-002e168-002e0-002e124.vscode-resource.vscode-cdn.net/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:457): UserWarning: For XPU device to save valuable device memory, temp to do optimization on inplaced model, so make inplace to be true
warnings.warn(
[/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:464](https://vscode-remote+ssh-002dremote-002b192-002e168-002e0-002e124.vscode-resource.vscode-cdn.net/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:464): UserWarning: For XPU, the weight prepack and sample input are disabled. The onednn layout is automatically chosen to use
warnings.warn(
Ubuntu 22.04 with 1.13.10+xpu.