Skip to content

Low Mixed Precision Performance #296

Open
@fredlarochelle

Description

@fredlarochelle

I am encountering some strange performance behavior on the A770. For example, taking the CIFAR-10 example in the documentation.

Using FP32, I get around 5.75s per epoch and, using BF16, I get around 6.2s per epoch. I also get the same exact performance with and without ipex.optimize().

Also, when I compare the performance with a Tesla T4 on Colab, in FP32, it runs each epoch in around 1s and, for FP16, around 0.25s. Wayy faster and the A770 has technically better specs...

Are the XMX engines being used on Arc GPUs? Yes #258?

Dunno, if it might be related, but I get the following warnings when running the example (EDIT I started going through the code in the repo, those warnings are not related to the current issue):

[/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:447](https://vscode-remote+ssh-002dremote-002b192-002e168-002e0-002e124.vscode-resource.vscode-cdn.net/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:447): UserWarning: For XPU device, the split master weight is unsupported for now, so temp to disable it
  warnings.warn("For XPU device, the split master weight is unsupported for now, so temp to disable it")
[/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:457](https://vscode-remote+ssh-002dremote-002b192-002e168-002e0-002e124.vscode-resource.vscode-cdn.net/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:457): UserWarning: For XPU device to save valuable device memory, temp to do optimization on inplaced model, so                     make inplace to be true
  warnings.warn(
[/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:464](https://vscode-remote+ssh-002dremote-002b192-002e168-002e0-002e124.vscode-resource.vscode-cdn.net/home/fred/.local/lib/python3.10/site-packages/intel_extension_for_pytorch/frontend.py:464): UserWarning: For XPU, the weight prepack and sample input are disabled. The onednn layout                     is automatically chosen to use
  warnings.warn(

Ubuntu 22.04 with 1.13.10+xpu.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions