Replies: 1 comment
-
No, you do not need flash attention to finetune, in the finetuning script use train.py instead of train_mem.py. It worked for me and i didnt see much difference in results. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to finetune LLaVA on a custom dataset. Flash attention is not supported by my GPU.
Beta Was this translation helpful? Give feedback.
All reactions