Can't get multi-gpu to work anymore #14948
Unanswered
EvanZ
asked this question in
DDP / multi-GPU / multi-node
Replies: 1 comment 1 reply
-
@EvanZ Currently, we don't have any version update guide at this time. For the time being, I would suggest updating your PL minor version one by one. For example, if you're using I am interested in how the degradation happened to your case. Would it be feasible for you to share your code and environment detail here so that I (or someone) might be able to point out possible causes in your code? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
At one point I was able to run my model using 4 GPUs on a single machine but since upgrading to the most recent versions of torch and lightning, I am getting shared memory errors like this:
unable to open shared memory object </torch_4121_699393955_8164> in read-write mode: Too many open files (24)
Is there a tutorial or any docs that explain what changes I need to make to my code to bring it up to date?
Beta Was this translation helpful? Give feedback.
All reactions