this post was submitted on 16 Aug 2023
4 points (100.0% liked)

Machine Learning

1765 readers
5 users here now

founded 4 years ago
MODERATORS
 

When I train my PyTorch Lightning model on two GPUs on jupyter lab with strategy="ddp_notebook", only two CPUs are used and their usages are 100%. How can I overcome this CPU bottleneck?

Edit: I tested with PyTorchProfiler and it was because of old ssds used on the server

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 1 year ago

Yup this, if you would like more help we need the code, or at least a minimal viable reproduction scenario.