Okay, I’ll admit it: As a practicing data scientist/junkie, I picked up Docker only after spending days trying to install the unholy combination of nvidia-cudatoolkit packages for PyTorch on my laptop with a measly GPU.
It took me a while to understand what Docker does, but I think I’d best describe it now as: Docker spins up an image or virtual container within your system, with the environment and system packages pre-specified. So I don’t have to worry about and resolve package conflicts on my machine, especially with the kind of random bloat that accumulates in my apt and .deb lists1. Instead, I have this neat little mini-server on my machine that I can log into, run code on, and do analysis in. This is the great blessing of containerizaton vis-a-vis Docker – you abstract away messy indivdual installation problems and create a clean, pre-specified environment that is the same across all users. And you know exactly what’s in it, because the build command specifies it all.
With my forays into hobbyist self-hosting – an effort to re-purpose my old hunk of a laptop and the disparate HDDs – I use docker extensively now. But I use docker-compose instead of vanilla Docker, and for several reasons: I like the .yml declaration style, I don’t have to write multi-line CLI comands, and it works extremely well with VSCode’s Docker and Remote-Containers extensions2.
Coming back to my ongoing adventures with UChicago’s HPC cluster, I ran into the same GPU-cuda installation issue3 above. But docker-compose couldn’t be the ready answer, since the cluster doesn’t have any native support for it. What they do have is Singularity.
Singularity is pretty similar to Docker, to my non-sysadmin eyes at least. It claims to be beter suited for HPCs, and also allows you to use docker images from Dockerhub. You know me, I’m all about interoperability.
That said, I couldn’t easily find any translation tips from docker to singularity. You can pattern-match your way around it one way or another – it’s what I did about a year-and-a-half(!?) ago when I wanted to use Tiago Peixoto’s remarkably fast graph-tool package to build interaction graphs and compute their properties[Don’t worry, they have a conda install now].
On the other hand, I was determined to use docker-compose for my problem. Luckily, there exists singularity-compose. It took me about an hour to figure out the idiosyncracies of singularity-compose, and I was up and running. Or I thought so. All this work ended with me getting the ominous error:
ERROR ['*my_uid* is not in the sudoers file. This incident will be reported.\n'] : return code 1
Well, I tried to get ahead of this by sending out a long request ticket to the good folks at RCC. Let’s see how that pans out.
Until then, here’s to keeping our package managers clean, and striving for the difficult combination of tool standardization, competition and interoperabilty.
My singularity-compose.yml file, in case some poor soul on the UChicgo network is looking for it. All standard warnings of ‘This is completely untested and volunteered’ apply.
|
|
This forces me to reset my linux install every year or so. But I don’t see that as a bad thing, I think of it as spring cleaning. ↩︎
I think I might be overly attached to my IDE setup. ↩︎
If I ever figure out how to resolve this, I’ll update this post.
Update: So the good folks at RCC (thank you John and Arleth!) resolved this for me by making a global conda env for rapids using cudatoolkit 11.0. And it works! For posterity’s sake, I’ll make a copy of the yaml file available here. ↩︎