Tiled Hacker news on React Router

The 1979 Design Choice Breaking AI Workloads

22 points - today at 4:59 PM

Source

pocksuppet
today at 5:48 PM
Clickbait title. Summary: Their AI docker containers are slow to start up because they are 10GB layers that have to be gunzipped, and gzip doesn't support random access.
andrewvc
today at 5:46 PM
They say an ideal container system would download portions of layers on demand, however is seems far from ideal for many production workloads. What if your service starts, works fine for an hour, then needs to read one file that is only available over the network, but that endpoint is unreachable? What if it is reachable but it is very very slow?
The current system has issues with network stuff, but in a deploy process you can delineate that all to a new container deployment. Perhaps you try to deploy a new container and it fails because the network is slow or broken. Rollback is simpler there. Spreading network issues over time makes debugging much harder.
The current system is simple and resilient but clearly not fast. Trading speed for more complex failure modes for such a widely distributed technology is hardly a clear win.
The de-duplication seems like a neat win however.
MontyCarloHall
today at 5:39 PM
I ran into a similar issue years ago, where the base infrastructure occupied the lion's share of the container size, very similar to the sizes shown in the article:
```
   Ubuntu base      ~29 MB compressed
   PyTorch + CUDA   7 – 13 GB
   NVIDIA NGC       4.5+ GB compressed
```
The easy solution that worked for us was to bake all of these into a single base container, and force all production containers built within the company to use that base. We then preloaded this base container onto our cloud VM disk images, so that pulling the model container only needed to download comparatively tiny layers for model code/weights/etc. As a benefit, this forced all production containers to be up-to-date, since we regularly updated the base container which caused automatic rebuilding of all derived containers.
dsr_
today at 6:59 PM
The problem: "containers that take far too long to start".
Somehow, they don't hit upon the solution other organizations use: having software running all the time.
I suppose if you have a lousy economic model where the cost of running your software is a large percentage of your overall costs, that's a problem. I can only advise them to move to a model where they provide more value for their clients.
cosmotic
today at 6:35 PM
Why does the model data need to be stored in the image? Download the model data on container startup using whatever method works best.
alanfranz
today at 5:57 PM
Looks like they'd like something git repositories (maybe with transparent compression on top) rather than .tar.gz files. Just pull the latest head and you're done.
formerly_proven
today at 5:29 PM
The gzip compression of layers is actually optional in OCI images, but iirc not in legacy docker images. The two formats are not the same. On SSDs, the overhead for building an index for a tar is not that high, if we're primarily talking about large files (so the data/weights/cuda layers instead of system layers). The approach from the article is of course still faster, especially for running many minor variations of containers, though I am wondering how common it is for only some parts of weights changing? I would've assumed that most things you'll do with weights would change about 100% of them when viewed through 1M chunks. The lazy pulling probably has some rather dubious/interesting service latency implications.
The main annoyance imho with gzip here is that it was already slow when the format was new (unless you have Intel QAT and bothered to patch and recompile that into all the go binaries which handle these, which you do not).
notyourbiz
today at 7:00 PM
Super helpful.
aplomb1026
today at 5:32 PM
[dead]
PaulHoule
today at 5:13 PM
I remember dealing with this BS back in 2017. It was clear to me that containers were, more than anything else, a system for turning 15MB of I/O into 15GB of I/O.
So wow and new shiny though so if you told people that they would just plug their ears with their fingers.

The 1979 Design Choice Breaking AI Workloads

pocksuppet

andrewvc

jono_irwin

MontyCarloHall

jono_irwin

dsr_

za_mike157

dsr_

cosmotic

za_mike157

jono_irwin

fwip

alanfranz

formerly_proven

jono_irwin

notyourbiz

za_mike157

aplomb1026

PaulHoule

pocksuppet

PaulHoule