Yes:
* There's an enormous amount of very expensive shared state (context cache) which you do not want to duplicate when you can avoid it.
* Memory locality is crucially important for performance.
* Hardware is extremely over-subscribed.
* Hardware is extremely expensive.
These factors all make hardware or even traditional memory-space (hypervisor/VM/hardware assisted virtualization) isolation a non-starter for most workloads and customers, which forces all isolation to the software layer. This already makes things way harder than they are in commodity SaaS.
Moving beyond that, the tools, frameworks, and hardware which the system runs on (GPU) wasn't designed for task isolation and building this isolation is even moreso an emergent research field than it is in x86 CPU hardware-sharing (which has required a huge amount of effort over the past 30+ years to get where we are today).
And, the ratio of usage/sensitivity to maturity is also just poor overall; these are young companies with rapid development and enormous delivery pressure under incredible customer workload requirements, too.
I can't tell if the original post is a real issue or not, but I'm surprised there aren't more like this overall; the whole thing really is a house of cards in this sense.
jstummbillig
today at 5:10 PM
> which forces all isolation to the software layer. This already makes things way harder than they are in commodity SaaS.
Is this not what happens in most SaaS? Isolation at the software layer? I understand there are special agreements, but they seem to be mostly that – no?
> the ratio of usage/sensitivity to maturity is also just poor overall; these are young companies with rapid development and enormous delivery pressure under incredible customer workload requirements, too.
Mh. The talent density in these companies is apparently quite exceptional. Things like customer data separation is something that is obvious and top of mind. I don't see why they would not hire the best to implement these relatively boring/solved things correctly at an architectural level.
> Is this not what happens in most SaaS?
I think it's fairly popular to try to do more logical isolation in SaaS now, especially with VM-scheduling-as-a-service becoming more popular. For example, I did security architecture at a company who did relatively simple financial processing; we worked to move to a model where customer documents were encrypted using a tenant key which we'd then wrap in both a service key and a login key; users could only get the login key stapled to their session by authenticating against that account, and the processing jobs ran on a cloud vendor's logical isolation. So the user needed a login key, the service needed the attested service key, and the job ran in what amounted to a mini-VM, avoiding issues like "whoops we sent the wrong document ID and the backend gave it back to us" or "whoops, we routed the request to the wrong tenant backend!" This level of isolation would be really hard to achieve in an LLM vendor context.
> I don't see why they would not hire the best to implement these relatively boring/solved things correctly at an architectural level.
I think a lot of these things develop over time; obviously hiring people who have done them before helps, but it's hard. Even the people with strong experience often only know little slices. And unfortunately, every system operating at these scales has emergent behavior which can become really challenging at scale; mistakes like "we used hash(id) as a key in a memory cache without a collision list, and it collided" which would simply never affect most startups become more and more frequent at scale. High rate of change makes it hard to suss these mistakes out and root-cause them, too; "a customer gave us a log where we swapped X and Y" is hard to bisect when you're doing 500 code deploys a day.