\

Why are we accepting silent data corruption in Vector Search? (x86 vs. ARM)

5 points - yesterday at 4:57 PM


I spent the last week chasing a "ghost" in a RAG pipeline and I think Iโ€™ve found something that the industry is collectively ignoring.

We assume that if we generate an embedding and store it, the "memory" is stable. But I found that f32 distance calculations (the backbone of FAISS, Chroma, etc.) act as a "Forking Path."

If you run the exact same insertion sequence on an x86 server (AVX-512) and an ARM MacBook (NEON), the memory states diverge at the bit level. Itโ€™s not just "floating point noise" itโ€™s a deterministic drift caused by FMA (Fused Multiply-Add) instruction differences.

I wrote a script to inspect the raw bits of a sentence-transformers vector across my M3 Max and a Xeon instance. Semantic similarity was 0.9999, but the raw storage was different

For a regulated AI agent (Finance/Healthcare), this is a nightmare. It means your audit trail is technically hallucinating depending on which server processed the query. You cannot have "Write Once, Run Anywhere" index portability.

The Fix (Going no_std) I got so frustrated that I bypassed the standard libraries and wrote a custom kernel (Valori) in Rust using Q16.16 Fixed-Point Arithmetic. By strictly enforcing integer associativity, I got 100% bit-identical snapshots across x86, ARM, and WASM.

Recall Loss: Negligible (99.8% Recall@10 vs standard f32).

Performance: < 500ยตs latency (comparable to unoptimized f32).

The Ask / Paper Iโ€™ve written a formal preprint analyzing this "Forking Path" problem and the Q16.16 proofs. I am currently trying to submit it to arXiv (Distributed Computing / cs.DC) but I'm stuck in the endorsement queue.

If you want to tear apart my Rust code: https://github.com/varshith-Git/Valori-Kernel

If you are an arXiv endorser for cs.DC (or cs.DB) and want to see the draft, Iโ€™d love to send it to you.

Am I the only one worried about building "reliable" agents on such shaky numerical foundations?

  • codingdave

    today at 1:56 PM

    > We assume that if we generate an embedding and store it, the "memory" is stable.

    Why do you assume that? In my experience, the "memory" is never stable. You seem to have higher expectations of reliability than would be reasonable.

    If you have proven that unreliability, that proof is actually interesting. But seems less like a bug, and more of an observation of how things work.

    • chrisjj

      today at 11:02 AM

      > Am I the only one worried about building "reliable" agents on such shaky numerical foundations?

      You might be the only one expecting a reliable "AI" agent period.

      • varshith17

        yesterday at 4:58 PM

        Github repo: https://github.com/varshith-Git/Valori-Kernel