I remember I met Bram Cohen (of Bittorent fame!) around 15 years ago. Around that time is when I had started building web-based distributed collaborative systems, starting with Qbix.com and then spun off a company to build blockchain-based smart contracts through Intercoin.org etc.
Anyway, I wanted to suggest a radical idea based on my experience:
Merges are the wrong primitive.
What organizations (whethr centralized or distributed projects) might actually need is:
1) Graph Database - of Streams and Relations
2) Governance per Stream - eg ACLs
A code base should be automatically turned into a graph database (functions calling other functions, accessing configs etc) so we know exactly what affects what.
The concept of what is âtoo nearâ each other mentioned in the article is not necessarily what leads to conflicts. Conflicts actually happen due to conflicting graph topology and propagating changes.
People should be able to clone some stream (with permission) and each stream (node in the graph) can be versioned.
Forking should happen into workspaces. Workspaces can be GOVERNED. Publishing some version of a stream just means relating it to your stream. Some people might publish one version, others another.
Rebasing is a first-class primitive, rather than a form of merging. A merge is an extremely privileged operation from a governance point of view, where some actor can just âpushâ (or âmergeâ) thousands of commits. The more commits, the more chance of conflicts.
The same problem occurs with CRDTs. I like CRDTs, but reconciling a big netsplit will result in merging strategies that create lots of unintended semantic side effects.
Instead, what if each individual stream was guarded by policies, there was a rate limit of changes, and people / AIs rejected most proposals. But occasionally they allow it with M of N sign offs.
Think of chatgpt chats that are used to modify evolving artifacts. People and bots working together. The artifacts are streams. And yes, this can even be done for codebases. It isnt about how ânearâ things are in a file. Rather it is about whether there is a conflict on a graph. When I modify a specific function or variable, the system knows all of its callers downstream. This is true for many other things besides coding too. We can also have AI workflows running 24/7 to try out experiments as a swarm in sandboxes, generate tests and commit the results that pass. But ultimately, each organization determines whether they want to rebase their stream relations to the next version of something or not.
That is what Iâm building now with https://safebots.ai
PS: if anyone is interested in this kind of stuff, feel free to schedule a calendly meeting w me on that site. I just got started recently, but Iâm dogfooding my own setup and using AI swarms which accelerates the work tremendously.