Openrsync: An implementation of rsync, by the OpenBSD team
189 points - today at 10:51 AM
SourceI've been using openrsync here and there since it was announced and it's definitely improved over time. I'm looking forward to when I can use it exclusively.
The one place in my usage where it doesn't match Samba rsync is with the following:
openrsync --rsync-path=openrsync -av -e ssh /etc/services example.com:/tmp/services
I would expect openrsync to create a remote file /tmp/services, but instead it creates /tmp/services/services.
Normal directory mirroring as in -av -e ssh /path/to/src/ example.com:/path/to/dst/ works as it does with Samba rsync.
> I would expect openrsync to create a remote file /tmp/services, but instead it creates /tmp/services/services.
As someone who has also suffered uncountable years of abuse from rsync, I understand the impulse, but I think it makes a lot more sense (and is a safer default) to create a second ”services”.
If we have a chance to change rsync defaults to something less insane and save future generations from this mess I think we should.
Was there already a /tmp/services directory on the dest?
One of the biggest points of confusion with rsync is how directories and trailing slashes are handled.
There is also a (stub) web page:
https://www.openrsync.org/
The problem with this fragmentation of rsync is that Apple and Android will prefer it, but the Linux and greater GPL world will adhere to the original implantation due to inertia. Power users will just have to know the quirks of each version.
The only way to stop this is for the original author(s) to release this under a BSD license.
Edit: For those assuming equivalent/identical behavior, study these words carefully: "accepts only a subset of rsync's command-line arguments."
It's really no different than every other BSD utility (and SysV utility, if you're running one of those) being different than the GNU ones. We've coped with it for fifty years at this point.
Basically like GNU Tar/CPIO and BSD Tar/CPIO. I've largely standardised towards using the bsd variant everywhere (especially since now even Windows ships it and it handles lots of other archive formats using the `tar` command) but it's always a pain to install it everywhere
thefilmore
today at 3:41 PM
This is the version used in macOS since 15.0.
mrdomino-
today at 4:53 PM
Was it 15.0? I seem to recall it coming in one of the minor point releases in the 15.x line - and I remember it breaking some scripts mysteriously.
EDIT: ah, fun: they did include it in 15.0, but they decided to save the breaking change that removed backwards compatibility for 15.4. https://apple.stackexchange.com/a/479297
The actual work of porting is matching the security features provided by OpenBSD's pledge(2) and unveil(2). These are critical elements to the functionality of the system. Without them, your system accepts arbitrary data from the public network.
https://justine.lol/pledge/
I am not seeing pledge on Alpine Linux in edge. Have people been testing Pledge on Linux? Did I perhaps misunderstand the risk of using Openrsync without pledge? Or is this article just for OpenBSD users?
saidnooneever
today at 4:40 PM
Linux has no such features as pledge or unveil, nor capsicum. it has cgroups, namespaces and a mess ofnother things u need to combine to try and do similar things. (it was built iteratively as many systems interacting and being combined to form 'sandboxing' or isolation/limiting of capabilities rather than specific isolation as an entire concept with specific system calls and kernel paths to enable it).
there might be newer stuff in linux land now i see comments about landlock but i assume those will build on the linux primitives rather than whole new ones. - total assumption there but it would seem logical to reuse rather than make new.
part of likely what they mean by 'mess' is that its all over the place. many different ways to try and lock things down. hard to pick what is best etc. without thoroughly diving into the different subsystems entirely. (as opposed to just have 1 or 2 relatively simple system calls)
From above your quote:
> The only officially-supported operating system is OpenBSD, as this has considerable security features.
And below your quote:
> This is possible (I think?) with FreeBSD's Capsicum, but Linux's security facilities are a mess, and will take an expert hand to properly secure.
It is portable in the sense that it compiles and runs, not in the sense that it has the same security features.
I'd love to see pledge/unveil on (upstream) Linux - but I'm not holding my breath.
papercrane
today at 3:43 PM
> I'd love to see pledge/unveil on (upstream) Linux - but I'm not holding my breath
There is Landlock now, I believe it would be possible to implement unveil and pledge on top of that.
Ok that makes more sense, thankyou.
justinsaccount
today at 4:17 PM
that quote seems to be a bit of an oversimplification to the point of being completely wrong.
> Without them, your system accepts arbitrary data from the public network.
Neither of these features change if you are accepting arbitrary data from the public network. They limit what an exploited process can do. It's explained properly in the 'Security' section, so I'm not sure where this came from.
that quote seems to be a bit of an oversimplification to the point of being completely wrong.
Under Portability [1] I don't have access to update that repo. I deleted my accounts when Microsoft took over.
[1] - https://github.com/kristapsdz/openrsync
rsync has specific running modes for the super-user. It also pumps arbitrary data from the network onto your file-system. openrsync is about 10 000 lines of C code: do you trust me not to make mistakes?
No, but that's why almost nobody runs it outside of strict trust boundaries. This security section would make more sense if rsync was like curl, which routinely deals with hostile counterparties. If the other side of your rsync is hostile, you probably have bigger problems!
(I'm not an rpki person so I don't know if there's some part of that problem domain that changes this equation. I'm not dunking on the project, just saying this snagged me in the README).
No, but that's why almost nobody runs it outside of strict trust boundaries. This security section would make more sense if rsync was like curl, which routinely deals with hostile counterparties. If the other side of your rsync is hostile, you probably have bigger problems!
I disagree. While rsync is most often used to transfer data between "friendly" systems, it's inherently crossing a security boundary. It's important to make sure that an attacker can't leverage it to transform the breach of one system into the breach of multiple systems.
delusional
today at 3:10 PM
> almost nobody runs it outside of strict trust boundaries.
I guess you can define "strict" however you want, but from what I saw ~10 years ago, most linux distros handled mirroring with rsync. That's a lot of usage in a pretty core part of the foundational open source ecosystem.
Many distros use rsync for that but also support unencrypted HTTP.
They’re layering on checksums and signing such that they mostly don’t think about the trustworthiness of mirrors or the networks between them.
skeledrew
today at 1:16 PM
This attempt to avoid things that use AI is increasingly looking like some weird kind of reverse whack-a-mole where each targeted hole becomes radioactive after. Just grabbing some popcorn to watch.
ranger_danger
today at 1:20 PM
I feel bad for people with the real name Claude.
Yeah, and we thought the most unlucky people were the ones named Alexa.
SoftTalker
today at 4:21 PM
Or those named Karen...
janussunaj
today at 4:28 PM
Rajesh here
formerly_proven
today at 1:45 PM
It took me quite some time to realize what an utterly presumptuous product name Claude Code actually is, but only because Shannon is rarely mentioned with his first name. It's golden calf levels of hubris, even more so if you consider how incapable it was on release. It's like renaming calc.exe Einstein. Incredibly poor taste, but entirely in line with AI tech bro mentality.
kstrauser
today at 4:03 PM
That linkage never occurred to me, or, I suspect, them. Claude use to be a reasonably common name. I have an uncle Claude. Why do you believe they named it after Shannon in particular?
homebrewer
today at 4:17 PM
It seems to be a widely repeated "fact" which can't be traced to anything particularly authoritative:
https://archive.is/pt5fQ
https://britannica.com/topic/Claude-AI
Looks like the 2023 NYT article started it, and it uses this as reference:
> depending on which employee you ask, was either a nerdy tribute to the 20th-century mathematician Claude Shannon
Personally I always associated it with the silent protagonist from GTA3.
https://gta.fandom.com/wiki/Claude
Yeah, especially since most Americans don't know how to properly pronounce Claude.
Hmm, Claude Shannon was an American (the model is ostensibly named after him), so maybe how he pronounced it would be the correct pronunciation.
That said, every language on earth will adapt foreign words into its phonology. The alternative would be to adopt the phonology of every language that loaned a word into your language.
No-slop version for the sane of us
Context:
https://mastodon.gamedev.place/@JeremiahFieldhaven/116654345...
ranger_danger
today at 1:25 PM
https://social.treehouse.systems/@thesamesam/116662824873341...
+1 to this. Other than people's reflexive anger or fear about AI coming for their code, I don't see anything to suggest that these are bugs that are due to the inclusion of AI vs bugs in a program with a bunch of complex interop with the filesystem and network.
In any case, it's important to identify projects that are beginning to actively vibecode and clearly express position on this issue on various platforms so that authors and maintainers receive feedback.
Even if this particular bug was not written by LLM in this particular case, it's not a fact that the release does not include other regressions and that subsequent vibecoded versions will not include them & new ones.
Do not going harassing developers because you think they are doing it wrong. If you can do better and don't want to actually contribute to the upstream you are always free to fork it.
skeledrew
today at 2:14 PM
> it's not a fact that the release does not include other regressions and [...]
Are you listening to yourself? The same exact thing also has applied, applies and will continue to apply to manually written code, in perpetuity. There's nothing new under the sun here; regressions happen when there's change, and the only way to mitigate is to have healthy feedback loops.
No. It's not important. It's actually pretty shitty to go around looking for projects and then telling the maintainers you disagree with how they develop.
applfanboysbgon
today at 3:49 PM
Friendly reminder that volunteer maintainers owe you literally not a single goddamn thing. I absolutely want no AI slop in my commercial products that I pay money for, but your feedback is not important to people you are not paying to develop software for you. They gave away not only their software but the source code for free; if you have a problem with it, fork it. Which is something you can do with their generous allowance, and that is an allowance any maintainer can instead choose to not bother themselves with if publishing their code for free leads themselves to dealing with entitled internet commenters harassing them with complaints.
> Friendly reminder that volunteer maintainers owe you literally not a single goddamn thing.
Technically true. However, I also do not owe them my silence.
I have not checked with OpenBSD 7.9, but as of 7.8 it did not support --exclude or -z. But outside of that openrsync works great.
(EDIT: --exclude is now supported on 7.9. Not sure when that was added, nice!)
But seems avoiding "slop" is getting very hard. I saw postfix now has a bit of AI code in it.
https://mastodon.sdf.org/@mrmasterkeyboard@mastodon.social/1...
nineteen999
today at 1:41 PM
Somewhat ironic Postfix has a record of no root/RCE in the default install, where opensmptd hasn't (CVE-2020-7247). Time will tell if it stays that way.
Where do you see that about Postfix? I followed the links and the only thing I see is that AI is being used to find bugs, not write code.
>Claude assisted code found in external/ibm-public/postfix/dist...
That is from the original post in the thread. Is that really due to LLM ? I do not know since I avoid AI as much as I can.
But the person also posted this link too:
https://github.com/NetBSD/src/commit/f764ddf4062e855f73fe2e3...
Right, I read all that and I didn't see anything to indicate that AI is being used to write code - just one person's unsubstantiated claim.
I did not look at details until I saw your post, but I tend to agree with you on this point.
But that is the odd thing, how to tell for sure if a LLM was used :)
Exclude is very commonly used in automation jobs to avoid duplicating big git repos and other big files. I think that would be a show stopper for a number of people.
I just tried openrsync(1) on OpenBSD 7.9, --exclude now works.
I have not tried using exclude in openrsync in a while, but I can see it now works on OpenBSD 7.9!
What's the deal with the name? Openrsync implies to me that it's an open source alternative to a closed source program. But the original Rsync is GPL? Is this just the pushover license making it "more open"?
OpenBSD folks would consider the GPL to be less open due to the requirement to apply the GPL to any derivative works.
ranger_danger
today at 1:32 PM
And GNU folks would say the GPL is actually the more open choice because it forces the project to stay open.
Two different ways of thinking about it I guess... it's nice to have choices and I don't think one is more or less "correct", more a matter of opinion/taste I guess.
It kind of reminds me of the equality of opportunity people versus the equality of outcome people. One sets the starting conditions for developers, the other the ending conditions for users.
Since developers are a subset of users, it's actually possible to calculate which is more open.
> more open choice because it forces the project
A true morality must be based on consent, not coercion. Humanity may not be there yet, and therein lies the argument for force (and thus copyleft); but the ultimate goal should always be to reduce its necessity.
It’s not coercion. You’re free to not use it, or alternatively do what these folks did, write your own. Coercion would be forcing people to use it through some mechanism, which clearly isn’t possible with GPL.
skeledrew
today at 2:26 PM
I see this, and the spiritual example that immediately comes to mind is that which is labeled as "crime". Would it be more moral that a murderer must first consent to being judged and sentenced, or that there is a system which automatically comes into play to hopefully deter but also punish it when it happens?
jcelerier
today at 1:51 PM
Allowing closed-source to exist is always the less moral choice for many reasons (one example being ecological sustainability)
kennywinker
today at 1:46 PM
Is this not the paradox of tolerance restated in different terms?
BSD license is unrestricted, it tolerates taking open source and closing it, thus always being at risk of things closing down.
GPL license doesn’t tolerate taking from open source and closing it, thus ensuring things stay open.
ranger_danger
today at 5:26 PM
The paradox clears itself up if you look at what tolerance actually is. It's simply not interfering with people's agency over themselves. Given that your right to self-agency doesn't entitle you to restrict others' self-agency, behavior that does try restricting others' agency is automatically not included in "tolerance."
The BSD license is why we have Valkey and not a purely closed-source Redis. It would have been much easier to perform the rugpull if Redis had initially been GPLed.
kennywinker
today at 2:16 PM
On top of badreligion42’s point, that both licenses allow forking just as easily - don’t you have the rugpull part backwards?
Afaik BSD licensed stuff can be re-licensed under any more closed licenses at any time, where as to re-license GPL, you need consent from every single contributor.
But i’m not familiar with the redis-valkey story so, maybe there is some nuance i am missing?
Redis started off as Free Software, but was switched to a source available license in version 7.4. The community promptly forked to Valkey, which is still under the BSD license. Since then, Redis shifted to AGPL 3, with contributor agreements, to try to ensure that they're the only ones who can attempt to commercialize Redis.
AGPL makes commercializing harder only for people who fear the AGPL because they want to keep stuff for themselves. there is no problem commercializing it if you don't mind sharing all your connected code. the only benefit redis has is that they can integrate non-free code in their hosting service, while the rest of us could not. since it is their work, i think it is reasonable that they have an advantage. it does not reduce my freedom as a user. it only hinders AWS and other big players from crushing redis.
badreligion42
today at 2:06 PM
And how exactly did the BSD license make creating Valkey easier? GPL and BSD licenses both have the source in the open. Anyone creating a fork, can easily do so for either BSD or GPL licensed projects. Since Redis is a database, which the user won't be using a binary of, even using a fork of a supposedly GPL-licensed Redis would not require you to share your modifications with your user, same as BSD.
The BSD license made forking Valkey easier because it ensures that everyone has equal footing. The GPL, especially with contributor license agreements and the like, makes it much more easy for a single party to control the direction of the product. For another example of this happening, look at MongoDB. It started out under the AGPL, but was rugpulled to a non-free license.
It feels like your actual beef here is with CLAs, which often are designed to allow the current maintainers to relicense.
CLAs are not an attribute of the GPL. They're an agreement that can be applied to contributions to any codebase with any license.
Mongo was already a centralized project. Technically open source agpl but I don’t remember it having a large developer community or really many contributions from outside mongo. When the rug pull happened I think simply most people didn’t care or moved on to equal (or better) alternatives. It’s not beloved software like Redis is.
The BSD license made forking Valkey easier because it ensures that everyone has equal footing
equal footing on the license is what allowed AWS to crush the original creators of the products they host.
it's a trade off.
the AGPL does not prevent a hosting service. it only prevents creating non-free addons. i see no problem with that. see also my other comment
ranger_danger
today at 1:28 PM
Many projects closely associated with OpenBSD start with "open"... openssh, openbgpd, openntpd, opensmtpd etc.
SoftTalker
today at 4:25 PM
Notable exception, OpenSSL already had the Open prefix so the OpenBSD project is called LibreSSL.
hamdingers
today at 1:43 PM
Not many are reimplementations of existing, much more popular, already open source projects.
throw0101a
today at 1:58 PM
OpenSSH was a 'reaction' to the original SSH(.com) code getting closed source:
> OpenSSH originated in 1999 as a fork of Björn Grönvall's OSSH, which derived from Tatu Ylönen's original SSH 1.2.12 release, the last version distributed under a license permitting open-source redistribution before Ylönen's subsequent software became proprietary under SSH Communications Security.[4]
* https://en.wikipedia.org/wiki/OpenSSH
It was probably the second thing with the Open— prefix by this group of developers, OpenBSD itself being the first. They simply ran with the naming convention. OpenBGP/OSPF were developed as alternatives to Quagga (GPL).
hamdingers
today at 3:58 PM
Is rsync going closed source? If not, how is that the same thing?
kstrauser
today at 4:06 PM
No. The name only means it’s made by the OpenBSD team, nothing more. If they made their own Python port, it’d be called OpenPython, even though the original is FOSS.
hamdingers
today at 4:34 PM
So is OpenSUSE made by the BSD team? OpenOffice? OpenShift? OpenCV? OpenAI?
It is not reasonable to claim this prefix unambiguously refers to the OpenBSD team. I do not understand why so many in this thread are pretending this isn't a confusing choice.
Nobody ever claimed that “Open” is a prefix used unambiguously by only one group of people ever.
In fact, your insistence that “Open” can only be used by projects that are replacing proprietary software is itself very odd.
OpenBSD itself has had its name for thirty years, and is not named for being an “open source” implementation of a proprietary OS.
hamdingers
today at 5:03 PM
The person I replied to said the "open" prefix means it's made by the OpenBSD team and I am responding to that.
Do not invent arguments that I did not make. I have only said that naming it openrsync when rsync already exists and is "open" in the general sense is confusing.
I find the negative reactions to this observation very confusing, especially yours, but I see that you're an OpenBSD developer so that explains your bias.
OpenBSD didn’t get its name from NetBSD going closed source.
Which aren't? It seems all (or most) are.