pull down to refresh

Careless big-time users are treating FOSS repos like content delivery networks

I'm at the Linux Foundation Members Summit, and Sonatype's CTO Brian Fox introduced me to a new open source problem. I wouldn't have thought that was possible, but here I am.

Fox, who also oversees Apache Maven, a popular Java build tool, explained that its repository site is at risk of being overwhelmed by constant Git pulls. The team has dug into this and found that 82 percent of the demand comes from less than 1 percent of IPs. Digging deeper, they discovered that many companies are using open source repositories as if they were content delivery networks (CDNs). So, for example, a single company might download the same code hundreds of thousands of times in a day, and the next day, and the next. This is unsustainable.

So Maven and other open source repositories are considering introducing a tiered payment system. Lone developers and small groups will still be able to download the code for free, but the hogs will have to pay for every download. In other words, open source software is still free as in speech, but you can forget about being "free as in beer" going forward.

How bad is it? Fox revealed that last year, major repositories handled 10 trillion downloads. That's double Google's annual search queries if you're counting from home and they're doing it on a shoestring. Fox described this as a "tragedy of the commons," where the assumption of "free and infinite" resources leads to structural waste amplified by CI/CD pipelines, security scanners, and AI-driven code generation.

...read more at theregister.com
Digging deeper, they discovered that many companies are using open source repositories as if they were content delivery networks (CDNs). So, for example, a single company might download the same code hundreds of thousands of times in a day, and the next day, and the next. This is unsustainable.

I don't understand, what is the benefit of doing this?

reply

pay for git pull or companies use open source repositories?

reply

No, why would someone need to download the same code hundreds of thousands of times a day?

reply
34 sats \ 4 replies \ @k00b 2h

The download is part of some build process script and it's easier to download from the source on every build than manage a cache.

reply

Oh, huh. I guess that makes sense? I don't quite know how these package managers work. I thought you download it once and now you have a copy on your machine. You only need to redownload if upgrading to a new version

reply
68 sats \ 2 replies \ @k00b 1h

They probably aren't using a package manager, or at least one that persists between builds, because package managers cache.

reply

I see. Custom build script then? Coz I thought even amateurs just use package managers most of the time like npm and pip

reply
68 sats \ 0 replies \ @k00b 1h

Yep custom and these build systems often don't persist the package manager's cache - preferring to start from a blank slate.