Open Source and America's AI Action Plan \ stacker news ~AI

I was meaning to do an analysis from the open source perspective of America's AI Action Plan last Friday but irl took over and I didn't have enough quiet time left available to write it up. Better late than never, I hope.

Rationale for this article

From the introduction of the plan:

Winning the AI race will usher in a new golden age of human flourishing, economic competitiveness, and national security for the American people.

If it were a matter of winning and losing, what would such an outcome look like for the other 95% of the world that would then not-win? What does "losing the AI race" look like? Being even more enslaved to the whims of companies like Microsoft, that many, also in the US (#1049092), are getting tired of already?

I also wonder, is "winning" the AI race a long-term winning strategy for the US? Or is it more like the CIA making the Taliban "winning" in their quest against the Soviet invader? We can only guess at if there will be blowback from this, and what the imperial attitude will bring.

Then comes open source, or, in the case of current closest equivalent for AI models, open weights, where everyone can download a model and execute it on their own, sovereign, computer. ¹. Open source doesn't really know winners or losers, but it does know progress. It is the complete opposite of winning the AI race, because the race continues in perpetuity.

In the AI space, some of the organizations that aren't market leaders in chatbots release their models as open weights, because that has both the most impact for further research, and it often does some disruption to closed and proprietary market leaders that these organizations aim to keep up with. Meta has done max damage this way with llama, and DeepSeek did the same with their R1 release. Many gladly ride these selfless-only-in-appearance donations to the Open Source community as it allows those of us that do not have those endless, sugar-sweet VC fiat injections, to build things with reasonably modern tech that would otherwise be proprietary and unobtainable except through expensive subscriptions.

Is America's AI Action Plan compatible with Open Source progess? Let's find out:

Pillar I: Accelerate AI Innovation

The first pillar consists of several proposed actions around what the US Government should encourage. Open source is explicitly mentioned here.

The Open Source mention

Open-source and open-weight AI models are made freely available by developers for anyone in the world to download and modify. Models distributed this way have unique value for innovation because startups can use them flexibly without being dependent on a closed model provider. They also benefit commercial and government adoption of AI because many businesses and governments have sensitive data that they cannot send to closed model vendors. And they are essential for academic research, which often relies on access to the weights and training data of a model to perform scientifically rigorous experiments.

This sounds pretty good, although it looks to me that it's clearly written by people stuck in government and Silicon Valley ², but let's ignore that for a moment, because there is something else here:

We need to ensure America has leading open models founded on American values.

Prior instances that reference "American values" are centered around freedom of speech, which seems like a good goal because censorship can impede cognitive excellence. Let's indeed not build bias into models; perhaps uncensored models are a great open-source answer to these values already ³. Since this already exists, we can consider this problem solved, also for unrelated listed actions like:

Update Federal procurement guidelines to ensure that the government only contracts with frontier large language model (LLM) developers who ensure that their systems are objective and free from top-down ideological bias.

Perhaps the USG should just use Dolphin-Mistral: a well performing open model and free to use! I'm sure NVidia would gladly provide preferential pricing for GPUs and you don't even need huge ones to run the 24B model. Since governments work for the public, it is only beneficial to leverage open source and not proprietary products. Vendor lock-in gets avoided because these open models can be exchanged as they can share a common execution environment.

Adoption

Adoption is also mentioned, and from an open source perspective, this could perhaps help funding open model efforts, if, but only if, these efforts aren't immediately captured by proprietary players using their proprietary models, which I expect to be more likely to happen than not:

A coordinated Federal effort would be beneficial in establishing a dynamic, “try-first” culture for AI across American industry:

Regulatory sandboxes

Domain-specific efforts in healthcare, energy, and agriculture

AI-integrated manufacturing is also mentioned, mainly suggesting to use the Federal government's purchasing power as a lever to embed the industrial AI/robotics cross-section firmly in the manufacturing sector.

Recognition of GIGO

There's a section on garbage-in-garbage-out called "Build World-Class Scientific Datasets".

Direct the National Science and Technology Council (NSTC) Machine Learning and AI Subcommittee to make recommendations on minimum data quality standards for the use of biological, materials science, chemical, physical, and other scientific data modalities in AI model training.

This sounds nice, but once more, if this is gatekept there will be a good chance that open source model engineering will have reduced exposure to this effort. I hope that, in the spirit of the earlier open source section, this will be publicly available.

A framework for assessing AI quality is recommended too, which, if public, could help show the merits of open source, so ultimately I think that the whole of this could result in positive impact for open source AI.

National security

There are lots of items about national security, and singling out of Chinese-made models. This is the worrying part, because open source is agnostic to location (your server may not be, but this is easily circumvented because open licenses often allow re-packaging and redistribution.) I worry that these considerations could take the upper hand, and if a choice needs to be made between opening up models and boosting a commercial provider that can be strictly regulated, the latter may fit the national security portion of the bill better.

Pillar II: Build American AI Infrastructure

While most of this pillar focuses on building datacenters and deregulating to reduce some environmental friction (in regulation, not irl, of course) there is also a section about secure-by-design AI.

AI systems are susceptible to some classes of adversarial inputs (e.g., data poisoning and privacy attacks), which puts their performance at risk. The U.S. government has a responsibility to ensure the AI systems it relies on—particularly for national security applications—are protected against spurious or malicious inputs. While much work has been done to advance the field of AI Assurance, promoting resilient and secure AI development and deployment should be a core activity of the U.S. government.

This kind of links to the GIGO topic from Pillar 1 as well. It would be good to have resilient AI models and surrounding tooling implementations, also on the open source side of things, or maybe, especially there.

Pillar III: Lead in International AI Diplomacy and Security

The "diplomacy" part of this pillar feels a bit misleading, because it's mostly about protective measures, such as increased export control on hardware, counter-influencing international communities, surveillance and another round of national security concerns.

Conclusion

While open source is an explicit topic in the plan, there are many other actions outlined that are of questionable usefulness, if not outright incompatible with open source principles and mechanics, such as export controls and surveillance. Because of this, from an open source perspective, the plan has some internal conflicts for which we will have to wait and see what gains the upper hand: open collaboration, or protectionism.

Perhaps through some of the mentioned initiatives, such as curating quality model source data, regulatory sandboxes and maybe opportunities to build specific solutions for targeted industries, the open source community will perhaps get opportunities to showcase the merit of open collaboration.

We'll have to keep building, which can be done with encouragement, or despite discouragement from the USG.

One of my personal favorite devices to run small LLMs and other transformer-based models on is my "old" Macbook Pro with an M1 chip. Any Apple M1/M2 device, due to errors in the physical chip architecture, cannot serve as a super-secure workspace anymore, but it has similar unified memory architecture to more recent M4 chips and also have an NLP chip built-in, and thus are fine for activities that don't require my pgp signature or secure compile environment. Providing compute for small LLM models works okay, though it's a bit slow at times. ↩
It doesn't mention individuals at all, but I'd pose that individuals have even more sensitive, private, data that they shouldn't under any circumstance share with a personal-data harvesting company like Google, Meta or OpenAI. ↩
It's also much less offensive if your own computer makes a cognitive error than when xAI's $120/mo chatbot insults you. However, I've not found much issues with uncensored models that had much/all of their "alignment" removed, like dolphin-mistral. These don't insult you out of the blue, unless you ask or manipulate them to generate answers about things that insult you. ↩

211 sats \ 4 replies \ @justin_shocknet 28 Jul

Models themselves are a red-herring in all of this, if China makes an open-source model, is it really China's if Americans can just fork it and run it on local hardware? Now that I think of it, the best opensource model is already Chinese... Qwen... run largely on American hardware by an American company... Groq.

What the apparatus says and does are also two different things, Microsoft and Musk... arguably the of two most important defense contractors, are the standard bearers of closed source models.

From a natsec/competitiveness perspective, what they ultimately care about is who can produce the chips (to power weapons), energy to power the chips and manufacturing (of weapons), and data-warehouses from which to glean intel (on how to best use weapons).

Having AI processing/data domestically also means SIGINT, packets going over Cisco equipment and US-laid submarine cables, not Huawei's.

Bitcoin is a good analogy here, it's a digital thing like a model that's only as secure as the flow of energy, chips, and communications networks in the real world... all things that fall under the direct control of the natsec apparatus.

117 sats \ 0 replies \ @halalmoney 28 Jul

Gonna cross post this to nostr.

I imagine your Nat Sec watcher nodding in agreement!

27 sats \ 2 replies \ @optimism OP 28 Jul

Models themselves are a red-herring in all of this

I've felt that for a longer time. Even if there are triggers baked into weights, we can remove those from the open models simply by reinforcing those weights inversely on top - that's kind of what is done to get uncensored models (decensored would be a better name, imho.)

All the panic early this year when Deepseek released their open source model was 99% openai & co panicking that their proprietary crap got beat again and 1℅ people saying that it ought to be impossible due to export constraints. I can't help but think that that was the intended effect. Nothing burger though: only the retarded plan to prevent development through exporting only nerved chips backfired. That's the real lesson.

only as secure as the flow of energy, chips, and communications networks in the real world.

Exactly. Which is why the less hops between you and your LLM, the better, and the less dependency on 3rd parties, the better. So running an open model that you've validated yourself (maybe can use the eval stack from #1052545 for that), on your own hardware, with your own tooling... is useful no matter whether you're a sovereign state or individual.

100 sats \ 1 reply \ @justin_shocknet 28 Jul

Deepseek

That seemed to me very coordinated to pin-prick the model bubble, ever since the focus (on CNBC et al) has been on hardware/energy more than the models

your own hardware

From the nation-state perspective absolutely, but individuals have no recourse at this stage and i'm not sure that will change... even "your" hardware is permissioned technology from the natsec supply chain. There's no meaningful opensource hardware and I suspect that's kept that way intentionally.

17 sats \ 0 replies \ @optimism OP 28 Jul

That seemed to me very coordinated to pin-prick the model bubble

Yep. And that's not a loss I mourn. The bubble was ridiculous. Focusing on infra is time and money spent much better.

There's no meaningful opensource hardware and I suspect that's kept that way intentionally.

I'm still hopeful for RISC-V implementations especially now that generic compute makes space for optimized compute, which does fit. I fear it won't go much past the hobbyist horizon but if I don't use it myself, I'm not part of the solution. So perhaps I should get back into that.

116 sats \ 6 replies \ @SatsMate 27 Jul

Pretty interesting! I think open sourcing all AI's should be key. It is kind of scary what this closed source software companies could be doing with our information that we are feeding the AI.

82 sats \ 5 replies \ @optimism OP 27 Jul

According to Proton's comparison ¹ (do take it with at least 1 grain of salt), no good things:

source: https://lumo.proton.me/about ↩

32 sats \ 4 replies \ @SatsMate 27 Jul

Cool, I may actually give Proton a try, definitely looks like a more friendly company from a privacy/user rights standpoint

0 sats \ 3 replies \ @optimism OP 27 Jul

It's still better to run your own, but if you don't have acceptable hardware for it, this could at least help in the meantime. My initial questioning of it didn't feel like it was close to the performance I see on some of the (even smaller) latest generation open source LLMs.

100 sats \ 2 replies \ @OT 27 Jul

I had a short test and I wasn't impressed at all.

50 sats \ 1 reply \ @optimism OP 27 Jul

I have to spend some more time with that to understand where it fails to deliver better but I share the feeling of a lack of impressive results thus far.

100 sats \ 0 replies \ @OT 27 Jul

I asked it something basic like "Which is the nest Bitcoin only exchange in Australia?" It couldn't answer saying something like I should go with one that I trustworthy.

0 sats \ 0 replies \ @Macoy31 28 Jul outlawed

stackers have outlawed this. turn on wild west mode in your /settings to see outlawed content.

Footnotes

Footnotes