AI Slop Is Killing Open Source. Here's What to Do About It.

The maintainers were the canary. The mine is everyone who depends on what they maintain.

On January 31, 2026, the curl project ended its bug bounty program. Daniel Stenberg, curl’s founder and lead developer, wrote a blog post explaining why. After six and a half years of running the program — with eighty-seven confirmed vulnerabilities and over a hundred thousand dollars paid to researchers — the math had stopped working. The confirmed-vulnerability rate, historically above fifteen percent, had collapsed to under five percent through 2025. Volume up. Quality down. And the cause had a name: AI slop. Long, confident, plausibly-formatted vulnerability reports generated by language models and submitted by people who had not bothered to verify them.

This was not curl’s bad luck. The same week the bug bounty closed, tldraw’s founder Steve Ruiz announced the project was closing external pull requests entirely; “well-formed noise” had made triage unsustainable. Seth Larson, the Python Software Foundation’s security developer-in-residence, had been documenting the same pattern in his “new era of slop security reports” essay since late 2024. Django’s security team had developed an internal triage protocol specifically for AI-generated reports. Matplotlib’s contribution policy had been bypassed by automated agents opening PRs on “good first issues” that maintainers had deliberately left unimplemented as on-ramps for new contributors.

The pattern is clear enough now to name it: AI has dropped the cost of producing an open-source contribution to near zero, while leaving the cost of triaging one unchanged. That asymmetry is the supply-chain problem of 2026 — and unlike most supply-chain problems, the people absorbing the damage are mostly volunteers.

What’s actually happening

The mechanism is simple. A well-functioning open-source contribution flow works because generating a credible report — a real vulnerability writeup, a working patch, a thoughtful design proposal — has historically required time and codebase knowledge. That cost acts as a natural filter. The bug bounty paid out because the friction of producing something credible was the screening function.

LLMs removed the friction on the submission side without touching the verification side. Stenberg’s data is the cleanest illustration: prior to 2025, roughly one in six curl security reports was real; by late 2025 it was closer to one in twenty or one in thirty. The total volume went up while the signal went down — the worst combination for a volunteer security team.

The reports themselves follow a recognizable shape. They are long, written in confident technical English, reference real APIs and CVE numbers, and describe vulnerabilities that do not actually exist. “Critical HTTP/3 stream dependency cycle exploit” — the kind of phrase that takes a maintainer twenty minutes to disprove and a language model fifteen seconds to generate. Stenberg coined “death by a thousand slops” for the cumulative effect, and it is the right metaphor: no single report is catastrophic; the accumulation is.

The pattern is not limited to security reports. Pull requests opened by autonomous agents on good-first-issue tickets. Issue threads that continue with progressively more confident, progressively more hallucinated AI replies after the maintainer pushes back. Drive-by contributions that “fix” things by introducing subtle regressions a reviewer has to read carefully to catch. The common feature is transferred triage cost: the AI does the easy part of producing output that looks plausible; the maintainer does the hard part of disproving it.

The economics of slop

It is worth being precise about why this matters beyond the specific projects affected. Open source is load-bearing infrastructure for almost every modern company, and the maintainer pool is small, mostly unpaid, and structurally fragile. The libraries that critical systems depend on are often maintained by one or two people in their evenings.

When you make those people’s lives meaningfully worse, several things happen at once. Some quit. Some stop running public reporting channels. Some start closing external contributions, which slows down legitimate improvements. The projects do not die immediately — but they get a little less healthy, a little less responsive, a little less safe. And the costs propagate downstream to every company that ships software built on top of them.

The economic asymmetry is what makes this different from prior maintainer-burnout discourse. Earlier waves were about entitlement, abuse, and the limits of unpaid labor. The current wave is about a specific mechanism — production costs falling to near zero while triage costs stay the same — that scales arbitrarily. Even well-intentioned users can now flood projects with low-quality contributions just by being slightly lazy with AI tools. The volume problem will get worse before it gets better.

Contribution-policy patterns that work

Several patterns have emerged across projects that have responded effectively. None of them solves the problem entirely; each shifts the cost back toward where it can be borne.

Explicit AI-disclosure requirements. The simplest pattern: require contributors to disclose if and how they used AI tools, and to explicitly attest that they have read, understood, and verified what they are submitting. Several major projects — curl, Gentoo, NetBSD — have adopted versions of this. The disclosure does not eliminate AI use; it eliminates the plausible deniability that lets low-effort submissions hide behind ambiguity. Maintainers can close anything that violates the policy without further investigation.

Reputation gating. Projects like Node.js use HackerOne’s Signal score to filter who can submit security reports without a maintainer triaging them first. New accounts with no track record get filtered into a different queue. This is uncomfortable — it raises the bar for genuinely new contributors — but the alternative is filtering by reading the submissions, which is the cost the project is trying to avoid.

Closing external PRs entirely. The tldraw approach. Outside contributions are not accepted; the project develops in the open but accepts only internal commits. Maintainers can still respond to issues, but the PR firehose is closed. The strongest response, with obvious costs — legitimate external contributors are turned away — but for some projects the only way to keep maintainers functional.

Invitation-only contribution. A middle path. Outside contributions are reviewed for the first few interactions; contributors who demonstrate they understand the codebase get invited into the regular flow. The first contribution costs the maintainer effort; subsequent ones do not.

Eliminating monetary incentives. Curl’s specific move. The bug bounty was the mechanism that converted “AI can produce plausible reports” into “AI can produce plausible reports that pay.” Removing the payout removes the slot-machine dynamic. Reports still come in — researchers who actually find vulnerabilities still want to disclose responsibly — but the volume of speculative submissions drops sharply.

Public consequences for bad-faith submission. Stenberg’s most controversial move: publicly identifying and ridiculing submitters of obvious AI slop. Uncomfortable. Criticized as harsh. It also works, because the social cost of being publicly identified is one of the few things that scales with the volume of submissions. The alternative — silent triage — does not scale at all.

The pattern across all of these is the same: shift cost back to the submission side, either through policy friction, reputation requirements, or social consequence.

Triage automation: useful tool, dangerous shortcut

A second wave of response is automating the triage itself. Several projects now run AI-based filters on incoming reports to catch obvious slop before a human looks at it. This is genuinely useful and likely to get more useful. It also has a specific failure mode worth naming: the filter is itself fallible, which means the same kind of subtle, well-presented slop that gets past a tired maintainer also gets past the AI filter.

The honest framing: triage automation is a force multiplier for a working policy, not a substitute for one. Run the filter as a first pass to catch the worst submissions; require human review on anything the filter is uncertain about; treat the filter’s confidence score as a signal, not a verdict. Projects that have tried to fully automate triage report the predictable outcome — the slop adapts to the filter and the maintainer still has to look at everything.

There is also a counter-pattern worth highlighting because it goes the other way. The OpenSSL collaboration with AISLE — expert-guided AI analysis where domain experts use AI tools to find real bugs and verify them before disclosure — produced twelve CVEs in OpenSSL, some dating back decades, with no invalid reports reaching the maintainer team. The expert team absorbed the false-positive filtering cost rather than transferring it. This is the version of AI in open source that actually helps. It requires expertise, judgment, and discipline; it cannot be done at scale by random submitters with a chatbot. But it is the proof that the problem is not “AI in open source” — it is “unverified AI output dumped onto unpaid volunteers.”

What downstream consumers should do

For enterprises and platform teams whose production systems run on open-source dependencies, the relevant question is not whether AI slop is bad. It is what the downstream consequences are and what the team should do about them.

First, projects that have closed external contributions or removed bug bounties are not less safe — they are responding to a real attack on their ability to function. Treating “the project doesn’t accept external PRs” as a red flag in vendor evaluation is reading the signal backwards. The healthier read: the maintainers are taking their responsibility to their users seriously.

Second, dependency teams should expect more of these closures, not fewer. Companies that built their dependency-management practices around the assumption of perpetually open contribution flows will need to update them. “We’ll just open a PR upstream” works less well when upstream has stopped accepting PRs from the outside. This affects fork strategies, patch backports, and the security-response playbook.

Third, enterprises that depend on open-source projects should consider what they owe back. Most large companies are net consumers of open-source effort, in a relationship where they extract substantial value and contribute relatively little of the maintenance cost. The current crisis is a reminder that the asymmetry has real consequences when the maintainers are under pressure. Sponsorship, paid maintainer time, security-research collaboration of the OpenSSL/AISLE variety — these are not charity. They are the cost of keeping the substrate functional.

Fourth, if your engineering organization uses AI coding tools — and it almost certainly does — establish an internal policy on what is acceptable to submit upstream. Specifically: no AI-generated content goes to an upstream open-source project without a human author who has read it, understood it, verified it, and is willing to attest to its correctness. This is the same standard the maintainers are now requiring. Enforce it on your side before they have to enforce it on theirs.

The bigger picture is that open source is going through the same structural adjustment every system goes through when AI lowers the cost of producing one side of an exchange without lowering the cost of evaluating it. Spam mail, content moderation, academic peer review, code review, security disclosure — each faces a version of the same problem. The open-source community is responding faster than most, partly because the burden falls so directly on identifiable people, partly because those people have the technical means to fight back. The contribution-policy patterns coming out of this period — disclosure requirements, reputation gates, social consequence, expert-guided automation — are likely to be the template for how the rest of the internet handles the same asymmetry. The maintainers are running the experiment for all of us. The least we can do is pay attention to what they learn.