The Internal Developer Platform Anti-Patterns Nobody Talks About
The five most common ways internal developer platforms quietly fail — and the remediations engineering leaders actually need to ship
TL;DR
Reading the post…
Most internal developer platforms do not fail loudly. They fail by becoming furniture.
The platform engineering discipline has matured fast. It has its own foundation, conferences, books, certifications, and a growing market of vendor portals. Most of the available content covers what to build. Very little covers the specific ways platforms quietly underperform — the ones that look fine in a leadership dashboard while developers route around them every day. Engineering leaders who have run platforms know these patterns instinctively. Engineering leaders who are about to invest in one usually find them six months in, after the budget is spent and the team is committed.
This post lays out the five anti-patterns I see most often in mid-to-large organizations running an Internal Developer Platform, each with a concrete remediation. They are not exotic. None of them require throwing the platform out. But noticing them before they compound is the difference between a platform that pays back the investment and one that becomes a permanent line item nobody is brave enough to kill.
A note on framing: most of the canonical guidance — Team Topologies, the platform-as-product literature, the platformengineering.org reference material — gets the principles right. The failure modes below are mostly about not following those principles even when you think you are.
Anti-pattern 1: The portal is the product
The first and most expensive mistake is confusing the developer portal with the platform itself. A team stands up Backstage, spends three months curating the service catalog, customizing the UI, and onboarding teams to keep their catalog-info.yaml up to date. The result looks impressive in a demo. Developers visit the portal, find the link to their service, and then leave the portal to do any actual work — kick off a deploy, provision a database, request an environment.
A catalog is not a platform. A catalog is a phone book with a search bar.
This pattern happens for a structural reason. Visible UI work is easier to demo and easier to justify in a roadmap review than the unglamorous automation work underneath. So the portal gets built first, and the automation engine that was supposed to sit behind it gets perpetually deferred — until the platform team is six engineers deep into curating metadata for a system that has never automated anything end to end.
Remediation. Build the workflow capabilities first — golden-path templates, automated environment provisioning, one-click deploy, infrastructure self-service — and treat the portal as a thin renderer of those capabilities. A useful test: if you removed the portal entirely and let developers consume the platform through a CLI and an API, how much of your real value would survive? If the honest answer is “most of it,” your team is in good shape. If the answer is “very little,” the portal is the product, and that is the problem.
Anti-pattern 2: Plugin sprawl as a strategy
The second pattern follows directly from the first. Once a portal-centric platform team is established, every new request becomes a custom plugin. The mobile team wants iOS build dashboards — write a plugin. The data team wants Spark job monitoring — write a plugin. The security team wants vulnerability views — write a plugin. Six months in, the platform team has thirty custom React plugins, half of which break on every Backstage version upgrade, and most of the team’s capacity is spent on framework-tracking work that produces no new platform capability.
The available analysis on this is consistent: running a mature Backstage instance typically requires multiple full-time engineers just for upkeep, with a substantial share of that going to plugin maintenance and version compatibility. Vendor write-ups will quote specific FTE numbers; the exact figure varies by organization, but the direction of travel is universal. Plugin sprawl is one of the fastest ways to convert a platform team from a leverage function into a maintenance cost center.
Remediation. Define extension boundaries deliberately. The portal should integrate with platform capabilities through stable APIs — not be the implementation surface for those capabilities. When a team asks for a “plugin,” the first question is whether the underlying capability is a portal concern at all. Most of the time it is not — it is an integration to an existing tool that belongs in the platform layer, exposed through the portal via a standard pattern. Aggressive curation matters more than feature count: a portal with ten capabilities that survive every Backstage upgrade beats one with fifty that need constant babysitting.
Anti-pattern 3: No SLOs on the platform itself
The third anti-pattern is one platform teams find genuinely embarrassing once it is pointed out. They enforce SRE discipline on the product teams they serve while running the platform itself without any of the same discipline.
There is no SLO on the CI/CD pipeline. No error budget on environment provisioning. No on-call rotation for the platform team when the deploy system breaks at 2am. When the platform has an outage — and it always eventually does — developers find out the way they find out about a third-party SaaS outage: it just stops working, with no acknowledgement, no ETA, and no postmortem.
This breaks trust in a way that is uniquely hard to recover from. A product team can survive its own bad deploy because they own it. They cannot survive depending on a platform that they cannot rely on. “The platform was down” becomes the explanation for every missed deadline, whether it is true or not, and once that framing takes hold in a leadership channel, the political cost of mandating the platform doubles overnight.
Remediation. Define explicit, public SLOs for each platform capability — pipeline availability, provisioning success rate, time-to-environment, observability ingestion latency. Run an on-call rotation. Post status updates. Write postmortems and circulate them. Treat the platform like the production system it is, because for the rest of the engineering organization, it absolutely is one. The Team Topologies platform-as-product material, the SRE tradition, and the broader platform-engineering literature all agree on this point. It is rarely the part teams skip on purpose; it is just the part they get to last, which is its own kind of choice.
Anti-pattern 4: Golden paths nobody takes
The fourth pattern is subtle. The platform team carefully designs golden paths — opinionated, recommended routes for the most common workflows. Documentation gets written. Internal talks get given. And then almost no team actually adopts them.
The reason, almost always, is that the un-golden path still works fine. Existing services keep doing what they were doing. New services often pick the old path because it is the one their tech lead knows. The golden path is “recommended” but not noticeably easier, faster, or better-supported. Six months in, the platform team is frustrated, leadership starts considering mandates, and the conversation about the platform turns adversarial.
Mandates are the worst possible answer here. They solve the symptom (low adoption) while making the underlying problem (the golden path is not actually golden) worse, and they breed shadow IT in the longer term. You can compel a team to file a service in your catalog. You cannot compel them to enjoy using your platform, and you cannot compel them not to quietly maintain a parallel deployment path on the side.
Remediation. The principle is friction asymmetry. The golden path needs to be measurably faster, with fewer steps, and with more capability included by default. Provisioning a new service on the golden path should take minutes; doing it the old way should take hours. Compliance, security review, observability, and operational support should be automatic on the golden path and manual elsewhere. If your golden path is not visibly easier than the alternative, adoption is not a marketing problem — it is a product problem. Fix the product, then market.
Anti-pattern 5: Adoption measured by catalog size
The fifth pattern is the most common one in leadership reporting: measuring platform adoption with vanity metrics. “We have eight hundred services in the catalog.” “Ninety percent of teams have onboarded to Backstage.” Neither tells you whether the platform is doing useful work.
The metrics that actually matter are harder to collect and less flattering. What share of new services are created via the golden path? What share of deployments go through the platform versus around it? How long does a new engineer take to ship their first change in a sanctioned way? What does the platform’s developer-experience survey trend look like over the last four quarters? Is anyone running shadow infrastructure because the platform path is too slow or too restrictive?
The available platform-metric guidance generally agrees that catalog size, plugin count, and similar surface-level numbers are leading indicators of nothing in particular. What you want is an adoption funnel — awareness, trial, sustained use, advocacy — combined with a leak metric: how often teams are quietly going around you. The leak metric is the single most useful signal a platform team can instrument, and it almost never gets reported because it is uncomfortable.
Remediation. Replace catalog-size dashboards with an adoption funnel. Instrument the platform to know what share of deploys, new services, and infrastructure requests flow through the golden path versus around it. Treat the “around it” number as the most important signal you have — it is the leading indicator of platform failure, and it is the one that will not appear in any quarterly review unless the platform team owns it deliberately.
The pattern across all five anti-patterns is the same. The failure is rarely technical. The failure is taking your eye off the principle the platform engineering literature has insisted on for years — that the platform is a product, and developers are its users. The moment that framing slips, every one of the failure modes above becomes easy to commit by accident. The teams that build platforms that compound value year over year are the ones who keep that framing tight, measure honestly against it, and stay willing to scrap the parts that are not earning their keep. Everyone else is building furniture.