Deployment is the quietest way to burn a SaaS budget. A team ships a working product, reaches for Kubernetes because that's what grown-ups use, and six months later they have three engineers managing YAML instead of shipping features. Another team stays on a single VPS with PM2 until it falls over on a Tuesday at 3am. The honest answer is that each of the four common paths — PM2 on a VPS, Fly.io, Vercel, and Kubernetes — has a stage where it wins and a stage where it's wrong. This post walks through all four, with the tradeoffs that matter at production scale.
The four paths, in one table
These four options cover roughly 95% of Node.js SaaS deployments in 2026. The axes that matter aren't feature lists — they're cost per request at your current scale, how much operational attention the platform demands, and how painful a zero-downtime deploy is on a bad day.
| Path | Best for | Monthly floor | Ops complexity | Where it breaks |
|---|---|---|---|---|
| VPS + PM2 | MVPs, side projects, up to ~1M req/day | $10–40 | Low | Single point of failure, no built-in HA |
| Fly.io | Global backends, regional latency, 1M–50M req/day | $20–300 | Low-medium | Debugging distributed state, region failovers |
| Vercel / Netlify | Next.js frontends, edge APIs, bursty traffic | $20–500+ | Lowest | Egress fees, cold starts on heavy backends, vendor lock-in |
| Kubernetes | Multi-service platforms, >50M req/day, compliance | $300–3000+ | High | YAML sprawl, platform team overhead, misconfig outages |
Cost floors assume a single environment (production only). Add a staging environment and the numbers roughly double. Add preview environments per PR and Vercel's model looks different from Kubernetes' model — that distinction matters more than people realize.
Path 1 — A single VPS with PM2 cluster mode
The most underrated deployment in 2026 is still a single well-provisioned VPS running Node.js under PM2 in cluster mode, fronted by Nginx or Caddy. It's cheap (a $20/month 4-vCPU Hetzner box handles extraordinary traffic), it's fast to debug because everything lives on one machine, and it scales vertically further than most teams ever need.
PM2's cluster mode forks one Node.js process per CPU core and load-balances between them with the built-in cluster module. That turns a single machine into an effective multi-process service with graceful reloads, automatic restarts on crash, and a reasonable CLI for ops. A typical ecosystem config is short, readable, and boring in the best possible way:
// ecosystem.config.js — production-ready PM2 config
module.exports = {
apps: [
{
name: "api",
script: "dist/server.js",
instances: "max", // one per CPU core
exec_mode: "cluster", // enables round-robin load balancing
max_memory_restart: "512M", // kill + restart if a worker leaks
kill_timeout: 5000, // 5s to drain before SIGKILL
wait_ready: true, // wait for process.send('ready')
listen_timeout: 10000, // fail fast if ready signal never arrives
env: {
NODE_ENV: "production",
PORT: 3000,
},
error_file: "/var/log/api/err.log",
out_file: "/var/log/api/out.log",
merge_logs: true,
time: true,
},
],
};
// Zero-downtime reload on deploy:
// pm2 reload ecosystem.config.js --update-env
// Graceful shutdown inside the app:
// process.on('SIGINT', () => server.close(() => process.exit(0)))The gotchas are boring but real. A single VPS is a single point of failure — one failed disk, one botched apt upgrade, one runaway cron job and the service is down. In-memory state (sessions, rate-limiter counters, local caches) doesn't survive a restart. And if traffic is global, one region adds hundreds of milliseconds of latency for users on the far side of the world.
A VPS + PM2 setup is often the right answer through the first year of a B2B SaaS. The team that migrates too early usually ends up paying Kubernetes costs while serving tens of requests per second — an expensive way to feel grown-up.
Path 2 — Fly.io for global backends
Fly.io sits in the sweet spot between PM2-on-a-VPS and Kubernetes. It's a platform built around Firecracker micro-VMs that ships a Node.js app globally with a short Dockerfile and a flyctl command. Health checks, rolling deploys, regional scaling, private networking, and managed Postgres come included. Pricing starts around $10 per small VM and scales with CPU, memory, and egress.
Fly's killer feature is geographic placement. A Node.js API deployed to three regions (Amsterdam, Newark, Singapore) serves every continent within 150ms first-byte latency without touching CDN configuration. That's meaningful for real-time apps, marketplaces with global users, or anything where p95 latency matters to conversion.
The tradeoffs: distributed state gets complicated (shared Postgres, Redis, or Fly's LiteFS), debugging is harder than on a single box because logs and traces live across regions, and failovers when one region has an issue still need careful testing. Fly's control plane has had occasional incidents over the years — not more than other providers, but single-region teams don't notice those the way global ones do.
Path 3 — Vercel and the serverless frontier
Vercel (and Netlify, and Cloudflare Workers) dominate Next.js hosting and edge APIs. The experience is exceptional: git push to deploy, preview URL per PR, edge caching handled automatically, analytics baked in. For a marketing site, a docs site, or a Next.js app where 80% of pages are statically generated, the path from zero to production is unmatched.
The failure mode is backend-heavy workloads. Vercel's serverless functions have cold starts, time limits (usually 60–300 seconds depending on plan), and egress pricing that compounds fast. A team running a chat API with large context windows, background jobs, or any long-lived connection will hit a Vercel bill that makes a $40/month Fly.io plan look absurd. The recurring pattern at scale: Pro plan teams hitting $500–2000/month on Vercel for a workload that a single $40/month Fly.io machine would serve comfortably.
The right read is that Vercel is excellent for the front half of a Next.js app and progressively worse for the back half. A common 2026 stack splits the difference: frontend and edge middleware on Vercel, API routes that do real work on Fly.io or a VPS, and background jobs on a worker queue somewhere cheap.
Path 4 — Kubernetes, and when it's actually wrong
Kubernetes is the heavyweight option. Done well, it gives a team horizontal scaling across hundreds of machines, self-healing across availability zones, declarative rollouts with canaries and blue-green, and a consistent model for every service in a large platform. EKS, GKE, and AKS remove most of the control-plane pain; Helm, ArgoCD, and Flux make GitOps tractable; Datadog, Grafana, and the CNCF ecosystem make observability manageable.
Done poorly — which is most of the time it's chosen early — Kubernetes adds a full-time ops role to a 5-engineer team, generates YAML faster than the team can review it, and produces outages from misconfigured Ingress, exhausted resource quotas, or a node pool autoscaler that didn't quite do what the docs implied.
Kubernetes is the wrong choice when the company has fewer than about 15 engineers, serves fewer than roughly 50M requests per day, runs fewer than a dozen microservices, and doesn't have a regulatory reason to need fine-grained platform control. Teams that cross one or two of those thresholds often benefit; teams that cross zero of them almost always regret the choice.
The fair case for Kubernetes is multi-service platforms with a dedicated platform team, regulated workloads where reproducibility and audit trails matter, and hybrid or multi-cloud deployments where vendor abstraction is genuinely valuable. Outside of those, there's usually a simpler answer.
Zero-downtime deploys across the four
Every path above supports zero-downtime deploys; the ceremony differs wildly.
- PM2: pm2 reload restarts workers one at a time, draining the old ones before killing them. It's a single command. Make sure the app responds to SIGINT with a graceful server.close() before calling process.exit().
- Fly.io: rolling deploys are the default. Set a health check, tune max_unavailable, and Fly handles the rest. Blue-green is a flag away with fly deploy --strategy bluegreen.
- Vercel: every deploy is immutable and atomic. Preview URLs per PR, promote-to-production on merge. This is the simplest of the four — there's no downtime concept to manage.
- Kubernetes: rolling updates are built into Deployment objects (maxSurge, maxUnavailable). Canaries need a service mesh or a GitOps tool like Argo Rollouts. The machinery is flexible and the failure modes are real — a botched readiness probe can take production down through a rolling deploy.
What to actually pick
A short decision tree that's served most of our client projects well:
- Under 100K requests per day, single region, small team — VPS + PM2. Nothing else comes close on cost or debuggability.
- Next.js app, frontend-heavy, bursty traffic — Vercel for the frontend, a separate backend (Fly.io or VPS) for anything that does real work.
- Global users, real-time features, 1M–50M requests per day — Fly.io. Strong default; usually cheaper than the equivalent Kubernetes setup by a factor of three to five.
- Multi-service platform, compliance requirements, dedicated platform team — managed Kubernetes (EKS/GKE). Accept the complexity tax; make sure the business case justifies it.
The best deployment target is the boring one the team can debug at 3am. Clever infrastructure is a liability when the person paged doesn't know the platform well enough to read the error messages. Simplicity is the feature.
Key takeaways
- A single VPS with PM2 cluster mode is the most underrated deployment in 2026 — it carries most SaaS apps through their first year comfortably.
- Fly.io is the right default for global backends at mid-scale. It's where most teams should go when a single VPS starts to feel like a risk.
- Vercel wins for frontends and edge APIs, and loses badly on backend-heavy workloads where egress and compute costs compound.
- Kubernetes is the correct choice for a small set of teams and the wrong choice for most that pick it early. Size the ops capacity honestly before committing.
- Zero-downtime deploys are available on every path. The ceremony ranges from a single command (Vercel, PM2) to a YAML file and a service mesh (Kubernetes).