AI31 May 20265 min read

Self-Hosting n8n in Production: A Practical Guide

How to self-host n8n in production: architecture, queue mode, Postgres, security, backups, scaling and the trade-offs versus n8n Cloud

By AI Advisory team

Self-hosting n8n is the right call for most mid-market teams running serious automation volume. You get unlimited executions, full control of credentials and data, and a per-month infrastructure cost that usually lands between £20 and £200 rather than the £500-£2,000+ you would pay on n8n Cloud at equivalent scale. The trade-off is that you own the uptime, the upgrades, the backups, and the security posture.

This guide covers what a production-grade n8n deployment actually looks like: the architecture decisions, the database setup, the queue mode threshold, the security hardening, and the operational habits that keep it running without 2am pages. It assumes you have either run n8n on n8n Cloud or installed it locally with Docker, and you are now deciding whether to take it into production yourself.

When self-hosting beats n8n Cloud

The economics flip quickly. n8n Cloud's Starter plan caps you at 2,500 executions per month for around £20, and the Pro tier sits at £50 for 10,000. Self-hosted n8n is free at the software layer under the Sustainable Use Licence, so the only cost is the VPS or container platform you run it on. A £15/month Hetzner CCX13 (2 vCPU, 8GB RAM) will comfortably handle 100,000+ executions per month for a typical workflow mix.

Beyond cost, self-hosting wins on three dimensions that matter for UK mid-market buyers:

Data residency and GDPR control. Credentials, execution payloads, and HTTP request bodies all sit on infrastructure you control. For workflows touching personal data, this materially simplifies your Article 30 records of processing and removes a sub-processor from your data flow.
Custom nodes and code execution. Self-hosted n8n lets you install community nodes (over 400 available on npm) and run arbitrary Python in the Code node via the Pyodide runtime. On Cloud, community nodes are restricted on lower tiers.
Integration with internal systems. You can place n8n inside your VPC, give it a private IP, and let it talk to internal databases and APIs without exposing them to the internet.

You should stay on n8n Cloud if you are running fewer than 5,000 executions per month, you have no one on the team who is comfortable with Linux and Docker, or your workflows are simple enough that the operational overhead of self-hosting outweighs the savings.

Reference architecture for production n8n

There are three deployment patterns that matter. Pick based on your execution volume and tolerance for downtime.

Single-instance Docker (small production)

One Docker container running n8n in regular mode, backed by an external Postgres database. Suitable for up to roughly 5,000-10,000 executions per day with workflows that complete in under 30 seconds. This is what most mid-market teams should start with. A single VPS, n8n behind a reverse proxy (Caddy or Nginx), TLS via Let's Encrypt, Postgres either on the same box or on a managed service like Supabase or AWS RDS.

The hard requirement here is to not use SQLite in production. SQLite is the default for n8n's local Docker setup and it will work until it doesn't. Once you hit any meaningful concurrency, you get database locks, slow execution log queries, and a fragile single file that is awkward to back up safely. Switch to Postgres before you put anything important on it.

Queue mode with workers (medium-to-large production)

Queue mode separates the n8n main process (the UI and webhook receiver) from worker processes that actually execute workflows. They communicate through a Redis queue. This is the architecture you want once you are running more than 10,000 executions per day, or you have long-running workflows that would otherwise block webhook responses.

The components:

n8n main - serves the editor UI and receives webhook traffic. Pushes execution jobs to Redis.
n8n workers - one or more processes pulling jobs from Redis and running workflows. Scale horizontally.
Redis - the job queue. A small managed Redis (Upstash, Redis Cloud, or self-hosted) is fine.
Postgres - stores workflows, credentials, execution data. Should be managed or properly backed up.
n8n webhook process (optional) - a dedicated process for high-volume webhook ingestion, separating it from the editor.

The official queue mode documentation covers the environment variables. The key ones are EXECUTIONS_MODE=queue, QUEUE_BULL_REDIS_HOST, and QUEUE_HEALTH_CHECK_ACTIVE=true to enable the worker health endpoint.

Kubernetes (enterprise scale)

If you are already running Kubernetes, deploy n8n there. The official Helm chart is maintained by the community at github.com/8gears/n8n-helm-chart. You get horizontal pod autoscaling on the workers, rolling upgrades, and the same operational model you use for everything else. Most mid-market teams do not need this. If your only Kubernetes workload would be n8n, do not introduce Kubernetes for it.

Database, persistence and backups

Postgres is the only sensible choice. Use Postgres 14 or later. Allocate at least 2GB of RAM to the database, more if you retain execution data for any length of time. n8n's execution log can grow surprisingly fast - a workflow with 20 nodes processing 10,000 items per day will generate gigabytes per week if you keep full execution data.

Configure execution data retention explicitly:

EXECUTIONS_DATA_PRUNE=true
EXECUTIONS_DATA_MAX_AGE=336 (hours - 14 days is a reasonable default)
EXECUTIONS_DATA_PRUNE_MAX_COUNT=50000 (cap on total stored executions)

For backups, you need three things: nightly logical backups of Postgres (pg_dump), backup of the n8n encryption key (the N8N_ENCRYPTION_KEY environment variable), and a documented restore procedure that you have actually tested. The encryption key is critical - without it, your backed-up credentials are useless because they are encrypted at rest. Store it in a password manager and in your infrastructure-as-code secrets, separately from the database backup.

If you are using a managed Postgres (Supabase, RDS, DigitalOcean managed databases), point-in-time recovery is usually included and removes most of the backup burden. For self-managed Postgres, use Barman or pgBackRest rather than rolling your own cron jobs.

Security hardening

n8n holds credentials for everything it touches - APIs, databases, email accounts, payment systems. Treat the n8n instance with the same security posture as a production database, not as a dev tool.

The non-negotiables:

Never expose n8n directly to the internet on HTTP. Always behind TLS, always through a reverse proxy that handles certificates. Caddy is the easiest because it does Let's Encrypt automatically.
Enable user management and turn off the default basic auth. Set N8N_USER_MANAGEMENT_DISABLED=false and create proper user accounts with strong passwords. Enable 2FA for every account.
Set a strong N8N_ENCRYPTION_KEY. 32+ random bytes. Generate with openssl rand -hex 32. Do not let n8n auto-generate one - it will, but you will not have a copy.
Restrict editor access by IP if possible. The editor UI does not need to be reachable from the public internet for most teams. Put it behind a VPN, Cloudflare Access, or Tailscale. Keep only the webhook endpoints public.
Disable the public REST API if you are not using it. Set N8N_PUBLIC_API_DISABLED=true.
Audit community nodes before installing. They run with the same privileges as n8n itself. Review the source on npm before adding to production.

For UK GDPR compliance, document n8n as a processor in your records of processing activities. The ICO's guide to data security sets the expected standard for technical and organisational measures.

Webhooks, reliability, and timeouts

Webhook reliability is where most self-hosted n8n setups quietly fail. The default behaviour is that an incoming webhook blocks until the workflow finishes. If your workflow takes 45 seconds and your reverse proxy has a 30-second timeout, you lose the response and the caller retries, creating duplicates.

Three patterns fix this:

Use the "respond immediately" webhook mode for any workflow that takes more than a few seconds. The webhook returns 200 OK instantly, and the workflow runs asynchronously. Combine with the Respond to Webhook node if you need to send a delayed response.
Tune proxy timeouts. If you are using Caddy, set flush_interval -1 and a generous read_timeout. For Nginx, proxy_read_timeout 600s is sensible.
Make workflows idempotent. External systems will retry. Use a deterministic ID from the inbound payload and check for prior processing before doing anything with side effects.

For webhooks that must not be lost - payment notifications, calendar updates, anything with regulatory weight - use queue mode and consider putting a lightweight queue (SQS, a Postgres table, or Redis Streams) in front of n8n. The webhook handler just writes to the queue and returns 200. A separate n8n workflow drains the queue. This decouples ingestion from processing and lets you replay if anything breaks.

Monitoring, upgrades, and operational hygiene

You need three signals to run n8n confidently in production:

Workflow failure alerts. Use n8n's built-in Error Workflow feature. Create one workflow that handles errors from all others, posting to Slack or PagerDuty with the workflow name, error message, and execution URL. Set this as the default error workflow in settings.
Process health. Monitor the n8n container's CPU, memory, and the queue depth in Redis if you are on queue mode. A growing queue depth means workers cannot keep up.
Database health. Postgres connection count, execution table size, and query latency. Slow queries on the execution table almost always mean you forgot to enable pruning.

For upgrades, n8n releases roughly weekly. Do not chase every release. Pin to a specific version tag in your Docker compose file (never latest), test new versions in a staging environment, and upgrade monthly or when a release fixes something you actually need. Read the changelog for breaking changes - n8n is still pre-2.0 and occasionally ships migrations that change behaviour.

Keep a staging instance that mirrors production. The cheapest version is a second container on the same VPS pointing at a separate Postgres schema. Use it to test workflow changes and version upgrades before they touch production data.

What this costs to run, realistically

For a typical mid-market deployment running 20-50 active workflows and 20,000-100,000 executions per month:

VPS (Hetzner CCX23 or equivalent): £25-£40/month
Managed Postgres (Supabase Pro or DigitalOcean): £20-£25/month
Redis (Upstash pay-as-you-go for queue mode): £5-£15/month
Backup storage (S3 or B2): £2-£5/month
Monitoring (Better Stack or Grafana Cloud free tier): £0-£20/month

Total: roughly £55-£105/month for an infrastructure that would cost £400-£1,500/month on n8n Cloud at equivalent execution volume. The operational time cost is typically 2-4 hours per month for upgrades, log review, and minor incidents - assuming you set the stack up properly the first time.

The most common failure mode is the team that installs n8n on a single small VPS with SQLite, runs it for six months building dozens of workflows, then loses the lot when the disk fills up or the container restarts and the database is corrupted. Spending two days on the architecture before you write your tenth workflow saves the rebuild.

Frequently asked questions

Is self-hosted n8n free to use commercially?

Yes, under the Sustainable Use Licence. You can run n8n self-hosted for internal business use without paying anything to n8n. The licence restricts you from offering n8n itself as a hosted service to third parties (so you cannot resell n8n as your own SaaS), and it restricts certain enterprise features like SSO and advanced RBAC to paid Enterprise licences. For the standard case of a UK business automating its own operations, the free self-hosted version is fully usable in production. Read the licence at docs.n8n.io/sustainable-use-license.

How many executions per month can a single n8n instance handle?

A single n8n instance on a 4 vCPU, 8GB RAM VPS with Postgres can comfortably handle 100,000+ executions per month for typical workflows (API calls, light data transformation, sending emails or messages). The real constraint is rarely raw count - it is concurrency and workflow duration. Ten workflows that each take 30 seconds and fire simultaneously will tax a single instance more than 10,000 fast webhook responses. Move to queue mode when you regularly see executions queueing or response times degrading, not when you hit an arbitrary count threshold.

Should I use Docker Compose or Kubernetes for n8n?

Docker Compose for almost everyone. It is simpler, faster to set up, easier to debug, and entirely sufficient for the load that most mid-market teams put on n8n. Use Kubernetes only if you already run a Kubernetes platform and adding n8n to it is no extra operational burden. Introducing Kubernetes solely to host n8n is overkill and adds significant complexity around storage, networking, and upgrades. A single Docker Compose file with n8n, Postgres, Redis, and Caddy can be version-controlled in Git and redeployed in minutes.

How do I handle credentials and secrets safely in n8n?

n8n encrypts credentials at rest using the N8N_ENCRYPTION_KEY. Store this key in your infrastructure secret manager (1Password, AWS Secrets Manager, Doppler) and inject it as an environment variable - never commit it to Git. For credentials themselves, use n8n's external secrets feature on the Enterprise plan, or for the community edition, create dedicated service accounts for each integration with the minimum permissions required. Rotate credentials quarterly. Avoid sharing one credential across many workflows where possible - granular credentials make it easier to revoke access when team members leave or scopes change.

What is the upgrade path from n8n Cloud to self-hosted?

Export your workflows from n8n Cloud (Settings > Workflows > Download), spin up a self-hosted instance, and import them. Credentials do not export with workflows for security reasons - you will need to recreate them on the self-hosted side, which usually takes 30-60 minutes for a typical client setup. Webhook URLs change, so you will need to update any external systems pointing at the old Cloud URLs. Run both in parallel for a week, switch webhook URLs one workflow at a time, and decommission Cloud once everything is stable. Plan two to three days of work for a 30-workflow migration.

Can I run n8n behind Cloudflare?

Yes, and it is a sensible default for production. Cloudflare in front of n8n gives you DDoS protection, a global CDN for the editor UI, and Cloudflare Access for putting the editor behind SSO without modifying n8n. The two configuration points to know: set Cloudflare's SSL mode to "Full (strict)" so the connection between Cloudflare and your origin is also encrypted, and increase the proxy read timeout if you have long-running workflows that respond synchronously. Cloudflare Tunnels also work well if you do not want to expose your VPS to the public internet at all.

How do I monitor and alert on workflow failures?

Build a single "error handler" workflow that posts to Slack, Discord, or your incident tool, including the failed workflow name, the error message, the execution URL, and the timestamp. Set this as the default error workflow under Settings > Error Workflow, so any workflow without a specific error handler falls back to it. For higher-stakes workflows, add explicit error branches in the workflow itself with custom retry logic. Outside n8n, monitor the container with Uptime Kuma or Better Stack checking the /healthz endpoint, and monitor Postgres connection count and disk usage. Treat n8n as production infrastructure, not a developer tool.

Do I need a dedicated DevOps engineer to run self-hosted n8n?

No, but you need someone who is comfortable with Docker, Linux, and basic Postgres administration. The initial setup takes one engineer two to three days if they have done similar work before. Ongoing operation is roughly half a day per month - upgrades, log review, occasional troubleshooting. If no one on the team has that skillset, you have two options: pay n8n Cloud and stay on it, or work with an agency to set up the self-hosted stack properly and hand over a documented runbook. The economics of self-hosting only work if the operational time is not a hidden cost no one is accounting for.

Getting from architecture to running system

The difference between an n8n instance that works for six months and one that works for six years is mostly decisions made in the first week: Postgres not SQLite, queue mode when you cross the threshold, the encryption key backed up separately, idempotent webhook design, and an error workflow wired up before the first production workflow ships. None of it is hard. All of it is easy to skip.

If you want help designing or operating a self-hosted n8n stack that survives real production load, AI Advisory builds and runs these systems for UK mid-market clients every week. Have a look at our workflow automation service for what that engagement looks like in practice.