Airbyte vs n8n vs Fivetran: ETL Pipelines
What You’ll Need
- n8n Cloud or self-hosted n8n
- Hetzner VPS or Contabo VPS for hosting
- Namecheap if domain needed
- DigitalOcean as alternative
Table of Contents
- The Core Differences
- Airbyte Deep Dive
- n8n for ETL Workflows
- Fivetran’s Enterprise Approach
- Hands-On Comparison
- Getting Started
The Core Differences
I’ve been building data pipelines for five years now, and the “which ETL tool” question keeps coming up. The answer? It depends—but let me break down why these three platforms diverge in ways that matter.
Airbyte is the open-source movement’s answer to expensive proprietary ETL. It’s built for data engineers who want control and transparency. n8n started as workflow automation but evolved into a capable ETL platform with a visual builder that makes complex logic accessible. Fivetran is the enterprise play—fully managed, pre-built connectors, and you pay for convenience.
Here’s the practical reality: if you’re bootstrapped or learning, Airbyte or n8n are your friends. If your company budgeted six figures for data infrastructure, Fivetran’s your bet. But cost alone isn’t the story. Let me walk you through what actually matters.
Airbyte Deep Dive
Airbyte launched in 2020 and quickly became the darling of the self-hosted ETL crowd. Why? It solved a real problem: data connectors are a nightmare to build from scratch.
I deployed Airbyte on a Hetzner VPS last year and was shocked at how straightforward the connector library is. Out of the box, you get 300+ pre-built connectors for everything from Stripe to Salesforce to PostgreSQL.
The architecture is clean: a Docker-based Airbyte server runs on your infrastructure, manages jobs, and stores metadata. You point it at source and destination databases, define sync schedules, and it handles the rest.
Here’s what a minimal Airbyte Docker setup looks like:
git clone https://github.com/airbytehq/airbyte.git
cd airbyte
docker-compose up
That’s it. Airbyte runs on localhost:8000 and you configure everything via the web UI. But here’s where it gets interesting for developers: you can define custom connectors in Python.
Let me show you a custom Python connector skeleton:
from abc import ABC, abstractmethod
from typing import Any, Iterable, List, Mapping, Optional, Tuple
from airbyte_cdk.models import AirbyteCatalog, AirbyteConnectionStatus, AirbyteMessage, AirbyteStream, SyncMode, Type
from airbyte_cdk.sources import Source
class CustomPostgresConnector(Source):
def check(self, logger, config: Mapping[str, Any]) -> AirbyteConnectionStatus:
try:
import psycopg2
conn = psycopg2.connect(
host=config["host"],
port=config["port"],
user=config["user"],
password=config["password"],
database=config["database"]
)
conn.close()
return AirbyteConnectionStatus(status=Type.SUCCEEDED)
except psycopg2.Error as e:
return AirbyteConnectionStatus(status=Type.FAILED, message=str(e))
def discover(self, logger, config: Mapping[str, Any]) -> AirbyteCatalog:
import psycopg2
conn = psycopg2.connect(
host=config["host"],
port=config["port"],
user=config["user"],
password=config["password"],
database=config["database"]
)
cursor = conn.cursor()
cursor.execute("""
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
""")
tables = cursor.fetchall()
streams = []
for table in tables:
table_name = table[0]
cursor.execute(f"""
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = '{table_name}'
""")
columns = cursor.fetchall()
properties = {}
for col_name, col_type in columns:
properties[col_name] = {"type": "string"}
stream = AirbyteStream(
name=table_name,
json_schema={
"type": "object",
"properties": properties
},
supported_sync_modes=[SyncMode.full_refresh, SyncMode.incremental]
)
streams.append(stream)
cursor.close()
conn.close()
return AirbyteCatalog(streams=streams)
def read(self, logger, config: Mapping[str, Any], catalog: AirbyteCatalog, state: Optional[Mapping[str, Any]]) -> Iterable[AirbyteMessage]:
import psycopg2
import json
conn = psycopg2.connect(
host=config["host"],
port=config["port"],
user=config["user"],
password=config["password"],
database=config["database"]
)
cursor = conn.cursor()
for stream in catalog.streams:
cursor.execute(f"SELECT * FROM {stream.name}")
columns = [desc[0] for desc in cursor.description]
for row in cursor.fetchall():
record = dict(zip(columns, row))
yield AirbyteMessage(
type=Type.RECORD,
record={
"stream": stream.name,
"data": record,
"emitted_at": int(__import__('time').time() * 1000)
}
)
cursor.close()
conn.close()
That’s a fully functional Airbyte source. Airbyte expects this to be packaged and registered in their connector registry. The benefit? Complete control over how data flows.
Cost reality: Airbyte is free if self-hosted. They offer managed cloud at $0.50 per sync job, which adds up fast if you run hundreds daily. Storage is separate.
When Airbyte wins: You need ultimate control, have a large engineering team, or need custom logic connectors can’t provide.
n8n for ETL Workflows
I built my first production n8n workflow in 2022 and immediately saw why people call it “Zapier but actually yours.” n8n Cloud gives you hosted workflows, but if you care about data sovereignty—or price—self-host it.
The key difference from Airbyte: n8n is workflow-first, not data-movement-first. You can absolutely use it for ETL, but the mental model is “what operations do I want to chain together?” rather than “how do I sync this database?”
That said, n8n has one advantage Airbyte doesn’t: it’s a full integration platform. You can transform, validate, and route data in the same platform.
Here’s a practical ETL workflow in n8n JSON (this is what the UI generates):
{
"nodes": [
{
"parameters": {
"host": "postgres.example.com",
"port": 5432,
"user": "{{ $env.DB_USER }}",
"password": "{{ $env.DB_PASSWORD }}",
"database": "source_db",
"ssl": true,
"query": "SELECT id, email, created_at FROM users WHERE created_at > NOW() - INTERVAL '1 day'"
},
"name": "Fetch Recent Users",
"type": "n8n-nodes-base.postgres",
"typeVersion": 2,
"position": [250, 300],
"credentials": {
"postgres": "prod_postgres_creds"
}
},
{
"parameters": {
"operation": "create",
"schema": "public",
"table": "user_emails",
"columns": "id,email,created_at",
"dataToInsert": "={{ $json }}"
},
"name": "Insert to Data Warehouse",
"type": "n8n-nodes-base.postgres",
"typeVersion": 2,
"position": [550, 300],
"credentials": {
"postgres": "warehouse_postgres_creds"
},
"dependsOn": ["Fetch Recent Users"]
},
{
"parameters": {
"operation": "post",
"url": "https://api.segment.com/v1/batch",
"authentication": "basicAuth",
"basicAuth": "segment_auth",
"sendBody": true,
"bodyParameters": {
"parameters": [
{
"name": "batch",
"value": "={{ $json.map(row => ({ userId: row.id, traits: { email: row.email }, timestamp: row.created_at })) }}"
}
]
}
},
"name": "Send to Segment",
"type": "n8n-nodes-base.httpRequest",
"typeVersion": 3,
"position": [850, 300],
"dependsOn": ["Insert to Data Warehouse"]
},
{
"parameters": {},
"name": "Start",
"type": "n8n-nodes-base.start",
"typeVersion": 1,
"position": [50, 300]
}
],
"connections": {
"Start": {
"main": [
[
{
"node": "Fetch Recent Users",
"type": "main",
"index": 0
}
]
]
},
"Fetch Recent Users": {
"main": [
[
{
"node": "Insert to Data Warehouse",
"type": "main",
"index": 0
}
]
]
},
"Insert to Data Warehouse": {
"main": [
[
{
"node": "Send to Segment",
"type": "main",
"index": 0
}
]
]
}
}
}
Notice what just happened: we fetched from Postgres, inserted into a data warehouse, and pushed to a third-party API—all in one workflow. Airbyte would require two separate connectors and external orchestration to chain them.
Self-hosting n8n Cloud on a Contabo VPS costs you ~$5-10/month for infrastructure, and the software is free. Managed n8n Cloud starts at $25/month.
💡 Fast-Track Your Project: Don’t want to configure this yourself? I build custom n8n pipelines and bots. Message me with code SYS3-HUGO.
Fivetran’s Enterprise Approach
Fivetran is the opposite of DIY. You pay Fivetran thousands per month, and they handle the connectors, transformations, monitoring—everything.
The appeal is real: pre-built connectors for 300+ data sources, automatic schema updates, connectors that just work. No Python,
Want to automate this yourself?
Start with n8n Cloud (free tier available) or self-host on a Hetzner VPS for full control.