Speed benchmarkprotocol · sha256_chain_v1 · frozen

Carrier^™ vs Node.js, Python, Go, .NET, and Spring Boot.

Six small API services — one per language/framework — exposing the same three endpoints. An aiohttp client bombards each with 64 in-flight requests for 30 seconds, records latency and throughput, and samples the server process's resident-set memory every 200 ms. All stacks run natively on the same host; no containers, no networking asymmetry. The CPU kernel is frozen as sha256_chain_v1 so published comparisons stay apples-to-apples across runs.

Throughput →Methodology speed-benchmark/ on GitHub

TL;DR

Compiled-language throughput, a fraction of the memory.

Across all three endpoints, Carrier lands in the top tier for throughput and tail-latency. Its peak memory is smaller than every other stack here except Go — and smaller than Go on the CPU-bound endpoint.

GET /items · Postgres

19,808

rps

Carrier · 2× Node, 7× Python

GET /items · p99

6.29

Tightest tail of the six stacks

GET /compute · crypto

24,980

rps

1000-iter SHA-256 chain · 10× Node

Peak RSS · all endpoints

Spring: 621 MB · .NET: 221 MB

What's being measured

Three endpoints, three different questions

Each endpoint isolates a different axis so the final picture isn't a single number.

GET /health

Pure HTTP framework + JSON serialization.

Everything clusters here — frameworks are all plenty fast on a trivial path.

GET /compute?n=1000

Language/runtime CPU speed. 1000 iterations of SHA-256(hex(prev)).

Native-crypto stacks (Carrier · Go · .NET · Spring) pull ahead. Node/Python bind to OpenSSL via FFI — the ~1000 boundary crossings per request dominate.

GET /items?limit=50

Real-world CRUD path. SELECT against Postgres (10k rows, pool=10).

The endpoint Carrier is designed for. DB cost should dominate, but framework/runtime overhead still differentiates the tail.

Six stacks under test

native processes · Postgres pool = 10

Stack	Language / runtime	Framework	DB driver
Carrier™	Carrier™ → Rust	built-in	sqlx
Go	Go 1.22+	Gin	pgx/v5
.NET	C# / .NET 10	ASP.NET Core Minimal	Npgsql
Spring	Java 21	Spring Boot 3.3	JDBC + HikariCP
Node	Node.js 22	Fastify	pg
Python	Python 3.12	FastAPI + Uvicorn	asyncpg

Throughput

Requests per second across three endpoints.

Higher is better. Carrier clusters with the compiled-language tier (Go, .NET, Spring) on all three endpoints and pulls ahead on the Postgres-backed /items path.

throughput.png · rendered from client/results60s warm-up + 30s window

throughput chart across carrier, node, python, go, dotnet, and spring

Requests per second · higher is better

Stack	GET /health	GET /compute	GET /items
Carrier™	23,254	24,980	19,808
Go	23,634	15,798	21,259
.NET	21,027	14,062	18,377
Spring	21,921	16,293	13,559
Node	22,923	2,532	10,863
Python	18,361	2,627	2,805

Latency

p50 / p95 / p99 latency per endpoint.

Lower is better. Carrier's 99th-percentile latency on /items is 6.3 ms — the tightest tail of any stack — because the compiled Axum handler has no async framework layer between it and sqlx.

latency.png · rendered from client/results60s warm-up + 30s window

latency chart across carrier, node, python, go, dotnet, and spring

GET /health · latency percentiles · milliseconds

lower is better

Stack	p50	p95	p99	max
Carrier™	2.71	2.79	5.50	13.71
Go	2.65	2.86	5.72	14.65
.NET	2.89	3.93	6.50	14.79
Spring	2.83	3.03	6.22	15.58
Node	2.75	2.84	5.70	10.76
Python	3.42	3.61	6.20	120.34

GET /compute · latency percentiles · milliseconds

lower is better

Stack	p50	p95	p99	max
Carrier™	2.49	2.85	5.28	22.00
Go	4.20	9.83	14.53	123.70
.NET	3.78	7.28	9.73	28.49
Spring	3.58	5.37	9.30	36.00
Node	24.76	30.99	49.79	74.61
Python	24.36	25.09	26.69	47.06

GET /items · latency percentiles · milliseconds

lower is better

Stack	p50	p95	p99	max
Carrier™	3.15	3.62	6.29	18.31
Go	2.87	3.70	6.25	30.02
.NET	3.33	4.19	7.00	24.67
Spring	4.14	7.95	11.87	46.60
Node	5.83	6.39	8.31	19.38
Python	22.67	41.72	76.57	290.85

Memory

Peak resident-set size sampled every 200 ms.

Lower is better. The compiled Rust binary's footprint stays under 16 MB at peak. Managed-runtime stacks pay a fixed overhead before they serve a single request.

memory.png · rendered from client/results60s warm-up + 30s window

memory chart across carrier, node, python, go, dotnet, and spring

Resident set size · MB · sampled @ 200 ms

lower is better

Stack	Idle (MB)	Peak @ /health	Peak @ /compute	Peak @ /items
Carrier™	10	13	14	15
Go	19	28	37	38
.NET	99	209	215	221
Spring	218	356	621	334
Node	73	107	237	277
Python	52	52	52	54

Why this matters

Peak RSS is the memory your orchestrator has to budget per replica. A compiled Carrier binary keeps that budget flat across the whole endpoint mix — 13 → 15 MB. Spring Boot swings across 218 → 621 MB depending on what the JIT inlines under load.

What's excluded

Postgres's own memory, OS page cache, and any external dependency (Redis, message brokers) are out of frame. This is the server process tree only, measured via ps -o rss= against each stack's PID.

Methodology

How the numbers above were produced. No containers, no special tuning.

All six APIs run as native host processes on the same machine and talk to the same native Postgres. The bench client discovers each server PID via its listening socket and samples memory directly.

Client

Python aiohttp driver with 64 in-flight requests, 8s warm-up discarded, 30s measurement window per (API, endpoint).

Memory

Peak and mean RSS sampled every 200 ms via ps -o rss=. The bench finds the PID with lsof -ti :{port} -sTCP:LISTEN — the actual server, not a Docker userspace proxy.

Database

Postgres 14+ on 127.0.0.1. Each stack uses a pool of 10 so six × 10 = 60 fits under the default max_connections=100.

Runtime uniformity

All six APIs run host-native (Mach-O / arm64). No Docker, no vpnkit, no userspace proxy. Every stack takes the same loopback path to Postgres.

/compute · sha256_chain_v1

Frozen protocol id: sha256_chain_v1. The chain starts from sha256_hex("carrier") and iterates n=1000 times replacing the hex with sha256_hex(hex). All six APIs produce the same final hex — correctness is verified.

Why these frameworks

Each language's modern default, not the fastest niche library. Spring Boot over Micronaut, Fastify over node:http, FastAPI over Flask, Minimal APIs over MVC. Otherwise the comparison rewards micro-routers rather than what production actually runs.

Honest

What the numbers don't say

A single-host benchmark is always relative, never absolute. These caveats are copied verbatim from speed-benchmark/README.md.

Single host

Everything runs on one laptop. Absolute throughput moves with background load. Results are comparable within a single run, not across days.

RSS only

Server process tree memory. Does NOT include Postgres, OS page cache, or any external services (Redis, brokers, etc).

Spring JVM baseline

218–356 MB for HotSpot JVM + Spring is normal. That's what running a Spring Boot service actually costs — it's not a benchmark anomaly.

Small table

items has 10k rows. Production scale would shift the /items numbers — DB cost rises for everyone, but index and planner effects don't show up at this size.

One worker each

No node cluster, no multi-process Uvicorn, no multiple Spring instances. Horizontal scaling would move these ratios.

Crypto FFI cost

Node and Python bind to OpenSSL — OpenSSL is fast, but ~1000 FFI crossings per /compute request dominate. That's the real bottleneck, not hash speed.

Reproduce

Every step in one place

The full benchmark directory is committed — the APIs, the client, the raw JSON, and the rendered charts.

1 · Seed Postgres with 10k rowszsh

$ cd speed-benchmark
$ psql -h 127.0.0.1 -U postgres -c "CREATE DATABASE speed_bench;"
$ psql -h 127.0.0.1 -U postgres -d speed_bench -f db/init.sql

2 · Build each API binaryzsh

$ (cd carrier-api && carrier build)
$ (cd go-api      && go build -o go-api main.go)
$ (cd dotnet-api  && dotnet build -c Release --nologo)
$ (cd spring-api  && mvn -q -DskipTests package)
$ (cd node-api    && npm install --omit=dev)
$ (cd python-api  && python3 -m venv .venv && .venv/bin/pip install -r requirements.txt)

3 · Start all six & run the clientzsh

# start each API on its own port (see README)
$ ./carrier-api/.carrier/build/carrier-bench &
$ node node-api/server.js &
$ ./go-api/go-api &
$ …
$ 
$ cd client && python3 -m venv .venv && source .venv/bin/activate
$ pip install -r requirements.txt
$ python bench.py --warmup 8
$ python render.py   # charts + BENCHMARK.md

Artifacts

Raw per-run JSON lives in client/results/. Charts are PNGs in client/results/charts/. The markdown report BENCHMARK.md is generated by render.py.

Where the source is

The full benchmark lives at speed-benchmark/ in the carrier repository. Every API is a minimal, idiomatic implementation of the same three endpoints.

speed-benchmark/BENCHMARK.md

Numbers are nice. Seeing the code is better.

The /items endpoint above is seven lines of Carrier. No router boilerplate, no serializer setup, no DB glue — the compiler lowers it into the Rust service binary the benchmark actually measured.

Read the language →Browse examples Quickstart ↗