Engineering Notes

Deep dives into interesting problems I've solved. Each section covers the motivation and execution.

01

Claude Code Web: In-Cluster AI Assistant

Motivation
I wanted a coding assistant that could see what's actually happening in my cluster—read logs, check pod status, query traces—without me having to copy-paste context. Running Claude Code in-cluster with MCP servers gives it direct access. A local vLLM instance handles simpler tasks without external API calls.
flowchart LR
    subgraph Browser
        UI[claude.jomcgi.dev]
    end

    subgraph Cluster
        Web[Claude Code Web]
        vLLM[vLLM Server]
        SigNoz[(SigNoz MCP)]
        K8s[(Kubernetes API)]
    end

    subgraph External
        Claude[Claude API]
    end

    UI --> Web
    Web -->|Complex tasks| Claude
    Web -->|Code tasks| vLLM
    Web -->|Logs & Traces| SigNoz
    Web -->|Pod status| K8s

    style UI fill:#ff6b6b,stroke:#ff6b6b,color:#fff
    style Web fill:#ffa502,stroke:#ffa502,color:#fff
    style vLLM fill:#ffd93d,stroke:#ffd93d,color:#000
    style SigNoz fill:#6bcb77,stroke:#6bcb77,color:#fff
    style K8s fill:#4d96ff,stroke:#4d96ff,color:#fff
    style Claude fill:#9b59b6,stroke:#9b59b6,color:#fff
Execution
vLLM In-Cluster
Qwen3-Coder-30B on a 4090. AWQ 4-bit quantization fits in 24GB VRAM. 24k context window. Fast inference for code completion and simple tasks.
Model Routing
Claude for complex reasoning and architecture decisions. Local vLLM for code generation, refactoring, and directed tasks.
SigNoz MCP
Direct access to logs, traces, and metrics. "Why is this pod crashlooping?" becomes answerable without leaving the conversation.
Cluster Context
Read-only Kubernetes API access. Check deployment status, resource usage, recent events. The assistant knows what's running and what's broken.
02

Trips: Camera to Browser

Motivation
I wanted an easy way to share trips with friends and family. A GoPro on the dash captures photos automatically, and my homelab turns them into a live feed they can follow along with - or replay later. Works for anything with GPS-tagged photos.
flowchart LR
    subgraph Capture
        GoPro[GoPro Hero]
        Queue[(SQLite)]
    end

    subgraph Process
        EXIF[EXIF + GPS]
        S3[(SeaweedFS)]
        NATS[(NATS JetStream)]
    end

    subgraph Deliver
        Proxy[imgproxy]
        CDN[Cloudflare CDN]
    end

    subgraph Display
        UI[trips.jomcgi.dev]
    end

    GoPro -->|27MP| Queue
    Queue --> EXIF
    EXIF -->|Images| S3
    EXIF -->|Events| NATS
    S3 --> Proxy
    Proxy --> CDN
    CDN --> UI
    NATS -->|WebSocket| UI

    style GoPro fill:#ff6b6b,stroke:#ff6b6b,color:#fff
    style Queue fill:#ffa502,stroke:#ffa502,color:#fff
    style EXIF fill:#ffd93d,stroke:#ffd93d,color:#000
    style S3 fill:#6bcb77,stroke:#6bcb77,color:#fff
    style NATS fill:#4d96ff,stroke:#4d96ff,color:#fff
    style Proxy fill:#9b59b6,stroke:#9b59b6,color:#fff
    style CDN fill:#e056fd,stroke:#e056fd,color:#fff
    style UI fill:#ff6b6b,stroke:#ff6b6b,color:#fff
View Live at trips.jomcgi.dev →
A. Capture

The GoPro shoots on interval while driving. A Python controller manages the camera over WiFi, handling connection drops and queueing downloads for later.

Async Camera
Python asyncio controller. GPS-triggered capture at configurable intervals. 27MP RAW with JPG fallback when storage is tight.
SQLite Queue
Persistent download queue survives restarts. Exponential backoff on WiFi drops. Resume exactly where we left off.
EXIF Extraction
Camera optics preserved: ISO, aperture, shutter speed, focal length. GPS coordinates embedded. Deterministic UUIDs from content hash.
B. Event Store

Trip points are events in NATS JetStream. The API replays the stream on startup to rebuild state. No database needed—just an append-only log.

Stream Replay
On startup, ephemeral consumer replays entire stream. Rebuilds in-memory cache from event history. ~200ms for 10k events.
Live Subscribe
After replay, durable consumer subscribes to new events. Cache stays current. Multiple API pods can subscribe without conflicts.
Tombstones
Deletions are events too. Tombstone message marks a point as deleted. Replay respects tombstones. No orphaned data.
C. Delivery

GoPro images are 27MB each. imgproxy generates thumbnails and display sizes on-the-fly. Cloudflare CDN caches everything at the edge—most requests never hit my homelab.

Deterministic Keys
UUID v5 from namespace + content hash. Same image always gets same key. Idempotent uploads. No duplicates ever.
imgproxy
On-the-fly resizing from SeaweedFS. /thumb/* → 300px, /display/* → 1920x1080. WebP/AVIF based on Accept header.
Cloudflare CDN
Immutable cache-control headers. Content-addressed keys mean cache invalidation is never needed. Edge-cached globally.
D. Display

The web interface at trips.jomcgi.dev shows the route on a map, photos by day, elevation profiles, and trip statistics. WebSocket connection for live updates during active trips.

MapLibre
Vector tiles with terrain hillshade. Route colored by day with offset calculations for overlapping paths. Smooth animations between points.
Live Updates
WebSocket broadcasts new points as they arrive. Viewer count tracking. Follow along in real-time during active trips.
Day-by-Day
Distance sparklines, elevation profiles, photo galleries per day. Rainbow route coloring shows progression through the journey.
03

Sextant: Type-Safe Operator State Machines

Motivation
Kubernetes operators are state machines, but we write them as imperative reconciliation loops. Every operator I wrote had the same bugs: invalid state transitions, forgotten error handling, missing metrics. I wanted to define the state machine declaratively and generate the boilerplate.
flowchart LR
    YAML[YAML Schema]
    Parse[Parse & Validate]
    Gen[Code Generator]
    Types[types.go]
    Trans[transitions.go]
    Metrics[metrics.go]
    Status[status.go]

    YAML --> Parse --> Gen
    Gen --> Types
    Gen --> Trans
    Gen --> Metrics
    Gen --> Status

    style YAML fill:#ff6b6b,stroke:#ff6b6b,color:#fff
    style Parse fill:#ffa502,stroke:#ffa502,color:#fff
    style Gen fill:#ffd93d,stroke:#ffd93d,color:#000
    style Types fill:#6bcb77,stroke:#6bcb77,color:#fff
    style Trans fill:#4d96ff,stroke:#4d96ff,color:#fff
    style Metrics fill:#9b59b6,stroke:#9b59b6,color:#fff
    style Status fill:#e056fd,stroke:#e056fd,color:#fff
Execution
Compile-Time Safety
Each state is a Go struct. Transitions return the next state type. Try to go from Pending to Ready without passing through Creating? Compiler error.
Forced Idempotency
Transition methods require request IDs. You can't transition without the ID, which forces you to call the external API first.
Guard Conditions
Go expressions embedded in YAML, evaluated at transition time. Invalid expressions fail at compile time, not runtime.
Generated Metrics
Prometheus counters, histograms, gauges. state_duration_seconds for SLOs. Automatic cleanup on resource deletion.
states:
  - name: Pending
    initial: true
  - name: Creating
    fields:
      requestID: string
  - name: Ready
    terminal: true

transitions:
  - from: Pending
    to: Creating
    action: StartCreation
    params:
      - requestID: string
04

Cloudflare Operator

Motivation
Every new service meant clicking through the Cloudflare dashboard: create DNS record, create Zero Trust application, update tunnel config. I wanted to annotate a Deployment and have everything provisioned automatically. Zero Trust ingress without the toil.
flowchart LR
    Deploy[Deployment]
    Watch[Operator Watch]
    DNS[DNS Record]
    ZT[Zero Trust App]
    Config[Tunnel Config]
    CF[(Cloudflare)]

    Deploy -->|Annotations| Watch
    Watch --> DNS
    Watch --> ZT
    Watch --> Config
    DNS --> CF
    ZT --> CF
    Config --> CF

    style Deploy fill:#ff6b6b,stroke:#ff6b6b,color:#fff
    style Watch fill:#ffa502,stroke:#ffa502,color:#fff
    style DNS fill:#ffd93d,stroke:#ffd93d,color:#000
    style ZT fill:#6bcb77,stroke:#6bcb77,color:#fff
    style Config fill:#4d96ff,stroke:#4d96ff,color:#fff
    style CF fill:#9b59b6,stroke:#9b59b6,color:#fff
Execution
Annotation-Driven
cloudflare.ingress.hostname and cloudflare.zero-trust.policy annotations trigger reconciliation. No CRDs to manage.
State Machine
Built with Sextant. States: Pending → CreatingDNS → CreatingZTApp → UpdatingConfig → Ready. Each step idempotent.
Finalizers
Cleanup on Deployment deletion. DNS records, ZT apps, and tunnel routes removed. No orphaned Cloudflare resources.
Drift Detection
Periodic reconciliation detects manual Cloudflare changes. Operator is source of truth. Dashboard edits get reverted.
metadata:
  annotations:
    cloudflare.ingress.hostname: myapp.jomcgi.dev
    cloudflare.zero-trust.policy: joe-only
05

Stargazer: Dark Sky Location Finder

Motivation
Finding good stargazing spots requires combining multiple data sources: light pollution maps, road access, elevation for horizon clearance, and weather forecasts. I built a pipeline that scores locations based on all these factors and updates continuously.
flowchart TB
    subgraph Acquire
        LP[Light Pollution Atlas]
        Roads[OSM Road Network]
        Elev[SRTM Elevation]
        Weather[MET Norway API]
    end

    subgraph Process
        Dark[Dark Region Extract]
        Buffer[Road Buffering]
        Zones[Zone Classification]
    end

    subgraph Score
        Cloud[Cloud Cover]
        Humid[Humidity]
        Wind[Wind Speed]
        Final[Final Score]
    end

    LP --> Dark
    Roads --> Buffer
    Elev --> Zones
    Dark --> Final
    Buffer --> Final
    Zones --> Final
    Weather --> Cloud --> Final
    Weather --> Humid --> Final
    Weather --> Wind --> Final

    style LP fill:#ff6b6b,stroke:#ff6b6b,color:#fff
    style Roads fill:#ffa502,stroke:#ffa502,color:#fff
    style Elev fill:#ffd93d,stroke:#ffd93d,color:#000
    style Weather fill:#6bcb77,stroke:#6bcb77,color:#fff
    style Dark fill:#4d96ff,stroke:#4d96ff,color:#fff
    style Buffer fill:#9b59b6,stroke:#9b59b6,color:#fff
    style Zones fill:#e056fd,stroke:#e056fd,color:#fff
    style Cloud fill:#ff6b6b,stroke:#ff6b6b,color:#fff
    style Humid fill:#ffa502,stroke:#ffa502,color:#fff
    style Wind fill:#ffd93d,stroke:#ffd93d,color:#000
    style Final fill:#6bcb77,stroke:#6bcb77,color:#fff
Execution
16-Task DAG
Parallel acquisition of light pollution atlas, OSM roads, SRTM elevation, and MET Norway weather. Tasks scheduled by dependency.
Spatial Analysis
Dark region extraction from light pollution raster. Road network buffering for accessibility. Zone classification by sky quality.
Weather Scoring
Cloud cover, humidity, fog probability, wind speed, dew point. Configurable weights. Updated hourly from forecast API.
Final Score
Composite score 0-100. Factors: darkness (40%), accessibility (20%), horizon (15%), weather (25%). Filterable by threshold.
06

Bazel: One Way to Build Everything

Motivation
I got tired of using different build commands for every project. I wanted one system that works the same everywhere—laptop, CI, Claude Code in the cluster. Everything's vendored, so there's nothing to install beyond Bazel itself.
flowchart LR
    subgraph Anywhere
        Laptop[Laptop]
        CI[CI]
        Claude[Claude Code]
    end

    Fmt[format]

    subgraph Build
        Code[Formatters]
        Helm[Manifests]
        OCI[Images]
    end

    Cache[(BuildBuddy)]

    subgraph Output
        Git[Git]
        Reg[Registry]
    end

    Laptop --> Fmt
    CI --> Fmt
    Claude --> Fmt
    Fmt --> Code
    Fmt --> Helm
    Fmt --> OCI
    Code --> Cache
    Helm --> Cache
    OCI --> Cache
    Cache --> Git
    Cache --> Reg

    style Laptop fill:#ff6b6b,stroke:#ff6b6b,color:#fff
    style CI fill:#ffa502,stroke:#ffa502,color:#fff
    style Claude fill:#ffd93d,stroke:#ffd93d,color:#000
    style Fmt fill:#6bcb77,stroke:#6bcb77,color:#fff
    style Code fill:#4d96ff,stroke:#4d96ff,color:#fff
    style Helm fill:#4d96ff,stroke:#4d96ff,color:#fff
    style OCI fill:#4d96ff,stroke:#4d96ff,color:#fff
    style Cache fill:#9b59b6,stroke:#9b59b6,color:#fff
    style Git fill:#e056fd,stroke:#e056fd,color:#fff
    style Reg fill:#e056fd,stroke:#e056fd,color:#fff
Execution
format
One command for formatters, manifests, lock files. Runs in parallel, seconds if nothing changed.
Custom Rules
Starlark for Go, Python, APKO images. Few lines per service, multi-platform containers out.
Vendored Tools
Helm, crane, ruff, shellcheck—pinned in one lock file. Nothing to install, works anywhere Bazel runs.
BuildBuddy
Remote cache so unchanged code doesn't rebuild. 80 cores on free tier, CI under a minute.

GitOps Manifests

Helm charts render through Bazel. Output goes to the source tree so Git tracks it. PR diffs show exactly what's changing in the cluster.

Cached
Each chart is a genrule. Unchanged charts skip. 20+ services render in parallel.
In Git
Manifests committed, not generated at deploy. I review what's going to the cluster before it goes.

Multi-Platform Images

arm64 on my laptop, amd64 in CI. Same rules build both and push a multi-platform index.

APKO
Alpine images from YAML. Lock files pin versions. Small, fast.
One Target
Define once, Bazel handles platform transitions and index creation.