Thursday, December 11, 2025

ClubHub Engineering Log — CI/CD Migration, Backlog Overhaul & First Prod Deploy

Agile in the real world is rarely about sticky notes and ceremonies. It is about whether we can change direction cheaply. Today on the ClubHub project, we did exactly that: we took a working CI/CD setup on GitLab and Railway and deliberately moved the whole thing to GitHub Actions and Render.

This wasn’t a vanity switch. It was a decision about economics, flow and reducing the trapezoid of torture between “best case” and “worst case” in our deployment pipeline. Fewer moving parts, more automation, and a single path to production.


Starting Point: GitLab + Railway

We began ClubHub with:

  • .gitlab-ci.yml driving builds, tests and deployments
  • Railway handling app containers and Postgres

The upside of this setup was speed. Railway makes it easy to get a service running, and GitLab CI is powerful once you have the pipeline dialled in. The downside was cognitive load: this project already has a lot going on with AI workers, backlogs and orchestration. Adding another platform into the mix was one variable too many.

The more we looked at where we wanted the project to be in six months’ time, the more obvious it became that we needed one opinionated path for CI/CD. For this project, that path is:

  • GitHub as the source of truth
  • GitHub Actions as CI/CD
  • Render as the runtime (app + database)

Migrating CI/CD to GitHub Actions

Today we moved the deployment brain from .gitlab-ci.yml into GitHub Actions. Instead of a large, platform-specific pipeline, we now have a smaller, more focused workflow in .github/workflows/.

The essence of the new pipeline:

name: CI and Deploy

on:
  push:
    branches: [ "main" ]
  pull_request:
    branches: [ "main" ]

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Node
        uses: actions/setup-node@v4
        with:
          node-version: "20"

      - name: Install dependencies
        run: npm install --legacy-peer-deps

      - name: Run tests
        run: npm test

      - name: Build
        run: npm run build

  deploy-prod:
    needs: build-and-test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4

      - name: Deploy to Render
        run: ./scripts/deploy_to_render.sh

Small things make a difference. For example, we standardised on:

docker compose

instead of:

docker-compose

This is one of those tiny, boring details that prevents “works on my machine” CI failures. Agility is often found in removing these small points of friction.


Render: Deployment and Database Setup

On the platform side, the application now runs on Render, with a dedicated Postgres instance. One of the key changes today was making database setup part of the deployment contract, not an afterthought.

We added scripts like:

render-setup-db.sh
start-with-db-setup.sh

so that every deployment performs:

  1. Database migrations
  2. Seed data (for initial logins and basic test data)

This fixed a class of “500 on login” problems that were really migration issues in disguise.

A typical Render command sequence now looks like:

Build Command:
  npm install --legacy-peer-deps && npm run build

Start Command:
  ./start-with-db-setup.sh

Render’s Postgres plan was also updated to a supported tier (for example basic-256mb) so that we are not depending on legacy plans that may disappear under our feet.


Backlog as Code: docs/backlog.md and Backlog.md

Parallel to the CI/CD move, we did a structural cleanup on how work is represented. We have had multiple ways of describing tasks: ad-hoc notes, story files, different Markdown formats. Today we declared a winner:

docs/backlog.md is now the single source of truth for task metadata.

This lines up with the Backlog.md tooling and a “backlog as code” mindset:

  • The backlog lives in the repository.
  • It is version controlled.
  • It is validated by scripts and CI.

A typical Task Block now looks like this:

---
id: CLUB-123
title: Migrate prod deploys from GitLab/Railway to GitHub/Render
epic: ci-cd-consolidation
status: in_progress
priority: p1
area: devops
depends_on:
  - CLUB-101
  - CLUB-102
lock: false
---

Refactor CI pipeline to use GitHub Actions.
Deploy ClubHub to Render with automatic DB migrations and smoke tests.

We standardised status and priority values, and we added validation scripts so that broken frontmatter gets caught early. The result is a backlog that both humans and AI workers can rely on without guesswork.


Abstracting the ai-project-hub

ClubHub is not just an app; it is one of several projects orchestrated through a shared AI infrastructure. Today we leaned into that by treating ai-project-hub as a proper abstraction layer rather than a bag of scripts.

In plain terms, ai-project-hub now provides:

  • A shared structure for all projects
  • Standardised backlog formats and validation
  • Tools and helpers for AI workers

On the ClubHub side we aligned names so everything points at this hub. A simplified version of the hub interface looks like this:

export function createAIProjectHub({ llm, memoryProvider, tools }) {
  return {
    async runTask(task) {
      const context = await memoryProvider.load(task.id);
      const result = await llm.execute({ task, context, tools });
      await memoryProvider.save(task.id, result);
      return result;
    }
  };
}

The benefit is similar to any good Agile practice: one way of doing things, used everywhere, instead of a different custom workflow in every project.


First Production Deployment with Live Smoke Testing

The real milestone today was not moving YAML files around. It was pressing the button (or more accurately, pushing to main) and watching a fully-automated pipeline:

  1. Builds and tests the code with GitHub Actions.
  2. Deploys to Render.
  3. Runs database migrations and seed scripts.
  4. Executes live smoke tests against production.

We wired in Playwright-based smoke tests that hit the live URL and assert the basics of the user journey. For example:

const { test, expect } = require('@playwright/test');

test('login page loads', async ({ page }) => {
  await page.goto('https://clubhub-ixbj.onrender.com/login');
  await expect(page.locator('form')).toBeVisible();
});

test('user can attempt login', async ({ page }) => {
  await page.goto('https://clubhub-ixbj.onrender.com/login');
  await page.fill('input[name="email"]', 'test@example.com');
  await page.fill('input[name="password"]', 'password');
  await page.click('button[type="submit"]');
  // Further assertions depend on seeded data and flows
});

The live login URL for this deployment is:

https://clubhub-ixbj.onrender.com/login

To make these tests robust in minimal container images, we fixed a few practical issues:

  • Explicitly installing curl in smoke-test images.
  • Using date commands that work on Alpine/busybox, not just GNU date.

The outcome is simple but powerful: after each deployment, we know whether the system is actually working from a user’s perspective, not just whether the container started.


Why This Matters (Agile, but Real)

From the outside, “moving from GitLab/Railway to GitHub/Actions/Render” might look like platform churn. From the inside, it is about the economics of change:

  • One place for code and reviews (GitHub).
  • One way to build and deploy (Actions → Render).
  • One backlog format that humans and AI can both trust.

In other words: spend late, earn early, and keep the option to turn on a dime. The smaller and more reliable our deployment pipeline, the easier it is to learn from real users and adjust. That is where agility actually pays off.

No comments:

🐌 From Codex CLI to OpenAI API: Building a Smarter AI Worker in 24 Hours

From Codex CLI to OpenAI API: Building a Smarter AI Worker in 24 Hours How throttling led to a complete rewrite, cost optimization, and a mo...