Saturday, November 29, 2025

Turning a One-Off Prompt into a Repeatable Codex Workflow (using Cursor + GPT-5.1 / GPT-5.1-codex)



I went looking for “just a prompt” and accidentally built a tiny AI coworker.

In Cursor, using GPT-5.1 / GPT-5.1-codex, I asked for a stand-alone prompt to help a Codex-style dev improve test coverage for my ClubHub project.

Instead of spitting out a wall of prose, it designed a repeatable workflow: a small Bash launcher that boots Codex into “ClubHub test coverage mode”, wired to project context and a focused coverage brief.

This post walks through that pattern and how you can steal it for your own repo.


The pattern: “Start Codex in project mode for a specific task”

Here’s the core idea the model produced:

  1. Keep your project context in one file
    prompts/system-project-context.md – tech stack, conventions, non-negotiables.

  2. Keep your task brief in another
    e.g. prompts/improve-test-coverage.md – current coverage, targets, files to touch.

  3. Use a tiny script to stitch them together into a single prompt and launch Codex.

The script it generated, codex-test-coverage.sh, does exactly that:

#!/usr/bin/env bash set -euo pipefail # Resolve repo root (works from subdirs too) REPO_ROOT="$(git rev-parse --show-toplevel 2>/dev/null || pwd)" CONTEXT_FILE="$REPO_ROOT/prompts/system-project-context.md" COVERAGE_PROMPT="$REPO_ROOT/prompts/improve-test-coverage.md"

It:

  • Finds the repo root (so you can run it from any subdirectory)

  • Points at:

    • a system context file for the project

    • a coverage-focused task file

Then it validates that both exist:

if [ ! -f "$CONTEXT_FILE" ]; then echo "❌ Context file not found: $CONTEXT_FILE" >&2 echo "Create prompts/system-project-context.md first." >&2 exit 1 fi if [ ! -f "$COVERAGE_PROMPT" ]; then echo "❌ Test coverage prompt not found: $COVERAGE_PROMPT" >&2 exit 1 fi

And finally, it builds the combined prompt and calls codex with it:

PROMPT="$(cat "$CONTEXT_FILE")" PROMPT="$PROMPT --- You are now in the ClubHub project session. General rules: - Be brief and opinionated. Prefer 1–2 strong options over long lists. - Assume: Go (latest stable), PostgreSQL, React + TypeScript + Vite, Tailwind CSS, GitLab CI. - Respect the multi-tenant architecture: all queries must filter by club_id. - Keep changes PR-sized and coherent; update or suggest tests when behaviour changes. - Never hard-code secrets or URLs; use config/env vars. - For CI, keep .gitlab-ci.yml valid and non-blocking for AI analysis. - Mobile-first: all UI should work well on phones. --- $(cat "$COVERAGE_PROMPT") " # Launch Codex with the combined prompt codex "$PROMPT"

That’s it. A one-liner CLI:

./codex-test-coverage.sh

…now starts an opinionated, project-aware AI dev session focused purely on test coverage for ClubHub.


The coverage brief: giving Codex something real to chew on

The second piece is the coverage prompt itself: prompts/improve-test-coverage.md.

Instead of “hey, write more tests”, it gives Codex a concrete target:

  • Current status (overall coverage + by package)

  • Goal (95%+ overall, with specific per-package minimums)

  • Priority areas and edge cases

  • Context (tooling, libraries, styles)

  • Success criteria

For example, the top of the file:

Overall Coverage: 90.2% of statements

  • internal/config: 100.0%

  • internal/db: 100.0%

  • internal/http/router: 73.3% (needs improvement)

  • internal/domain/payment: 0.0% (no tests)

Then it sets a clear goal:

Improve test coverage to 95%+ overall, focusing on:

  1. Router package (currently 73.3%)

  2. Handlers package (currently 90.9%)

  3. Payment domain (currently 0.0%)

  4. Middleware (currently 93.9%)

And it gets very specific about what to write:

  • Router:

    • SPA fallback behaviour (serving index.html for client routes)

    • Static file edge cases & error handling

    • Protection of /api routes in the static handler

  • Handlers:

    • Error paths in CreateMember

    • Soft-delete scenarios in UpdateMember

    • NULL handling for list endpoints

  • Payment:

    • Validation rules and request validation tests

  • Middleware:

    • Extra auth failure paths

    • Logger edge cases

Finally, it shows Codex where to look:

  • internal/http/router/router.go

  • internal/http/handlers/store.go

  • internal/domain/payment/payment.go

  • internal/http/middleware/auth.go

…and how to measure success:

  • Overall coverage ≥ 95%

  • Router ≥ 85%

  • All tests pass and follow project conventions

This turns Codex from “smart autocomplete” into something much closer to a junior engineer working from a ticket.


Why this is better than pasting a giant prompt into the editor

A few things clicked for me as soon as I saw this pattern:

  1. Reproducibility

    Anyone on the team can run the script and get the same project-aware Codex session.
    No more copy-pasting fragile prompts from Notion or Slack.

  2. Single source of truth for project rules

    The “ClubHub mode” rules live in system-project-context.md (not shown here, but referenced by the script).
    When the stack or conventions change, you update one file and all your Codex workflows inherit it.

  3. Focus per script

    codex-test-coverage.sh is just one entry point.
    You can imagine others:

    • codex-api-design.sh

    • codex-ci-hardening.sh

    • codex-frontend-accessibility.sh

    Each one pulls in the same base context, but uses a different task file.

  4. PR-sized output by design

    The script bakes in constraints like:

    • “Keep changes PR-sized and coherent”

    • “Update or suggest tests whenever behaviour changes”

    That language nudges Codex away from huge, repo-wide refactors and towards reviewable chunks.


How to adapt this for your own project

If you want to copy this pattern, the steps are small:

  1. Create a project context file

    prompts/system-project-context.md with things like:

    • Tech stack (language, frameworks, CI tool)

    • Architectural rules (e.g. multi-tenant filters, layering)

    • Security constraints (no secrets in code, how config works)

    • Testing style (frameworks, mocking approach, naming conventions)

  2. Create a focused task brief

    For example: prompts/improve-test-coverage.md, following this structure:

    • Current metrics (coverage, failing areas)

    • Specific targets & packages

    • Concrete behaviours and edge cases to test

    • Pointers to existing tests as patterns

    • Clear success criteria

  3. Drop in a launcher script

    Adapt codex-test-coverage.sh:

    • Point CONTEXT_FILE at your project context

    • Point COVERAGE_PROMPT at your task file

    • Tweak the “General rules” block for your preferences

    • Replace codex with whatever CLI your AI workflow uses

  4. Check it into the repo

    Treat these like dev tools, not personal notes.
    That way, the whole team can benefit, and updates are code-reviewed.


Takeaway

I went asking for a prompt and got handed the skeleton of a Codex operating system for my repo:

  • A project brain (system-project-context.md)

  • A task brief that looks like a real engineering ticket (improve-test-coverage.md)

  • A one-command launcher that wires them into a focused AI dev session (codex-test-coverage.sh)

It’s a small pattern, but it shifts AI from “clever autocomplete inside the editor” to “scriptable coworker that can be put into different modes for different jobs”.

Next up for me: cloning this pattern for CI hardening, performance profiling, and API design reviews — each with their own prompt file and tiny launcher script.

If you’re already using Cursor and Codex (or similar tools), try doing the same:
instead of asking for better prompts, ask the model to design you a repeatable workflow.

No comments:

🐌 From Codex CLI to OpenAI API: Building a Smarter AI Worker in 24 Hours

From Codex CLI to OpenAI API: Building a Smarter AI Worker in 24 Hours How throttling led to a complete rewrite, cost optimization, and a mo...