Run a standalone code review on lilnas changes — correctness, testing, maintainability, project standards — plus conditional reviewers (kieran- typescript, julik-frontend-races, security, performance, api-contract, reliability, equations-security, adversarial, code-simplicity, previous- comments).
skillsy install codemonkey800/lilnas@lilnas-code-reviewA pre-push code review skill for lilnas work. Spawns 4 always-on reviewer personas (correctness, testing, maintainability, project-standards) plus conditional reviewers (kieran-typescript, julik-frontend-races, security, performance, api-contract, reliability, equations-security, adversarial, code-simplicity, previous-comments) in parallel, merges findings, and produces an emoji-rich numbered report.
The full markdown report is written to REVIEW.md at the repo root (any prior REVIEW.md is deleted first so reports don't stack); the chat shows only the summary picker and recommendation so you can reply with issue numbers inline.
Read-only by design. No fixes are applied. The user picks issue numbers from the final summary and addresses them manually.
apps/** or packages/** diff — NestJS services, Discord bots, Next.js/Vite frontends, shared libraries, Docker compose deploy files, or the LaTeX sandbox in apps/equations/./lilnas-code-review with or without arguments.<prompt> is free-form natural language. Parse it into a concrete diff scope before dispatching reviewers. Strip recognized tokens and interpret the remainder.
| User says… | Resolved scope | How |
|---|---|---|
| (blank) | Current branch vs detected base | bash references/resolve-base.sh |
45897 (a number) or full PR URL |
PR's diff after gh pr checkout |
Stage 1 PR path |
my-branch (a local branch) |
Branch vs detected base | Stage 1 branch path |
base:HEAD~3 |
Last 3 commits + working tree | direct base: use |
last 3 commits / last N commits |
base:HEAD~N |
NLP → base:HEAD~N |
unstaged changes / working tree |
base:HEAD (git diff $BASE covers uncommitted edits) |
NLP → base:HEAD |
staged changes |
base:HEAD, with a header note that scope covers staged + unstaged together (git diff $BASE does not isolate --cached) |
NLP → base:HEAD |
commit a1b2c3d |
base:a1b2c3d^ (parent of that commit) |
NLP → base:<sha>^ |
between abc and def / abc..def |
base:abc |
NLP → base:abc |
the changes I made to <path> |
base:HEAD~N where N covers all branch commits touching that path; mention the path filter in the header |
git log --oneline -- <path> to count commits |
anything containing plan:<path> |
Use <path> as the plan for requirements verification |
NLP → plan:<path> |
Rules:
base: cannot combine with a PR number or branch target. If both appear, stop with: ❌ Cannot use base: with a PR number or branch target — base: implies the current checkout is already the correct branch. Pass base: alone, or pass the target alone and let scope detection resolve the base.<prompt> contains mode:autofix, mode:headless, or mode:report-only, reject with: ❌ /lilnas-code-review is interactive and read-only — no mode flags accepted.AskUserQuestion in Claude Code, request_user_input in Codex, the platform equivalent elsewhere). Do not dispatch reviewers until scope is resolved.Pick the path that matches the parsed prompt.
base:<ref> was givenUse the ref directly. Verify the worktree state is sane and capture the diff:
git rev-parse --verify <ref>
BASE=$(git rev-parse <ref>)
echo "BASE:$BASE"
echo "FILES:"
git diff --name-only $BASE
echo "DIFF:"
git diff -U10 $BASE
echo "UNTRACKED:"
git ls-files --others --exclude-standard
gh pr view <number-or-url> --json state,title,body,baseRefName,headRefName,url,files
state is CLOSED or MERGED, stop with PR is closed/merged; not reviewing.git status --porcelain
If non-empty: You have uncommitted changes. Stash or commit them before reviewing a PR, or run /lilnas-code-review with no argument to review the current branch as-is.gh pr checkout <number-or-url>
BASE: / FILES: / DIFF: / UNTRACKED: markers above.Verify worktree clean, git checkout <branch>, then run bash references/resolve-base.sh to determine the base. Capture markers.
Run bash references/resolve-base.sh to detect the base. Capture markers.
After capturing FILES:, check whether 100% of changed files match generated/lockfile patterns:
pnpm-lock.yaml, package-lock.json, yarn.lock**/dist/**, **/build/**, **/.next/**, **/generated/****/__snapshots__/****/.turbo/**If all files match, print:
📦 Diff is lockfile/snapshot/generated-only — skipping review.
…and stop. There's nothing useful for reviewers to look at.
Always inspect UNTRACKED:. If non-empty, tell the user which files are excluded. If any of them should be reviewed, stop and recommend git add + rerun.
Understand what the change is trying to accomplish.
gh pr view. Supplement with commit messages if the body is sparse.git log --oneline ${BASE}..HEAD to get commit subjects and bodies.Compose a 2–3 line intent summary:
Intent: Add the `tikz` package to the LaTeX whitelist in apps/equations so
TikZ diagrams render via the existing pdflatex sandbox. Must not regress
the secure-exec subprocess isolation or the rate-limit tiers.
Pass this to every reviewer in the spawn prompt.
When intent is ambiguous and you can ask: ask one question via the platform's blocking question tool. "What is the primary goal of these changes?"
Four reviewers are always-on. Read the diff and the file list, then decide which conditional reviewers to add. Selection is agent judgment, not pure keyword matching.
| Persona | Persona file |
|---|---|
correctness |
reviewers/always-on/correctness.md |
testing |
reviewers/always-on/testing.md |
maintainability |
reviewers/always-on/maintainability.md |
project-standards |
reviewers/always-on/project-standards.md |
| Persona | Trigger heuristic |
|---|---|
kieran-typescript |
Any changed file matches *.ts / *.tsx (excluding *.test.ts*, *.spec.ts*, *.d.ts, **/dist/**, **/build/**, **/.next/**, **/__generated__/**) |
julik-frontend-races |
Diff touches .tsx or .jsx under any frontend (apps/portal/, apps/dashcam/, apps/macros/, apps/swole/, or admin/web folders inside hybrid apps like apps/tdr-bot/, apps/download/, apps/yoink/) AND contains any of useEffect, useState with timer/animation setters, useMutation, useSubscription, setTimeout, setInterval, requestAnimationFrame, or hand-written DOM event wiring |
security |
Diff touches NestJS handlers (apps/*/src/**/*.controller.ts, **/*.guard.ts, **/*.middleware.ts, **/*.strategy.ts), apps/yoink/src/auth/**, Zod validation schemas, subprocess invocation (spawn, exec, execFile, child_process), file-system writes outside os.tmpdir(), secrets handling, or introduces user-controlled input on a public route |
performance |
Diff adds new .map / .filter / .reduce chains in controllers or services over request data, removes a useMemo / useCallback / React.memo, adds new render-heavy code paths in any frontend, adds new Drizzle queries (especially inside loops), introduces unbounded LangChain agent loops (apps/tdr-bot/src/messages/llm/**), or adds per-request work in NestJS pipelines |
api-contract |
Diff touches shared package exports (packages/{utils,media,lidarr-client,token-client}/src/index.ts or other **/src/index.ts), NestJS controller route signatures / DTO shapes, Zod schemas exposed externally (request bodies, equations LaTeX input, yoink download types), or Discord command signatures (Necord @SlashCommand arguments) |
reliability |
Diff touches NestJS async handlers, scheduled jobs (@nestjs/schedule @Cron), Discord client event handlers (ClientEvents.*, Necord @On/@Once), Drizzle migrations or queries, Radarr/Sonarr/Lidarr HTTP client retry logic, LangChain workflow nodes, Docker compose deploy.yml / deploy.dev.yml health-checks/restart-policy/depends_on, or cleanup of timers / event listeners / subprocess handles |
adversarial |
≥50 changed non-test/non-generated lines, OR diff touches the LaTeX sandbox (apps/equations/), OAuth + JWT (apps/yoink/src/auth/), Discord guards/middleware, subprocess spawn anywhere, Docker entrypoints, LangChain tool-calling (apps/tdr-bot/), shared library exports, Drizzle migrations, file-system writes in apps/download/ or apps/equations/, or external API integrations |
equations-security |
Any file under apps/equations/src/**, apps/equations/Dockerfile, or apps/equations/image-magick-policy.xml changes. The equations service has its own SECURITY.md and an explicit defense-in-depth model (Zod validation → secure-exec → restricted TeX → ImageMagick policy → rate limits); any diff here gets a dedicated reviewer |
code-simplicity |
New abstraction introduced (new class / factory / interface), ≥3 new files added in a single feature, single-use component file, OR <prompt> contains simplicity / simplify keywords |
previous-comments |
PR mode (Stage 1 PR path) AND gh pr view --json comments,reviews returns non-empty review activity |
Before spawning, print the roster with one-line trigger justifications for the conditionals:
Review team:
✅ correctness (always)
✅ testing (always)
✅ maintainability (always)
✅ project-standards (always)
🔡 kieran-typescript — 6 .ts/.tsx files changed
🧪 equations-security — edits apps/equations/src/validation/equation.schema.ts
🛡️ security — touches NestJS guard in apps/yoink/src/auth
🤖 adversarial — 84 changed lines, touches subprocess spawn in equations
This is progress reporting, not a blocking confirmation.
Before spawning, find the file paths (not contents) of all relevant standards files for the project-standards persona. Use the platform's glob/file-search tool:
**/CLAUDE.md, **/AGENTS.md, and **/.agents/common/**.mdc in the repo.Pass the list to the project-standards persona inside a <standards-paths> block in its review context. The persona reads the files itself.
In lilnas, the canonical root standards file is /Users/jeremyasuncionnetflix.com/dev/lilnas/CLAUDE.md (covers pnpm/Turbo conventions, ESLint flat config, Prettier rules, NestJS/Next.js/Vite/Discord conventions, Docker base images, security guidance for the equations service, and storage/deployment patterns). Individual apps may add their own CLAUDE.md later — discover dynamically.
If no standards files exist anywhere in the repo, still spawn the project-standards persona — its job in that case is to verify the lack of standards is not itself a regression (e.g. someone deleted CLAUDE.md without replacement). Pass an empty <standards-paths> block.
Spawn each selected reviewer as a parallel sub-agent (using Agent with subagent_type: general-purpose in Claude Code, the equivalent sub-agent invocation in Codex, or the platform's parallel-task mechanism elsewhere). All reviewers inherit the session model.
Each spawn prompt is built from references/subagent-prompt-template.md with these substitutions:
{persona_file} — contents of reviewers/always-on/<name>.md or reviewers/conditional/<name>.md{diff_scope_rules} — contents of references/diff-scope.md{schema} — contents of references/findings-schema.json{intent_summary} — output of Stage 2{pr_metadata} — PR title/body/URL when reviewing a PR; empty otherwise{file_list} — FILES: block from Stage 1{diff} — DIFF: block from Stage 1 (git diff -U10 $BASE){reviewer_name} — persona name (e.g. "correctness", "kieran-typescript", "equations-security"){trigger_reason} — for conditional reviewers, the heuristic that fired (e.g. "6 .tsx files changed"). Empty for always-on.project-standards only: append a <standards-paths> block with the path list from Stage 3b.Dispatch all reviewers in parallel — one sub-agent invocation per persona, all in a single message where the platform allows it. Reviewers return compact JSON only (no file writes, no run-id artifacts).
Convert reviewer compact JSON returns into one deduplicated, confidence-gated finding set.
Validate. For each return, check required top-level fields (reviewer, findings, residual_risks, testing_gaps) and per-finding fields (title, severity, file, line, why_it_matters, confidence, evidence, pre_existing). Drop malformed returns or findings. Record the drop count.
severity ∈ {P0, P1, P2, P3}confidence ∈ {0, 25, 50, 75, 100}evidence is an array with ≥1 stringpre_existing, requires_verification (if present) are booleansDeduplicate. Compute fingerprint: normalize(file) + line_bucket(line, ±3) + normalize(title). When fingerprints match, merge: keep the highest severity, keep the highest confidence, append the contributing reviewer's name to a lenses[] list.
Cross-reviewer agreement promotion. When 2+ independent reviewers flag the same fingerprint, promote confidence one anchor step: 50 → 75, 75 → 100, 100 → 100. Note the agreement in the merged finding's lenses[].
Separate pre-existing. Pull out findings with pre_existing: true into a separate list. These do not count toward the verdict.
Confidence gate. Suppress remaining findings below anchor 75. Exception: P0 findings at anchor 50+ survive. Record the suppressed count by anchor so it can appear in Coverage if needed.
Sort. Order surviving findings by:
Assign global numbers. Number the sorted list contiguously starting from 1. These numbers drive both the per-finding section headers and the bottom summary picker.
Collect coverage data. Union all reviewers' residual_risks and testing_gaps.
The full markdown report is written to REVIEW.md at the repo root using the templates below. The chat receives only a confirmation line + the summary picker + the recommendation block so the user can reply with issue numbers inline.
REPO_ROOT=$(git rev-parse --show-toplevel). The report path is ${REPO_ROOT}/REVIEW.md. The skill may be invoked from a subdirectory like apps/equations/, so always anchor on the repo root — never the current working directory.rm -f "${REPO_ROOT}/REVIEW.md". The -f flag makes this a no-op when the file doesn't exist, so there's no need to check first and no error to handle.Do the delete immediately before the write (Step 3), not at skill startup, so a mid-run failure leaves the previous report intact rather than wiping it out. If Stage 1 hit the lockfile/generated-only skip path, or the prompt was rejected before Stage 6, do not touch REVIEW.md — leave any prior report in place.
Use the templates below verbatim. Emoji are part of the contract. Concatenate the sections in this order: Header → Severity sections (P0 → P1 → P2 → P3) → Pre-existing (if any) → Summary picker → My recommendation → Coverage (if any).
# 🔍 Lilnas Code Review
📊 **Scope:** <human-readable scope description> (<N files>, ±<add>/<del> lines)
🎯 **Intent:** <intent summary from Stage 2, one or two lines>
👀 **Reviewers (always-on):** ✅ correctness · ✅ testing · ✅ maintainability · ✅ project-standards
🎚️ **Conditional:** <emoji label> · <emoji label> · ... ← only the conditionals that actually fired
🧮 **Findings:** <total> • 🚨 <p0_count> P0 • ⚠️ <p1_count> P1 • 📝 <p2_count> P2 • 💭 <p3_count> P3
Critical rendering rule: Every line in the header above must be separated from the next by a blank line, otherwise the markdown renderer joins them into one wrapped paragraph and the report becomes unreadable. The # 🔍 Lilnas Code Review heading also separates the report visually from any preceding progress text.
If no conditional reviewers fired, omit the 🎚️ Conditional: line entirely. If the total is zero, render the header, skip the severity sections, and jump to a "✨ All clear" block (see "Clean review" below).
For each non-empty severity in order P0 → P1 → P2 → P3, render a level-2 markdown heading, then every finding at that severity:
## 🚨 P0 — Critical (must fix before merge)
Banner labels (always render as exact ## headings, never as decorated text — box-drawing characters like ═══ get interpreted by the markdown renderer as horizontal rules and the emoji+text floats to the line-end, which breaks the report):
## 🚨 P0 — Critical (must fix before merge)## ⚠️ P1 — High (should fix)## 📝 P2 — Moderate (fix if straightforward)## 💭 P3 — Low (user's discretion)### <N>. ❌ <title> — `<file>:<line>`
**🤔 Description**
<why_it_matters from the merged finding — 2–4 sentences>
**🔍 Problem code**
\`\`\`<lang inferred from file extension>
<problem_code snippet — from the merged finding, or extracted from the diff hunk ±10 lines, or read from the file ±5 lines as last resort>
\`\`\`
**💡 Potential solutions**
- Option A — <approach>: <one-line pro/con>
- Option B — <approach>: <one-line pro/con>
- 🎯 **Recommendation:** <which option and why, in one sentence>
OR (when there is only one obvious fix):
**💡 Why this fix:** <one-line justification of the suggested fix>
OR (when no clean fix exists and trade-offs depend on context):
**⚖️ Trade-off:** <describe the tension>. Discuss with author.
**✅ Recommended solution**
\`\`\`<lang>
<suggested_fix snippet>
\`\`\`
**🏷️ Lens:** <lens-1> · <lens-2 if cross-corroborated> · **🎯 Confidence:** <50 | 75 | 100>
---
Critical rendering rule: Every bold sub-label (**🤔 Description**, **🔍 Problem code**, **💡 Potential solutions**, **✅ Recommended solution**) must be followed by a blank line before its content. Without the blank line, the bold label and the next line collapse into one wrapped paragraph and the label looks like inline prose. Inline labels that end with a colon (**💡 Why this fix:** …, **⚖️ Trade-off:** …, **🏷️ Lens:** …) are exempt — they're meant to flow inline.
Notes on populating each block:
<lang>: infer from file extension. .tsx → tsx, .ts → ts, .jsx → jsx, .js → js, .css → css, .json → json, .md → md, .sh → bash, .yml/.yaml → yaml, .dockerfile/Dockerfile → dockerfile, others → no language.problem_code population order: (a) merged finding's problem_code field if present, (b) extract ±10 lines from the in-memory diff hunk where this file:line lives, (c) read the file at line ± 5 as last resort.recommended solution population: use the merged finding's suggested_fix if present. If the finding has a fix_options[] array with 2+ entries, render the 💡 Potential solutions block with those entries.lenses[] separated by ·. If two reviewers corroborated, that's a stronger signal — display both.After the P3 section, if pre-existing findings exist, add a separate section that does not count toward the verdict:
## 🗂️ Pre-existing (not introduced by this diff)
N. 📝 [P2] <title> — `<file>:<line>` · 🏷️ <lens>
...
One-line each, no problem-code / recommended-solution blocks. These are FYI only.
Always render the summary, even when there's only one finding:
## 📋 Summary — pick what to address
1. 🚨 [P0] <one-line summary> — `<file>:<line>`
2. ⚠️ [P1] <one-line summary> — `<file>:<line>`
3. ⚠️ [P1] <one-line summary> — `<file>:<line>`
4. 📝 [P2] <one-line summary> — `<file>:<line>`
5. 💭 [P3] <one-line summary> — `<file>:<line>`
The numbers must match the per-finding section numbers above. Render each item flush-left with no leading space — a leading space turns the ordered list into an indented paragraph in stricter renderers.
Apply the rubric in the next section to bucket findings, then render:
## 🎯 My recommendation
🚀 **Fix before merge:** #<n>, #<n>, ... ← omit paragraph entirely if empty
👍 **Worth grabbing while you're here:** #<n>, #<n>, ... ← omit if empty
💤 **Skip unless polishing:** #<n>, #<n>, ... ← omit if empty
⚖️ **Discuss:** #<n> (<one-line context>) ... ← omit if empty
📨 Reply with the issue numbers you want to address (e.g. `1, 2, 4` or `all P0/P1`).
Critical rendering rule: Each bucket line must be separated from the next by a blank line, otherwise they collapse into one wrapped paragraph.
If no bucket has any items, render a "✨ All clear" block instead:
## ✨ All clear — no actionable findings
Nothing surfaced above the confidence gate.
<If pre-existing findings exist: "K pre-existing findings noted above for awareness.">
<If residual_risks or testing_gaps are non-empty: "See Coverage below.">
If residual_risks or testing_gaps are non-empty, render at the very bottom:
## 📊 Coverage
Residual risks the reviewers flagged but didn't promote to findings:
- <risk>
- <risk>
Testing gaps the reviewers flagged:
- <gap>
- <gap>
Suppressed: <N> findings below anchor 75 (P0 at anchor 50+ retained).
After assembling the sections from Step 2, write the full concatenated markdown to ${REPO_ROOT}/REVIEW.md via the platform's file-write tool (Write in Claude Code, equivalent elsewhere). The file is the canonical artifact of the review — the user opens it in an editor to read findings, scroll, and search. This is the only place the full report lives.
Once REVIEW.md is on disk, print to the chat — and only print — three blocks in this exact order:
One-line confirmation, with the absolute path filled in from ${REPO_ROOT}/REVIEW.md:
📝 Wrote review to `REVIEW.md` (<absolute-path-to-REVIEW.md>) — <total> findings: 🚨 <p0_count> P0 · ⚠️ <p1_count> P1 · 📝 <p2_count> P2 · 💭 <p3_count> P3.
The ## 📋 Summary — pick what to address block from Step 2, rendered verbatim with finding numbers that match the per-finding section headers in REVIEW.md.
The ## 🎯 My recommendation block from Step 2, rendered verbatim — including the 📨 Reply with the issue numbers… prompt at the bottom.
Do not print the header, severity sections, per-finding blocks, pre-existing block, or Coverage to the chat — those live only in REVIEW.md. The summary picker and recommendation are the only sections that appear in both places, and their numbering must stay consistent across them.
If Stage 5 produced zero post-gate findings (the "✨ All clear" path):
REVIEW.md containing only the Header block + the ## ✨ All clear — no actionable findings block + the Coverage section if residual_risks or testing_gaps are non-empty. Skip the severity sections, per-finding blocks, summary picker, and recommendation in the file.## ✨ All clear — no actionable findings block — no picker, no recommendation, since there's nothing to pick.Bucket each finding into exactly one of:
equations-security finding at confidence ≥ 50 (the LaTeX sandbox has no margin for regression).⚖️ Trade-off block (no clear recommendation).A finding lands in exactly one bucket. When two rules could apply, pick the more aggressive bucket (Fix before merge > Worth grabbing > Skip).
Finding numbers refer to the per-finding section headers inside REVIEW.md and the summary picker printed to the chat at the end of Stage 6. Both must stay in sync — that's what makes this stage work.
When the user replies with issue numbers (e.g. 1, 3, 5, all P0, all P0/P1), the skill's job is to record the picks for follow-up, not to apply fixes.
TaskCreate in Claude Code, the equivalent task primitive elsewhere). Each todo:
📨 Created N todos for issues <list>. Drive each one through your normal flow.Do not apply fixes from this skill. The picker is meant to slot the chosen findings into the user's normal work loop, not auto-resolve them.
Before delivering the report:
REVIEW.md. If you tag a finding as 🚀, the summary must also list it under 🚀. The summary picker numbering in REVIEW.md must match the per-finding section numbering in the same file, and the picker block printed to the chat must use the same numbers as REVIEW.md — Stage 7 relies on this consistency to map reply numbers to findings.## H2 headings — never decorate them with ═══, ───, ***, or any box-drawing characters (markdown renderers treat those as horizontal rules and float the emoji+heading text to the line-end, breaking the report). Separate every header/recommendation/coverage line from the next with a blank line; consecutive lines without blank separators collapse into one wrapped paragraph. Bold sub-labels in finding blocks (**🤔 Description**, **🔍 Problem code**, **💡 Potential solutions**, **✅ Recommended solution**) must each be followed by a blank line before their content — inline-colon labels (**💡 Why this fix:** …) are the only exception.This skill is repo-scoped to lilnas. The canonical files live at <repo>/.claude/skills/lilnas-code-review/. There are currently no Codex or Cursor symlinks — add them if other clients start being used here.
To add a new reviewer persona, drop a file under reviewers/always-on/ or reviewers/conditional/ and update Stage 3 in this file with its trigger heuristic. To add a lilnas-specific lens (e.g. discord-bot-conventions, langchain-graph-correctness, drizzle-migration-safety), create the persona file plus a Stage 3 trigger row and a roster-emoji in the "Announce the team" example.