Journal

Markdown Runbooks for AI Agents: A Practical Operator Guide

How to write Markdown runbooks for private AI agents: inputs, allowed actions, stop conditions, approval gates, receipts, and metrics.

Claw EmpireJune 18, 20267 min read

Hermes#runbooks#operations#approval-gates

A Markdown runbook is a plain-text operating procedure for an AI agent. It tells the agent what workflow it supports, what inputs it may inspect, what actions it may take, when it must stop, how it asks for approval, and what receipt it leaves behind. For small teams, readable runbooks are the difference between useful private AI executive assistants and vague automation experiments.

Write a Markdown runbook when an agent workflow will repeat. Keep it short enough for an owner to review, specific enough for the agent to follow, and strict enough to prevent silent damage. The runbook should define the workflow in business terms before you connect tools or schedule anything.

A good runbook answers seven questions:

What outcome should this assistant prepare?
What sources may it read?
What may it draft or prepare?
What may it never do without approval?
When must it escalate?
What does approval look like?
What receipt proves what happened?

Why Markdown works

Markdown is boring in the right way.

It is readable by founders, developers, and agents. It can live in a repo. It can be reviewed in a diff. It can be copied into a project context file or skill. It does not hide the operating rules behind a visual automation canvas or a SaaS prompt box.

For Hermes-led workflows, Markdown also matches how founders and assistants already use durable context: project instructions, AGENTS-style files, skills, and workflow notes. The exact file name matters less than the discipline: rules that affect real work should be inspectable.

The mental model: runbook as contract

Treat the runbook as a contract between the business owner and the assistant.

The owner promises to provide clear inputs, tool boundaries, and approval rules. The assistant promises to operate inside those boundaries, explain uncertainty, and stop before commitments.

This is why a runbook should not sound like:

> “Handle incoming leads and keep the CRM updated.”

That instruction hides too much authority.

A better runbook says:

> “Review new inbound lead messages, classify them, draft a reply, suggest a CRM note, and ask the owner before sending or changing deal stage.”

The second version gives the agent room to help without granting it silent control.

A practical runbook template

Use this as a first draft.

# [Workflow Name] Assistant

## Purpose
One sentence describing the business outcome.

## Owner
The human responsible for approvals, edits, and weekly review.

## Inputs
- Source 1
- Source 2
- Source 3

## Allowed actions
- Read approved sources
- Summarize relevant context
- Draft recommended next steps
- Create low-risk internal tasks if approved by the owner

## Actions requiring approval
- Send external messages
- Update customer-facing records
- Change dates, prices, refunds, or commitments
- Publish anything
- Delete, archive, merge, or close records

## Stop conditions
Escalate instead of acting when:
- identity is unclear
- customer is angry
- legal, financial, medical, or HR judgment is involved
- source data conflicts
- the requested action is outside this runbook
- confidence is low

## Approval card format
Show:
- source context
- proposed action
- risk level
- exact output after approval
- alternatives if rejected

## Receipt format
Log:
- time
- sources reviewed
- draft or action proposed
- approval result
- final action taken
- follow-up needed

## Metrics
- drafts accepted
- drafts rejected
- minutes saved
- missed items
- approval escalations
- unsafe recommendations caught

This template is intentionally plain. The point is not literary quality. The point is operational clarity.

Worked example: weekly reporting assistant

A solo founder wants a Friday report without manually checking Stripe, CRM, support inbox, and project tasks.

A vague instruction would be:

> “Create a weekly business report.”

A useful runbook says:

# Weekly Business Reporting Assistant

## Purpose
Prepare a Friday operating report that helps the owner spot revenue, pipeline, support, and delivery risks.

## Inputs
- Stripe dashboard export or approved revenue source
- CRM pipeline view
- support inbox label: unresolved
- project tracker view: due this week and overdue

## Allowed actions
- summarize changes since last Friday
- identify anomalies
- draft owner questions
- suggest follow-up tasks

## Actions requiring approval
- message customers
- change CRM stages
- issue refunds
- edit invoices
- reassign project tasks

## Stop conditions
Escalate when revenue data is missing, a customer complaint involves legal risk, a refund is requested, or source systems disagree.

## Output
1. Revenue snapshot
2. Pipeline movement
3. Support risks
4. Delivery risks
5. Three owner decisions needed
6. Follow-up task suggestions

## Receipt
List sources reviewed, missing sources, anomalies found, and owner decisions requested.

The report assistant is valuable before it has write access. It saves attention first. Later, if the owner trusts the pattern, it may create internal tasks behind approval.

Checklist before a runbook goes live

If the runbook fails this checklist, do not compensate by writing a longer prompt. Tighten the operating design.

Common pitfalls

Pitfall 1: writing policy without examples

Agents follow concrete examples better than abstract values. Include one realistic input and one acceptable output whenever the workflow touches customers.

Pitfall 2: confusing tone with authority

“Be friendly and proactive” is a tone rule, not an action boundary. Authority rules need verbs: send, delete, refund, publish, approve, archive, schedule, change.

Pitfall 3: letting tools define the workflow

Do not start with “we connected Gmail, Slack, and CRM; what can the agent do?” Start with the business loop. Then connect only the tools required.

Pitfall 4: burying stop conditions at the bottom

Stop conditions are not decoration. Put them where both the owner and agent will see them. Review them after every mistake.

Pitfall 5: no receipt

If the assistant cannot explain what it reviewed and what it changed, you cannot safely improve it. Receipts are the operating memory of the system.

Metrics for better runbooks

A runbook gets better when it changes based on evidence.

Review weekly:

Which instructions prevented a bad action?
Which stop condition fired too often?
Which draft sections needed repeated edits?
Which sources were missing or noisy?
Which approvals were easy?
Which approvals required the owner to go search for context?
Which tool permissions were unused and can be removed?

The best metric is not “agent autonomy.” It is owner leverage with controlled risk.

How this fits Hermes

Hermes can load project context, use skills, remember useful facts, and call tools. A Markdown runbook gives those capabilities a job. It tells the runtime what “good” means for one workflow.

For a small business, that means the first workflow can start as a readable file, run in a Hermes session, and mature into a scheduled or gateway-connected assistant only after the output is useful.

Recap

Markdown runbooks make AI executive assistants legible. They turn fuzzy prompts into operating rules: inputs, allowed actions, approval gates, stop conditions, receipts, and metrics. If a human owner cannot understand the runbook, the business is not ready to trust the assistant.

Next step

For runtime setup, read Hermes Agent Runtime for Business Workflows. For the business category, read What Is a Private AI Executive Assistant?.

Back to Journal