A writing-voice guide for your AI assistants
I stopped giving my AI assistants rules and started giving them reading material. The shift is from instruction to artifact. The em-dash regex is the catchy part. Loading files as books-the-system-reads is the durable one.
I stopped giving my AI assistants style guides. I started giving them examples of my own writing and a single rule: if it sounds like a consultant, rewrite it.
The shift is from instruction to artifact. A style guide is a list of things the model has been told. An artifact is a thing the model has read. The first one rarely transfers. The second one transfers reliably, because prediction is what the model is built to do, and what you put in front of it shapes what it predicts.
This post is about the small set of signals I use to keep the artifact honest, and the regex I run before anything ships.
#The em-dash signal
Em dashes (—) are the most reliable formatting-level AI tell in written
English. They feel natural to a system that has never typed one but has
read a billion of them. Every editorial style guide ever published has
been ingested with em dashes in place. Every literary novel. Every New
Yorker piece. To a model, the em dash is what writing looks like.
Native human writers, especially engineers writing fast in chat windows
and pull-request descriptions, reach for either a tight dash (word-)
glued to the preceding word, or a double dash (--) with spaces on
both sides. Watch your thumbs the next time you type something quickly.
That’s the test.
The signal is Bayesian. An em dash in a draft raises the probability the text is AI-generated. It doesn’t confirm it. Plenty of human writers — particularly people who learned typography from books rather than chat clients — use em dashes deliberately. The regex below works for my drafts because my own writing doesn’t use them, so any em dash that shows up is drift. The test calibrates to a baseline you have to know.
I have a one-line check that runs before any public publish:
grep -RIn "—" src/
If the result is non-empty, something AI-generated has drifted in. The check also catches my own drift on the rare occasions a Mac autocorrect substitutes an em dash for a double dash. The rule applies to me too, which is the point.
Em dashes feel natural to a system that has never typed one but has read a billion of them. Native writers, especially engineers, reach for word- or — instead. One regex catches the difference.
#Words I flag (not words I ban)
A short list of words that overuse has hollowed out:
leverage, robust, comprehensive, seamless, synergy, holistic, utilize, facilitate, learnings, bandwidth, circle back, stakeholders, deliverables, ecosystem, journey, value add, paradigm shift, innovative, transformative, disrupt, thought leadership, best-in-class, world-class, cutting-edge, next-generation, turnkey, empower, ideate, solution.
Zero of these appear in my writing. “Leverage our existing infrastructure” means nothing. “Reuse the read replica we stood up last quarter” means something. The list isn’t a banned-word list. It’s a flag list. When I brief an AI assistant on a draft, I don’t say “don’t use these.” I share the list and ask the assistant to highlight any of them that appear in the draft. Every highlighted word is a place where the assistant defaulted to corporate abstraction instead of specificity.
That posture matters. Forbidding words pushes the model to find synonyms. Flagging words asks it to write more specifically. The first produces laundered slop. The second produces sharper sentences.
#The four tests
I run drafts through four quality tests before they ship.
- Teammate Test. Would a smart teammate find it clear, or roll their eyes? Write for the eye-roll.
- CTO Test. Would a serious CTO find it impressive, or embarrassing? Serious CTOs read source code and inspect network tabs. They know when something was written to sound impressive rather than to be clear.
- Action Test. Could a reader act on this without asking a follow-up question? If not, the post isn’t specific enough.
- Attribution Test. If I removed my name, could AI have written this?
The fourth is the gate. The first three catch quality problems any competent writer catches. The fourth catches the failure mode that makes a draft sound generated.
I run the Attribution Test by scrolling to a random paragraph and reading it as if a stranger wrote it. If the paragraph could be on any consultant’s Medium account, it fails. If it names a tool, a number, a proper noun, or a specific anecdote that nobody else would have written in exactly that shape, it passes. The test is fast. It catches most drift in five minutes.
#Artifact, not instruction
The deeper move is that I stopped writing style guides for AI assistants and started curating their reading material.
I keep a folder of my own past writing — emails, PRs, docs, code-review comments, READMEs — organized by register. When I start a writing task with an assistant, I drop the closest two or three examples into the prompt context. The pattern transfers. I never describe the labels.
The same posture works for code review. Three of my own past reviews beat any prose description of what good code review looks like. The asset isn’t a guide. The asset is the folder.
For voice specifically, I keep one file at ~/.config/writing-voice.md
and an @-include in ~/.claude/CLAUDE.md that pulls it into every
Claude Code session. Cursor and the SDK clients reach the same file via
their own global configs. The file isn’t loaded as instructions. It loads
as artifact, the same way a human writer absorbs a style by reading the
writer, not by being told the writer’s rules. 1
The em-dash signal has a half-life. It works in 2024-2026. By 2028 the models will be better at mimicking specific human voices from a folder of examples, and the artifact approach will need sharper inputs to keep its edge. The current discipline holds for now. The general principle — show the model the corpus, don’t lecture the model on the rules — outlasts any individual regex.
Notes
- Anthropic's Prompt engineering docs put "use examples" at the top of every "how to get better outputs" list. The DetectGPT (Stanford 2023) and GPTZero literature converges on the same pattern from the detection side: AI-generated text has consistent fingerprints, but single-signal detection is unreliable. Ensembles work. ↩
- Mark Liberman's Language Log has decades of empirical observation on English punctuation usage across registers. The em-dash-as-AI-tell observation is a specific application of a more general truth: machines learn punctuation from text; humans learn it from keyboards. ↩