Scientific Voice & Style

Voice is what survives when everything else gets rewritten. In an era when LLMs can produce grammatically clean, structurally correct, and technically empty prose, the things that mark you as a thinker — your idioms, your hesitations, your particular way of seeing a result — become the only signal of authorship left. This skill is a craft guide for building that signal deliberately, and for recognising the patterns by which AI assistance erodes it.

When to use

You are starting a new manuscript and want to set a strong authorial register before drafting.
You have drafted a section with heavy AI involvement and want to recover your voice.
You are reviewing a manuscript (your own or a mentee's) for stylistic tells that flatten the author.
You are training a writing habit, not chasing a single paper.
You are designing a lab writing workflow that uses AI without losing voice.

When NOT to use

The text is for a tightly regulated format (abstracts with strict word counts, structured regulatory submissions) where voice is genuinely not the goal.
You are doing language polishing for a non-native English speaker whose priority is clarity, not stylistic distinctiveness. Use a plain-language editing skill instead.
You are looking for help with grammar, punctuation, or sentence mechanics at the line level. Use a copyediting skill.

Prerequisites

A working draft, or the intent to start one. Voice work without a draft is theory; voice work with a draft is craft.
Familiarity with the conventions of your target subfield. You cannot sound like a X if you have not absorbed the cadence of X.
Willingness to write ugly first drafts. Voice is found in revision, not in first typing.

Core workflow

1. Read widely in your target register

Before you write a word of a paper, read 30-50 papers in the exact register you want to publish in — not just any papers, but the ones whose prose you would be willing to claim. The point is not to copy their phrasings; it is to internalise what is normal, what is interesting, what is overkill. Sword calls this "absorbing the rhythms of the discourse community." Without it, any writing you produce — human or AI-assisted — sounds like a generic outsider.

Concrete moves:

Print three papers whose prose you admire. Re-read them with a pen.
Notice the connective tissue: how do they move from result to interpretation? From one paragraph to the next?
Notice the ungoverned moments: where do they take a position? Where do they allow themselves to be slightly opinionated?

2. Draft without AI, in full sentences, on paper or in a plain text editor

The single most reliable way to develop voice is to write first drafts without AI assistance. AI assistance at the drafting stage locks you into its register from sentence one, and the cost of rewriting past that lock-in is higher than the cost of writing badly yourself.

This is not a moral position. It is a craft observation: if the first 200 words of your manuscript are AI-flavored, you will spend the next 4000 words fighting the influence. The same is not true of AI editing of a draft you wrote in your own voice.

3. Identify the AI tells in the draft (or in the AI-assisted version)

Run the draft against the AI-tell taxonomy in the "Code patterns" section below. Mark every match. Do not rewrite yet — just mark.

Common tells to look for:

Em-dash overuse. A few em-dashes are fine; six per page is a signature.
Three-part list construction. "Fast, accurate, and reliable" / "robust, scalable, and interpretable" / "rigorous, reproducible, and transparent." The triad is the single most common AI rhythm.
Hedging that is true but obvious. "It is important to note that...", "It should be noted that...", "It is worth mentioning that..." All of these are true, and all of them are content-free.
Travel metaphors for thinking. "Dive into", "delve into", "navigate the complexities of", "explore the landscape of", "unpack the nuances." These phrases do work; they do not do writing.
Bolded conclusions that restate what the previous sentence already said.
False balance. "On the one hand... on the other hand..." used where the evidence does not actually support a two-handed framing.
The summary opener. "In this study, we have shown that..." used to launch a paragraph that itself is a summary.
Generic intensification. "Crucially", "importantly", "key", "vital", "essential" — all of these are often correct and almost always vacuous.
Politeness injections. "I hope this section is helpful", "Let me know if you have questions", "I will now discuss..." This is LLM chat residue leaking into prose.

4. Rewrite the marked passages in your own voice

For each marked passage, do not paraphrase — rewrite from your understanding. If you cannot rewrite it, that is information: the passage is doing work you do not understand, and you should leave a comment to revisit it.

A useful diagnostic question for each rewrite: would I say this out loud at a lab meeting, in this order, with these words? If not, the rewrite is still borrowed.

5. Maintain first-person and idiosyncratic analogies

Voice lives in first person and in the analogies only you would draw. AI systems default to impersonal voice and to stock analogies (cells as factories, datasets as fuel, methods as pipelines). Both are recoverable with effort; both are also the places where your authorship is most legible.

For first person: use "we" sparingly and deliberately, but use it. "We argue" is stronger than "It is argued."
For analogies: cultivate a small library of analogies that are yours — drawn from a hobby, a previous field, a teaching example, a half-remembered book. These are the things AI cannot infer.

6. Vary sentence length deliberately

Good scientific prose alternates long and short. AI prose is metronomic. When you audit a passage, count the words per sentence for ten sentences in a row. If the variance is low, vary it. Short sentences are the breath marks of scientific writing; use them where a point lands.

7. Use discipline-specific terminology in the exact way your subfield uses it

Every subfield has a few words that mean something slightly different than the dictionary definition. "Robust" in econometrics is not "robust" in machine learning is not "robust" in cell biology. "Validation" in clinical research is not "validation" in machine learning. AI is bad at this and defaults to the most general meaning. You are not.

If your subfield writes "patients" rather than "subjects" or "participants", use "patients". If it writes "effect size" rather than "effect magnitude", use "effect size". The signal you send by using the right word in the right way is the same signal a good reviewer picks up on — this person is one of us.

8. Run the Gopen-Swan structural pass last

Gopen & Swan is not about style, but about the reader's experience of the sentence. Their unit-of-thought principle says: each sentence should contain one unit of thought, and that unit should sit in the position of greatest stress in the sentence (usually the end). Run this pass after the voice pass: a stylistically distinctive manuscript whose sentences violate reader-experience is still a bad manuscript.

A simple diagnostic: read each sentence aloud, listen to where the stress falls, and check whether the stress falls on the new information. If the stress falls on a connective ("however", "therefore", "thus"), restructure.

Code patterns

Voice diagnostic: AI-tell checklist

# A simple regex-based scanner for the most common AI tells.
# Not exhaustive — voice is too contextual for regex.
# Use as a first pass, then read aloud for the rest.

AI_TELLS = {
    "em_dash": r"—",                     # count > 4 per 1000 words is suspicious
    "triadic_construction": r"\b\w+,\s\w+,?\s+and\s+\w+\b",  # rough
    "travel_metaphor": r"\b(delve|dive|navigate|unpack|explore the|landscape)\b",
    "false_balance": r"\bon the one hand.*\bon the other hand\b",
    "summary_opener": r"\bin this (study|paper|work|manuscript)\b",
    "polite_injection": r"\b(let me know|hope this (helps|is)|feel free to)\b",
    "vacuous_intensifier": r"\b(crucially|importantly|essentially|basically)\b",
    "it_is_important": r"\bit is (important|worth|noteworthy) (to note|to mention)?\b",
}

import re
from collections import Counter

def voice_scan(text: str) -> dict:
    counts = {}
    for name, pat in AI_TELLS.items():
        counts[name] = len(re.findall(pat, text, flags=re.IGNORECASE))
    return counts

# Example:
# voice_scan(your_manuscript_text)
# If "em_dash" is > 0.4% of words, or "travel_metaphor" appears at all,
# read those passages aloud and consider rewriting.

Sentence-length variance diagnostic

# AI prose tends to have low variance in sentence length.
# A quick diagnostic:

import re

def sentence_lengths(text: str) -> list[int]:
    sentences = re.split(r"(?<=[.!?])\s+", text.strip())
    return [len(s.split()) for s in sentences if s]

def length_variance_report(text: str) -> None:
    lens = sentence_lengths(text)
    if not lens:
        return
    mean = sum(lens) / len(lens)
    var = sum((x - mean) ** 2 for x in lens) / len(lens)
    sd = var ** 0.5
    cv = sd / mean if mean else 0
    print(f"sents={len(lens)}  mean={mean:.1f}  sd={sd:.1f}  cv={cv:.2f}")
    # CV < 0.35 is metronomic — a voice tell.

# For a paragraph, run this and look at the spread.
# For a full manuscript section, run this on every paragraph.

Read-aloud unit-of-thought check (manual, not codable)

The Gopen-Swan pass is not codable in any meaningful way. The procedure:

Read the sentence aloud.
Note the word that receives the most stress.
Ask: is that word the new information for the reader at this point?
If not, restructure so the new information lands in the stress position.

This is roughly the procedure professional copyeditors use, and it is fast once you internalise it.

Common pitfalls

Believing that "more sophisticated vocabulary" = "more voice". It does not. Voice comes from the order and texture of words, not from the difficulty of the words.
Trying to sound like a specific author you admire. Mimicry is legible, and reviewers who recognise the mimicry will mark you down for it.
Using AI to "improve style" of a draft you did not write. This is the fastest way to lose voice. The draft needs to be yours before any style work.
Confusing idiosyncrasy with inaccuracy. A distinctive voice is not a license for imprecision. Style and accuracy are independent axes; you need both.
Skipping the read-aloud pass. Voice is partly a property of sound. If you have not read your own prose aloud, you have not heard whether it sounds like you.
Overcorrecting AI tells by going slangy or informal. The opposite of AI prose is not casual prose. It is prose that is yours.
Pursuing voice in a short abstract or methods section. Voice is a section- and paper-level property. A 150-word abstract will not bear the weight of voice work; spend that energy on the discussion.

Validation

You know voice is working when:

A trusted colleague can identify a paragraph as yours without being told the author.
You can read a sentence aloud and recognise yourself in it.
A reviewer comment says something about the argument (a sign of engagement) rather than something about the writing (a sign the prose is transparent).
A mentee, asked to imitate your style, finds it hard to do without studying several of your papers.
The AI-tell scanner flags very few of the tells above.

You know voice is failing when:

Every paragraph opens with the same structure.
You find yourself reaching for the thesaurus.
You cannot tell which of two paragraphs in your manuscript you wrote.
A reviewer comments on the prose positively ("nicely written") in a way that feels like a hedge about the content.

Open alternatives

For style reference: every paid style guide has a public predecessor. Strunk & White is free. Gopen & Swan is free. Pinker is in most libraries.
For read-aloud review: read your own draft aloud. There is no software substitute. Some authors use text-to-speech (e.g. the system voices on macOS/Linux/Windows, or the open piper-tts) to hear their own prose without the self-consciousness of reading aloud.
For AI-tell scanning: the regex scanner above is the simplest open approach. There is no canonical open-source tool; commercial "AI detection" tools (see ai-detection-awareness) attempt related tasks but should not be confused with style diagnostics.

References

Pinker, S.. The Sense of Style: The Thinking Person's Guide to Writing in the 21st Century. Viking.
Sword, H.. Stylish Academic Writing. Harvard University Press.
Gopen, G. D., & Swan, J. A.. "The Science of Scientific Writing." American Scientist, 78(6), 550-558. Reprinted widely; full text in the public domain per the authors' practice.
Williams, J. M.. Style: Lessons in Clarity and Grace (11th ed.). Pearson. — A complementary craft reference.
Orna, L., & Stevens, G.. How to Manage Your Research Career. — Not primarily a style book, but the chapter on writing habits is still useful.
Related ors-* skills: ors-scientific-writing-manuscript-structure, ors-scientific-writing-imrad-drafting, ors-humanizer-skills-text-humanizing-editorial.

Changelog

1.0.0 (2026-06-10): Initial adaptation by Pradyumna Jayaram. Sources: Pinker 2014, Sword 2012, Gopen & Swan 1990. Added: AI-tell taxonomy, voice-build workflow with AI only as editor, sentence-variance diagnostic.

skills/humanizer-skills/scientific-voice-style