skills/image-analysis-microscopy/image-analysis-best-practices
Image Analysis Best Practices: Acquisition, Controls, and Publication
"
This skill is the "do no harm" companion to the tool-specific image-analysis skills. It covers what to do before you open a single image in CellProfiler, QuPath, or Fiji: how to plan controls, how to acquire with enough dynamic range and a real scale bar, how to keep channels from contaminating each other, how to make figures that survive peer review, and — most importantly — how to handle statistics correctly when the unit of analysis is not what the spreadsheet says it is. Pair this with
ors-image-analysis-microscopy-cellprofiler-pipelines,ors-image-analysis-microscopy-qupath-pathology, orors-image-analysis-microscopy-imagej-fiji-macro.
When to use
- Designing a microscopy experiment whose quantitative results will appear in a paper, preprint, or grant.
- Reviewing a manuscript or thesis chapter that contains microscopy-based quantitative claims.
- Building a figure for a journal, with correct resolution, scale bars, color-blind safe palettes, and minimal-necessary processing.
- Deciding what counts as "n" in an image-derived dataset (cells vs images vs animals vs fields-of-view).
- Diagnosing a red flag in someone else's figure or your own (saturation, mismatch in channel intensities, scaled-up background, "western-blot" tweaks to gels).
- Implementing positive/negative/isotype controls for an immunofluorescence experiment.
When NOT to use
- Tool-specific "how do I run CellProfiler / QuPath / Fiji" — use the corresponding tool skill in this category.
- Pure flow-cytometry (no spatial information) — see the
bio-flow-cytometry-*skills. - Statistical testing per se (t-tests, GLMMs) — see the statistical-analysis skills; this skill covers which test to pick, not the mechanics.
- Acquisition-side correction (illumination, chromatic aberration) — the tool skills cover that; this skill covers experimental best practices.
Prerequisites
- Familiarity with the chosen microscope and acquisition software (µManager, NIS-Elements, ZEN, LAS X, slide-scanner software).
- A figure-creation tool (matplotlib, ggplot2, Illustrator, Inkscape) with at least one that supports CMYK / 300+ DPI export.
- A small statistics toolset (R, Python statsmodels / scipy) for hierarchical models and effect-size calculations.
- Awareness of the target journal's figure requirements (most biomedical journals require 300 DPI for color, 600 DPI for halftone, vector format for line art — verify with the journal's Instructions for Authors).
Core workflow
The order below matters. Skipping step 1 cannot be patched in step 5.
- Plan the experiment before imaging. Specify the biological replicate (animal, donor, biological sample), the technical replicate (fields-of-view, slides, sections), the a priori power calculation, and which statistical test will be applied later. The MIAME/MIAPE-style checklists (e.g. the CellProfiler example pipelines' "metadata") are useful templates.
- Set the acquisition parameters once, write them down.
- Exposure / gain for every channel chosen to place the dimmest expected signal well above the noise floor (typically with peak pixel 500-3000 of a 12-bit / 16-bit dynamic range — see "Dynamic range" below).
- Pixel size at Nyquist or finer (≈ 2.3× oversampling of the optical resolution; for a 1.4 NA objective at 488 nm excitation, the Abbe limit is ≈ 200 nm, so 90-100 nm/px sampling is good practice).
- Laser power, integration time, and detector gain recorded in a per-channel table.
- No contrast / brightness / gamma applied at acquisition (linear data, no LUT stretching).
- Capture the controls. For fluorescence:
- Negative control (no primary / isotype). Establishes autofluorescence and non-specific binding baseline.
- Positive control (known-positive sample). Confirms the antibody / probe works and the acquisition range is right.
- Single-stain controls (one per channel). Required for unmixing, channel bleed-through correction, and spectral unmixing.
- DAPI / nuclear channel on its own. Confirms the nuclear segmentation.
- Process and quantify on raw data. Always keep a copy of the
original
.czi,.lif,.nd2, or 16-bit.tifin a write-once archive. Apply the same processing chain to every image in the batch. The image is the data; the figure is the visualization. - Document the processing chain. A short methods paragraph
listing: software, version, exact modules / commands, any
thresholds, any color balance changes. Best practice: deposit
the full pipeline / script (e.g.
.cppipe,.qpproj,.ijm, Python notebook) in a public or supplementary archive (Zenodo, GitHub tag). - Build the figure to journal specs. Resolution, color space, file format, scale bar, channel colors, and any linear contrast adjustment must be applied identically to all panels in a comparison (no "normalized" or "this one is brighter to see the structure").
- Run the statistics on the right unit of analysis. See the "Common pitfalls — n = cells vs n = images vs n = animals" below. Report effect sizes and CIs, not only p-values.
- Self-audit before submission. Run the figure through the "Common pitfalls — Image manipulation" checklist. Use the CellProfiler example data or test images as a sanity check for the whole pipeline.
Code patterns
Pixel-size / scale bar verification
import tifffile
import json
# Load a multi-resolution slide and check the embedded XML
# metadata (OME / vendor-specific). Verify the exact metadata
# keys for your file format in the Bio-Formats or vendor docs.
img = tifffile.TiffFile("slide.tif")
# The physical-pixel-size key is format-dependent; the exact
# name and units must be confirmed in the metadata specification.
# print(img.pages[0].tags.get("XResolution"))
In the figure, set the scale bar in absolute units — e.g. "10 µm" in white on a 10-pixel wide line — and never in pixel count ("250 pixels = ? µm, depends on the figure size").
Check for channel bleed-through
import numpy as np
import tifffile
import matplotlib.pyplot as plt
# Single-stain control: image of only the donor (no acceptor).
donor_only = tifffile.imread("control_donor_only.tif")
# Acceptor-detection channel from that same field:
acceptor_leak = tifffile.imread("control_donor_only_acceptor_ch.tif")
# Plot a 2-D intensity histogram to look for cross-talk
H, xedges, yedges = np.histogram2d(
donor_only.ravel(), acceptor_leak.ravel(), bins=128
)
plt.imshow(np.log1p(H.T), origin="lower", aspect="auto",
extent=[xedges[0], xedges[-1], yedges[0], yedges[-1]])
plt.xlabel("Donor intensity")
plt.ylabel("Acceptor-leak intensity")
plt.title("Single-stain control — check for cross-talk")
plt.colorbar(label="log(1 + count)")
plt.show()
A non-zero slope in the high-intensity donor tail means bleed-through; correct by unmixing (linear spectral unmixing, linear subtraction with a measured coefficient) or by selecting narrower emission filters.
Saturation / dynamic-range audit
import numpy as np
import tifffile
img = tifffile.imread("dapi.tif")
bit_depth = img.dtype # e.g. uint16
# Percentage of pixels at the max value
sat = np.mean(img == np.iinfo(bit_depth).max) * 100
print(f"Saturated pixels: {sat:.3f}%")
# Median intensity — should be well above the camera's
# dark / readout floor
print(f"Median intensity: {np.median(img):.1f}")
Both CellProfiler's MeasureImageSaturation and a custom Python
check like this are reasonable. Re-image at lower exposure if
saturation > ~0.5%; re-image at higher exposure if median ≪
0.1% of dynamic range.
Color-blind safe palette (Wong / Okabe-Ito)
import matplotlib.pyplot as plt
import numpy as np
# Okabe-Ito 8-color palette — distinguishable by all common
# forms of color-vision deficiency. Originally published
# by Okabe & Ito and popularized in Wong 2011 (Nature
# Methods). Always cite the palette you use.
okabe_ito = {
"black": "#000000",
"orange": "#E69F00",
"sky_blue": "#56B4E9",
"bluish_green": "#009E73",
"yellow": "#F0E442",
"blue": "#0072B2",
"vermillion": "#D55E00",
"purple": "#CC79A7",
}
fig, ax = plt.subplots()
for (name, color) in okabe_ito.items():
ax.plot(np.arange(10), np.random.rand(10), color=color, label=name)
ax.legend()
plt.show()
Hierarchical / mixed-effects model (n = cells, n = images, n = animals)
# Pseudo-code: a hierarchical model where per-cell measurements
# are subsamples of per-image measurements, which are subsamples
# of per-animal measurements. Use a linear mixed-effects model
# in statsmodels or R's lme4 — the exact test and link function
# depend on the data type. Verify the model in the statsmodels
# / lme4 documentation.
#
# import statsmodels.formula.api as smf
# model = smf.mixedlm(
# "intensity ~ treatment",
# data=per_cell_df,
# groups=per_cell_df["animal_id"],
# vc_formula={"image_id": "0 + C(image_id)"}
# )
# result = model.fit()
# print(result.summary())
A simpler but acceptable alternative when the experimental design permits is to average within animal first, then run the test on the per-animal table (one number per animal). This discards the per-cell information but produces a defensible n = number-of-animals statistical test.
Common pitfalls
Image manipulation (Rossner & Yamada 2004, Journal of Cell Biology)
- Adjusting brightness/contrast differently across panels — a frequent reason for editorial rejection. Apply the same linear adjustment to every image in a comparison.
- Erasing background speckle / dust / saturated pixels — must be done in a way that is also applied to controls and documented. ImageJ's Cloning tool used to "clean up" a panel is a hard no for a publication figure.
- Splicing lanes / regions from different gels or fields — must be visibly indicated with a separator line and explained in the legend. A "representative" image must be representative of the quantified data, not a cherry-picked field.
- Re-scaling a Western blot's contrast to hide or enhance a band. Always provide the full original.
- Publication of an image with no scale bar / wrong scale bar — the original Rossner & Yamada editorial gives many examples.
Exposure and dynamic range
- Pixel intensities pinned at 65535 (or 4095). Saturated pixels cannot be quantified; the relative amount of signal across samples is destroyed. Lower the exposure.
- Median intensity << 1% of the dynamic range. The signal is being thrown away; raise the exposure or use a brighter probe.
- 16-bit data saved as 8-bit JPEG. All quantitative information is lost; the dynamic range is permanently compressed.
Channel bleed-through
- Using the "yellow" look (green + red colocalization) on un-corrected channels. Strong cross-talk can mimic colocalization. Run single-stain controls; compute bleed-through coefficients; consider spectral unmixing.
- Photobleaching during acquisition. The donor bleaches faster than the acceptor, biasing ratiometric measurements (FRET) over time. Reduce laser power; minimize exposure time; do the FRET calculation on the same frame order.
Controls
- No negative control / no isotype. Every fluorescence experiment must have one. For a multi-color panel, include single-color controls and a "no primary" or "isotype" control per primary.
- No positive control. A failed experiment looks identical to a successful one with no positive control. Always include a known-positive sample.
- Wrong "isotype" concentration. Isotype control should be at the same concentration (mass, fluorophore-to-protein ratio) as the test antibody, not at a generic "matching" concentration.
Publication figures
- Insufficient resolution. Most journals require ≥ 300 DPI for color, ≥ 600 DPI for halftone, vector format (PDF, SVG, EPS) for line art. A "pretty" 72-DPI figure is not publication-quality.
- Using red/green to encode information. Deuteranopia (red-green color-blindness) affects ~6% of male readers; use a color-blind safe palette (Wong / Okabe-Ito) and pair with shape/label redundancy.
- No scale bar. Reviewers will (rightly) reject the figure.
- Inconsistent figure styles within a paper. Define a style once (font, line widths, color, marker sizes, annotations) and apply it everywhere.
- Log-scale axes without a warning. Log scales hide small-magnitude differences; use them with care, and label them clearly.
n = cells vs n = images vs n = animals
This is the single most common statistical error in image analysis papers (and reviewers are increasingly trained to catch it). Three scenarios:
- n = cells, treatment-vs-control. If 50 cells from one animal per condition are measured, the unit of analysis is the cell, but the biological replicate is still the animal. The cell-level test is wrong if it ignores the animal-level clustering.
- n = images (fields-of-view). Often 5-10 fields per condition per animal. The FOV is a technical replicate nested in animal; analyze as a hierarchical model or average within animal first.
- n = animals. If you have 6 animals per condition, the per-animal mean is the right unit. Doing a t-test on per-cell values with n = 1000 is statistically invalid because the cells are not independent.
Use a linear mixed-effects model (R lme4, Python
statsmodels mixedlm) with the appropriate random-effect
structure: cells nested in FOVs nested in animals. Or, when
the design is simple, average within animal, plot per-animal
points, and run a t-test on the per-animal means. Always
report both n = animals and a total cell count in the
legend.
Other gotchas
- Comparing images acquired with different objectives / cameras / bit depths. Calibrate the pixel size and the intensity units first; otherwise the comparison is meaningless.
- Comparing control and treated without checking the baseline imaging conditions. Day-to-day, instrument-to- instrument variation in lamp output and detector gain can exceed biological effects. Block by day, run a control in the same session, or use a ratiometric measurement.
- Ignoring 3D context. A 2-D maximum projection of a Z stack is not the same data as a single 2-D plane; the projection can introduce apparent co-localization artifacts.
Validation
- Pipeline test on synthetic data: generate an artificial image stack (e.g. 100 circles of known diameter and intensity) and verify the pipeline recovers the ground truth within tolerance.
- Spike-and-recovery / dilution series: where possible, spike in a known amount of fluorescent standard and verify the measurement is linear in the expected range.
- Inter-rater agreement: when annotations are involved (pathology, ROIs), compute Cohen's κ or ICC between two annotators; report it.
- Re-analysis by an independent analyst: if results are central to a paper claim, have a second person re-run the analysis from raw data using only the methods paragraph.
- Journals' figure QC: Cell, Nature, JCB, eLife, and PLOS
all run published figures through image-manipulation
screening tools (e.g. the Office of Research Integrity's
forensic kit). Run your own audit with the
bioimage-forensicsor similar tool before submission.
Open alternatives
- Forensic image-audit tools: the Bioimage Forensics community maintains open tools for detecting cloning and splicing. Commercial equivalents exist (e.g. ImageJ-based "Analyze → Tools → ...) but are not standard.
- Acquisition standards: the Quality Assessment and Reproducibility for Instruments in Optical Microscopy (QUAREP-LiMi) initiative publishes per-modality checklists.
- Reporting standards: ARRIVE 2.0 (animal research), MIAME (microarray — analogously applied to imaging), REMARK (tumor marker), and the CellProfiler and QuPath project templates all have methods-section templates you can copy.
References
- Rossner, M. & Yamada, K. M. 2004, "What's in a picture? The temptation of image manipulation," Journal of Cell Biology 166(1):11-15. The foundational editorial on image-manipulation ethics.
- Cromey, D. W. 2010, "Avoiding twisted pixels: ethical guidelines for the appropriate use and manipulation of scientific digital images," Science and Engineering Ethics 16(4):639-667. Comprehensive practical guide.
- Okabe, M. & Ito, K. 2008, "Color universal design" — the origin of the Okabe-Ito palette.
- Wong, B. 2011, "Points of view: Color blindness," Nature Methods 8:441. Popularized Okabe-Ito for biology.
- QUAREP-LiMi: https://quarep.org/ — community microscopy-quality standards.
- ARRIVE 2.0: https://arriveguidelines.org/
- NIH figure guidelines and the ImageJ for Scientists book (Miura & Nørrelykke, 2021) for further practical reading.
Changelog
- 1.0.0 (2026-06-10): Initial adaptation by Pradyumna Jayaram from the Rossner & Yamada 2004 JCB editorial and the QUAREP-LiMi reporting standards; consolidated the n = cells vs n = images vs n = animals guidance and the color-blind palette reference.
