skills/chemoinformatics/pharmacophore-modeling

stars:0
forks:0
watches:0
last updated:N/A

Version Compatibility

Reference examples tested with: RDKit 2024.09+, pharmer / pharmit (web service), PharmIT 1.1+, plip 2.4+ (interaction analysis).

Before using code patterns, verify installed versions match. If versions differ:

  • Python: pip show rdkit then help(rdkit.Chem.Pharm3D) to check signatures

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

Pharmacophore Modeling

Build 3D pharmacophore queries that capture the essential interaction features of a ligand-target binding event. A pharmacophore is the spatial arrangement of pharmacophore features (donor, acceptor, hydrophobe, aromatic, charged) sufficient for activity, abstracted from any specific chemotype. Used for scaffold-hopping (find compounds with different scaffold but matching pharmacophore), virtual screening (faster than docking), and cross-target SAR transfer. Modern best practice: derive pharmacophore from co-crystal structure if available (receptor-based; apo2ph4 workflow of Heider et al 2023) or align actives if no crystal (ligand-based). Diffusion-based generation (PharmacoForge) lets pharmacophore drive de novo design.

For 2D scaffold-based searches, see chemoinformatics/scaffold-analysis. For 3D shape similarity, see chemoinformatics/shape-similarity. For protein-ligand interaction analysis, see chemoinformatics/virtual-screening.

Pharmacophore Feature Types

FeatureRDKit codeDefinitionGeometric tolerance
H-bond donorD-OH, -NH1.0-1.5 Å
H-bond acceptorAsp2 O / N (lone pair)1.0-1.5 Å
HydrophobeHsp3 C / aromatic ring centroid1.5-2.0 Å
Aromatic ringRAromatic ring centroid + normal1.0-1.5 Å
Positive ionizableP-NH3+, -NR3+1.0-1.5 Å
Negative ionizableN-COO-, -SO3-1.0-1.5 Å
HalogenXCl, Br, I (halogen bond donor)1.0-1.5 Å
Metal coordinationMsp/sp2 N/O near metal0.5-1.0 Å

Tolerances are pharmacophore-feature distance windows in the search. Tighter tolerances = fewer hits but more specific.

Method Taxonomy

MethodOriginUse caseFails when
Ligand-based (LBP)Catalyst, MOE, RDKit Pharm3DMultiple actives, no crystal<3 actives; flexible actives
Receptor-based (RBP)apo2ph4, LigandScoutCo-crystal availableApo structure (use AlphaFold3 or Boltz)
Common pharmacophorePharm3D EmbedPharmacophoreConsensus from active setDiverse actives confound alignment
Diffusion-based (PharmacoForge)Flynn et al 2025De novo generation with pharmacophore priorPretrained model required
Active learning pharmacophoreCatalyst variantIterative refinementCustom; not standard

Decision Tree by Scenario

ScenarioMethodTools
Co-crystal structure availableReceptor-basedapo2ph4 + Pharmer/Pharmit
Multiple active compounds, no crystalLigand-based common pharmacophoreRDKit Pharm3D EmbedPharmacophore
Single active compoundSingle-conformer pharmacophoreRDKit Pharm3D from bioactive conformer
Scaffold hopping prospectiveReceptor-based + shape filterapo2ph4 + ROCS
Cross-target SAR transferCommon pharmacophore across targetsManual + LigandScout
De novo design with pharmacophorePharmacoForgeDiffusion-based generation
Library pre-filteringPharmacophore screenPharmit search

Ligand-Based Pharmacophore (RDKit Pharm3D)

Goal: Extract a common pharmacophore from a set of bioactive compounds.

Approach: Embed actives, extract per-molecule features, then build a target pharmacophore with distance bounds capturing cross-active variability.

from rdkit import Chem
from rdkit.Chem import AllChem, ChemicalFeatures
from rdkit.Chem.Pharm3D import Pharmacophore
from rdkit.RDPaths import RDDataDir
import os

fdef_file = os.path.join(RDDataDir, 'BaseFeatures.fdef')
factory = ChemicalFeatures.BuildFeatureFactory(fdef_file)

active_smiles = ['CC(C)c1ccc(C(=O)NCc2ccccn2)cc1',
                 'CCC(C)c1ccc(C(=O)NCc2ccccn2)cc1']
active_mols = [Chem.AddHs(Chem.MolFromSmiles(s)) for s in active_smiles]
for m in active_mols:
    AllChem.EmbedMolecule(m, AllChem.ETKDGv3())
    AllChem.MMFFOptimizeMolecule(m)

feature_lists = [factory.GetFeaturesForMol(m) for m in active_mols]

# Example 2-feature pharmacophore (Aromatic + Donor) with a 3.5-5.0 A band
feature_types = ['Aromatic', 'Donor']
pharmacophore = Pharmacophore.Pharmacophore(feature_types)
pharmacophore.setLowerBound(0, 1, 3.5)
pharmacophore.setUpperBound(0, 1, 5.0)

BaseFeatures.fdef (RDKit-shipped) defines feature SMARTS. For drug-like pharmacophores, this is the standard starting point.

Receptor-Based Pharmacophore (apo2ph4 workflow)

Goal: Derive a pharmacophore from a protein binding-pocket structure without requiring a bound ligand.

Approach: Identify hot-spots (donor / acceptor / hydrophobe regions) from protein geometry; assemble into a pharmacophore. apo2ph4 (Heider et al 2023) describes this pipeline.

# Conceptual apo2ph4-style workflow; verify the exact CLI against the published release.
apo2ph4 -pdb receptor.pdb \
        -site_residues 'A:100,A:101,A:104,A:108' \
        -output pharmacophore.ph4

When a co-crystal ligand is available, derive pharmacophore directly from the ligand binding pose: each ligand feature in contact with a complementary protein residue is part of the pharmacophore.

from plip.basic import config

PLIP (Protein-Ligand Interaction Profiler) extracts per-residue interactions; use to build the pharmacophore manually from contacts (H-bond, hydrophobic, salt bridge, π-stacking).

Pharmer / Pharmit (Web Service)

For library screening with a .ph4 pharmacophore:

  • Pharmer (Koes 2012): open-source pharmacophore search; efficient indexing
  • Pharmit (Sunseri 2016): web-based frontend, integrates Pharmer with vendor catalogs (Enamine, Mcule, ZINC)
# Pharmit provides a REST API for batch search
curl -X POST https://pharmit.csb.pitt.edu/api/search \
     -H 'Content-Type: application/json' \"
     -d '{"pharmacophore": "...", "library": "enamine-real", "max_hits": 1000}'

For open-source Pharmer, install via conda; programmatic interface in Python.

Diffusion-Based Pharmacophore Generation (PharmacoForge)

PharmacoForge (Flynn et al 2025) generates molecules conditioned on a pharmacophore using a diffusion model. Use when de novo design is needed and a pharmacophore constraint is desired.

# PharmacoForge pseudo-API; verify against the published release.
# from pharmaco_forge import generate
# mols = generate(
#     pharmacophore='ph4_file',
#     n_molecules=1000,
#     scaffold_hint='aryl_sulfonamide',
# )

Per-Tool Failure Modes

Ligand-based pharmacophore -- too few actives

Trigger: Active set < 5 compounds with diverse chemotypes.

Mechanism: Common pharmacophore requires shared features; small active sets don't converge.

Symptom: All-atom pharmacophore is too permissive; thousands of hits with random features.

Fix: Use receptor-based pharmacophore if crystal available; or pick 1 bioactive conformer as single template.

apo2ph4 -- ambiguous pocket definition

Trigger: Binding site contains multiple sub-pockets; user-specified residues span all.

Mechanism: Hot-spot detection over generalizes.

Symptom: Pharmacophore has too many features; nothing matches in real library.

Fix: Manually curate sub-pocket; pass focused residue list (3-6 residues in active site).

Pharmer / Pharmit -- too few hits

Trigger: Pharmacophore strict; library small.

Mechanism: Geometric tolerance + feature count filter compounds aggressively.

Symptom: < 10 hits returned for large library.

Fix: Increase tolerance by 0.5-1.0 Å; remove 1 least-essential feature; verify pharmacophore geometry.

Pharmacophore match but binder is inactive

Trigger: Compound matches pharmacophore but IC50 > 10 µM.

Mechanism: Pharmacophore captures necessary features but not sufficient (strain, dynamics, off-target).

Symptom: False positive in screen.

Fix: Combine pharmacophore with shape (ROCS color) and physchem filters.

Common Errors

SymptomCauseFix
EmbedPharmacophore returns no matchDistance bounds too tightWiden by 1 Å
Feature SMARTS not recognizedCustom fdef file requiredVerify against BaseFeatures.fdef
Stereochemistry not in pharmacophoreDefault RDKit features ignore stereoUse PLIP for stereo-aware contact analysis
PLIP can't find interactionsWrong residue numberingCheck PDB numbering vs UniProt offset
Pharmer hangs on large libraryIn-memory index for big librariesUse Pharmer with --disk-index flag

References

  • Heider et al., J. Chem. Inf. Model. 63:147-158 -- apo2ph4 receptor-based pharmacophore.
  • Flynn et al., Front. Bioinform. -- PharmacoForge diffusion-based generation.
  • Koes et al., J. Chem. Inf. Model. 52:1159 -- Pharmer.
  • Sunseri et al., J. Mol. Graph. Model. -- Pharmit web service.
  • Wermuth et al. -- IUPAC pharmacophore definition.

Related Skills

  • chemoinformatics/scaffold-analysis - 2D scaffold analysis
  • chemoinformatics/shape-similarity - 3D shape
  • chemoinformatics/similarity-searching - 2D similarity
  • chemoinformatics/virtual-screening - Docking as alternative
  • chemoinformatics/conformer-generation - 3D conformers for active alignment
  • chemoinformatics/generative-design - Generative models with pharmacophore conditioning
    Good AI Tools