I think that’s my first RDKit post! So reason to celebrate!
Here' I’ll focus on a very nice feature available in Open Source Software and very useful in daily structure based but even ligand based drug design tasks.
How can I dock small molecules into a receptor and avoid as much as possible the “what is the right pose” problem?
Sorry to disappoint you there is no easy solution to that. But given a few assumptions you can actually work around this issue quite well. Let me explain that a bit further below.
The approach taken here
Docking small molecules in a pocket of a protein structure can be done for various reasons and use cases. Virtual screening or lead optimization and SAR analysis for instance. I’ll focus on lead optimization stages here and this setting gives us a few pre-requisits to work with.
Let’s suppose you have a crystal structure or a very good model of how your lead compound binds to the target. Then docking your whole compound series where decorations have been added here and there could actually make use of the known positioning of the original lead compound. That’s based on the (strong) assumption that the modifications made on the ligand do not affect the binding mode of the ligand in a significant way. I know that this is not always true, but very often it’s a good enough hypothesis to work with and get things done.
rdock allows you to do tethered docking….but for that you need 3D conformations of the ligands already in the place where they should be tethered and then a property in the SD file indicating which atoms should be tethered.
In this post I’ll show how to do that first part: how to prepare your sd file for tethered docking with rdock using RDKit.
All the functional source code and sample data and output can be found on this github repo.
Example on HSP90 inhbitors
Let’s suppose we have a reference structure we would like to dock into. Here I chose my favourite structure from the PDB: 1uyd.
Now let us see how we can prepare a docking of a series of compounds that all share at least the adenine fragment in them:
Let’s set up a RDKit script that takes a reference 3D molecular structure and a set of ligands and does a tethered minimization on the MCS of the ligand and reference structure. Below I’ll just go through the main steps (not the full code). Refer to this github repo for the full code and examples.
from rdkit import Chem from rdkit.Chem import AllChem from rdkit.Chem import rdFMCS from rdkit.Chem.rdMolAlign import AlignMol reference = #reference molecule ligands = #ligands to align for mol in ligands: mols=[reference,mol] mcsResult=rdFMCS.FindMCS(mols,threshold=0.9,completeRingsOnly=True) #find the maximum common substructure if mcsResult.smartsString and len(mcsResult.smartsString)>0 : patt = Chem.MolFromSmarts(mcsResult.smartsString,mergeHs=True) #keep only the core of the reference molecule ref=AllChem.ReplaceSidechains(reference,patt) if ref: core=AllChem.DeleteSubstructs(ref,Chem.MolFromSmiles('*')) core.UpdatePropertyCache()
Now we have identified the common core between our ligand and the reference molecule. Next, we will use that to minimize our ligand on top of that core from the reference.
newmol=Chem.Mol(mol) #create a new instance of the ligand, as we will change the coordinates AllChem.ConstrainedEmbed(newmol,core) #constrained minimization of newmol versus the core of the reference
Actually, just two lines of code to do the tricky part. Rather cool! Now we just have to prepare our output for rdock, which is basically write out the minimized and aligned molecule into an SD file with a property containing all atoms that should be constrained upon docking.
tethered_atom_ids=newmol.GetSubstructMatches(patt) #that's to get the atom ids only if tethered_atom_ids : t=tethered_atom_ids t1=map(lambda x:x+1, list(t)) ta=','.join(str(el) for el in t1) nm=Chem.AddHs(newmol, addCoords=True) #create a new 3D molecule and add the TETHERED ATOMS property nm.SetProp('TETHERED ATOMS',ta) w.write(nm) #write to an sd file
That was not too difficult actually. So once this is done we end up with an SD file containing all molecules aligned and with the relevant property associated to each molecule:
This SD file can now be used for tethered docking with rdock. Follow a normal docking protocol with rdock but provide in the ligand section of your docking configuration file something along these lines.
Obviously you can use this script or the output SD to do a lot of other things as well.
For the rdock affiliates among you
The code presented here can serve as a complete replacement of the sdtether.py that was originally shipped with rdock. The original sdtether only allowed rigid body alignments on a reference ligand which obviously comes with important caveats. With the new RDKit based version you run flexible minimizations.
I’d like to thank Greg Laundrum for his great work on RDKit and his hints to the ConstrainedEmbed function. David Morley, Xavier Barril, Daniel Alvarez-Garcia & Sergio Luiz Carmona for writing and maintaining rdock.
Same idea with a slightly different approach (not rdock related though): https://iwatobipen.wordpress.com/2019/06/04/constrain-embedding-with-mcs-as-a-core-rdkit-chemoinformatics/