Discover 3decision® - Part 1: The Growing Mountain of Protein Structural Data

Challenges and Opportunities in Structure-Based Drug Discovery (SBDD)

The Value of Protein Structural Data

Many FDA-approved drugs were originally discovered and/or developed thanks to computational, structure-based approaches.

These include the Bcl-2 inhibitor Venetoclax (AbbVie), for treatment of chronic lymphocytic leukemia; and the B-RafV600E inhibitor Vemurafenib (Roche/Genentech), for treatment of BRAFV600E-positive melanomas and Erdheim-Chester Disease [1].

Interestingly, medicinal chemists at Bayer have estimated that 25% to 50% of their Phase 1 candidates have originated or otherwise benefitted from in silico R&D [2], which includes structure-based methods.

Recently reported Drug Discovery programs guided by structural data include highly specific inhibitors of the CREB-binding protein bromodomain, by Pfizer [3]; a new series of ERK1/2 inhibitors, by AstraZeneca [4]; and peptidic inhibitors of the YAP-TEAD protein-protein interaction (PPI), by Novartis (NIBR) [5].

These and many more cases illustrate that, despite historic setbacks and some remaining technical limitations, SBDD has become an invaluable component of the Drug Hunter’s toolbox.

Molecular representation of a ligand-binding-site.

 

The Rising Number of Structures

Advances in experimental characterization techniques such as X-ray Crystallography, Cryo-Electron Microscopy (Cryo-EM) and NMR, among others, are accelerating the rate at which researchers feed new experimentally-derived protein structures into repositories such as RCSB Protein Data Bank (PDB).

With a deposit rate of over 11,000 new structures per year, the number of biological macromolecular structures in PDB will surpass the staggering figure of 160,000 early next year [6].

Beyond publicly-available structures, during the last decades, Biotech and Big Pharma companies have gradually amassed in-house collections of proprietary structures. Large groups can generate up to several hundreds of new structures per year.

Importantly, alongside experimentally-derived structures, researchers have been generating a steady stream of theoretical structures (homology models, validated docking models, models from Molecular Dynamics simulations, etc.) which most of the time help to answer a specific question in time and are later discarded independently of their actual value.

Epic gains in computational processing, ever-expanding access to cloud-based data & tools, increasingly sophisticated modeling algorithms and the ever-decreasing costs of IT infrastructure are together bolstering the SBDD capacities of Drug Discovery companies of all sizes, including tiny start-ups with limited resources.

Accordingly, the number, complexity, and quality of computational protein structures, will continue to surge in the coming years, translating to a wave of data that must be stored, classified, analyzed and exploited.

The overall growth of released PDB structures in the last 30 years. (Source: https://www.rcsb.org)

 

Large-Scale Analysis of Structural Data

Long gone are the days when a dedicated team of Computational Chemistry or Structural Bioinformatics specialists handled only a few protein structures at a time, a scenario for which classical modeling software alone was sufficient.

Today, multidisciplinary SBDD teams must navigate large cohorts of data across departments and projects, as even structural data for distinct targets can provide valuable insight.

Examples of this work include scanning the pocketome (the ensemble of binding pockets) across a collection of protein structures to find potential off-target side effects, searching for potential ligand-optimization ideas by comparing ligand-pocket interaction patterns, or browsing through hundreds of kinase structures to establish structure-function classifications.

Such tasks can be further complicated by the fact that structural data – whether public or proprietary – are often scattered in different locations.

And these tasks will only become more challenging as the amount of data, including unique computational structures of protein targets and their ligands, continue to accrue in the notebooks, hard drives, cloud folders and online repositories used by Drug Discovery teams in industry and academia.

Software Solutions for Large-Scale Analysis

Given the massive time, costs and risks involved in Drug Discovery, there is a pressing need for solutions to enable multidisciplinary teams to collaboratively exploit the growing mountain of protein structural data.

Discngine, through its expertise in Knowledge Management applications for Life Sciences R&D, has created such a solution in its 3decision® software, which we will introduce in our upcoming article.

References

 1. Van Montfort & Workman, Essays in Biochem, 2017, Structure-based drug design: aiming for a perfect fit

 2. Hillisch, Heinrich & Wild., ChemMedChem, 2015, Computational Chemistry in the Pharmaceutical Industry: From Childhood to Adolescence

 3. Denny et al., J Med Chem, 2017, Structure-Based Design of Highly Selective Inhibitors of the CREB Binding Protein Bromodomain

 4. Ward et al., J Med. Chem, 2017, Structure-Guided Discovery of Potent and Selective Inhibitors of ERK1/2 from a Modestly Active and Promiscuous Chemical Start Point

 5. Furet et al., Bioorg Med Chem Lett, 2019, Structure-based design of potent linear peptide inhibitors of the YAP-TEAD protein-protein interaction derived from the YAP omega-loop sequence

 6. https://www.rcsb.org/#Category-search


3decision’s blog has been moved!