“RNA as a Drug Target” Book Highlight: How targeting RNA emerges as the next frontier for medicinal chemistry

The original article and comments are on Riccardo Martini’s LinkedIn Pulse.

Last May in France came with quite some bank holidays - both at the beginning and at the end of the month. With a bit of planning, it’s the perfect opportunity to enjoy an extended break. For me, long airport and train station waits and a few extra days off are the ideal setting to keep myself up to date with what’s happening at the forefront of drug discovery.

Lately, I've become particularly interested in a topic that’s been gaining quite some attention due to innovation in closely related fields (ever heard about the mRNA COVID-19 vaccine?). Recently, my company, Discngine, kindly extended its library with the excellent “RNA as a Drug Target” edited by John Schneekloth Jr. from the National Cancer Institute and Martin Pattersson from Promedigen. I've decided to give it a go. After several plane and train rides, I've finished reading it and decided to write this book review for your reading pleasure and further discussions.

Content

As this is going to be quite an extensive review, here is the content table to manage your expectations:

  • RNA “is” a lot

  • RNA vs Proteins - It is not that simple

  • Current state of RNA drug discovery

  • Challenges in RNA-targeting projects

  • We still want to find active molecules!

  • Take-home messages

As per the title, the book discusses several aspects of using RNA as a drug target in drug discovery campaigns.

In general, I find it a well-balanced mix of why targeting RNA is so promising, while also offering detailed descriptions of dedicated discoveries and analytics methods used in the field. This made it especially appealing for people like me, who are not (yet? 😉) experts in the topic, and at the same time a valuable reference for those already working in the field, as some chapters explore different methods in depth. Having such an overview of the current state of the art is particularly useful, and I’m grateful that now I know where to look for detailed explanations of methods in this field.

Throughout the reading, I found the challenges researchers face when working with RNA most interesting. This gives a new perspective on how to approach the drug discovery process (I focused on the in-silico-related aspects) when RNA, rather than proteins, is the target.

Let’s dig in, shall we?

RNA “is” a lot

What do I mean by “is” a lot? I mean that not only “is doing” a lot (see later paragraphs), but simply “is” a lot. Comparing the level of expression, for example, only approximately 2% of the human genome encodes the proteins that are ultimately expressed. Yet more than 85% is transcribed at a certain point into RNA (Chapter 13). Just these numbers alone show you that the difference in the share amount of molecules we can potentially target is high (of course, there are many exceptions on both sides in terms of what you can and can't target or how, but still, I find this difference astonishing).

Now for the RNA “is doing” a lot part. RNA is actually involved in a wide range of cellular processes. For a long time, RNA has been regarded mainly as a carrier for genetic information (mRNA) (Chapter 2), which already makes it an intriguing drug target. All those proteins deemed “undruggable” are, anyway, expressed from an RNA sequence. Targeting the RNA (or better, the pre-RNA) that encodes “undruggable” proteins would block their activity downstream.

But that’s not all - it gets even more interesting. RNA’s function is not purely relegated to being just the messenger of genetic information. Non-Coding RNAs (ncRNAs), which are transcribed but not translated into proteins, are believed to be involved in:

  • Regulation of transcription, translation, and gene expression.

  • Macromolecular scaffolding - an example is the correct assembly of ribosomes.

  • Sensing the environment - RNA can change its structure in response to environmental conditions such as temperature, pH, or specific molecules. These structural changes can influence the RNA's function, such as its ability to bind to proteins or other RNAs, effectively triggering (or inhibiting) biological effects based on environmental variations.

  • Catalysis.

Being involved in so many different processes is just one reason why RNA is so interesting as a drug target. Another one is that targeting RNA doesn’t necessarily mean inhibition – it can also mean activation. Thanks to its role as a cellular process regulator, it makes it a viable candidate across different therapeutic areas, including viral infection, cancer, genetic diseases, etc. (Chapter 5)

RNA vs Proteins – It is not that simple

So, should we now forget about proteins to solely focus on RNA? Surely not, as proteins still play a crucial role, even when the goal is to target RNA.

Let me clarify.

Proteins, especially those bearing one or more RNA-binding domains (RBDs), are key regulators of RNAs within the cell. They bind RNAs via various types of amino acid sequences, such as:

  • RNA Recognition Motifs (if you are curious, you can check PDB crystal structures 1FXL, 2FY1).

  • Double-stranded RBDs (example 6HTU).

  • Zinc fingers (examples 5U9B, 3D2S).

  • K homology domains (example 2ATW, see image below)

So why is RNA-RBPs interaction important for drug discovery? Because when they are dysregulated, they can lead to various diseases, or, as the book says in Chapter 11: "The RBP is lost, and it wreaks havoc on the cell". This makes RNA-protein interactions important drug targets on their own.

An example of a K homology (KH) domain of a protein (KH1 in blue and KH2 in green) binding to RNA (in pink) (PDB: 2ATW). This interaction is significant for drug targeting. The structure is visualized using Discngine's 3decision® application, where the Annotation Browser automatically highlights the KH domains.

A complex molecular mechanism that exemplifies the importance of what was mentioned above is the spliceosome, the system controlling the splicing of RNA. It enables the inclusion or exclusion of different segments at the pre-mRNA level to generate different proteins from a single gene. This clever mechanism allows humans, among others, to produce between 40k and 100k distinct proteins despite having roughly only 19k genes.

Therapeutically targeting the spliceosome can lead to a variety of outcomes. For example, modulating splicing can suppress the production of proteins that accumulate and produce toxic effects (e.g., toxic tau aggregates related to frontotemporal dementia with Parkinsonism). On the other hand, it can restore the production of fully functional proteins, as seen in Spinal Muscular Atrophy (SMA), where splicing correction enhances the production of survival motor neuron (SMN) protein. The last example is actually how the only small-molecule FDA-approved splicing-modulating drug works (more later).

Current state of RNA drug discovery

So, which drugs out there are targeting RNA? From Chapter 6, I could extrapolate this summary:

  • Antisense Oligonucleotides (ASOs) and small interfering RNA (siRNAs): Despite several being marketed, and many more in clinical trials, these molecules face significant challenges. Those include high cost, limited metabolic stability, and suboptimal biodistribution to the site of action.

  • Peptides: Their large surface area facilitates their interaction with RNA, and they are generally easier to modify and optimize than oligonucleotides. However, their therapeutic applications are currently limited due to stability and biodistribution limitations.

  • Small molecules: The most well-known examples of marketed small-molecule drugs targeting RNA are antibiotics such as macrolides and tetracyclines, which act on bacterial ribosomal RNA (rRNA). However, these drugs do not target human RNA. If we focus only on the FDA-approved small-molecule drugs targeting human RNA, the total amount reduces to one single example: Risdiplam (Evrysdi®), approved in 2020 for the treatment of SMA (Chapter 7). Beyond Risdiplam, other small molecules are (or were) quite advanced in clinical trials. The book cites Zotatifin as an example, in Phase 2 for breast cancer and SARS-CoV-2. And Branaplam, initially developed by Novartis for SMA and Huntington’s Disease, whose clinical trials have been paused (Chapter 4). On the more experimental side, however, there is a plethora of interesting approaches. One that is still in its infancy, but that I find worth mentioning, is reported in Chapter 9. They are called RiboTACs (ribonuclease-targeting chimeras). These are, similarly to their cousins, PROTACs, heterobifunctional small molecules that induce degradation of the target. Definitely to keep an eye on them!


[UPDATE] Unfortunately, after a quick web search, the company that developed Zotatifin - Effector Therapeutics - seems to be no longer operating. And the clinical trials on Branaplam have been stopped by Novartis due to toxicity concerns.


Challenges in RNA-targeting projects

Knowing the clear benefits of RNA-targeting drugs inevitably raises an overarching question:

Why are there so few FDA-approved small molecules targeting human RNA?

Targeting RNA for drug development presents unique challenges compared to proteins. Based on the book, I could broadly categorize these challenges into two groups: “intrinsic” (mainly covered in Chapter 3) and “technological” (discussed primarily in Chapter 13), which I will summarize below.

Intrinsic challenges

With intrinsic challenges, I encompass all of the problems that derive from the nature of RNA, such as its structure. Structure-based drug design is one of the most common starting points in Drug Discovery, and having access to high-quality RNA structures would be highly beneficial. However, these are difficult to obtain mainly due to RNA flexibility. In detail:

  • The main interactions driving RNA folding are the base stacking interactions. Due to its flexibility, RNA can form tertiary interactions with nucleotides that are distant in the linear sequence.

  • RNA is (very) often found in complex with proteins, and these interactions heavily influence its structure. In addition to protein binding, folding is also influenced by other factors like pH and different ion concentrations.

  • Current structural determination techniques face limitations when applied to RNA. RNA molecules do not crystallize well, which hampers the effectiveness of X-ray Crystallography. As the molecules are generally small, it is difficult to study them with cryo-EM. Using NMR is also challenging. RNA is comprised of four similar nucleotides, which leads to lower chemical shift dispersion and overlap of NMR signals.

  • Besides NMR, no techniques currently can directly determine the bioactive conformation of RNA with a reasonable degree of certainty.

Additionally, RNA conformations with low populations and short lifetimes, known as Excited States (ES), play crucial regulatory roles. Even lowly populated states (<1%) of a dynamic ensemble of interconverting conformations (we are talking about lifetimes in the range of microseconds) can be functionally important. The problem is that those states are undetectable with conventional structural biology techniques (see above).

Together with the difficulties getting the structure (or better, the ensemble of structures) of the (bioactive) conformations, when targeting RNA one should also consider that:

  • Unlike proteins, RNA lacks catalytic sites, so ligands can only stabilize certain conformations rather than block active sites. This leads to using the Surface Plasmon Resonance (SPR) as the main method to measure affinity (Chapter 4).

  • RNA can be found everywhere in the cell, from the nucleus to the plasma to the ribosomes. According to the target of the drug discovery campaign, the ligand should be able to reach the desired cellular compartment.


Technological challenges

In this paragraph, I’m summarizing the challenges related to the technological difficulties in working on a project that computationally targets RNA:

  • Lack of RNA structures: As mentioned in the previous paragraph, the current determination technology remains limited. It is difficult to get an RNA structure. The number of RNA-containing structures in the PDB further reflects on this situation. There are only about 6.5k RNA structures in the PDB compared to approximately 200k protein structures. Among these, fewer than 2k are RNA-only. And among these, 1.2k are actually synthetic RNA constructs with no biological or disease-relevant function.

  • Dedicated software: According to the book, there aren't many reliable software tools for predicting RNA structures that would compensate for experimental limitations. The existing ones lack proper force fields and good scoring functions, which makes it difficult to evaluate the generated models (more later).

  • Standardization: A considerable technical challenge includes standardization. As there's no official standard for RNA-target engagement studies in sequencing data analysis, every lab has developed its own pipeline, leading to a variety of approaches that are not easily comparable.

Example of an RNA-only structure from the PDB (PDB: 7SHX). This NMR structure shows how a single nucleotide change (C to A, highlighted in white) can reshape the architecture of a non-coding RNA, modulating gene expression over long distances. In this example, such structural variation reduces gene expression. The structure is visualized using Discngine's 3decision® application.

We still want to find active molecules!

It often starts with a model

When there are challenges, especially in complex and/or emerging fields (well, this is not really new, but recently it is gaining momentum), scientists–modelers here–become very ingenious working with what is available. Depending on the starting point, models can be generated using several approaches:

  • Homology modelling: Useful in cases of very closely related RNA sequences.

  • Fragment-assembly (somewhat an extension of the homology modelling): This method uses libraries of small structural motifs from unrelated RNA structures, which are remapped onto the target sequence and recombined to assemble a model. This is later scored with a semi-empirical scoring function. Here, the book mentions some methods or programs that use such a technique, including FARFAR-2 (Fragment Assembly of RNA with Full Atom Refinement), FARNA (Fragment Assembly of RNA), and MC-Sym.

  • Ab-initio folding: This approach uses Monte Carlo sampling or plain molecular dynamics simulation. It is employed when no similar RNAs providing any atomic-level structural information can be used in the prediction, or 3D fragment templates can introduce any bias.

Further, information from the wet lab is added as constraints during model generation, which usually considerably improves the model quality. This information comes from structural interrogation with chemical probes, direct mapping of RNA-RNA interactions, and special proximity mapping.

Model creation alone is not enough: Scoring

Once the models are done, a subsequent scoring process is needed. Currently, it presents another big challenge in the field. The book explains that there is not yet a gold standard for scoring functions of RNA models. However, it mentions one solution that has been adopted to improve the quality of the models: a deep learning technique called ARES (Atomic Rotationally Equivalent Scorer). This seems to really help understand which models are “good” and which are not.

Lead optimization tools

Now that the models have been built and scored, the next step is to find an entity that binds effectively. The book mentions that the landscape is still very empty, as the classic molecular modelling tools for drug discovery are not yet very reliable when handling RNA.

It seems that none of them is currently outperforming the others. The inconsistent results of RNA-ligand docking, for example, often come from a common issue: the lack of a force field tailored for RNA-ligand interaction. As an alternative to licensed software, one could test some algorithms developed by academic groups. These are generally difficult to use and not user-friendly - some are command line only - but some studies reported good performance. I will leave you a list here if you are curious to search for them online: RiboDock, MORDOR, RLDock, NLDock.


[UPDATE]: Almost all the companies offering software for drug discovery are actively investing in solving this problem, and some have already launched their RNA-tailored force field.


Take-home message(s)

If you made it this far (ok, ok, also if you just skipped to this section), after going from challenges to potential solutions, passing through tools, here are the take-home messages I got from reading the book on RNA for drug discovery:

  • One can potentially indirectly target so-called “undruggable” proteins by targeting the RNA encoding for them.

  • In the context of RNA, we can no longer talk about single structures but about an ensemble of structures. Sure, that is true for some protein targets as well, but here, it is brought to the extreme, as even very low-populated conformational states of RNA can have crucial regulatory roles.

  • Unfortunately, one of the main challenges in RNA targeting is generating a reliable ensemble of structures.

  • The generation of high-quality models is directly correlated to the increase in computing power spent exploring the conformational space close to the native state, and by enforcing experimental constraints found with wet lab experiments.

These are my take-home messages, biased towards my background and interests in supporting the modelers and the medicinal chemists.

However, the book has so much valuable information on various RNA discovery aspects, and, in my opinion, it is worth reading the entire book, even if a certain chapter might be a bit far from your primary field of expertise.


Wish to comment on the article? Please visit the original content and commnets on Riccardo Martini’s LinkedIn Pulse.


Before closing: an interesting additional highlight

As a final note, I’d like to share a fascinating insight from the book in Chapter 10 on frameshifting - one that, to me, perfectly illustrates the ingenuity of nature and reinforces just how cool it is to work in life sciences.

Viruses have a limited genome. Some of them have parts of the genome that are particularly “hard to read” by the cellular mechanism, called slippery sequences. Those tend to induce a shift (typically one frame) during reading, and therefore, the cell ends up generating a totally different viral protein (shifted protein). Stabilization of the slippery slope, which allows the correct production of the original protein 100% of the time, results in the absence of the shifted protein. And, as the shifted protein is also actually necessary for the virus, without it, the viral replication is actually blocked!

 

Many thanks to Discngine for purchasing such a great book!