What are the common directions shaping modern antibody discovery? – Reflections from antibody events

Introduction – Learning from an Antibody Enthusiast

Over the past couple of months, I’ve had the chance to attend (and virtually follow) multiple antibody-focused events.

Those that were specifically interesting were two conferences I visited in person: PEGS Europe 2025, a major protein engineering summit, and a US-based Biologics Summit 2026; and two events I followed online: the 5th International Antibody Validation Meeting and Discngine Labs Live.

These 4 events were interesting to me because they provided a lot of knowledge on the broader community challenges and solutions in antibody discovery. I found this very useful as a Leader of the Biologics Journal Club and Biologics Solution Lead at Discngine.

Here, I want to share some crossing points and consistent themes among those events, including:  

  1. In silico developability – the earlier the better 

  2. Bispecifics – double the power, double the trouble 

  3. Machine learning/AI in the loop 

  4. The data bottleneck – small, siloed, and too positive 

  5. Antibody validation “crisis” 

I will also mention what personally stood out to me to weave those event insights into a broader picture of where antibody R&D is heading, the common challenges, and opportunities.


1. In silico developability – the earlier the better

A widely covered topic that came up in many talks was developability. Speakers highlighted how the field has been steadily evolving toward more efficient ways of answering a deceptively simple question: Will my antibody actually work as a therapeutic product?

There is a clear shift away from the traditional “trial and error” experimental approach toward the growing use of computational methods to predict developability properties much earlier (before significant investment in production). Several groups presented models and workflows that analyze antibody sequences and, increasingly, their predicted structural properties to help identify potential risks at very early stages of discovery. 

One example discussed was an in silico approach that combines sequence‑based descriptors (such as charge, hydrophobic patches, and amino acid composition) with structure‑based features (including solvent‑accessible surface areas and contact maps). The combined models consistently outperformed those relying on either type alone and showed strong correlation with experimental Size Exclusion Chromatography (SEC) results, a commonly used proxy for aggregation propensity. With this approach, scientists can computationally pre-screen large libraries of antibody candidates and focus experimental efforts on the most “developable” ones.

The Novartis presentation at Discngine Labs echoed these findings and extended them further through a more integrated computational biologics discovery approach. In their case, scientists perform early filtering of large sets of initial binders to detect and remove a significant fraction of candidates (historically around 40% of early hits) with unfavorable properties before committing to laboratory work. The workflow considers the relationships among sequence, structure, dynamics, and the resulting properties and functions. Optimization strategies and predictive models are continuously calibrated against experimental reality, including affinity measurements and key developability metrics such as thermal stability and SEC profile.

These and other talks show that developability is increasingly being addressed through new, more reliable, and scalable computational tools.

Three bold takeaways of early developability assessment include faster transitions to development, fewer experimental cycles and better candidate quality.
— Bruck Taddesse, PhD, Senior Expert lI Data Science, Novartis Biologics Research Center

I should note that, no matter the approach, all the presenters emphasized that computational predictions aren’t definitive – they only work as well as the data behind them (more on the data topic in the paragraph below). Nonetheless, their ability to “fail fast” by eliminating high-risk candidates in silico represents a meaningful early step toward more efficient antibody discovery. 

The theme of early developability screening resonated strongly with me. At Discngine, for example, we are advancing our solution 3dpredict/Ab, which relies on physics-based predictors applied to antibody ensembles, to rapidly screen and rank up to hundreds of antibody candidates early in discovery. 

2. Bispecifics – double the power, double the trouble

Another hot topic at conferences was bispecific antibodies (bsAbs) – antibodies engineered to bind two different targets simultaneously. They are often cited as the fastest-growing segment of the antibody drug pipeline. Bispecifics are gaining momentum due to their ability to enable new mechanisms of action that traditional antibodies cannot achieve, and they are widely studied across immunotherapy and oncology. Notable clinical successes include Blinatumomab (CD3×CD19) in leukemia and Faricimab (VEGF×Ang-2) in retinal disease (image below).

 

CrossMab molecule illustration of therapeutic bispecific antibody Faricimab: vascular endothelial growth factor-A (VEGF-A) Fab in orange (PDB: 1CZ8) and antieangiopoietin-2 (Ang-2) Fab (PDB: 4IML) in blue. The two faricimab fabs are overlaid on a full antibody structure (PDB: 1IGT). The image is created using Discngine’s 3decision software.

 

Alongside therapeutic opportunities, there was a broad consensus that bispecifics amplify many of the existing challenges in antibody discovery. In particular:

  • Developability is even harder to predict. There are dozens of bsAb formats (IgG-like, tandem scFv, knobs-and-holes, etc.), each introducing distinct manufacturing, stability, and formulation challenges.

  • Data scarcity is greater than for conventional antibodies, and far fewer experimental datasets are available for models to learn from. On the other hand, approaches trained on monospecific antibodies often fail to generalize to more complex formats.

  • Validation is complex, as it requires demonstrating that the antibody binds both targets independently, while also confirming that it functions as intended when the combined bindings happen.

At the Discngine Labs event, Merck’s Principal Scientist Giuseppe Licari presented his group's very recent work that aims to address complex developability challenges at the interface of drug discovery and development. He showed that commonly used global descriptors, such as overall isoelectric point (pI), are often insufficient to explain or predict self-association in bispecific architectures, where charge asymmetry across independently engineered domains can dominate intermolecular behavior. Instead, a computational strategy based on domain-level pI and charge balance, informed by homology modeling and structure-based electrostatic calculations, was presented as a more mechanistically grounded approach. Using a bispecific IgG1-VHH case study, rational charge engineering guided by this framework produced variants with significantly improved colloidal stability and reduced viscosity compared to the starting molecule, with predictions confirmed experimentally across formulation-relevant conditions.

This is one of many cases that illustrate how the potential of bispecific antibodies in therapeutic applications is actively driving the field to confront amplified developability challenges, address knowledge gaps, and move toward practical solutions.


The recording of the Discngine Labs event “Navigating developability challenges across antibody modalities: computational tools and approaches” is available for free on this page. Access approved presentations and a roundtable discussion with industry experts.


3. Machine Learning and AI in the loop

Many talks and discussions were answering the question: “How are machine learning and AI transforming antibody discovery?” I’ve already touched on one example - in silico models for early developability assessment - above. But the discussions extended toward more advanced AI approaches that aim to guide prioritization, design, and candidate optimization. As a computational scientist, I found the following topics particularly interesting.

A recurring highlight was the rapid emergence of protein language models (PLMs) – the biological counterparts of GPT-style models trained on millions of protein sequences.

At the Biologics summit, a team from Sanofi presented a next-generation model called NextGenPLM, which incorporates 3D structural information and interaction data during pre-training, rather than relying on sequence alone. Notably, NextGenPLM outperformed previous models, like Meta’s ESM-2, on antibody-specific benchmarks for property predictions and variant effect assessment.

However, limitations of this and other PLMs remain. In particular, predicting hypervariable loops, especially CDR-H3, remains challenging. Its extreme sequence and conformational diversity mean that there is a very limited evolutionary conservation signal for general model training.

One promising direction to address this challenge is the development of antibody‑specific models trained on the Observed Antibody Space (OAS) - a large collection of experimentally determined antibody sequences. These specialized models learn that the positional context matters (e.g., the same amino acid sequence might behave differently across different CDR loops and therefore have different functions). They enable improved CDR design, affinity maturation predictions, and the generation of diverse antibody libraries.

Another particularly exciting topic was how AI is being used to navigate the vast antibody sequence space when selecting candidates for experimental testing. An interesting strategy that stood out was deep batch active learning. Instead of blindly testing hundreds of candidates or randomly selecting a few top-scoring ones, active learning algorithms identify batches of experiments that are maximally informative. Crucially, they select a diverse batch (to avoid duplicates or redundant testing) and leverage model uncertainty to prioritize experiments that will teach the model the most. After each experimental round, the model is retrained on the newly generated data, and the next batch is selected accordingly. Reported results are satisfactory, with a three–to fivefold improvement in discovery efficiency compared to random screening, translating into savings in both time and experimental resources.

 

One of the current trending topics is the usage of available data and AI methods to advance the antibody design, prioritization and candidate optimization.

 

4. The Data Bottleneck – Small, siloed, and too positive

For AI methodologies to work, data is essential. However, the current landscape is challenging, mainly due to persistent data scarcity. Across nearly every talk I attended, in varying degrees, it was noted that antibody-related datasets are limited, fragmented, and scattered.

Several common reasons for that were highlighted:

  • Experimental antibody datasets are small, typically consisting of a few hundred to a few thousand data points at best. Moreover, these are very siloed within laboratories and companies and are often proprietary, making them hardly accessible to the broader scientific community.

  • The available data is biased toward positive outcomes. A particularly critical issue discussed was the near absence of negative data, as failed experiments are rarely published or shared. As a result, models trained predominantly on successful outcomes tend to be overconfident and are not able to identify failures.

  • Structural data is even more limited. High-quality antibody-antigen complex structures are fewer than needed, although steadily increasing. While specialized modeling tools, such as RosettaAntibody, ABodyBuilder3, IgFold, Ablooper, DeepABlike, and others, help bridge this gap, they’re not fully reliable, particularly for highly flexible regions such as the CDR H3 loop.

The data bottleneck is not completely new to the field. To recall the Discngine event from 2024 on the potential of AI in antibody discovery, speakers from industry and academia raised similar concerns around data accessibility and quality. While the broad release of proprietary experimental datasets is unlikely, there was consensus among speakers that sharing methods and modeling strategies developed using such data could allow smaller organizations and academic groups to benefit from insights generated behind corporate firewalls. Many emphasized that the greatest impact would actually come from community‑wide benchmarking initiatives, such as the AIntibody challenge, that systematically assesses which AI-based approaches work and which do not - across realistic antibody discovery and developability tasks.

We model the data because we don’t have the experimental ones.
If I had a wish to improve biologics research, it would be that we have this experimental data.
— Essam Metwally, PhD, Principal Scientist at Merck

5. The Antibody Validation Crisis

Beyond discovery, conferences and online discussions are increasingly focused on the reliability of antibodies already on the market. The question raised often is: “Can we trust our antibodies?”

The uncomfortable reality shared is that an estimated ~50% of commercial antibodies do not perform as advertised. Many fail to bind their intended target or function only in specific assays, leading to wasted resources, irreproducible results, and, in some cases, retracted findings. The speakers stressed that “validation is not just a QC checkbox, it’s a scientific necessity”, which means that antibodies must be verified to fit for the specific application. An antibody that performs well in a Western blot, for example, might fail in a flow cytometry or tissue immunostaining. A striking case cited was a widely used antibody against PP2A that was later found to cross-react with another protein, casting doubt on years of results before the issue was discovered.

On the positive side, the field is adopting more rigorous validation practices to address this “crisis”. Many laboratories are transitioning from variable polyclonal reagents to sequence-defined recombinant monoclonal antibodies that offer consistent performance and renewability. Epitope mapping has become more common, enabling precise identification of the exact target region recognized by each antibody, helping to predict and avoid off-target binding. In addition, researchers are embracing orthogonal validation strategies, avoiding reliance on a single assay, and instead confirming antibody specificity using independent approaches (for example, in knockout cell lines or via mass spectrometry).

Several collaborative initiatives were also highlighted, including MILKSHAKE, which provides a standardized workflow for antibody validation, and HuBMAP (the Human BioMolecular Atlas Program), which supports community‑driven validation of antibodies used in tissue imaging and makes validation data publicly accessible. These efforts reflect a growing recognition that transparency and shared standards are essential for improving reproducibility.

Conclusion and future directions

An image from BioLogic summit 2026, January 19-22, in San Diego, with colleagues from Chemical Computing Group.

Taken together, the four events offered a valuable opportunity to assess where the antibody discovery field currently stands across several key topics, and where the discovery may be heading next. Across all discussions, it was clear that meaningful progress is being made on multiple fronts:

  • the growing integration of computational approaches and machine learning to improve speed and resource efficiency

  • the rapid rise of bispecific antibodies to enable new therapeutic possibilities and the associated effort to address their amplified discovery challenges

  • the need for a more systematic approach to antibody validation.

The field is continuously moving forward, and such events that enable the sharing of collective experience and lessons learned play an important role in accelerating progress.

Personally, I will continue to follow antibody‑focused events and share observations as the field evolves. Looking ahead, antibody discovery is likely to move toward greater complexity, with multispecific formats beyond bispecifics gaining traction and placing new demands on design, developability, and validation. At the same time, discovery workflows are expected to become more adaptive, with AI, experimentation, and validation increasingly connected in continuous, data‑driven cycles.

Next
Next

Designing with novelty: why considering Structure-Activity Relationship alongside patent disclosures matters