Research Output
29 peer-reviewed papers, conference proceedings, and preprints spanning AI for science, computational biology, drug discovery, and RNA design.
RECOMB 2026
Introduces a generative language model tailored to mRNA sequence design, enabling the de novo generation of optimised therapeutic mRNA molecules with improved stability and translational efficiency.
Under reviewJournal of Computational Biology, 2025
Applies deep batch active learning to efficiently guide experimental protein structure determination, dramatically reducing the number of experiments needed to build accurate structural models.
In pressEurIPS/NeurIPS SimBioChem 2025
Evaluates whether large time-series foundation models can act as zero-shot surrogates for mechanistic virtual patient models in clinical pharmacology — eliminating costly model re-training. Highlighted and forwarded to Nature.
Read PaperNeurIPS AI4Science 2025
A multi-hop reasoning framework over biomedical knowledge graphs using path-based relational learning, enabling complex question answering across linked biological entities.
Read PaperNeurIPS Negel 2025
An SE(3)-equivariant graph transformer that predicts aqueous solubility of small molecules with state-of-the-art accuracy by leveraging 3D geometric information invariant to rotation and translation.
Read PaperICML FM4LS 2025 — Selected Talk
A structure-aware protein language model that integrates 3D structural information into sequence embeddings, achieving superior performance on antibody discovery and optimisation benchmarks.
Read PaperRECOMB 2025 · Lecture Notes in Computer Science, Springer · pp. 17–33
An active learning framework that selectively acquires experimental protein structure data to maximally improve prediction models, minimising the number of expensive crystallography experiments required.
Read PaperIndustrial & Engineering Chemistry Research, Vol. 64, Issue 13, 2025 · pp. 6825–6837
Combines in silico data generation with probabilistic machine learning to explore reaction mechanisms and build kinetic models for complex chemical processes, accelerating process development.
Read PaperNucleic Acids Research, Vol. 53, Issue 3, 2025
A full-length sequence language model for mRNA that jointly models UTR and coding regions, enabling comprehensive mRNA analysis, stability prediction, and sequence optimisation for therapeutic applications.
Read PaperarXiv, 2025
Distils quantitative chemistry knowledge from LLMs into a prior for Bayesian optimisation, significantly accelerating the search for optimal chemical reaction conditions and reducing laboratory experiments.
Read PaperGenome Research, Vol. 34, Issue 7, 2024 · pp. 1027–1035
A large language model pre-trained on codon sequences that improves mRNA vaccine design by predicting translation efficiency and stability, directly supporting the development of next-generation mRNA therapeutics.
Read PaperICML AI for Science Workshop 2024
Demonstrates that many-shot in-context learning with LLMs can effectively tackle molecular inverse design, generating molecules with target properties without additional fine-tuning.
Read PaperBioinformatics, Vol. 40, Issue 7, 2024
Uses large language model embeddings to represent lipid nanoparticles (LNPs) and predict their mRNA transfection efficiency, accelerating the design of superior mRNA delivery vehicles.
Read PaperNeurIPS Generative AI and Biology 2023 — Spotlight Paper & Talk
The original CodonBERT spotlight paper: a codon-level BERT model that learns rich representations of mRNA sequences, enabling optimisation of codon usage for improved expression in therapeutic contexts.
Read PapereLife, Vol. 12, 2023
Applies deep learning-guided batch active learning to drug discovery, intelligently selecting which compounds to screen experimentally to identify potent candidates while minimising assay costs.
Read PaperBioinformatics, Vol. 39, Issue 4, 2023
A geometry-aware computational system for comparing protein molecular surfaces using 3D shape descriptors, enabling structure-based drug design and protein function annotation across large protein databases.
Read PaperOxford Synthetic Biology, Vol. 4, Issue 1, 2019
Systematically optimises the experimental conditions for the ligase cycling reaction (LCR), a key technique in synthetic biology for seamless gene assembly and library construction.
Read PaperIEEE/ACM Transactions on Computational Biology and Bioinformatics, 2019
A subgraph isomorphism-based method for comparing RNA secondary structures, identifying structural similarities and functional relationships across large RNA databases with high accuracy.
Read PaperACS Synthetic Biology, 2019
Applies machine learning to rationally tune riboswitch performance, enabling programmable and quantitatively predictable gene regulation for synthetic biology circuits.
Read PaperTU Darmstadt, 2018
Doctoral thesis developing computational methods for optimising single molecules and integrating these approaches into automated high-throughput screening workflows for accelerated drug and material discovery.
Read ThesisBIOspektrum, Issue 01/18, Springer, 2018
Reviews novel in silico computational methods for establishing green chemistry practices, highlighting how computer-aided approaches can reduce hazardous reagents and waste in chemical synthesis.
Read PaperNucleic Acids Research, 2018
Develops and characterises a novel RNA-based genetic switch triggered by the antibiotic ciprofloxacin, expanding the toolkit for ligand-responsive gene regulation in synthetic biology.
Read PaperJournal of Computational Chemistry, 2018
A stream-based R framework for advanced statistical analysis of molecular dynamics simulation trajectories, enabling real-time analysis of large-scale MD simulations without full trajectory storage.
Read PaperChemBioChem, 2017
Uses the computational protein design tool FoldX to engineer improved thermostability in an industrially relevant transaminase enzyme, demonstrating in silico-guided biocatalyst engineering.
Read PaperAlgorithms for Molecular Biology, 12(1):15, 2017
Introduces Markov model-based algorithms for analysing coarse-grained RNA molecular dynamics, capturing conformational dynamics of RNA structures from connectivity-graph representations.
Read PaperACS Journal of Chemical Information and Modeling, 57(2):243–255, 2017
Computationally investigates how cleavage product accumulation progressively inhibits cutinase activity during enzymatic PET plastic biodegradation, informing enzyme engineering for improved plastic recycling.
Read PaperWABI 2016 · Lecture Notes in Computer Science, Vol. 9838, Springer
Conference version presenting the StreAM-Tg algorithm for coarse-grained RNA dynamics analysis, demonstrating its application to benchmark RNA systems and comparison with all-atom MD.
Read PaperPLoS One, 11(1):e0146104
Develops a statistical framework for evaluating high-throughput screening assays targeting enzymatic β-keto ester hydrolysis, enabling reliable hit identification and false positive reduction in enzyme discovery campaigns.
Read PaperAlCoB 2015 · Lecture Notes in Computer Science, Vol. 9199, Springer
Introduces a memory-efficient stream-based algorithm for counting network motifs in large dynamic graphs, applicable to biological interaction networks and real-time social network analysis.
Read Paper