Senior Data Scientist – AI/ML (CADD)
I’m supporting an innovative client at the forefront of chemistry, robotics, and AI who is looking to hire a Senior AI/ML Data Scientist to help advance their small-molecule discovery and computer-aided drug design (CADD) capabilities.
This is an opportunity to join a cutting-edge multidisciplinary team and play a key role in building and deploying state-of-the-art models that directly accelerate drug discovery.
The Role
As Senior Data Scientist, you will:
Develop and optimise advanced generative models (Transformers, GNNs, Diffusion Models) for molecular design and prediction tasks.
Build scalable pipelines for processing large chemical/biological datasets and training high-performance models.
Apply modern AI/ML techniques to challenges such as ADMET/QSAR prediction, reaction prediction, binding affinity, and synthetic route design.
Work closely with computational chemists, medicinal chemists, and engineers to integrate AI results into real discovery workflows.
Design robust experiments to ensure model quality, synthesizability, novelty, and accuracy.
Clearly communicate insights and recommendations across technical and non-technical teams.
Stay up to date with AI for drug discovery, multimodal models, and emerging research.
What We’re Looking For
MSc/PhD plus 5+ years of experience in Machine Learning, Computer Science, Computational Chemistry/Biology, or related fields.
Strong proficiency in Python and deep learning frameworks (PyTorch or TensorFlow).
Deep understanding of modern ML architectures: Transformers, GNNs, VAEs/GANs/Diffusion Models.
Experience leading complex ML projects end-to-end in a scientific context.
Track record working with molecular data (SMILES, 3D structures) and biological datasets (protein sequences, assay data).
Familiarity with efficient training methods (LoRA, quantization, distillation) and GPU/distributed environments.
Experience with ML for protein structures or small-molecule interactions is highly valuable.
Strong communication, problem-solving abilities, and a collaborative mindset.
Nice-to-Have Experience
Cheminformatics tools such as RDKit
RAG systems and vector databases (FAISS, Pinecone, Milvus, Redis)
Protein language models (ESM, ProtBERT) or structure prediction approaches
Synthetic route evaluation frameworks
SQL/NoSQL databases
