PLINDER documentation#

plinder

PLINDER, short for protein ligand interactions dataset and evaluation resource, is a comprehensive, annotated, high quality dataset and resource for training and evaluation of protein-ligand docking algorithms:

  • > 400k PLI systems across > 11k SCOP domains and > 50k unique small molecules

  • 500+ annotations for each system, including protein and ligand properties, quality, matched molecular series and more

  • Automated curation pipeline to keep up with the PDB

  • 14 PLI metrics and over 20 billion similarity scores

  • Unbound (apo) and predicted Alphafold2 structures linked to holo systems

  • train-val-test splits and ability to tune splitting based on the learning task

  • Robust evaluation harness to simplify and standard performance comparison between models.

Dataset access

Access the PLI systems and their annotations directly via the files

Dataset tutorial

Python API

Use the dedicated Python package to explore the data

Python API tutorial