PlinderDataset#

class plinder.core.loader.PlinderDataset(df: pd.DataFrame | None = None, split: str = 'train', split_parquet_path: str | Path | None = None, store_file_path: bool = True, num_alternative_structures: int = 0, file_paths_only: bool = False)[source]#

Bases: Dataset

Creates a dataset from plinder systems

Parameters:
dfpd.DataFrame | None

the split to use

splitstr

the split to sample from

file_with_system_idsstr | Path

path to a file containing a list of system ids (default: full index)

store_file_pathbool, default=True

if True, include the file path of the source structures in the dataset

num_alternative_structuresint, default=0

if available, load up to this number of alternative structures (apo and pred)