MD Workflow¶

The MD workflow is organized as a small sequence of reusable steps.

1. Label trajectories¶

Use mltsa.md.label_trajectories(...) to classify each trajectory as IN, OUT, or TS based on the mean rule value over the final window_size frames.

2. Build feature sets¶

Use mltsa.md.featurize_dataset(...) to create one named feature set inside the same MD HDF5 file.

Supported v1 feature families:

closest_residue_distances
all_ligand_protein_distances
bubble_distances
contact_map
pca_xyz

3. Train and explain¶

Use mltsa.md.run_mltsa(...) to load a selected feature set, fit a model, compute feature importances, and optionally save results to a separate results HDF5.

4. Export structures¶

Use mltsa.md.export_state_structures(...) to write multi-model IN.pdb, OUT.pdb, and TS.pdb files from an existing label experiment.

Minimal example¶

from mltsa.md import featurize_dataset, label_trajectories, run_mltsa

label_trajectories(
    trajectory_paths=["traj_0001.dcd", "traj_0002.dcd"],
    topology="topology.pdb",
    h5_path="md_dataset.h5",
    experiment_id="labels",
    rule="sum_distances",
    selection_pairs=[("index 10", "index 220")],
    lower_threshold=0.4,
    upper_threshold=0.8,
    window_size=25,
)

featurize_dataset(
    h5_path="md_dataset.h5",
    feature_set="closest",
    feature_type="closest_residue_distances",
    label_experiment_id="labels",
)

result = run_mltsa(
    "md_dataset.h5",
    "closest",
    model="random_forest",
    explanation_method="native",
    results_h5_path="md_results.h5",
    experiment_id="rf_native",
)

print(result.training_score)

MD Workflow¶

1. Label trajectories¶

2. Build feature sets¶

3. Train and explain¶

4. Export structures¶

Minimal example¶

mltsa

Navigation

Related Topics