Upgrade Guide ============= This repository is moving from several legacy top-level modules to a single package-first API under ``mltsa``. Concept map ----------- .. list-table:: :header-rows: 1 * - Legacy module or concept - New home - Notes * - ``MLTSA_datasets.OneD_pot`` and ``MLTSA_datasets.TwoD_pot`` - ``mltsa.synthetic`` - Use ``make_1d_dataset(...)``, ``make_2d_dataset(...)``, and ``SyntheticDataset``. * - ``MLTSA_sklearn`` - ``mltsa.models`` and ``mltsa.explain`` - Use ``get_model(...)`` plus ``analyze(...)``. * - ``MLTSA_tensorflow`` - no direct v1 replacement - Prefer the PyTorch wrappers in ``mltsa.models`` for new work. * - ``CV_from_MD`` and related MD labeling scripts - ``mltsa.md.label_trajectories`` - Labeling now uses only the final frame window instead of assuming a fixed trajectory length. * - ad hoc MD CV generation - ``mltsa.md.featurize_dataset`` - Feature sets are stored appendably under ``/md/feature_sets/``. * - notebook-driven MD analysis - ``mltsa.md.run_mltsa`` - The full load, fit, explain, and save workflow is now available as a Python API. * - manual feature importance outputs - ``mltsa.explain.ExplanationResult`` and results HDF5 storage - Explanation outputs can be appended to a separate results file. * - script-specific CLI entry points - ``mltsa-md`` - The CLI mirrors the labeling, feature building, and analysis workflow. Typical migration patterns -------------------------- Synthetic data ^^^^^^^^^^^^^^ Old approach: .. code-block:: python from MLTSA_datasets.OneD_pot.OneD_pot_data import dataset, potentials New approach: .. code-block:: python from mltsa.synthetic import make_1d_dataset synthetic = make_1d_dataset(n_trajectories=64) Models and feature importance ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Old approach: .. code-block:: python # Historical sklearn and notebook-specific helpers from MLTSA_sklearn import MLTSA_sk New approach: .. code-block:: python from mltsa.explain import analyze from mltsa.models import get_model model = get_model("random_forest", n_estimators=200) model.fit(X, y) explanation = analyze(model, method="native", feature_names=feature_names) MD workflow ^^^^^^^^^^^ Old approach: - label trajectories with legacy MD helpers - generate CV arrays separately - train and interpret models in notebooks New approach: .. code-block:: python from mltsa.md import featurize_dataset, label_trajectories, run_mltsa label_trajectories(...) featurize_dataset(...) result = run_mltsa("md_dataset.h5", "closest") What stays legacy for now ------------------------- - TensorFlow-specific training code - older notebooks that still import historical packages - historical helper modules that remain in the repository for reference during the migration