Analytical One-Dimensional model module

These are the functions that the analytical model provides

class OneD_pot_data.dataset(potentials, n_feats, degree_of_mixing)

Class for the generation of datasets from a given set of potentials

PrepareData(data, ans, time_frame, mode='Normal')

Small wrapper that prepares the data inputted from the dataset object as the correct format to use on the ML approach.

Parameters
  • data – (list) Data simulation generated from the dataset.

  • ans – (list) Labels of the outcome from the simulations

  • time_frame – (list) [start_frame_range, end_frame_range] Values for the range of frames/steps to keep for the final data, this allows to select a particular amount from the trajectories.

  • mode – (str) Wether to use the real value of the relevant potential as a last feature or not. “Normal” means using it.

Returns

(list) List containing the data as (X, Y) being X the simulation data as the mixed trajectories and Y as the labelled outcomes for each frame.

generate_linear(n_samples, time, mode='Normal')

Wrapper to generate data on demand, for the desired number of samples and time.

Parameters
  • n_samples – (int) Number of simulations/samples to generate.

  • time – (int) Amount of steps/time to run the simulations for.

  • mode – (str) Mode on how to generate the last potential. Look at the code for more info.

Returns

(list) List containing the data (X/sim_data) and the labels (Y/answers)

class OneD_pot_data.potentials(n_pots, n_dw, relevant_feat)

Class for potentials generation

DataGeneration(n_samples, potentials, sim_time)

Function to generate trajectories on the given potentials for the number of samples/simulations and time/steps requested.

Parameters
  • n_samples – (int) Number of simulations to run for.

  • potentials – (list) Derivative coefficients for each of the potentials to run for.

  • sim_time – (int) Number of steps/time to run the simulations for.

Returns

(list) List containing the data generated for every potential with the shape of (n_simulations/n_samples, n_frames/sim_time, n_potentials).

DefinePotentials(n_features, double_well_potentials, plot=False)

This function generates all X and Y values for the number of potentials requested

Parameters
  • n_features – (int) Total Number of potentials to define.

  • double_well_potentials – (int) Number of Double Well (DW) potentials to include in the total n_features.

  • plot – (bool) Wether to plot (True) or not (False) the shape of the potentials.

Returns

(list) [potentials, shape] Potentials is a list of the values of the coefficients for each potential and Shape is the X/Y shape of the potentials.

GetAnswers(data, relevant_feat)

Small function made to label the outcome of the given simulations from a list containing them. Note that the values for classifying are defaulted to above or below 0.5. Change them manually if needed.

Parameters
  • data – (list) List of simulations it has to have the shape (n_sims, n_frames, n_potentials).

  • relevant_feat – (int) Index of the potential that will be used for labelling the simulation.

Returns

gen_potential(name='double_well', n_bins=100, RC_range=[0, 1])

This function generates the potential requested. Depending on the type and the number of bins as well as the range of values that can be used.

Parameters
  • name – (str) Type of potential to use “double_well” for SW and “single_well” for SW.

  • n_bins – (int) Number of bins to define the potential with.

  • RC_range – (list) Range of values ([first, last]) to work with.

Returns

(tuple) (X, Y) it returns the different bins the potential has been defined with (X) and the values over Y of the potentials.

gen_traj_langevin(coeffs_derivative, start_pos=0.5, n_steps=1000, diffusion=0.01, simul_lagtime=0.0001)

Function that generates trajectories on a given set of derivative coefficients with different parameters to control the behaviour of the simulation. Crucial to generate data on the 1D potential.

Parameters
  • coeffs_derivative – (list) List of coefficients to generate trajectories on. 1

  • start_pos – (float) starting position for each simulation. We keep this at 0.5 as the transition state.

  • n_steps – (int) Number of steps to run the simulation for. Note that increasing this will make longer trajectories, but changing n_steps and simul_lagtime at the same time is not recommended. Bigger n_steps will yield longer (time wise) trajectories.

  • diffusion – (float) Diffusion coefficient for the langevin dynamics equation, it translates on the speed of the transitions across the free energy landscape.

  • simul_lagtime – (float) Size of the steps recorded on the simulation, the bigger the less resolution the coordinates will have.

Returns

(list) A list containing the coordinates of the potential for the desired number of steps.

generate_data(n_samples, time)

Function to wrap generating data and labelling it.

Parameters
  • n_samples – (int) Number of simulations/samples of the potentials to run for.

  • time – (int) number of steps/time to run for.

Returns

(list) List containing (data, answers) the first one is the simulation data for each trajectory and the second one the corresponding labels for the isimulations.