API

PoreAnalyser

Library for Pathfinding with an ellipsoidal probe particle

class PoreAnalyser.PoreAnalysis(pdb_array, opt_method='nelder-mead', align_bool=True, end_radius=15, pathway_sel='protein', path_save='', num_circle=24, clipping=100, D_cation=1.8e-09, D_anion=2.032e-09, popt=[1.40674664, 1.25040698], temp=300, c_m=0.15, trajectory=False, traj_frames=1)

Class for pore analysis of a set of pdb models. Parameters: - pdb_array (list): List of file paths to the input PDB models. - opt_method (str, optional): Optimization method for ellipsoid fitting. Default is ‘nelder-mead’. - align_bool (bool, optional): Flag indicating whether to align the models. Default is True. - end_radius (float, optional): Radius of the spherical probe particle. Default is 15. - pathway_sel (str, optional): Atom selection string for identifying the pathway. Default is ‘protein’. - path_save (str, optional): Path to save analysis results. Default is ‘’. - num_circle (int, optional): Number of circles for point cloud visualization of pore surface. Default is 24. - clipping (int, optional): Clipping value for 3D visualization of pore surface. Default is 100. Parameters for condutance estimation: - D_cation (float, optional): Diffusion coefficient of the cation. Default is 1.8e-9 m^2/s (value for potassium). - D_anion (float, optional): Diffusion coefficient of the anion. Default is 2.032e-9 m^2/s (value for chloride). - popt (list, optional): Parameters for the conductivity model. Default is [1.40674664, 1.25040698].

  • popt[0] (float): Scaling parameter of radius for the conductivity model (dimension 1/Angstrom).

  • popt[1] (float): Shift parameter of the sigmoid function for the conductivity model (dimenionless).

  • temp (int, optional): Temperature in Kelvin. Default is 300.

  • c_m (float, optional): Concentration in mol/l. Default is 0.15.

  • trajectory (bool, optional): Flag indicating whether the input is a trajectory. Default is False.
    • If trajectory is True, the input pdb_array should contain the path to the topology file and the trajectory file.

    • pdb_array[0]: Path to the topology file.

    • pdb_array[1]: Path to the trajectory file.

  • traj_frames (int, optional): Number of frames to extract from the trajectory. Default is 1.

Attributes: - pdb_array (list): List of file paths to the input PDB models. - align_bool (bool): Flag indicating whether to align the models. - end_radius (float): Radius of the spherical probe particle. - pathway_sel (str): Atom selection string for identifying the pathway. - path_save (str): Path to save analysis results. - labels (list): List of labels derived from the input PDB file paths. - names_aligned (list): List of aligned PDB file paths. - opt_method (str): Optimization method for ellipsoid fitting. - num_circle (int): Number of circles for point cloud visualization of pore surface. - clipping (int): Clipping value for 3D visualization of pore surface. - hole_fig (matplotlib.figure.Figure): Figure object for the hole analysis. - hole_df (pandas.DataFrame): DataFrame containing hole analysis results. - ellipsoid_dfs (dict): Dictionary containing ellipsoid analysis results for each model. - popt (list): Parameters for the conductivity model. - bulk conductivity (float): Bulk conductivity of the system.

Methods: - hole_analysis: Perform hole analysis on the set of PDB models. - ellipsoid_analysis: Perform ellipsoid analysis on a specific PDB model. - plt_pathway_ellipsoid: Plot ellipsoid analysis results for a specific model. - pathway_visualisation: Visualize the pathway for a specific model. - conductance_estimation: Estimate the conductance of the pore using a conductivity model. - plt_trajectory_average: Plot the trajectory average of the radius / radii profile.

Example: >>> pdb_models = [‘model1.pdb’, ‘model2.pdb’] >>> pore_analysis = PoreAnalysis(pdb_array=pdb_models, opt_method=’nelder-mead’) >>> pore_analysis.hole_analysis() >>> pore_analysis.ellipsoid_analysis(index_model=0) >>> pore_analysis.plt_pathway_ellipsoid(index_model=0) >>> pore_analysis.pathway_visualisation(index_model=0) >>> pore_analysis.pathway_rendering(index_model=0) >>> pore_analysis.conductance_estimation(index_model=0)

Example with trajectory: >>> pdb_models = [fname+’.tpr’, fname+’.xtc’] >>> pore_analysis = PoreAnalysis(pdb_array=pdb_models, trajectory=True, traj_frames=10) >>> pore_analysis.hole_analysis() >>> pore_analysis.plt_trajectory_average(HOLE_profile=True) >>> for i in range(10): pore_analysis.ellipsoid_analysis(index_model=i) >>> pore_analysis.plt_trajectory_average(HOLE_profile=False)

conductance_estimation(index_model=0, f_size=15)

Estimate the conductance of the pore using a conductivity model. Parameters: - index_model (int, optional): Index of the model in the pdb_array. Default is 0. - f_size (int, optional): Font size for the plot. Default is 15.

Returns: tuple: Tuple containing the conductance values in pS for the pore. 1. conductance with bulk conductivity and spherical probe particle (hole) 2. conductance with bulk conductivity and ellipsoid probe particle (PoreAnalyser) 3. conductance with conductivity model and spherical probe particle (hole) 4. conductance with conductivity model and ellipsoid probe particle (PoreAnalyser) 5. fig : matplotlib.figure.Figure: Figure object for the plot (resistance vs z)

ellipsoid_analysis(index_model=0, plot_lines=True, legend_outside=False, title='', f_size=15)

Perform ellipsoid analysis on a specific PDB model.

Parameters: - index_model (int, optional): Index of the model in the pdb_array. Default is 0. - plot_lines (bool, optional): Flag indicating whether to plot lines. Default is True. - legend_outside (bool, optional): Flag indicating whether to place the legend outside the plot. Default is False. - title (str, optional): Title for the plot. Default is an empty string. - f_size (int, optional): Font size for the plot. Default is 15.

Returns: datframe df_res

Notes: This method performs ellipsoid analysis on a specific PDB model and generates visualizations. The results are stored in the ellipsoid_dfs attribute.

Example: >>> pore_analysis = PoreAnalysis(pdb_array=[‘model1.pdb’, ‘model2.pdb’]) >>> pore_analysis.ellipsoid_analysis(index_model=0)

hole_analysis(plot_lines=True, legend_outside=False, title='', f_size=15)

Perform hole analysis on the set of PDB models. HOLE uses a spherical probe particle.

Parameters: - plot_lines (bool, optional): Flag indicating whether to plot lines. Default is True. - legend_outside (bool, optional): Flag indicating whether to place the legend outside the plot. Default is False. - title (str, optional): Title for the plot. Default is an empty string. - f_size (int, optional): Font size for the plot. Default is 15.

Returns: Figure and dataframe

Notes: This method performs hole analysis on the set of PDB models and generates visualizations. The results are stored in the attributes hole_fig and hole_df.

Example: >>> pore_analysis = PoreAnalysis(pdb_array=[‘model1.pdb’, ‘model2.pdb’]) >>> pore_analysis.hole_analysis()

pathway_visualisation(index_model=0, f_end='_circle.pdb')

Visualize the pathway for a specific model.

Parameters: - index_model (int, optional): Index of the model in the pdb_array. Default is 0. - f_end (str, optional): File ending for the visualization file. Default is ‘_circle.pdb’.

Returns: xyzview: Pathway visualization object.

Example: >>> pore_analysis = PoreAnalysis(pdb_array=[‘model1.pdb’, ‘model2.pdb’]) >>> xyzview = pore_analysis.pathway_visualisation(index_model=0)

plt_pathway_ellipsoid(index_model=0, title='', f_size=15)

Plot ellipsoid analysis results for a specific model.

Parameters: - index_model (int, optional): Index of the model in the pdb_array. Default is 0. - title (str, optional): Title for the plot. Default is an empty string. - f_size (int, optional): Font size for the plot. Default is 15.

Returns: matplotlib.figure.Figure: Figure object for the plot.

Example: >>> pore_analysis = PoreAnalysis(pdb_array=[‘model1.pdb’, ‘model2.pdb’]) >>> pore_analysis.ellipsoid_analysis(index_model=0) >>> fig = pore_analysis.plt_pathway_ellipsoid(index_model=0)

plt_trajectory_average(num_bins=100, f_size=20, title='', HOLE_profile=True)

Plot the trajectory average of the hole radius. Parameters: - num_bins (int, optional): Number of bins for the plot. Default is 100. - f_size (int, optional): Font size for the plot. Default is 20. - title (str, optional): Title for the plot. Default is an empty string. - HOLE_profile (bool, optional): Flag indicating whether to plot the HOLE

profile or the PoreAnalysor profile. Default is True.

Returns: Figure and dataframe

Auxiliary functions

HOLE analysis

PoreAnalyser.hole_analysis.align_to_z(p, pdb_name, align_bool=True, sel='protein')

rotate the principal axes of the molecule to align with Cartesian coordinate system

PoreAnalyser.hole_analysis.analysis(names, labels, path='/biggin/b198/orie4254/Documents/CHAP/', end_radius=15, title='', sel='protein', legend_outside=False, plot_lines=True, f_size=18, align_bool=True)

Perform hole analysis on one or more PDB files and plot the results.

Parameters

nameslist of str

Names of the PDB files to analyze. If multiple files are provided, the function will align them to the first file in the list before analysis.

labelslist of str

Labels to use for the legend in the plot, corresponding to each PDB file.

pathstr, optional

Path to the directory containing the PDB files. Default is ‘/biggin/b198/orie4254/Documents/CHAP/’.

end_radiusfloat, optional

End radius of the HOLE cylinder, in Angstroms. Default is 15.

savestr, optional

Name of the file to save the plot as (without extension). If not provided, the plot will not be saved.

titlestr, optional

Title of the plot. Default is an empty string.

selstr, optional

Selection string to use for the hole analysis. Default is ‘protein’.

legend_outsidebool, optional

If True, place the legend outside of the plot. Default is False.

align_bool: bool, optional

If True, place align the largest principal component to z-axis. Default is True.

Returns

figmatplotlib figure

The generated plot figure.

dfpandas DataFrame

A DataFrame containing the results of the hole analysis, with the following columns: - ‘Label z [A]’: the z-coordinate of each point along the pore axis. - ‘Label Radius [A]’: the radius of the pore at each point. ‘Label’ corresponds to the labels provided in the labels parameter.

PoreAnalyser.hole_analysis.hole_analysis(name, path, end_radius=20, sel='protein')

Perform hole analysis on a molecular structure and create a VMD surface for visualization.

Parameters:

name (str): The name of the input file. path (str): The path to the input file. end_radius (float, optional): The radius (in Angstroms) of the maximum hole to detect.

Default is 20 Angstroms.

sel (str, optional): The selection string to select the atoms to be analyzed.

Default is ‘protein’.

Returns:

midpoints2 (numpy.ndarray): An array of the midpoints of the histogram bins. means2 (numpy.ndarray): An array of the mean values of the histogram bins.

Ellipsoid library

class PoreAnalyser.ProbeParticleEllipsoid.ellipse_lib.atom(x, y, z=0, r=1)

Class representing a 3D atom with coordinates (x, y, z) and a radius (r).

Parameters: - x (float): x-coordinate of the atom. - y (float): y-coordinate of the atom. - z (float, optional): z-coordinate of the atom. Default is 0. - r (float, optional): Radius of the atom. Default is 1.

Attributes: - x (float): x-coordinate of the atom. - y (float): y-coordinate of the atom. - z (float): z-coordinate of the atom. - r (float): Radius of the atom.

Example: >>> my_atom = Atom(x=1.0, y=2.0, z=0.0, r=1.5) >>> print(my_atom.x, my_atom.y, my_atom.z, my_atom.r) 1.0 2.0 0.0 1.5

PoreAnalyser.ProbeParticleEllipsoid.ellipse_lib.dist_ellipse_vdwSphere(ellipse, sphere, plot=0)

Calculate the distance between an ellipse and a sphere, considering the van der Waals (vdW) radius of the sphere.

Parameters: - ellipse (Ellipse): An Ellipse object representing the ellipse in 2D space. - sphere (Sphere): A Sphere object representing the sphere in 2D space, with attributes x, y (center coordinates) and r (radius). - plot (int, optional): An integer flag indicating whether to plot the result. Default is 0 (no plot).

Returns: float: The distance between the ellipse and the sphere, taking into account the van der Waals radius.

Notes: The function first checks if the center of the sphere is inside the ellipse. If so, the distance is considered as the negative of the sphere’s radius.

The distance is calculated by transforming the coordinates of the sphere to the ellipse’s local coordinate system, finding the closest point on the ellipse, and then transforming the closest point back to the global coordinate system. The distance is the difference between the transformed closest point and the sphere’s center, considering the sphere’s radius.

The van der Waals (vdW) radius of the sphere is taken into account when calculating the distance.

Example 1: >>> ellipse = e_lib.ellipse(a=3.0, b=2.0, cx=0.0, cy=0.0, theta=0.0) >>> sphere = e_lib.atom(x=1.0, y=1.0, r=0.5) >>> e_lib.dist_ellipse_vdwSphere(ellipse, sphere, plot=0) -0.5 # distance between the ellipse and the sphere

Example 2: >>> ellipse = e_lib.ellipse(a=3.0, b=2.0, cx=5.0, cy=5.0, theta=0.0) >>> sphere = e_lib.atom(x=1.0, y=1.0, r=0.5) >>> e_lib.dist_ellipse_vdwSphere(ellipse, sphere, plot=0) 2.6950072040653335

PoreAnalyser.ProbeParticleEllipsoid.ellipse_lib.distance_ellipse(semi_major, semi_minor, p)

Calculate the closest point on an ellipse to a given point in 2D space.

Parameters: - semi_major (float): Length of the semi-major axis of the ellipse. - semi_minor (float): Length of the semi-minor axis of the ellipse. - p (tuple): A tuple representing the coordinates (x, y) of the point in 2D space.

Returns: tuple: A tuple representing the coordinates (x, y) of the closest point on the ellipse to the given point.

Reference: This function is based on the method described in the following blog post: “A simple method for distance to ellipse” https://blog.chatfield.io/simple-method-for-distance-to-ellipse/

Example: >>> distance_ellipse(3.0, 2.0, (1.0, 1.0)) (1.2487110341841325, 1.8185123044348084)

class PoreAnalyser.ProbeParticleEllipsoid.ellipse_lib.ellipse(a, b, cx, cy, cz=0, r=1, theta=0)

Class representing a 2D ellipse with parameters (a, b, cx, cy, cz, r, theta).

Parameters / Attributes: - a (float): Length of the semi-major axis. - b (float): Length of the semi-minor axis (radius to grow). - cx (float): x-coordinate of the center. - cy (float): y-coordinate of the center. - cz (float, optional): z-coordinate of the center. Default is 0. - r (float, optional): Radius of the ellipse. Default is 1. - theta (float, optional): Angle of rotation in radians. Default is 0.

Methods: - on_ellipse(x, y): Check if a point (x, y) is on the ellipse. - draw(res=0.01): Generate coordinates of the ellipse for plotting.

Example: >>> my_ellipse = Ellipse(a=3.0, b=2.0, cx=0.0, cy=0.0, theta=0.0) >>> print(my_ellipse.on_ellipse(1.0, 1.0)) True >>> x_coords, y_coords = my_ellipse.draw(res=0.01)

Optimisation with ellipsoidal probe particle

PoreAnalyser.ProbeParticleEllipsoid.ellipsoid_optimisation.ellipsoid_pathway(p, pdb_name, sph_name, slice_dz=1, parallel=False, end_radius=15, num_processes=None, timeout=20, start_index=50, f_size=22, out=0, n_xy_fac=1.6, pathway_sel='protein', opt_method='nelder-mead')

Generate ellipsoids to represent the pore path of a biomolecule.

Given the path to a directory (p), a PDB file name (pdb_name), and a SPH file name (sph_name), this function generates ellipsoids to represent the pore path of a biomolecule.

Parameters

pstr

The path to the directory containing the PDB and SPH files.

pdb_namestr

The name of the PDB file (without the extension).

sph_namestr

The name of the SPH file (without the extension).

Returns

None

This function does not return anything, but generates a text file and saves plots of the ellipsoids.

Raises

IOError

If the PDB or SPH file cannot be read.

Notes

This function uses the MDAnalysis library to read and analyze the PDB and SPH files, and the pandas and matplotlib libraries to plot the data and save the output.

The output of this function is a text file that contains the parameters of the ellipsoids that represent the pore path of the biomolecule. The format of the text file is as follows:

#x, y, z, a, b, theta 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 …

The x, y, and z values represent the center of the ellipsoid, while the a, b, and theta values represent the semi-axes and orientation of the ellipsoid.

The plots of the ellipsoids are saved in the directory p+pdb_name+’_pathway_slices/’, where p is the path to the directory containing the PDB and SPH files, and pdb_name is the name of the PDB file.

PoreAnalyser.ProbeParticleEllipsoid.ellipsoid_optimisation.insert_ellipse(index, dataframe, universe, out=0, plt_path='', rmax=50, show=0, label=0, n_xy_fac=1.6, timing=0, f_size=22, pathway_sel='protein', opt_method='nelder-mead')

Inserts an ellipse into a plot with specified parameters.

Parameters: index : int The index to locate the ellipse within the dataframe. dataframe : pandas.DataFrame A dataframe that contains the x, y, z, r and resid columns. universe : MDAnalysis.Universe A molecular dynamics universe that contains the coordinates of atoms. out : int, default 0 A flag to control output print statements. plt_path : str, default ‘’ A path to save the plot to. rmax : float, default 50 The maximum radius for the ellipsoid (to be deleted, set rmax to n_xy)

show : int, default 0 A flag to control whether the plot is displayed. label : int, default 0 A flag to control whether labels are displayed on the plot. n_xy_fac: float, default 1.6 when calculating the neighbor vector, atoms within hole_radius*n_xy_fac are considered

timing: int, default 0 A flag to control whether timings for sub tasts shoudl be printed out Returns: None

PoreAnalyser.ProbeParticleEllipsoid.ellipsoid_optimisation.neighbor_vec(universe, probe, probe1, n_xy_fac, out=0, call=0, pathway_sel='protein')

Identify neighboring atoms within a specified spatial range around a probe in a molecular system.

Parameters: - universe (MDAnalysis.universe): The molecular dynamics universe representing the system. - probe (Sphere): A Sphere object representing the central probe. - probe1 (MDAnalysis AtomGroup): AtomGroup representing the probe atoms from initial HOLE run (resname SPH). - n_xy_fac (float): The factor used to determine the spatial range in the xy-plane around the probe. - out (int, optional): An integer flag indicating whether to print debugging information. Default is 0 (no printing). - call (int, optional): An integer indicating the recursive call level. Default is 0. - pathway_sel (str, optional): The selection string to identify the pathway in the molecular system. Default is ‘protein’.

Returns: tuple: A tuple containing:

  • list: A list of Atom objects representing neighboring atoms within the specified range.

  • list: A list of strings representing labels for the neighboring atoms.

  • float: The spatial range (n_xy) used for the neighbor search.

Notes: This function identifies neighboring atoms around a central probe within the specified spatial range. The spatial range is determined in the xy-plane based on the probe’s radius and the given factor (n_xy_fac). The selection is performed within the specified pathway (default is ‘protein’).

If the number of neighboring atoms is too large (greater than 150) and the recursive call level is less than 4, the function reduces the spatial range (n_xy_fac) and makes a recursive call to refine the selection. If the number of neighboring atoms is too small (less than 30) and the recursive call level is less than 4, the function increases the spatial range and makes a recursive call to expand the selection.

PoreAnalyser.ProbeParticleEllipsoid.ellipsoid_optimisation.penalty_overlap_4dim(x, args)

Parameters: - x (list): A 4-dimensional vector representing the parameters of the ellipse.

  • x[0]: Radius to expand.

  • x[1]: Angle of rotation.

  • x[2]: x-coordinate of the center.

  • x[3]: y-coordinate of the center.

  • args (tuple): A tuple of arguments.
    • args[0]: Constant radius used in the creation of the ellipse.

    • args[1]: List of Sphere objects representing the surroundings of the ellipse.

    • args[2] (optional): Boolean flag indicating whether to stop the loop upon detecting an overlap. Default is True.
      • if True: return a high penalty if any overlap is detected.

      • if False: return the absolute value of the minimum distance between the ellipse and the spheres if we have an overlap.

Returns: float: The penalty score based on the conditions specified.

Notes: This function creates an Ellipse object using the input parameters and evaluates the overlap penalty with a set of vdW spheres. The penalty is calculated based on the minimum distance between the ellipse and the spheres. If the stop_loop flag is set to True, the function returns a high (positive) penalty if any overlap is detected.