lammpskit.plotting package

Scientific visualization utilities for creating publication-ready figures across different analysis workflows. The plotting package provides both general-purpose plotting functions and specialized time-series visualization tools with consistent styling and configuration management.

Key Features

  • Publication-ready styling with scientific color schemes and typography

  • Multi-dimensional array handling for comparative analysis

  • Centralized configuration for consistent visual output

  • Time-series specialization with dual-axis plotting capabilities

  • Memory-efficient rendering optimized for large datasets

  • Multiple output formats (PDF, SVG, PNG, EPS) for different use cases

Package Architecture

The plotting package is organized into specialized modules:

General Plotting (utils)

Flexible plotting functions for scientific data visualization

Time Series Analysis (timeseries_plots)

Specialized functions for temporal data analysis with standardized configurations

Core Functions

General Purpose Plotting:

  • lammpskit.plotting.plot_multiple_cases() - Create multi-case comparative plots

Time Series Plotting:

  • lammpskit.plotting.create_time_series_plot() - Create standardized time series plots

  • lammpskit.plotting.create_dual_axis_plot() - Create dual-axis correlation plots

  • lammpskit.plotting.save_and_close_figure() - Save and cleanup figure objects

  • lammpskit.plotting.calculate_mean_std_label() - Generate statistical labels

  • lammpskit.plotting.calculate_frequency_label() - Generate frequency labels

Configuration Classes:

Styling Standards

Color Palette: [‘b’, ‘r’, ‘g’, ‘k’] (blue, red, green, black)

Line Styles: [’–’, ‘-.’, ‘:’, ‘-’] (dashed, dash-dot, dotted, solid)

Markers: [‘o’, ‘^’, ‘s’, ‘*’] (circle, triangle, square, star)

Typography: 8pt labels, 7pt legends, 7pt ticks for compact scientific layout

Usage Examples

Basic comparative analysis:

import numpy as np
from lammpskit.plotting import plot_multiple_cases

x = np.linspace(0, 30, 50)  # z-positions
y = np.array([[5, 10, 15], [8, 12, 18]])  # Two cases
labels = ['SET state', 'RESET state']

fig = plot_multiple_cases(x, y, labels, 'Atom count', 'Z position (Å)',
                         'comparison', 8, 6)

Time series with dual axes:

from lammpskit.plotting import create_dual_axis_plot, DualAxisPlotConfig

config = DualAxisPlotConfig(primary_color='tab:red', secondary_color='tab:blue')
fig, ax1, ax2 = create_dual_axis_plot(
    time, connectivity, temperature, 'Evolution Analysis',
    'Time (ps)', 'Connectivity (%)', 'Temperature (K)',
    'Conn: 45±12%', 'Temp: 315±26K', config=config)

Performance Notes

  • Memory usage: Scales with data size and number of cases

  • Rendering time: O(n_cases × n_points) for plot generation

  • File I/O: Vector formats (PDF/SVG) recommended for publications

  • Large datasets: Consider downsampling for >10⁵ points

Submodules

Module contents

General plotting utilities for LAMMPSKit.

This module provides general-purpose plotting functions that can be used across different analysis types and simulation workflows.

lammpskit.plotting.plot_multiple_cases(x_arr, y_arr, labels, xlabel, ylabel, output_filename, xsize, ysize, output_dir=os.getcwd(), **kwargs)[source]

Create comparative plots for multiple datasets with publication-ready styling.

Versatile plotting function for scientific data visualization supporting various array dimensions and comparison scenarios. Handles both single-case and multi-case analysis with automatic styling, customizable limits, and dual-format output. Optimized for electrochemical cell analysis and general MD simulation data visualization.

Parameters:
  • x_arr (np.ndarray) – X-axis data for plotting. Supports multiple dimensions: - 1D: Single x-series for all cases - 2D: Different x-series for each case (shape: n_cases, n_points)

  • y_arr (np.ndarray) – Y-axis data for plotting. Supports multiple dimensions: - 1D: Single y-series (used with single case or shared across cases) - 2D: Different y-series for each case (shape: n_cases, n_points)

  • labels (List[str]) – Legend labels for each case. Length should match number of cases in data arrays.

  • xlabel (str) – X-axis label with units. Example: ‘Z position (Å)’, ‘Time (ps)’

  • ylabel (str) – Y-axis label with units. Example: ‘Atom count’, ‘Displacement (Å)’

  • output_filename (str) – Base filename for saved figures (extensions added automatically). Example: ‘atomic_distribution’, ‘filament_evolution’

  • xsize (float) – Figure width in inches. Note: Function overrides with hardcoded value (1.6).

  • ysize (float) – Figure height in inches. Note: Function overrides with hardcoded value (3.2).

  • output_dir (str, optional) – Output directory for saved figures. Created if doesn’t exist (default: cwd).

  • **kwargs (dict, optional) –

    Advanced customization options:

    Axis Limits:

    xlimit : tuple (xmin, xmax) - Set both x-axis limits ylimit : tuple (ymin, ymax) - Set both y-axis limits xlimitlo : float - Set x-axis lower limit only xlimithi : float - Set x-axis upper limit only ylimitlo : float - Set y-axis lower limit only ylimithi : float - Set y-axis upper limit only

    Reference Lines:

    xaxis : bool - Add horizontal line at y=0 yaxis : bool - Add vertical line at x=0

    Styling:

    markerindex : int - Override automatic color/marker cycling

    Statistical Analysis:
    ncountnp.ndarray - Atom counts per bin for average calculations

    Shape: (n_cases, n_bins). Prints weighted averages.

Returns:

fig – Figure object for further customization or display. Note: Figure is automatically saved and closed for memory efficiency.

Return type:

matplotlib.figure.Figure

Notes

Array Dimension Handling: - x_arr.ndim=1, y_arr.ndim=1: Single case plot - x_arr.ndim=1, y_arr.ndim=2: Shared x-axis, multiple y-series - x_arr.ndim=2, y_arr.ndim=1: Multiple x-series, shared y-axis - x_arr.ndim=2, y_arr.ndim=2: Full multi-case plot (most common)

Performance Characteristics: - Memory usage: O(max(x_size, y_size)) - Rendering time: O(n_cases * n_points) - File I/O: Dual output (PDF + SVG) for versatility

Output Format: - PDF: Vector format for publications and presentations - SVG: Web-compatible vector format for interactive displays - Both saved with tight bounding boxes for clean appearance

Common Usage Patterns in LAMMPSKit:

Electrochemical analysis (atom distributions): >>> plot_multiple_cases(distributions[‘hafnium’], z_bin_centers, labels, … ‘Hf atoms #’, ‘z position (A)’, ‘hf_distribution’, 8, 6)

Displacement analysis: >>> plot_multiple_cases(zdisp, binposition, labels, … ‘z displacement (A)’, ‘z position (A)’, ‘z_disp’, 8, 6, … yaxis=True) # Add y=0 reference line

Charge distribution with axis limits: >>> plot_multiple_cases(charge_data, z_positions, labels, … ‘Net charge’, ‘z position (A)’, ‘charge_dist’, 8, 6, … ylimithi=70, xlimithi=15, xlimitlo=-20)

Examples

Basic multi-case comparison:

>>> import numpy as np
>>> z_pos = np.linspace(0, 30, 50)  # Electrode positions
>>> hf_counts = np.array([[5, 10, 15], [8, 12, 18]])  # Two voltage states
>>> labels = ['0.5V', '1.0V']
>>> fig = plot_multiple_cases(hf_counts, z_pos, labels,
...                          'Hf atom count', 'Z position (Å)',
...                          'hafnium_analysis', 10, 8)

Single case with reference lines:

>>> displacement = np.random.normal(0, 1, 100)
>>> positions = np.linspace(-10, 40, 100)
>>> fig = plot_multiple_cases(displacement, positions, ['Displacement'],
...                          'Displacement (Å)', 'Z position (Å)',
...                          'displacement_profile', 8, 6,
...                          yaxis=True, xaxis=True)

Multi-dimensional array example:

>>> # 3 cases, 4 elements each
>>> element_counts = np.random.randint(1, 20, (3, 4))
>>> elements = ['Hf', 'Ta', 'O', 'Electrode']
>>> case_labels = ['SET', 'Intermediate', 'RESET']
>>> fig = plot_multiple_cases(element_counts, elements, case_labels,
...                          'Element count', 'Element type',
...                          'element_comparison', 12, 8)
class lammpskit.plotting.TimeSeriesPlotConfig(alpha=0.55, linewidth=0.1, markersize=5, marker='^', include_line=True, include_scatter=True, format='pdf', fontsize_title=8, fontsize_labels=8, fontsize_ticks=8, fontsize_legend=8)[source]

Bases: object

Configuration class for standardized time series plotting with publication-ready defaults.

Provides centralized control over plot styling, elements, and output formatting for consistent scientific visualization across LAMMPSKit analysis workflows. Supports both line plots and scatter plots with flexible element combination.

alpha

Transparency level for plot elements. Range [0.0, 1.0] where 0.0 is fully transparent and 1.0 is fully opaque. Balanced for overlay visualization.

Type:

float, default=0.55

linewidth

Line thickness for connected plots. Thin lines prevent visual clutter in dense time series data. Use higher values (0.5-2.0) for presentation plots.

Type:

float, default=0.1

markersize

Size of scatter plot markers. Optimized for readability without overcrowding. Scale proportionally for different figure sizes.

Type:

float, default=5

marker

Matplotlib marker style for scatter plots. Options: ‘o’ (circle), ‘s’ (square), ‘^’ (triangle), ‘*’ (star), ‘+’ (plus), ‘x’ (cross), ‘D’ (diamond).

Type:

str, default=’^’

include_line

Whether to draw connecting lines between data points. Useful for trend visualization in temporal data. Disable for pure scatter analysis.

Type:

bool, default=True

include_scatter

Whether to draw individual data point markers. Essential for discrete data visualization. Disable for smooth trend-only plots.

Type:

bool, default=True

format

Output file format for saved figures. Options: ‘pdf’ (vector, publication), ‘svg’ (web-compatible vector), ‘png’ (raster), ‘eps’ (LaTeX-compatible).

Type:

str, default=’pdf’

fontsize_title

Font size for plot titles in points. Optimized for compact scientific layout. Use None to disable title font control.

Type:

int, default=8

fontsize_labels

Font size for axis labels in points. Consistent with scientific journal standards. Use None to use matplotlib defaults.

Type:

int, default=8

fontsize_ticks

Font size for axis tick labels in points. Maintains readability at small sizes. Use None to use matplotlib defaults.

Type:

int, default=8

fontsize_legend

Font size for legend text in points. Balanced for information density. Use None to use matplotlib defaults.

Type:

int, default=8

Notes

Configuration Design Philosophy: - Publication-ready defaults minimize post-processing - Centralized font control ensures consistency across figures - Flexible element control (line/scatter) supports diverse data types - Conservative styling prevents visual clutter in complex analyses

Performance Considerations: - Transparent plots (alpha < 1.0) may slow rendering for large datasets - Vector formats (PDF/SVG) maintain quality but increase file size - Font rendering overhead is minimal for typical scientific plots

Examples

Default configuration for temporal analysis:

>>> config = TimeSeriesPlotConfig()
>>> print(f"Using marker: {config.marker}, alpha: {config.alpha}")

Custom configuration for presentation plots:

>>> config = TimeSeriesPlotConfig(
...     linewidth=1.0,
...     markersize=8,
...     alpha=0.8,
...     fontsize_title=12,
...     fontsize_labels=10
... )

Scatter-only configuration for statistical analysis:

>>> config = TimeSeriesPlotConfig(
...     include_line=False,
...     include_scatter=True,
...     marker='o',
...     markersize=3
... )

High-contrast configuration for printing:

>>> config = TimeSeriesPlotConfig(
...     alpha=1.0,
...     linewidth=0.8,
...     format='eps'
... )
Parameters:
__init__(alpha=0.55, linewidth=0.1, markersize=5, marker='^', include_line=True, include_scatter=True, format='pdf', fontsize_title=8, fontsize_labels=8, fontsize_ticks=8, fontsize_legend=8)
Parameters:
alpha: float = 0.55
fontsize_labels: Optional[int] = 8
fontsize_legend: Optional[int] = 8
fontsize_ticks: Optional[int] = 8
fontsize_title: Optional[int] = 8
format: str = 'pdf'
include_line: bool = True
include_scatter: bool = True
linewidth: float = 0.1
marker: str = '^'
markersize: float = 5
class lammpskit.plotting.DualAxisPlotConfig(alpha=0.55, linewidth=0.1, markersize=5, marker='^', primary_color='tab:red', secondary_color='tab:blue', format='pdf', primary_legend_loc='upper right', secondary_legend_loc='lower right', legend_framealpha=0.75, tight_layout=True, fontsize_title=8, fontsize_labels=8, fontsize_ticks=8, fontsize_legend=8)[source]

Bases: object

Configuration class for dual-axis plots supporting simultaneous visualization of two data series.

Enables comparison of time series data with different units or scales on a single figure. Essential for correlating physical quantities like temperature-displacement or connectivity-time relationships in scientific analysis. Provides independent color control and legend positioning for clear data interpretation.

alpha

Transparency level for both data series. Range [0.0, 1.0]. Moderate transparency allows underlying grid and axis lines to remain visible.

Type:

float, default=0.55

linewidth

Line thickness for both axes data. Thin lines prevent visual dominance of either dataset. Increase for presentation or when clarity is critical.

Type:

float, default=0.1

markersize

Marker size for scatter points on both axes. Consistent sizing maintains visual balance between primary and secondary data series.

Type:

float, default=5

marker

Marker style for secondary axis (right). Primary axis uses default scatter markers. Options: ‘o’, ‘s’, ‘^’, ‘*’, ‘+’, ‘x’, ‘D’, ‘v’, ‘<’, ‘>’.

Type:

str, default=’^’

primary_color

Color for primary (left) y-axis data and axis labels. Matplotlib tab colors provide good contrast. Options: ‘tab:blue’, ‘tab:orange’, ‘tab:green’, etc.

Type:

str, default=’tab:red’

secondary_color

Color for secondary (right) y-axis data and axis labels. Should contrast with primary_color for clear visual separation.

Type:

str, default=’tab:blue’

format

Output file format. Dual-axis plots benefit from vector formats to maintain text and line clarity at different scales.

Type:

str, default=’pdf’

primary_legend_loc

Legend position for primary axis data. Standard matplotlib locations: ‘upper/lower/center’ + ‘left/right/center’, or ‘best’ for automatic.

Type:

str, default=’upper right’

secondary_legend_loc

Legend position for secondary axis data. Should not overlap with primary legend. Consider ‘upper left’, ‘lower left’, or ‘center left’.

Type:

str, default=’lower right’

legend_framealpha

Background transparency for legend boxes. Range [0.0, 1.0]. Semi-transparent frames prevent complete data occlusion while maintaining readability.

Type:

float, default=0.75

tight_layout

Whether to apply matplotlib tight_layout for automatic spacing adjustment. Prevents axis label cutoff in dual-axis configurations.

Type:

bool, default=True

fontsize_title

Font size for plot title. Centered above both axes.

Type:

int, default=8

fontsize_labels

Font size for both primary and secondary axis labels.

Type:

int, default=8

fontsize_ticks

Font size for tick labels on both axes.

Type:

int, default=8

fontsize_legend

Font size for both legend boxes.

Type:

int, default=8

Notes

Dual-Axis Design Principles: - Color coding clearly distinguishes data series and corresponding axes - Legend positioning prevents data occlusion while maintaining clarity - Consistent marker sizing maintains visual balance between series - Semi-transparent legends allow underlying data visibility

Common Use Cases: - Temperature vs. displacement over time - Connectivity percentage vs. cluster size evolution - Voltage vs. current relationships in device characterization - Statistical metrics vs. physical properties correlation

Performance Considerations: - Dual-axis rendering requires additional matplotlib operations - Legend placement calculations may slow complex figures - Vector output formats recommended for text clarity

Examples

Default configuration for scientific analysis:

>>> config = DualAxisPlotConfig()
>>> print(f"Colors: {config.primary_color}, {config.secondary_color}")

Custom color scheme for publication:

>>> config = DualAxisPlotConfig(
...     primary_color='tab:green',
...     secondary_color='tab:purple',
...     primary_legend_loc='upper left',
...     secondary_legend_loc='upper right'
... )

High-contrast configuration for presentations:

>>> config = DualAxisPlotConfig(
...     alpha=0.9,
...     linewidth=0.5,
...     markersize=8,
...     legend_framealpha=0.9,
...     fontsize_title=12,
...     fontsize_labels=10
... )

Minimal styling for technical reports:

>>> config = DualAxisPlotConfig(
...     primary_color='black',
...     secondary_color='gray',
...     tight_layout=True,
...     format='eps'
... )
Parameters:
__init__(alpha=0.55, linewidth=0.1, markersize=5, marker='^', primary_color='tab:red', secondary_color='tab:blue', format='pdf', primary_legend_loc='upper right', secondary_legend_loc='lower right', legend_framealpha=0.75, tight_layout=True, fontsize_title=8, fontsize_labels=8, fontsize_ticks=8, fontsize_legend=8)
Parameters:
alpha: float = 0.55
fontsize_labels: Optional[int] = 8
fontsize_legend: Optional[int] = 8
fontsize_ticks: Optional[int] = 8
fontsize_title: Optional[int] = 8
format: str = 'pdf'
legend_framealpha: float = 0.75
linewidth: float = 0.1
marker: str = '^'
markersize: float = 5
primary_color: str = 'tab:red'
primary_legend_loc: str = 'upper right'
secondary_color: str = 'tab:blue'
secondary_legend_loc: str = 'lower right'
tight_layout: bool = True
lammpskit.plotting.create_time_series_plot(x_data, y_data, title, xlabel, ylabel, stats_label, config=None, ylim=None, fontsize_title=None, fontsize_labels=None, fontsize_ticks=None, fontsize_legend=None)[source]

Create standardized time series plots with flexible line and scatter element control.

Generates publication-ready time series visualizations with configurable styling and centralized font management. Supports pure line plots, pure scatter plots, or combined visualizations based on configuration. Essential for temporal analysis workflows including filament evolution tracking, statistical time series, and experimental data visualization.

Parameters:
  • x_data (np.ndarray) – X-axis values, typically representing time or sequential measurements. Shape: (n_points,). Units depend on analysis context (e.g., timesteps, seconds, frames).

  • y_data (np.ndarray) – Y-axis values for plotting. Shape: (n_points,). Must match x_data length. Examples: displacement values, connectivity percentages, statistical measures.

  • title (str) – Plot title displayed above the figure. Include units and context for clarity. Example: ‘Filament Evolution Over Time’, ‘Temperature vs. Timestep’

  • xlabel (str) – X-axis label with units. Standard format: ‘Property (units)’. Examples: ‘Time (ps)’, ‘Timestep’, ‘Frame Number’, ‘Voltage Cycle’

  • ylabel (str) – Y-axis label with units. Standard format: ‘Property (units)’. Examples: ‘Displacement (Å)’, ‘Connectivity (%)’, ‘Temperature (K)’

  • stats_label (str) – Statistical summary or description for legend entry. Often includes computed metrics like mean, standard deviation, or frequency. Example: ‘Mean: 2.34 ± 0.15’

  • config (TimeSeriesPlotConfig, optional) – Plot configuration object controlling styling, elements, and output format. If None, uses default configuration optimized for scientific visualization.

  • ylim (tuple of float, optional) – Y-axis limits as (ymin, ymax). Useful for consistent scaling across multiple related plots or for focusing on specific data ranges.

  • fontsize_title (int, optional) – Override configuration title font size. Useful for presentation adaptation without modifying the base configuration object.

  • fontsize_labels (int, optional) – Override configuration axis label font size. Maintains consistency while allowing figure-specific adjustments.

  • fontsize_ticks (int, optional) – Override configuration tick label font size. Important for readability when figures are scaled for different contexts.

  • fontsize_legend (int, optional) – Override configuration legend font size. Critical for maintaining legend readability in complex multi-series plots.

Return type:

Tuple[Figure, Axes]

Returns:

  • fig (plt.Figure) – Matplotlib figure object containing the plot. Can be further customized, saved, or displayed using standard matplotlib operations.

  • ax (plt.Axes) – Matplotlib axes object for the plot. Provides access for additional annotations, reference lines, or styling modifications.

Raises:
  • ValueError – If x_data and y_data have mismatched lengths or contain invalid values.

  • TypeError – If data arrays are not numpy arrays or convertible to arrays.

Notes

Plot Element Control: - include_line=True, include_scatter=True: Connected scatter plot (default) - include_line=True, include_scatter=False: Pure line plot for trends - include_line=False, include_scatter=True: Pure scatter plot for discrete data - include_line=False, include_scatter=False: Empty plot (not recommended)

Performance Characteristics: - Memory usage: O(n_points) for data storage, minimal for plot objects - Rendering time: O(n_points) for line plots, O(n_points) for scatter plots - File size: Vector formats scale with data complexity, raster formats are fixed

Statistical Integration: - Use calculate_mean_std_label() for automated statistical summaries - Use calculate_frequency_label() for discrete event analysis - Legend automatically includes stats_label for quantitative context

Examples

Basic time series plot with default configuration:

>>> import numpy as np
>>> from lammpskit.plotting.timeseries_plots import create_time_series_plot
>>> time = np.arange(0, 100, 1)
>>> displacement = np.random.normal(2.0, 0.5, 100)
>>> fig, ax = create_time_series_plot(
...     time, displacement,
...     'Atomic Displacement Evolution',
...     'Time (ps)', 'Displacement (Å)',
...     'Mean: 2.0 ± 0.5 Å'
... )

Custom configuration for presentation:

>>> from lammpskit.plotting.timeseries_plots import TimeSeriesPlotConfig
>>> config = TimeSeriesPlotConfig(
...     linewidth=1.0, markersize=8, alpha=0.8, format='png'
... )
>>> fig, ax = create_time_series_plot(
...     time, displacement,
...     'High-Visibility Displacement Plot',
...     'Time (ps)', 'Displacement (Å)',
...     'N=100 points', config=config
... )

Scatter-only plot for statistical analysis:

>>> config = TimeSeriesPlotConfig(include_line=False, marker='o')
>>> fig, ax = create_time_series_plot(
...     time, displacement,
...     'Discrete Displacement Measurements',
...     'Time (ps)', 'Displacement (Å)',
...     'σ = 0.5 Å', config=config
... )

Controlled y-axis range for comparison plots:

>>> fig, ax = create_time_series_plot(
...     time, displacement,
...     'Constrained Range Analysis',
...     'Time (ps)', 'Displacement (Å)',
...     'Range: 0-5 Å', ylim=(0, 5)
... )

Font size override for manuscript figures:

>>> fig, ax = create_time_series_plot(
...     time, displacement,
...     'Publication Figure',
...     'Time (ps)', 'Displacement (Å)',
...     'Experimental data',
...     fontsize_title=14, fontsize_labels=12, fontsize_legend=10
... )
lammpskit.plotting.create_dual_axis_plot(x_data, primary_y_data, secondary_y_data, title, xlabel, primary_ylabel, secondary_ylabel, primary_stats_label, secondary_stats_label, config=None, primary_ylim=None, secondary_ylim=None, fontsize_title=None, fontsize_labels=None, fontsize_ticks=None, fontsize_legend=None)[source]

Create dual-axis plots for simultaneous visualization of two correlated data series.

Generates publication-ready figures with independent y-axes supporting different units, scales, or physical quantities. Essential for comparative analysis of time-dependent properties where direct correlation visualization is critical. Features automatic color coordination, independent axis control, and optimized legend positioning.

Parameters:
  • x_data (np.ndarray) – Shared x-axis values for both data series. Shape: (n_points,). Typically represents time, voltage cycles, or sequential measurements.

  • primary_y_data (np.ndarray) – Left y-axis data series. Shape: (n_points,). Must match x_data length. Examples: displacement values, temperature measurements, primary properties.

  • secondary_y_data (np.ndarray) – Right y-axis data series. Shape: (n_points,). Must match x_data length. Examples: connectivity percentages, statistical measures, derived properties.

  • title (str) – Plot title displayed above both axes. Should indicate the relationship being explored. Example: ‘Temperature vs Connectivity Evolution’

  • xlabel (str) – Shared x-axis label with units. Standard format: ‘Property (units)’. Examples: ‘Time (ps)’, ‘Voltage Cycle’, ‘Timestep Number’

  • primary_ylabel (str) – Left y-axis label with units. Standard format: ‘Property (units)’. Color-coded to match primary_color in configuration.

  • secondary_ylabel (str) – Right y-axis label with units. Standard format: ‘Property (units)’. Color-coded to match secondary_color in configuration.

  • primary_stats_label (str) – Statistical summary for primary data legend entry. Often includes mean, standard deviation, or characteristic values. Example: ‘Temp: 300 ± 50 K’

  • secondary_stats_label (str) – Statistical summary for secondary data legend entry. Should complement primary statistics. Example: ‘Connected: 23.4% of time’

  • config (DualAxisPlotConfig, optional) – Dual-axis configuration controlling colors, legend positioning, and styling. If None, uses default configuration optimized for scientific visualization.

  • primary_ylim (tuple of float, optional) – Primary (left) y-axis limits as (ymin, ymax). Useful for consistent scaling across multiple related plots or for highlighting specific data ranges.

  • secondary_ylim (tuple of float, optional) – Secondary (right) y-axis limits as (ymin, ymax). Independent control enables optimal visualization of secondary data regardless of primary axis scaling.

  • fontsize_title (int, optional) – Override configuration title font size. Useful for presentation adaptation without modifying the base configuration object.

  • fontsize_labels (int, optional) – Override configuration axis label font size. Applies to both primary and secondary axis labels simultaneously.

  • fontsize_ticks (int, optional) – Override configuration tick label font size. Affects both axes tick labels for consistent appearance.

  • fontsize_legend (int, optional) – Override configuration legend font size. Applies to both legend boxes.

Return type:

Tuple[Figure, Axes, Axes]

Returns:

  • fig (plt.Figure) – Matplotlib figure object containing the dual-axis plot. Can be further customized, saved, or displayed using standard matplotlib operations.

  • ax1 (plt.Axes) – Primary (left) y-axis axes object. Provides access for additional annotations, reference lines, or primary data modifications.

  • ax2 (plt.Axes) – Secondary (right) y-axis axes object. Enables independent secondary axis customization, additional data series, or specialized annotations.

Raises:
  • ValueError – If data arrays have mismatched lengths or contain invalid values.

  • TypeError – If data arrays are not numpy arrays or convertible to arrays.

Notes

Dual-Axis Design Principles: - Color coordination ensures clear association between data and corresponding axes - Independent axis scaling optimizes visualization of disparate data ranges - Legend positioning minimizes data occlusion while maintaining readability - Automatic tight layout prevents axis label cutoff in complex configurations

Visual Hierarchy: - Primary data (left axis) uses warm colors (red) for visual prominence - Secondary data (right axis) uses cool colors (blue) for complementary contrast - Legend transparency allows underlying data visibility - Consistent marker sizing maintains visual balance

Performance Considerations: - Dual-axis rendering requires additional matplotlib twinx() operations - Legend placement calculations may impact rendering time for complex data - Vector output formats recommended for maintaining text and line clarity - Memory usage: O(n_points) for data, minimal overhead for dual axes

Common Applications: - Process parameter correlation (temperature vs. pressure over time) - Statistical trend analysis (mean vs. variance evolution) - Performance monitoring (throughput vs. error rate) - Multi-scale temporal analysis (short-term vs. long-term trends)

Examples

Basic dual-axis plot with default configuration:

>>> import numpy as np
>>> from lammpskit.plotting.timeseries_plots import create_dual_axis_plot
>>> time = np.arange(0, 100, 1)
>>> temperature = 300 + 50 * np.sin(time * 0.1)
>>> connectivity = 50 + 30 * np.cos(time * 0.05)
>>> fig, ax1, ax2 = create_dual_axis_plot(
...     time, temperature, connectivity,
...     'Temperature-Connectivity Correlation',
...     'Time (ps)', 'Temperature (K)', 'Connectivity (%)',
...     'T = 300 ± 50 K', 'C = 50 ± 30%'
... )

Custom configuration for presentation:

>>> from lammpskit.plotting.timeseries_plots import DualAxisPlotConfig
>>> config = DualAxisPlotConfig(
...     primary_color='tab:green',
...     secondary_color='tab:purple',
...     primary_legend_loc='upper left',
...     secondary_legend_loc='lower right',
...     alpha=0.8
... )
>>> fig, ax1, ax2 = create_dual_axis_plot(
...     time, temperature, connectivity,
...     'Custom Color Analysis',
...     'Time (ps)', 'Property A', 'Property B',
...     'Series A', 'Series B', config=config
... )

Controlled axis ranges for comparison:

>>> fig, ax1, ax2 = create_dual_axis_plot(
...     time, temperature, connectivity,
...     'Fixed Range Comparison',
...     'Time (ps)', 'Temperature (K)', 'Connectivity (%)',
...     'Controlled range', 'Fixed scale',
...     primary_ylim=(250, 350), secondary_ylim=(0, 100)
... )

High-contrast configuration for printing:

>>> config = DualAxisPlotConfig(
...     primary_color='black',
...     secondary_color='gray',
...     alpha=1.0,
...     legend_framealpha=1.0,
...     format='eps'
... )
>>> fig, ax1, ax2 = create_dual_axis_plot(
...     time, temperature, connectivity,
...     'Print-Optimized Dual Plot',
...     'Time (ps)', 'Primary', 'Secondary',
...     'Data A', 'Data B', config=config
... )

Font override for manuscript figures:

>>> fig, ax1, ax2 = create_dual_axis_plot(
...     time, temperature, connectivity,
...     'Publication Figure',
...     'Time (ps)', 'Temperature (K)', 'Connectivity (%)',
...     'Experimental', 'Calculated',
...     fontsize_title=16, fontsize_labels=14, fontsize_legend=12
... )
lammpskit.plotting.save_and_close_figure(fig, output_dir, filename, file_format='pdf')[source]

Save matplotlib figure to disk with automatic directory creation and memory cleanup.

Provides standardized figure output handling for scientific visualization workflows. Automatically creates output directories, handles filename formatting, and closes figures to prevent memory accumulation during batch processing. Essential for automated analysis pipelines generating multiple plots.

Parameters:
  • fig (plt.Figure) – Matplotlib figure object to save. Can be any figure created with plt.figure(), plt.subplots(), or plotting functions returning figure objects.

  • output_dir (str) – Target directory for saved figure. Created automatically if it doesn’t exist. Supports both absolute and relative paths. Use ‘.’ for current directory.

  • filename (str) – Base filename without extension. Extension is added automatically based on file_format parameter. Should be descriptive of plot content for organization.

  • file_format (str, optional, default='pdf') – Output file format determining quality and compatibility: - ‘pdf’: Vector format, publication-ready, scalable - ‘svg’: Web-compatible vector format, editable - ‘png’: Raster format, good for web display - ‘eps’: LaTeX-compatible vector format - ‘jpg’/’jpeg’: Compressed raster, smaller files

Raises:
  • OSError – If output directory cannot be created due to permissions or disk space.

  • ValueError – If file_format is not supported by matplotlib backend.

Return type:

None

Notes

Memory Management: - Automatically closes figure after saving to prevent memory leaks - Critical for batch processing workflows generating many plots - Use plt.show() before calling this function if display is also needed

Directory Handling: - Creates nested directory structures automatically - Preserves existing directories and files - No error if output_dir already exists

Performance Considerations: - Vector formats (PDF, SVG, EPS) maintain quality but may be larger - Raster formats (PNG, JPG) have fixed resolution but smaller files - PDF recommended for scientific publications and presentations - PNG recommended for web display and documentation

Examples

Save figure with automatic directory creation:

>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> fig, ax = plt.subplots()
>>> x = np.linspace(0, 10, 100)
>>> ax.plot(x, np.sin(x))
>>> save_and_close_figure(fig, 'output/plots', 'sine_wave')
# Saves as 'output/plots/sine_wave.pdf'

Save in different formats for various uses:

>>> save_and_close_figure(fig, 'manuscript/figures', 'analysis', 'eps')
>>> save_and_close_figure(fig, 'web/images', 'analysis', 'png')
>>> save_and_close_figure(fig, 'presentations', 'analysis', 'svg')

Organized output structure:

>>> base_dir = 'results/experiment_2024'
>>> save_and_close_figure(fig, f'{base_dir}/temperature', 'temp_vs_time')
>>> save_and_close_figure(fig, f'{base_dir}/displacement', 'disp_evolution')

Current directory output:

>>> save_and_close_figure(fig, '.', 'quick_analysis', 'png')
# Saves as './quick_analysis.png'

Batch processing workflow:

>>> figures = [fig1, fig2, fig3]
>>> names = ['temperature', 'pressure', 'density']
>>> for fig, name in zip(figures, names):
...     save_and_close_figure(fig, 'output/timeseries', name)
lammpskit.plotting.calculate_mean_std_label(data, label_prefix, precision=2)[source]

Generate standardized statistical summary labels for plot legends and annotations.

Computes mean and standard deviation of input data and formats as publication-ready label string. Essential for automated legend generation in scientific plots where quantitative summaries enhance data interpretation. Supports flexible precision control for different measurement scales and reporting requirements.

Parameters:
  • data (np.ndarray) – Input data array for statistical calculation. Shape: (n_points,). Supports any numeric data type. NaN values are handled by numpy functions.

  • label_prefix (str) – Descriptive text preceding the statistical values. Should include property name and units for clarity. Example: ‘Temperature (K)’, ‘Displacement (Å)’

  • precision (int, optional, default=2) – Number of decimal places for formatting statistical values. Range: 0-15. Use 0-1 for large values, 2-4 for typical scientific measurements, 5+ for high-precision requirements.

Returns:

Formatted label string in standard “prefix = mean ± std” format. Uses Unicode ± symbol for professional appearance. Compatible with matplotlib legend and annotation functions.

Return type:

str

Raises:
  • ValueError – If precision is negative or data array is empty.

  • TypeError – If data is not array-like or label_prefix is not string.

Notes

Statistical Calculations: - Mean: Arithmetic average computed with np.mean() - Standard deviation: Sample standard deviation with np.std() (N-1 denominator) - NaN handling: Follows numpy conventions (NaN propagation)

Formatting Standards: - Uses Unicode ± (U+00B1) for professional appearance - Precision applies to both mean and standard deviation - No scientific notation; adjust precision for extreme values

Performance Characteristics: - Computational complexity: O(n) for statistical calculations - Memory usage: O(1) additional memory beyond input array - String formatting: Minimal overhead for typical legend use

Applications: - Time series analysis summary statistics - Experimental data characterization - Model validation metrics - Comparative analysis legend entries

Examples

Basic temperature data summarization:

>>> import numpy as np
>>> temperatures = np.array([298.2, 301.5, 299.8, 300.1, 302.3])
>>> label = calculate_mean_std_label(temperatures, 'Temperature (K)')
>>> print(label)
'Temperature (K) = 300.38 ± 1.52'

Displacement analysis with high precision:

>>> displacements = np.random.normal(2.345, 0.123, 1000)
>>> label = calculate_mean_std_label(displacements, 'Displacement (Å)', precision=4)
>>> print(label)
'Displacement (Å) = 2.3451 ± 0.1234'

Large-scale data with low precision:

>>> particle_counts = np.random.poisson(1e6, 100)
>>> label = calculate_mean_std_label(particle_counts, 'Count', precision=0)
>>> print(label)
'Count = 1000023 ± 1000'

Percentage data with appropriate precision:

>>> percentages = np.array([23.45, 24.12, 22.89, 23.78, 24.56])
>>> label = calculate_mean_std_label(percentages, 'Connectivity (%)', precision=1)
>>> print(label)
'Connectivity (%) = 23.8 ± 0.6'

Integration with plotting workflows:

>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> ax.plot(range(len(temperatures)), temperatures, label=label)
>>> ax.legend()  # Automatically uses formatted statistical label
lammpskit.plotting.calculate_frequency_label(data, target_value, label_template, precision=2)[source]

Calculate occurrence frequency of specific values and generate formatted labels.

Computes percentage frequency of target value occurrence in data arrays and formats using customizable template strings. Essential for binary state analysis, event detection summaries, and categorical data visualization. Supports flexible label formatting for diverse scientific reporting contexts.

Parameters:
  • data (np.ndarray) – Input data array for frequency analysis. Shape: (n_points,). Supports any data type that supports equality comparison (int, float, bool, str).

  • target_value (any) – Specific value to count occurrences of. Must be comparable to data elements using == operator. Examples: 1, 0, True, ‘connected’, specific float values.

  • label_template (str) – Format string template with {frequency} placeholder for percentage insertion. Supports all Python string formatting options. Example: ‘Active {frequency:.1f}% of time’

  • precision (int, optional, default=2) – Decimal places for frequency percentage formatting. Range: 0-10. Note: This parameter is currently unused; precision controlled by template format specifiers.

Returns:

Formatted label string with frequency percentage substituted into template. Percentage calculated as (occurrences / total_points) * 100.

Return type:

str

Raises:
  • ValueError – If data array is empty or label_template missing {frequency} placeholder.

  • TypeError – If target_value type incompatible with data elements for comparison.

  • KeyError – If label_template contains invalid format specifications.

Notes

Frequency Calculation: - Uses element-wise equality (==) for counting matches - Percentage = (matches / total_elements) * 100 - Range: 0.0% (no matches) to 100.0% (all matches) - Floating-point precision handled by template format specifiers

Template Formatting: - Supports all Python str.format() capabilities - Use {frequency:.1f} for 1 decimal place, {frequency:.0f} for integers - Can include additional text, units, and formatting - Multiple {frequency} references allowed in single template

Performance Characteristics: - Computational complexity: O(n) for equality comparison - Memory usage: O(1) additional memory beyond input array - Boolean array creation for comparison may temporarily double memory

Common Applications: - Binary state analysis (connected/disconnected, active/inactive) - Event detection (threshold crossings, state changes) - Categorical data summaries (phase classification, state distribution) - Time-based occurrence rates (duty cycles, sampling frequencies)

Examples

Binary connectivity analysis:

>>> import numpy as np
>>> connectivity = np.array([1, 1, 0, 1, 0, 1, 1, 0, 1, 1])
>>> label = calculate_frequency_label(connectivity, 1, "Connected {frequency:.1f}% of time")
>>> print(label)
'Connected 70.0% of time'

Boolean state analysis:

>>> active_states = np.array([True, False, True, True, False])
>>> label = calculate_frequency_label(active_states, True, "Active: {frequency:.0f}%")
>>> print(label)
'Active: 60%'

Threshold crossing analysis:

>>> temperatures = np.array([298, 305, 310, 295, 307, 312, 290])
>>> over_threshold = temperatures > 300
>>> label = calculate_frequency_label(over_threshold, True, "Above 300K: {frequency:.1f}%")
>>> print(label)
'Above 300K: 57.1%'

Categorical state distribution:

>>> phases = np.array(['A', 'B', 'A', 'A', 'C', 'B', 'A'])
>>> label_A = calculate_frequency_label(phases, 'A', "Phase A: {frequency:.1f}%")
>>> label_B = calculate_frequency_label(phases, 'B', "Phase B: {frequency:.1f}%")
>>> print(label_A, '|', label_B)
'Phase A: 57.1%' | 'Phase B: 28.6%'

Multiple format references:

>>> successes = np.array([1, 0, 1, 1, 0])
>>> label = calculate_frequency_label(
...     successes, 1,
...     "Success rate: {frequency:.1f}% ({frequency:.2f}% precise)"
... )
>>> print(label)
'Success rate: 60.0% (60.00% precise)'

Integration with time series plotting:

>>> connectivity_data = np.random.choice([0, 1], 1000, p=[0.3, 0.7])
>>> stats_label = calculate_frequency_label(
...     connectivity_data, 1, "Connected {frequency:.1f}% of simulation"
... )
>>> # Use stats_label in create_time_series_plot() for automatic legend generation