lammpskit.config module

The configuration module provides essential validation functions and constants for robust LAMMPS trajectory analysis. These functions ensure input validation, parameter checking, and standardized constants used across LAMMPSKit analysis workflows.

Key Functions

The module includes validation functions for:

  • File path validation with existence checking

  • Data index validation with range checking

  • File list validation for batch processing

  • Loop parameter validation for iteration control

  • Chunk parameter validation for parallel processing

  • Cluster parameter validation for analysis configuration

Plot Configuration System

PlotConfig (Base Class)

The foundation for all plotting configurations, providing:

  • Typography control (title, labels, ticks, legend fonts)

  • Color scheme management

  • Figure size and DPI settings

  • Output format specifications

Default Typography Settings:
  • Title: 12pt (emphasis without overwhelming)

  • Labels: 10pt (clear readability)

  • Ticks: 8pt (compact but legible)

  • Legend: 9pt (balanced information density)

ScatterPlotConfig

Extends PlotConfig for scatter plot specific parameters:

ScatterPlotConfig(
    alpha=0.5,          # Semi-transparency for overlap visualization
    marker='o',         # Circular markers for universal recognition
    markersize=30,      # Large enough for pattern recognition
    colormap='viridis', # Perceptually uniform colormap
    format='pdf'        # Vector output for publications
)

HistogramConfig

Specialized for histogram and distribution plots:

HistogramConfig(
    bins=30,            # Balanced between detail and noise
    alpha=0.7,          # Slight transparency for overlays
    density=True,       # Normalized distributions
    color='skyblue',    # Professional, non-aggressive color
    edgecolor='black'   # Clear bin boundaries
)

TimeSeriesPlotConfig

Optimized for temporal data visualization:

TimeSeriesPlotConfig(
    alpha=0.55,         # Transparency for overlay analysis
    linewidth=0.1,      # Thin lines for dense time series
    markersize=5,       # Readable without overcrowding
    marker='^',         # Distinctive triangular markers
    include_line=True,  # Show trend lines
    include_scatter=True # Show individual data points
)

Font Configuration Examples

Centralized font control:

from lammpskit.config import CentralizedFontConfig

# Professional presentation configuration
presentation_fonts = CentralizedFontConfig(
    title_size=16,
    label_size=14,
    tick_size=12,
    legend_size=13,
    family='serif'  # Traditional academic style
)

# Apply to any plot configuration
plot_config = TimeSeriesPlotConfig(font_config=presentation_fonts)

Individual font overrides:

# Override specific font sizes while keeping others at defaults
config = PlotConfig(
    fontsize_title=14,    # Larger title only
    fontsize_labels=10,   # Keep default label size
    # Other fonts inherit from defaults
)

Analysis Configuration Usage

FilamentAnalysisConfig

Configure filament connectivity analysis:

from lammpskit.config import FilamentAnalysisConfig

# Electrochemical filament analysis
ecell_config = FilamentAnalysisConfig(
    connectivity_threshold=2.5,  # Ångström cutoff distance
    min_cluster_size=3,          # Minimum atoms per cluster
    gap_analysis=True,           # Include gap size analysis
    temporal_tracking=True,      # Track evolution over time
    statistical_analysis=True   # Include mean/std calculations
)

ConnectivityAnalysisConfig

Network analysis parameters:

from lammpskit.config import ConnectivityAnalysisConfig

# Network connectivity analysis
network_config = ConnectivityAnalysisConfig(
    distance_cutoff=3.0,         # Maximum connection distance
    periodic_boundaries=True,    # Account for PBC
    cluster_algorithm='DBSCAN',  # Clustering method
    min_samples=2,               # DBSCAN parameter
    connectivity_metric='euclidean'  # Distance calculation
)

System Configuration Examples

SystemParameters

Define system-wide analysis parameters:

from lammpskit.config import SystemParameters

# Molecular dynamics system configuration
md_system = SystemParameters(
    box_dimensions=[50.0, 50.0, 50.0],  # Simulation box size (Å)
    periodic_boundaries=[True, True, True],  # PBC in x,y,z
    temperature=300.0,    # Target temperature (K)
    pressure=1.0,         # Target pressure (atm)
    timestep=0.001       # Integration timestep (ps)
)

MaterialProperties

Physical material parameters:

from lammpskit.config import MaterialProperties

# Electrochemical device materials
device_materials = MaterialProperties(
    electrode_material='Ag',
    electrolyte='TiO2',
    electrode_density=10.49,     # g/cm³
    electrolyte_density=4.23,    # g/cm³
    contact_resistance=100.0,    # Ohm·cm²
    formation_energy=1.2        # eV
)

Configuration Inheritance

Configuration classes support inheritance for specialized applications:

# Base configuration for all electrochemical plots
base_ecell_config = PlotConfig(
    fontsize_title=12,
    fontsize_labels=10,
    format='pdf',
    dpi=300
)

# Specialized configuration for presentations
presentation_config = PlotConfig(
    **base_ecell_config.__dict__,  # Inherit base settings
    fontsize_title=16,             # Override title size
    fontsize_labels=14,            # Override label size
    format='svg'                   # Override format
)

Runtime Configuration Override

Dynamic parameter modification:

# Start with default configuration
config = TimeSeriesPlotConfig()

# Override specific parameters for current analysis
config.alpha = 0.8              # Increase opacity
config.markersize = 8           # Larger markers
config.include_line = False     # Remove connecting lines

# Use modified configuration
create_time_series_plot(x, y, config=config)

Validation and Type Safety

Configuration classes include automatic validation:

# Type checking prevents common errors
config = PlotConfig(
    fontsize_title=12,    # ✓ Valid integer
    alpha=0.5,           # ✓ Valid float [0,1]
    format='pdf'         # ✓ Valid output format
)

# Invalid configurations raise errors at runtime
invalid_config = PlotConfig(
    alpha=1.5,           # ✗ Invalid: alpha > 1.0
    format='doc'         # ✗ Invalid: unsupported format
)

Best Practices

Configuration Organization:
  • Create base configurations for project-wide consistency

  • Use inheritance for specialized requirements

  • Document custom configurations with usage examples

Parameter Selection:
  • Test font sizes at target output resolution

  • Validate color schemes for colorblind accessibility

  • Consider output medium (screen vs. print) when setting DPI

Performance Optimization:
  • Use lower alpha values sparingly (impact rendering speed)

  • Select appropriate marker sizes for data density

  • Choose vector formats for scalable graphics

Module Documentation

LAMMPSKit Configuration Module

Essential validation functions and constants for robust LAMMPS trajectory analysis. This module provides infrastructure for input validation, parameter checking, and standardized constants used across LAMMPSKit analysis workflows.

Architecture Design

The configuration module follows a functional approach rather than class-based configuration objects, prioritizing simplicity and direct validation at point-of-use. This design reduces coupling while ensuring consistent parameter validation across all analysis modules.

Key Components

  • Column mapping constants for LAMMPS dump file parsing

  • Data type labels for displacement and property analysis

  • Input validation functions with domain-specific error messages

  • Parameter range checking with physics-aware warnings

Validation Philosophy

Validation functions use a “fail-fast” approach with informative error messages to catch configuration issues early in analysis workflows. Physics-aware warnings help identify potential coordinate system or unit scale problems common in MD simulations.

Performance Considerations

Validation overhead is O(1) for most functions, O(n) for file list validation. Pre-validate parameters once at workflow start rather than per-timestep for optimal performance in long trajectory analysis.

lammpskit.config.validate_filepath(filepath, check_existence=True)[source]

Validate file path for LAMMPS trajectory and output files.

Ensures filepath is a valid string and optionally verifies file existence. Essential for preventing downstream failures in trajectory reading and analysis output generation.

Parameters:
  • filepath (str) – Path to file for validation. Supports both absolute and relative paths.

  • check_existence (bool, optional) – Whether to verify file exists on disk (default: True). Set False for output file validation.

Raises:
Return type:

None

Examples

Validate input trajectory file:

>>> validate_filepath('trajectory.lammpstrj')

Validate output path without existence check:

>>> validate_filepath('output/analysis.pdf', check_existence=False)
lammpskit.config.validate_dataindex(dataindex, max_index=None)[source]

Validate array index for displacement data and property arrays.

Ensures safe array indexing with support for Python negative indexing. Primarily used for accessing DISPLACEMENT_DATA_LABELS and trajectory property arrays.

Parameters:
  • dataindex (int) – Index to validate. Supports negative indexing (e.g., -1 for last element).

  • max_index (int, optional) – Maximum allowed index. If None, uses DISPLACEMENT_DATA_LABELS length.

Raises:

ValueError – If dataindex is not integer or out of valid range.

Return type:

None

Notes

Negative indexing follows Python conventions: -1 = last element, -n = first element for array of length n.

Examples

Validate index for displacement data:

>>> validate_dataindex(2)  # Access 'temp (K)'
>>> validate_dataindex(-1)  # Access last element
lammpskit.config.validate_file_list(file_list)[source]

Validate list of trajectory files for batch processing.

Ensures all files exist and are accessible before starting computationally expensive analysis workflows. Prevents partial analysis completion due to missing files in multi-trajectory studies.

Parameters:

file_list (List[str]) – List of file paths to validate. Commonly used for time series analysis across multiple LAMMPS dump files.

Raises:
  • ValueError – If file_list is not a list/tuple, is empty, or contains non-string elements.

  • FileNotFoundError – If any files don’t exist. Reports all missing files simultaneously for efficient error handling.

  • Performance Notes

  • -----------------

  • Complexity – O(n) where n is number of files.:

  • For large file lists (>1000), consider validating in chunks.

Examples

Return type:

None

Validate trajectory sequence:

>>> files = ['step_0.lammpstrj', 'step_1000.lammpstrj', 'step_2000.lammpstrj']
>>> validate_file_list(files)
lammpskit.config.validate_loop_parameters(loop_start, loop_end)[source]

Validate timestep range parameters for trajectory analysis loops.

Ensures valid timestep iteration bounds for LAMMPS trajectory processing. Critical for preventing infinite loops or invalid memory access in temporal analysis functions.

Parameters:
  • loop_start (int) – Starting timestep index (inclusive). Must be non-negative.

  • loop_end (int) – Ending timestep index (inclusive). Must be >= loop_start.

Raises:

ValueError – If parameters are not integers, negative, or loop_start > loop_end.

Return type:

None

Notes

Both indices are inclusive: range [loop_start, loop_end]. For trajectory with N timesteps, valid range is [0, N-1].

Examples

Validate analysis range:

>>> validate_loop_parameters(0, 1000)  # Analyze first 1001 timesteps
>>> validate_loop_parameters(500, 1500)  # Analyze middle portion
lammpskit.config.validate_chunks_parameter(nchunks, min_chunks=1, max_chunks=1000)[source]

Validate spatial binning parameters for density and distribution analysis.

Ensures appropriate bin count for spatial discretization in electrochemical cell analysis. Balances statistical significance with computational efficiency.

Parameters:
  • nchunks (int) – Number of spatial bins/chunks for discretization.

  • min_chunks (int, optional) – Minimum allowed bins (default: 1). Must be positive.

  • max_chunks (int, optional) – Maximum allowed bins (default: 1000). Prevents excessive memory usage.

Raises:
  • ValueError – If nchunks is not integer or outside [min_chunks, max_chunks] range.

  • Performance Notes

  • -----------------

  • Memory usage – O(nchunks) per property per timestep.:

  • Computation time – O(N * nchunks) where N is atom count.:

  • Optimal range – 10-100 chunks for most electrochemical systems.:

Return type:

None

Examples

Validate binning for layer analysis:

>>> validate_chunks_parameter(50)  # 50 z-direction layers
>>> validate_chunks_parameter(10, min_chunks=5, max_chunks=20)
lammpskit.config.validate_cluster_parameters(z_filament_lower_limit, z_filament_upper_limit, thickness)[source]

Validate geometric parameters for filament connectivity analysis.

Ensures physically meaningful parameters for OVITO-based cluster analysis in electrochemical cell simulations. Validates filament detection geometry and provides physics-aware warnings for common coordinate system issues.

Parameters:
  • z_filament_lower_limit (float) – Lower z-coordinate bound for filament connectivity (Angstroms). Typically electrode surface position.

  • z_filament_upper_limit (float) – Upper z-coordinate bound for filament connectivity (Angstroms). Typically opposite electrode surface position.

  • thickness (float) – Filament thickness parameter for cluster detection (Angstroms). Controls sensitivity of connectivity algorithm.

Raises:
  • TypeError – If parameters are not numeric (int or float).

  • ValueError – If z_lower >= z_upper or thickness <= 0.

Warns:
  • UserWarning – If negative z-coordinates detected (potential coordinate system issue). If large z-values detected (potential unit scale issue).

  • Physics Notes

  • ————-

  • Typical electrochemical cell dimensions (20-100 Å electrode separation.)

  • Filament thickness (2-10 Å depending on atom size and connectivity criteria.)

  • Z-coordinates should span electrode-to-electrode distance.

Return type:

None

Examples

Validate HfTaO cell parameters:

>>> validate_cluster_parameters(-10.0, 50.0, 3.5)  # 60 Å cell, 3.5 Å thickness
>>> validate_cluster_parameters(0.0, 30.0, 2.0)    # 30 Å cell, 2.0 Å thickness