lammpskit.ecellmodel.analyze_clusters

lammpskit.ecellmodel.analyze_clusters(filepath, z_filament_lower_limit=5, z_filament_upper_limit=23, thickness=21)[source]

Perform comprehensive OVITO-based cluster analysis for metallic filament characterization.

Conducts advanced structural analysis of HfTaO electrochemical memory devices using OVITO’s cluster detection algorithms to identify, characterize, and quantify conductive filament formation. Provides detailed connectivity analysis, size distribution characterization, and radial distribution function (RDF) analysis essential for understanding resistive switching mechanisms and device reliability in electrochemical memory applications.

Parameters:

filepath (str) – Path to LAMMPS trajectory file (.lammpstrj, .dump) containing atomic coordinates and species information. File must be compatible with OVITO import functionality and contain sufficient atomic data for cluster detection. Example: “filament_formation.lammpstrj”, “switching_cycle.dump”
z_filament_lower_limit (float, optional) – Lower z-coordinate boundary (Angstroms) defining the bottom electrode interface for filament connectivity analysis. Represents the lower threshold for complete electrode-to-electrode conductive bridge formation. Default: 5.0 Å. Typical range: 3-8 Å depending on device geometry.
z_filament_upper_limit (float, optional) – Upper z-coordinate boundary (Angstroms) defining the top electrode interface for filament connectivity analysis. Represents the upper threshold for complete conductive bridge formation. Default: 23.0 Å. Typical range: 20-30 Å depending on oxide thickness.
thickness (float, optional) – Characteristic filament thickness parameter (Angstroms) used for structural analysis and RDF calculations. Influences cluster size classification and connectivity determination. Default: 21.0 Å. Optimization range: 15-30 Å based on device dimensions.

Returns:

cluster_analysis_results – Comprehensive cluster analysis data containing connectivity metrics, size distributions, and structural characterization:

Temporal Information: - timestep (int): Simulation timestep from trajectory file header

Connectivity Analysis: - connection (int): Binary connectivity state (1=connected, 0=disconnected) - separation (float): Minimum inter-cluster distance (Å) when disconnected - gap (float): Z-direction gap between cluster boundaries (Å)

Lower Filament Characterization: - fil_size_down (int): Number of atoms in bottom filament cluster - fil_height (float): Maximum z-coordinate of bottom cluster (Å) - rdf_down (np.ndarray): Radial distribution function data (shape: [N_bins, 2])

Column 0: Radial distance (Å)

Column 1: RDF intensity

Upper Filament Characterization: - fil_size_up (int): Number of atoms in top filament cluster - fil_depth (float): Minimum z-coordinate of top cluster (Å) - rdf_up (np.ndarray): Upper cluster RDF data (shape: [N_bins, 2])

Return type:

tuple[int, int, int, float, np.ndarray, int, float, np.ndarray, float, float]

Raises:

FileNotFoundError – If the specified filepath does not exist or is not accessible to OVITO.
ValueError – If no metallic clusters are detected in the trajectory file or if file format is incompatible with OVITO cluster analysis pipelines.
TypeError – If input parameters have invalid types or values outside acceptable ranges.

Notes

OVITO Cluster Analysis Pipeline:

Multi-Pipeline Architecture: The function employs four specialized OVITO pipelines for comprehensive analysis:

Primary Cluster Pipeline: Largest cluster identification and analysis
Secondary Cluster Pipeline: Second-largest cluster characterization
Lower Filament Pipeline: Bottom electrode region analysis
Upper Filament Pipeline: Top electrode region analysis

Coordination Analysis Framework:

Cutoff Distance: 2.7 Å (first neighbor shell for metallic bonds)
Coordination Threshold: <6 (under-coordinated atoms for filament detection)
Clustering Cutoff: 3.9 Å (metallic cluster connectivity distance)

Species Selection Criteria: The analysis targets metallic and under-coordinated species: - Type 2: Hafnium atoms (matrix metal component) - Type 4, 8: Tantalum atoms (matrix metal component) - Type 8: Additional Ta metal species (device-specific) - Type 1, 3, 7: Oxygen atoms (matrix component) - Types 5, 6, 9, 10: Electrode species

Connectivity Determination Algorithm:

if z_min < z_lower_limit and z_max > z_upper_limit:
    connection_state = 1  # Continuous bridge formed
    separation = 0.0
    gap = 0.0
else:
    connection_state = 0  # Discontinuous filament
    separation = min_distance_between_clusters - cutoff_distance
    gap = z_direction_gap_between_clusters

Radial Distribution Function (RDF) Analysis: Characterizes local atomic structure and bonding environment

Performance Characteristics:

Processing Time: 5-30s per trajectory frame depending on atom count
Memory Usage: O(N_atoms × N_neighbors) for coordination analysis
Accuracy: Sub-angstrom precision for cluster boundary detection
Scalability: Efficient for trajectories up to ~10⁶ atoms per frame

Integration with LAMMPSKit Ecosystem:

File Compatibility: Direct integration with LAMMPS trajectory formats
Parameter Validation: Uses config module validation functions
Pipeline Integration: Compatible with track_filament_evolution() workflows
Data Export: Results integrate with time series plotting functions

Examples

Basic filament connectivity analysis:

>>> from lammpskit.ecellmodel.filament_layer_analysis import analyze_clusters
>>> # Analyze single trajectory frame for filament formation
>>> trajectory_file = "switching_process.lammpstrj"
>>>
>>> cluster_results = analyze_clusters(trajectory_file,
...                                   z_filament_lower_limit=5,
...                                   z_filament_upper_limit=23)
>>>
>>> # Extract connectivity information
>>> timestep, connection, size_down, height, rdf_down, size_up, depth, rdf_up, separation, gap = cluster_results
>>>
>>> print(f"Timestep: {timestep}")
>>> print(f"Filament Connected: {'Yes' if connection else 'No'}")
>>> if connection:
...     print(f"Complete bridge formed with {size_down + size_up} atoms")
... else:
...     print(f"Gap: {gap:.2f} Å, Separation: {separation:.2f} Å")

Device geometry optimization:

>>> # Parametric study of device geometry effects on connectivity
>>> trajectory_file = "device_optimization.lammpstrj"
>>>
>>> # Test different electrode separations
>>> geometries = [
...     {"lower": 3, "upper": 20, "name": "thin_oxide"},
...     {"lower": 5, "upper": 25, "name": "medium_oxide"},
...     {"lower": 7, "upper": 30, "name": "thick_oxide"}
... ]
>>>
>>> connectivity_results = {}
>>> for geom in geometries:
...     try:
...         results = analyze_clusters(trajectory_file,
...                                  z_filament_lower_limit=geom["lower"],
...                                  z_filament_upper_limit=geom["upper"])
...         connectivity_results[geom["name"]] = {
...             "connected": results[1],
...             "total_atoms": results[2] + results[5],
...             "gap": results[9] if not results[1] else 0.0
...         }
...         print(f"{geom['name']}: Connected={results[1]}, Atoms={results[2] + results[5]}")
...     except Exception as e:
...         print(f"Error analyzing {geom['name']}: {e}")
>>>
>>> print("Geometry optimization completed")

Detailed structural characterization:

>>> # Comprehensive filament structure analysis
>>> trajectory_file = "filament_structure.lammpstrj"
>>>
>>> cluster_data = analyze_clusters(trajectory_file, z_filament_lower_limit=4,
...                               z_filament_upper_limit=24, thickness=20)
>>>
>>> timestep, connection, size_down, height, rdf_down, size_up, depth, rdf_up, separation, gap = cluster_data
>>>
>>> # Analyze RDF characteristics
>>> rdf_distances = rdf_down[:, 0]  # Radial distances
>>> rdf_intensities = rdf_down[:, 1]  # RDF values
>>>
>>> # Find first coordination shell peak
>>> first_peak_idx = np.argmax(rdf_intensities[:50])  # Search first 5 Å
>>> first_peak_position = rdf_distances[first_peak_idx]
>>> first_peak_intensity = rdf_intensities[first_peak_idx]
>>>
>>> print(f"Structural Analysis Results:")
>>> print(f"  Lower cluster: {size_down} atoms, height: {height:.2f} Å")
>>> print(f"  Upper cluster: {size_up} atoms, depth: {depth:.2f} Å")
>>> print(f"  First RDF peak: {first_peak_position:.2f} Å (intensity: {first_peak_intensity:.3f})")
>>>
>>> if connection:
...     total_filament_size = size_down + size_up
...     filament_span = height - depth
...     print(f"  Complete filament: {total_filament_size} atoms spanning {filament_span:.2f} Å")
... else:
...     print(f"  Disconnected filament with {gap:.2f} Å gap")

Time series connectivity analysis:

>>> # Process multiple trajectory frames for evolution tracking
>>> import glob
>>> trajectory_files = sorted(glob.glob("time_series_*.lammpstrj"))
>>>
>>> connectivity_evolution = []
>>> filament_size_evolution = []
>>> gap_evolution = []
>>>
>>> for i, traj_file in enumerate(trajectory_files):
...     try:
...         results = analyze_clusters(traj_file, z_filament_lower_limit=5,
...                                  z_filament_upper_limit=23)
...
...         timestep, connection, size_down, height, rdf_down, size_up, depth, rdf_up, separation, gap = results
...
...         connectivity_evolution.append(connection)
...         filament_size_evolution.append(size_down + size_up)
...         gap_evolution.append(gap if not connection else 0.0)
...
...         if i % 10 == 0:  # Progress reporting
...             print(f"Processed frame {i+1}/{len(trajectory_files)}")
...
...     except Exception as e:
...         print(f"Error processing {traj_file}: {e}")
...         connectivity_evolution.append(0)
...         filament_size_evolution.append(0)
...         gap_evolution.append(float('inf'))
>>>
>>> # Analyze connectivity statistics
>>> connectivity_fraction = np.mean(connectivity_evolution)
>>> avg_connected_size = np.mean([size for size, conn in zip(filament_size_evolution, connectivity_evolution) if conn])
>>> avg_gap_when_disconnected = np.mean([gap for gap, conn in zip(gap_evolution, connectivity_evolution) if not conn and gap < float('inf')])
>>>
>>> print(f"Time Series Analysis Results:")
>>> print(f"  Connectivity fraction: {connectivity_fraction:.3f}")
>>> print(f"  Average connected filament size: {avg_connected_size:.1f} atoms")
>>> print(f"  Average gap when disconnected: {avg_gap_when_disconnected:.2f} Å")

Advanced RDF analysis and comparison:

>>> # Compare RDF characteristics between different states
>>> connected_file = "connected_state.lammpstrj"
>>> disconnected_file = "disconnected_state.lammpstrj"
>>>
>>> # Analyze connected state
>>> conn_results = analyze_clusters(connected_file)
>>> conn_rdf_down = conn_results[4]
>>>
>>> # Analyze disconnected state
>>> disconn_results = analyze_clusters(disconnected_file)
>>> disconn_rdf_down = disconn_results[4]
>>>
>>> # Compare structural order
>>> conn_first_peak = np.max(conn_rdf_down[:50, 1])
>>> disconn_first_peak = np.max(disconn_rdf_down[:50, 1])
>>>
>>> print(f"RDF Comparison:")
>>> print(f"  Connected state first peak intensity: {conn_first_peak:.3f}")
>>> print(f"  Disconnected state first peak intensity: {disconn_first_peak:.3f}")
>>> print(f"  Structural order change: {(conn_first_peak - disconn_first_peak)/disconn_first_peak*100:.1f}%")

Error handling and validation:

>>> # Robust cluster analysis with comprehensive error handling
>>> trajectory_files = ["test1.lammpstrj", "test2.lammpstrj", "test3.lammpstrj"]
>>>
>>> successful_analyses = []
>>> failed_analyses = []
>>>
>>> for traj_file in trajectory_files:
...     try:
...         # Validate file exists before processing
...         import os
...         if not os.path.exists(traj_file):
...             print(f"File not found: {traj_file}")
...             failed_analyses.append((traj_file, "FileNotFoundError"))
...             continue
...
...         # Perform cluster analysis
...         results = analyze_clusters(traj_file,
...                                  z_filament_lower_limit=5,
...                                  z_filament_upper_limit=23)
...
...         # Validate results
...         if len(results) == 10:
...             successful_analyses.append((traj_file, results))
...             print(f"Successfully analyzed: {traj_file}")
...         else:
...             failed_analyses.append((traj_file, "Invalid results format"))
...
...     except ValueError as e:
...         print(f"Cluster detection failed for {traj_file}: {e}")
...         failed_analyses.append((traj_file, f"ValueError: {e}"))
...     except Exception as e:
...         print(f"Unexpected error for {traj_file}: {e}")
...         failed_analyses.append((traj_file, f"Exception: {e}"))
>>>
>>> print(f"Analysis Summary:")
>>> print(f"  Successful: {len(successful_analyses)}")
>>> print(f"  Failed: {len(failed_analyses)}")

Integration with filament evolution tracking:

>>> # Use with track_filament_evolution for comprehensive time series analysis
>>> from lammpskit.ecellmodel.filament_layer_analysis import track_filament_evolution
>>>
>>> # Single-frame cluster analysis
>>> single_frame_file = "snapshot.lammpstrj"
>>> cluster_analysis = analyze_clusters(single_frame_file)
>>>
>>> print(f"Single Frame Analysis:")
>>> print(f"  Connectivity: {cluster_analysis[1]}")
>>> print(f"  Total atoms: {cluster_analysis[2] + cluster_analysis[5]}")
>>>
>>> # Multi-frame evolution tracking
>>> time_series_files = ["evolution.lammpstrj"]
>>> evolution_plots = track_filament_evolution(time_series_files, "evolution_study",
...                                          time_step=0.0002, dump_interval_steps=5000)
>>>
>>> print(f"Evolution tracking completed with {len(evolution_plots)} plots")