tardisbase.testing.regression_comparison.compare module¶
- class tardisbase.testing.regression_comparison.compare.ReferenceComparer(ref1_hash=None, ref2_hash=None, refpath1=None, refpath2=None, print_path=False, repo_path=None)[source]¶
Bases:
object
A class for comparing reference data between two regression data commits or direct paths.
This class provides functionality to compare HDF5 files, generate visualizations, and analyze differences between two regression data repo commits or direct directory paths. It supports directory comparison, HDF5 file analysis, and plot generation.
- Parameters:
ref1_hash (str, optional) – Git commit hash for the first reference dataset, by default None. Cannot be used together with refpath1.
ref2_hash (str, optional) – Git commit hash for the second reference dataset, by default None. Cannot be used together with refpath2.
refpath1 (str or Path, optional) – Direct path to the first reference directory, by default None. Cannot be used together with ref1_hash.
refpath2 (str or Path, optional) – Direct path to the second reference directory, by default None. Cannot be used together with ref2_hash.
print_path (bool, optional) – Whether to print file paths in comparison output, by default False.
repo_path (str or Path, optional) – Path to the repository containing reference data, by default None. If None, uses the path specified in CONFIG[‘compare_path’]. Only used when using git hashes.
- Raises:
AssertionError – If neither git hashes nor direct paths are provided, or if both are provided.
- compare(print_diff=False)[source]¶
Perform comparison between regression datasets.
This method executes the main comparison workflow, including optional directory difference printing and HDF5 file comparison. It updates the internal test_table_dict with comparison results.
- Parameters:
print_diff (bool, optional) – Whether to print detailed directory differences, by default False. If True, displays a tree-like view of file differences. Only available when using git hashes and both references are available.
- compare_hdf_files()[source]¶
Discover and compare all HDF5 files in the reference directories.
This method recursively walks through the reference directories, identifies HDF5 files (.h5, .hdf5), and compares them. When both paths are available, it compares files that exist in both directories. When only one path is available, it lists all HDF5 files in that directory.
- compare_testspectrumsolver_hdf(custom_ref1_path=None, custom_ref2_path=None)[source]¶
Perform comparison for TestSpectrumSolver HDF5 files.
- Parameters:
custom_ref1_path (str or Path, optional) – Custom path to the first TestSpectrumSolver.h5 file, by default None. If None, uses the standard path within ref1_path directory (git mode) or the direct ref1_path (direct path mode).
custom_ref2_path (str or Path, optional) – Custom path to the second TestSpectrumSolver.h5 file, by default None. If None, uses the standard path within ref2_path directory (git mode) or the direct ref2_path (direct path mode).
Notes
The method automatically creates visualization output directories when the SAVE_COMP_IMG environment variable is set to ‘1’. The comparison generates specialized plots tailored for spectrum solver data analysis.
Standard file paths (when custom paths are not provided):
git mode:
tardis/spectrum/tests/test_spectrum_solver/test_spectrum_solver/TestSpectrumSolver.h5
direct path mode: uses ref1_path and ref2_path directly
- display_hdf_comparison_results()[source]¶
Print a formatted summary of all HDF5 comparison results.
This method provides a comprehensive overview of comparison results for all HDF5 files that were analyzed, displaying key-value pairs for each file’s comparison statistics.
Notes
The output includes information such as:
Number of different keys
Identical keys count
Keys with same name but different data
File paths and reference key lists
- generate_graph(option)[source]¶
Generate interactive visualizations of comparison results.
This method creates Plotly bar charts to visualize differences between reference datasets, supporting two types of comparisons: keys with same names but different data, and structural key differences.
- Parameters:
option (str) – Type of comparison to visualize. Must be one of: - “different keys same name” : Shows keys with identical names but different data - “different keys” : Shows structural differences (added/deleted keys)
- Returns:
Interactive Plotly figure showing the comparison results. Returns None if no data matches the specified option.
- Return type:
plotly.graph_objects.Figure or None
- Raises:
ValueError – If option is not one of the supported values.
Notes
For “different keys same name” option: - Bar colors represent relative difference magnitude using blue color scale - Hover information includes maximum relative differences and percentages - Handles NaN and infinite values gracefully
For “different keys” option: - Green bars represent added keys - Red bars represent deleted keys - Random color variations within each category for distinction
If the environment variable SAVE_COMP_IMG is set to ‘1’, the plot will be saved as a high-resolution PNG file in a comparison_plots directory.
- get_temp_dir()[source]¶
Get the temporary directory path used for file operations.
- Returns:
The temporary directory path managed by the file manager when using git, or None when using direct paths.
- Return type:
Path or None
- setup()[source]¶
Set up all necessary components for reference comparison.
This method initializes the file manager (if using git), sets up reference files, creates analyzer and comparator instances, and establishes directory comparison objects. Must be called before performing any comparisons.
Notes
After calling this method, the following attributes will be available: - ref1_path : Path to the first reference directory - ref2_path : Path to the second reference directory - dcmp : Directory comparison object - file_setup : Configured FileSetup instance - diff_analyzer : Configured DiffAnalyzer instance - hdf_comparator : Configured HDFComparator instance
- summarise_changes_hdf(name, path1, path2)[source]¶
Analyze and store changes for a specific HDF5 file pair.
This method performs detailed comparison of an HDF5 file between two reference directories and stores the results in the internal test_table_dict.
- Parameters:
Notes
The results are stored in test_table_dict[name] and include: - Relative path information (when using git) - All comparison results from HDFComparator - Lists of keys from both reference files - Summary statistics about differences