tardisbase.testing.regression_comparison.analyzers module

class tardisbase.testing.regression_comparison.analyzers.DiffAnalyzer(file_manager)[source]

Bases: object

A class for analyzing and displaying differences between directory structures.

This class provides methods to visualize directory differences using tree-like displays with colored output and detailed file comparison reports.

Parameters:

file_manager (tardisbase.testing.regression_comparison.file_manager.FileManager) – A file manager object that handles file operations and provides access to temporary directory paths.

file_manager

The file manager instance used for path operations.

Type:

object

display_diff_tree(dcmp, prefix='')[source]

Display a tree-like visualization of directory differences.

This method recursively traverses directory comparison objects and displays added, removed, and modified files/directories using colored symbols in a tree structure.

Parameters:
  • dcmp (filecmp.dircmp) – A directory comparison object from the filecmp module containing the comparison results between two directories.

  • prefix (str, optional) – String prefix for indentation in the tree display, by default ‘’. Used internally for recursive calls to maintain proper indentation.

Notes

Uses the following symbols:

  • ‘−’ (red) for items only in the left directory

  • ‘+’ (green) for items only in the right directory

  • ‘✱’ (yellow) for files that differ between directories

  • ‘├’ (blue) for common subdirectories

  • ‘│ ‘ for tree indentation in subdirectories

print_diff_files(dcmp)[source]

Print detailed information about file differences between directories.

Parameters:

dcmp (filecmp.dircmp) – A directory comparison object containing the results of comparing two directory structures.

class tardisbase.testing.regression_comparison.analyzers.HDFComparator(print_path=False)[source]

Bases: object

A class for comparing HDF5 files and analyzing differences between datasets.

This class provides functionality to compare HDF5 files, identify differences in keys and data, and display statistical summaries and visualizations of the differences found.

Parameters:

print_path (bool, optional) – Whether to print file paths in the output, by default False.

summarise_changes_hdf(name, path1, path2)[source]

Compare two HDF5 files and summarize the differences between them.

This method performs a comparison of HDF5 files, analyzing both structural differences (different keys) and data differences (same keys with different values).

Parameters:
  • name (str) – The name of the HDF5 file to compare (should exist in both paths).

  • path1 (str or Path) – Path to the first directory containing the HDF5 file.

  • path2 (str or Path) – Path to the second directory containing the HDF5 file.

Returns:

A dictionary containing comparison results if differences are found:

  • ’different_keys’int

    Number of keys that differ between the files

  • ’identical_keys’int

    Number of keys that are completely identical

  • ’identical_keys_diff_data’int

    Number of keys with same name but different data

  • ’identical_name_different_data_dfs’dict

    Dictionary mapping key names to difference DataFrames

  • ’ref1_keys’list

    List of all keys in the first file

  • ’ref2_keys’list

    List of all keys in the second file

  • ’added_keys’list

    Keys present only in the second file

  • ’deleted_keys’list

    Keys present only in the first file

Returns None if no differences are found.

Return type:

dict or None

Notes

The method prints detailed summaries and visualizations when differences are detected. For data differences, it calculates relative differences as (ref1 - ref2) / ref1 and displays heatmaps.