Nauty Algorithm Overview and Implementation Examples

Machine Learning Natural Language Processing Artificial Intelligence Digital Transformation Semantic Web Knowledge Information Processing Graph Data Algorithm Relational Data Learning Recommend Technology Python Time Series Data Analysis Navigation of this blog

Overview of the Nauty Algorithm

Nauty (short for No AUTomorphisms, Yes?) is a high-performance algorithm designed for efficiently handling graph isomorphism testing and canonical labeling of graphs.

Nauty serves as a powerful tool for analyzing the structural properties of graphs and provides the following three core functionalities:

Graph Isomorphism Testing:
Nauty rapidly determines whether two graphs are structurally identical (i.e., isomorphic). This enables accurate identification of graphs that may appear different but have the same structure.
Canonical Labeling:
It transforms any given graph into a unique canonical form, which facilitates standardized comparison and storage. This is particularly useful for detecting duplicates or performing structural classification in databases.
Automorphism Group Computation:
Nauty enumerates all the symmetry-preserving permutations of a graph’s nodes, thereby computing the automorphism group. This function is essential for symmetry analysis in applications such as chemical informatics and combinatorial optimization.

As such, Nauty provides a robust framework for structure recognition, normalization, and symmetry analysis in graphs.

Basic Algorithmic Workflow

The core steps of the Nauty algorithm include:

Partition Refinement:
Initially, the graph’s nodes are grouped into subsets (called cells) based on structural properties such as degree and adjacency. These partitions are progressively refined to distinguish nodes more precisely.
Search Tree Construction:
A search tree is built based on the refined partitions, exploring possible labelings of the graph in a depth-first manner.
Pruning:
When a structural configuration that has already been explored (i.e., an isomorphic form) is encountered, further exploration of that branch is pruned. This significantly reduces redundant computation.
Canonical Form Selection:
Among all possible labelings, the one with the lexicographically smallest order is chosen as the canonical form. This ensures consistent comparison and storage.

These steps allow Nauty to handle even complex and large-scale graphs with high speed and accuracy.

Comparison with VF2 Algorithm

Both Nauty and VF2 described in “VF2 Algorithm Overview and Implementation Examples“ are widely used algorithms for graph isomorphism, but they differ significantly in their design goals and application domains:

Problem Focus:
- Nauty focuses on complete isomorphism detection and also supports canonical labeling and automorphism group computation.
- VF2 supports both full and subgraph isomorphism, making it suitable for pattern matching tasks.
Pruning Strategy:
- Nauty employs strong symmetry-based pruning using group theory.
- VF2 uses state-space exploration with fast local feasibility checks.
Canonical Labeling:
- A central feature in Nauty; it outputs a unique representation of the graph.
- Not supported in VF2.
Support for Attributed Graphs:
- Nauty does not natively support node or edge labels; for such tasks, Traces, a derivative of Nauty, is recommended.
- VF2 can handle labeled graphs and perform attribute-aware matching.
Performance:
- Nauty is extremely fast for small to medium-sized unlabeled graphs, especially when canonicalization is required.
- VF2 is generally fast but may be less efficient for complex, labeled graphs with intricate matching rules.

In summary, Nauty is ideal for canonical form generation and symmetry analysis, while VF2 is better suited for flexible substructure matching involving node and edge labels.

Computational Complexity

Theoretically, Nauty may require exponential time in the worst case, as the graph isomorphism problem is known to be in NP but is neither proven to be NP-complete nor in P.

However, in practice, Nauty performs extremely efficiently. Thanks to partition refinement and symmetry-based pruning, it dramatically reduces computation time for most real-world graph structures.

As a result, Nauty can handle graphs with thousands to tens of thousands of nodes on modern hardware, often achieving near-linear or low-polynomial runtime for typical inputs. This makes it a widely trusted tool in both academic and industrial applications.

Use Cases Where Nauty Excels

Graph Isomorphism Filtering at Scale:
Nauty excels at filtering out duplicate graphs from large datasets by converting each graph to its canonical form, allowing for rapid deduplication.
Structural Classification and Enumeration:
Ideal for tasks like generating all non-isomorphic graphs with a given number of nodes. Nauty ensures efficient exploration without redundancy through canonical labeling.
Graph ID Generation via Canonical Forms:
By assigning a unique canonical representation to each graph, Nauty enables consistent and compact identification. This is especially useful for indexing, retrieval, and structural comparison.

Thanks to its precision and speed, Nauty is highly effective for structure-based graph analysis and is widely used in fields such as chemistry (e.g., molecular structure comparison), knowledge graphs, combinatorics, and graph databases, where exact structural equivalence matters.

Related Algorithms to Nauty

The Nauty algorithm is a powerful tool for graph isomorphism testing, canonical labeling, and automorphism group computation. The following categorization introduces related algorithms across multiple dimensions:

1. Graph Isomorphism Algorithms

Graph isomorphism checks whether two graphs are structurally identical. It is a deep problem in both theory and implementation. Below are representative algorithms:

Algorithm	Characteristics	Notes
VF2 / VF3	State-space search, supports subgraph isomorphism	Fast in practice, supports labeled graphs
Ullmann’s Algorithm	One of the earliest search-based methods	Weak pruning, inefficient for practical use
Weisfeiler–Lehman (WL) Test	Label propagation-based approximate isomorphism test	~O(n log n); widely used in GNNs
Babai’s GI Algorithm (2015)	Demonstrates that GI can be solved in quasipolynomial time	Theoretical breakthrough; complex to implement

2. Canonical Labeling Algorithms

Canonical labeling gives each graph a unique representation (canonical form), essential for duplication detection and structural identification.

Algorithm	Description	Notes
Nauty	Most widely used canonical labeling algorithm	Very fast; strong on sparse graphs
Traces	Nauty’s successor with improved search order	Handles labeled graphs more effectively
Bliss	Scalable, based on group theory	Strong for large directed graphs
Saucy	Specialized in symmetry detection	Fast, widely used in SAT/SMT solving; not ideal for canonical labeling

3. Automorphism Group Computation Algorithms

An automorphism group enumerates all symmetry-preserving permutations of nodes. This is crucial for symmetry analysis and structural simplification.

Algorithm	Description	Tools
Nauty	Computes automorphism groups during isomorphism testing	Available via the `dreadnaut` tool
Bliss	Efficiently computes automorphism groups using Cayley graphs	Widely used in graph generation pipelines
SBPs (Symmetry Breaking Predicates)	Used with SAT solvers to break symmetries in models	Integrated into verification/model checking tools

4. Application-Specific Algorithms

Graph isomorphism and similarity detection are applied in domain-specific contexts with customized algorithms.

Domain	Tools / Algorithms	Notes
Chemical Structure Analysis	RDKit (substructure search using VF2)	Subgraph isomorphism is more relevant than full isomorphism
Software Verification	Symmetry Reduction + Saucy	Used to reduce state space in model checking
Knowledge Graphs (RDF)	Semantic Graph Matching + WL test	Considers both structural and semantic similarity

5. Algorithm Selection Guide (Task-Oriented)

Choosing the right algorithm depends on the analysis task. The following table summarizes recommended algorithms by use case:

Task	Recommended Algorithms	Reason
Exact Isomorphism Testing	Nauty / Traces	Fast with canonical labeling support
Labeled Graph Isomorphism	VF2 / Traces	Flexible support for node/edge labels
Substructure Search	VF2 / Ullmann	Effective for pattern mining tasks
Approximate Similarity Detection	Weisfeiler–Lehman	Ideal for machine learning and GNN-based models
Symmetry Analysis	Nauty / Bliss / Saucy	For extracting and simplifying automorphism groups

These algorithms provide a rich toolkit for graph analysis, with Nauty excelling in canonicalization and structural deduplication, and others offering flexibility for attributes, scalability, or approximate matching depending on the use case.

Practical Implementation Examples of the Nauty Algorithm

The following presents practical use cases of the Nauty algorithm, demonstrating how to perform graph isomorphism testing and canonical labeling to uniquely represent graph structures. While Nauty is originally implemented in C, there are two main ways to use it from Python:

Method 1: Using Nauty via SageMath (Python-like interface)

SageMath internally incorporates Nauty, allowing graph isomorphism and canonical labeling through a Python-like syntax.

1. Example: Graph Isomorphism Testing

# Run in SageMath environment or Jupyter with SageMath kernel

G1 = Graph([(1, 2), (2, 3), (3, 1)])  # Triangle
G2 = Graph([(4, 5), (5, 6), (6, 4)])  # Same structure, different labels

# Isomorphism test (uses Nauty internally)
print("Are they isomorphic?", G1.is_isomorphic(G2))  # True

2. Example: Canonical Labeling for Unique Representation

G = Graph([(3, 1), (1, 2), (2, 3)])
canon = G.canonical_label()
canon.show()  # Label order is normalized despite original structure

Method 2: Nauty Core Tools via Command Line (labelg, dreadnaut)

Nauty provides several command-line tools that can be used directly.

1. Graph Input in graph6 Format

Save the graph as a graph6 string in a file called graph1.g6:

DQK # Represents a triangle in graph6 encoding

2. Canonical Label Generation

labelg graph1.g6 -o > canon1.g6

labelg: Nauty tool for canonical labeling
-o: Output to stdout

3. Isomorphism Checking Between Multiple Graphs

shortg graphA.g6 graphB.g6

If the graphs are isomorphic, only one canonical version is output—effectively performing an isomorphism filter.

Example Use Cases

Field	Description	Role of Nauty
Graph Databases	Deduplication and compression of graph structures	Detect isomorphism using canonical labels
Scientific Research (Chemistry)	Molecular structure identification and ID generation	Canonical form for molecular IDs (e.g., normalized formulas)
Graph Enumeration	Enumerating all non-isomorphic graphs	Used with `geng` and `shortg`
SAT/SMT Verification	Symmetry elimination in model checking	Structural analysis using `saucy` or `nauty`
Knowledge Graphs	Structural comparison and subgraph matching	Detecting subgraph isomorphism

Use Case Example: Non-Isomorphic Graph Generation and Classification

geng 4 -c | shortg > unique_graphs.g6

geng 4 -c: Generates all connected graphs with 4 nodes
shortg: Filters out isomorphic duplicates using Nauty core logic

Related Tool Summary

Tool	Purpose	Output Format
`geng`	Generate graphs	`graph6`
`labelg`	Canonical labeling	`graph6`
`shortg`	Unify isomorphic graphs	`graph6`
`dreadnaut`	Interactive interface for custom operations	text/commands

Bonus: Python + Nauty (Advanced Integration)

Advanced workflows can integrate Nauty with Python frameworks such as:

SageMath
Graphillion
igraph
pygraphblas

These enable fast and flexible graph canonicalization in Python-based environments while leveraging Nauty’s core performance.

This practical workflow illustrates Nauty’s versatility in both research and applied graph analytics, especially where graph uniqueness, symmetry analysis, and structural comparison are critical.

Real-World Applications of the Nauty Algorithm

The Nauty algorithm is widely applied in diverse domains where structural graph equivalence and canonical labeling are essential. Below are concrete application examples across different fields:

1. Chemical Structure Analysis (Molecular Graph Isomorphism)

Overview:
Molecules are modeled as graphs where atoms = nodes and bonds = edges. Even if structural formulas differ, it is necessary to determine whether two molecules are actually the same (structural isomorphism).

Role of Nauty:
Canonical labeling is used to assign a unique structural representation to each molecular graph, enabling:

- Identification of isomers and chemically identical structures with different representations
- Duplicate removal in compound databases

Examples:

- Preprocessing and normalization in databases such as ChEMBL, PubChem, and ChemSpider
- Canonical SMILES or InChI generation (note: InChI doesn’t use Nauty but similar logic applies)

2. Graph Theory and Mathematical Research (Graph Classification)

Overview:
Enumerating, classifying, and visualizing all non-isomorphic graphs with n vertices is a common task in mathematical research and education.

Role of Nauty:
By using tools like geng and shortg, Nauty efficiently generates and filters all non-isomorphic graphs.

Examples:

- Academic works or teaching materials visualizing “All non-isomorphic graphs with n ≤ 10”
- Structural enumeration in Ramsey theory, graph coloring, and related combinatorics

3. SAT/SMT Model Checking and Theorem Proving (Symmetry Reduction)

Overview:
In model checking, computational overhead is reduced by eliminating structurally equivalent states in the state space.

Role of Nauty:

- Represents model states as graphs
- Detects and collapses isomorphic states to avoid redundancy
- Helps prevent state space explosion

Examples:

- Used in tools like NuSMV, Cadence SMV, and Symmetry-Aware SAT Solvers
- Integrated with Nauty or Saucy to extract automorphism groups as preprocessing

4. Database Indexing and Deduplication

Overview:
In graph-structured databases (e.g., knowledge graphs, social networks), it is often necessary to unify structurally identical entities.

Role of Nauty:

- Canonical labeling provides a unique structural hash or ID
- Enables rapid structural comparisons and indexing

Examples:

- Preprocessing entity graphs before storing them in Neo4j
- Structural normalization of RDF graphs
- Comparing and archiving graphs in research datasets in a consistent format

5. Machine Learning and GNN Preprocessing

Overview:
In graph neural networks (GNNs), it’s crucial to prevent the model from misclassifying isomorphic graphs as different classes.

Role of Nauty:

- Provides consistent input orderings and node IDs for isomorphic graphs
- Improves training stability by eliminating redundant representations
- Prevents data leakage in graph classification tasks

Examples:

- Preprocessing steps in TUDataset and OGB (Open Graph Benchmark)
- Experimental setups in papers like “Graph Isomorphism Networks (GIN)”

These applications demonstrate how Nauty is central to tasks requiring structural consistency, deduplication, and symmetry analysis in both theoretical and applied settings, especially in chemistry, formal verification, and machine learning.

References

1. Foundational Papers and Developer Publications

McKay, Brendan D. (1981)
Practical Graph Isomorphism – The original Nauty paper
McKay, B. D., & Piperno, A. (2014)
Practical Graph Isomorphism II – Introduces the Traces algorithm (a successor to Nauty)

2. Standard Textbooks and Theoretical Foundations

Handbook of Graph Theory
Authors: Jonathan Gross & Jay Yellen (CRC Press)
→ Covers the theoretical background of Nauty and canonical labeling techniques
Graph Theory with Applications
Authors: J. A. Bondy and U. S. R. Murty
→ Comprehensive coverage of the theoretical foundations of graph isomorphism

3. Implementations and Documentation Resources

Nauty/Traces Official Website (Brendan McKay)
🔗 https://users.cecs.anu.edu.au/~bdm/nauty/
Included tools:
- geng: Generate non-isomorphic graphs
- shortg: Compress and filter isomorphic graphs
- labelg: Perform canonical labeling
- dreadnaut: Interactive command-line interface
- Traces: Enhanced version of Nauty for labeled graphs
SageMath Documentation (uses Nauty internally)
🔗 Sage Graphs API
→ Functions such as is_isomorphic() and canonical_label() leverage Nauty

4. Applied and Experimental Research Papers

Junttila, T., & Kaski, P. (2007)
Engineering an Efficient Canonical Labeling Tool for Large and Sparse Graphs
→ Empirical performance analysis of Nauty, comparison with Bliss
Babai, László (2016)
Graph Isomorphism in Quasipolynomial Time
→ A major theoretical breakthrough in the complexity of the graph isomorphism problem

5. Comparative Studies with Related Algorithms

Bliss vs. Nauty: Junttila & Kaski (2007) – Performance benchmarking of canonical labeling tools
Traces vs. Nauty: McKay & Piperno (2014) – Design and enhancements of the Traces algorithm
Saucy vs. Nauty: Comparative studies in SAT/SMT symmetry reduction literature
VF2 vs. Nauty: Cordella et al. (2001) and various benchmark evaluations comparing subgraph and full isomorphism techniques

This reference list provides a well-rounded foundation for understanding both the theoretical and practical dimensions of Nauty and its role in graph isomorphism, labeling, and symmetry analysis.