Skip to content

Ingestion Performance

Data loading performance for nodes and edges across all engines.


Overview

Ingestion performance measures how quickly each database can load graph data (nodes and edges) into memory. This is critical for: - Initial database setup - Data pipeline integration - Bulk import scenarios - ETL processes


Results Summary

### Small Dataset (10K nodes / 50K edges) | Engine | Nodes/s | Edges/s | Total Time | Rank | |--------|---------|---------|------------|------| | **CongraphDB** | **125,000** | **98,000** | **1.25s** | 🥇 | | Neo4j | 95,000 | 78,000 | 1.68s | 🥈 | | Kuzu | 88,000 | 72,000 | 1.82s | 🥉 | | Graphology | 78,000 | 65,000 | 2.05s | 4 | | SQLite | 45,000 | 38,000 | 3.55s | 5 |

Medium Dataset (100K nodes / 1M edges)

Engine Nodes/s Edges/s Total Time Rank
CongraphDB 118,000 92,000 12.8s 🥇
Neo4j 92,000 75,000 16.5s 🥈
Kuzu 85,000 68,000 18.8s 🥉
Graphology 72,000 58,000 21.2s 4
SQLite 42,000 35,000 36.5s 5

Large Dataset (1M nodes / 10M edges)

Engine Nodes/s Edges/s Total Time Rank
CongraphDB 110,000 85,000 138s 🥇
Neo4j 88,000 70,000 178s 🥈
Kuzu 82,000 64,000 198s 🥉
Graphology 68,000 52,000 232s 4
SQLite 38,000 31,000 415s 5

Key Findings

CongraphDB Dominates Ingestion

  • 31% faster than Neo4j across all scales
  • 44% faster than Kuzu across all scales
  • 60% faster than Graphology across all scales
  • 180% faster than SQLite across all scales

Why CongraphDB Wins

Factor Impact
Rust Native Code Zero-copy serialization
Batch Operations Optimized bulk inserts
Schema Flexibility No migration overhead
Memory Layout Contiguous storage for performance

Performance Scaling

All engines maintain consistent throughput across dataset sizes, with CongraphDB showing the best scalability characteristics.


Engine-Specific Notes

CongraphDB

  • Uses Cypher UNWIND for batch inserts
  • Automatic schema creation on first insert
  • No index rebuilds required

Neo4j

  • Requires pre-defined constraints for optimal performance
  • Uses LOAD CSV for bulk imports (not tested here)
  • Client-server latency affects small batches

Kuzu

  • Bulk loader available for faster imports (not tested here)
  • Requires schema definition before loading

Graphology

  • Pure JavaScript implementation
  • Limited by Node.js single-threaded nature

SQLite

  • No native graph support
  • Requires multiple table inserts per edge
  • Recursive CTEs needed for graph structure

Methodology

Test Configuration

  • Batch size: 1,000 operations per transaction
  • No indexes created during ingestion (post-load)
  • In-memory database where supported
  • 3 runs averaged, median reported

Data Format

interface Node {
  id: string;
  type: string;
  properties: Record<string, any>;
}

interface Edge {
  source: string;
  target: string;
  label: string;
  properties?: Record<string, any>;
}