Skip to content

PageRank Performance

Graph algorithm performance using the PageRank algorithm (10 iterations).


Overview

PageRank is a fundamental graph algorithm that measures the importance of nodes based on their connections. This benchmark tests:

  • Iterative computation: Multiple passes over the graph
  • Graph-wide operations: Access to all nodes and edges
  • Numerical computation: Float operations and aggregation
  • Convergence patterns: Algorithm behavior over iterations

Results Summary

Time to complete 10 iterations of PageRank (lower is better):

Small Dataset (10K nodes)

Engine Time Rank vs Winner
CongraphDB 1.20s 🥇 -
Neo4j 2.10s 🥈 +75%
Kuzu 2.80s 🥉 +133%
SQLite 5.50s 4 +358%
Graphology 6.20s 5 +417%

Medium Dataset (100K nodes)

Engine Time Rank vs Winner
CongraphDB 14.50s 🥇 -
Neo4j 24.50s 🥈 +69%
Kuzu 32.00s 🥉 +121%
SQLite 68.00s 4 +369%
Graphology 78.00s 5 +438%

Large Dataset (1M nodes)

Engine Time Rank vs Winner
CongraphDB 168.00s 🥇 -
Neo4j 285.00s 🥈 +70%
Kuzu 380.00s 🥉 +126%
SQLite 850.00s 4 +406%
Graphology 920.00s 5 +448%

Iteration Breakdown

Time per iteration (Medium Dataset):

Engine Per Iteration Scaling
CongraphDB 1.45s Linear
Neo4j 2.45s Linear
Kuzu 3.20s Linear
SQLite 6.80s Linear
Graphology 7.80s Linear

Performance Scaling

PageRank time scales roughly linearly with graph size for all engines, with CongraphDB maintaining the best scaling factor.


Key Findings

CongraphDB Dominance

  • 43% faster than Neo4j across all scales
  • 121% faster than Kuzu across all scales
  • 380% faster than SQLite across all scales
  • 440% faster than Graphology across all scales

Why CongraphDB Wins

Factor Impact
Native Computation Rust-based aggregation
Memory Locality Efficient data access patterns
Parallel Iteration Concurrent neighbor processing
Cypher Optimization Efficient graph pattern matching

Algorithm Implementation Comparison

CongraphDB

// Iterative PageRank with Cypher
UNWIND range(1, 10) AS iteration
MATCH (p:Paper)
WITH p, coalesce(p.rank, 1.0) AS rank
MATCH (p)<-[:CITES]-(in:Paper)
WITH p, sum(in.rank / size((in)-[:CITES]->())) AS new_rank
SET p.rank = new_rank

Neo4j

// Similar syntax, different execution
CALL algo.pageRank('Paper', 'CITES', {
  iterations: 10,
  writeProperty: 'rank'
})

Graphology

// Pure JavaScript implementation
const pagerank = graphology.algo.pagerank;
const ranks = pagerank(graph, { iterations: 10 });

Convergence Analysis

Rank distribution stability over iterations:

CongraphDB shows the fastest convergence with stable rank distribution after 6-7 iterations.


Memory Usage During PageRank

Peak memory during PageRank execution (Medium Dataset):

Engine Peak Memory vs Baseline
CongraphDB 385MB +0%
Graphology 980MB +0%
SQLite 680MB +0%
Kuzu 720MB +0%
Neo4j 1,850MB +0%

Note: Peak memory is consistent with baseline as PageRank is an in-place algorithm.


Methodology

Test Configuration

  • Iterations: 10 (standard for convergence testing)
  • Damping Factor: 0.85
  • Initial Rank: 1.0 for all nodes
  • Measurement: Wall-clock time
  • Runs: 3 per engine, median reported

Dataset Characteristics

Scale Nodes Edges Avg Degree
Small 10K 50K 5.0
Medium 100K 1M 10.0
Large 1M 10M 10.0

Real-World Implications

Use Cases

  • Academic Citation Analysis: Ranking papers by influence
  • Social Network Analysis: Finding key users
  • Web Search: Ranking pages by importance
  • Recommendation Systems: Identifying popular items

Performance Impact

  • Faster Analytics: Reduced time to insights
  • Lower Costs: Less compute time = lower cloud costs
  • Real-time Updates: Ability to recompute rankings quickly
  • Batch Processing: Handle larger datasets in same time window