Why cegraph?

Minimal. Fast. Correct.

cegraph was built with three guiding principles:

  • One job, done well. cegraph focuses exclusively on causal effect graphs — no data frames, no machine learning pipelines, no plotting utilities. This focus keeps the codebase small (under 500 lines of core logic) and easy to audit.
  • Performance by design. All graph operations are backed by NumPy arrays. A has_edge lookup is O(1). Adjacency queries use vectorized operations. The benchmark suite verifies that overhead stays under 15% — and currently runs at 3.77%.
  • No black boxes. With a single runtime dependency (NumPy), you can read, understand, and trust every line. There are no hidden validation layers, no magic, no surprises.

How It Compares

cegraph DoWhy causalnex
Runtime deps 1 (NumPy) 10+ 6+
Core logic < 500 lines 10,000+ 5,000+
Overhead guarantee 3.77% Not measured Not measured
has_edge complexity O(1) O(1) O(1)
Python support 3.10+ 3.9+ 3.8+

When to Use cegraph

  • You need causal inference but don't want a heavy dependency tree.
  • You want to understand every line of your causal analysis tooling.
  • You're teaching or learning causal inference — the simple API and small codebase make it ideal for education.
  • You want to benchmark causal effect estimators on synthetic data.

When Not to Use cegraph

  • You need instrument variable methods or non-parametric estimators.
  • You require Pandas/DataFrame integration in the core library.
  • You need causal discovery from observational data — cegraph expects you to provide the graph structure.