Why cegraph?
Minimal. Fast. Correct.
cegraph was built with three guiding principles:
- One job, done well. cegraph focuses exclusively on causal effect graphs — no data frames, no machine learning pipelines, no plotting utilities. This focus keeps the codebase small (under 500 lines of core logic) and easy to audit.
- Performance by design. All graph operations are
backed by NumPy arrays. A
has_edgelookup is O(1). Adjacency queries use vectorized operations. The benchmark suite verifies that overhead stays under 15% — and currently runs at 3.77%. - No black boxes. With a single runtime dependency (NumPy), you can read, understand, and trust every line. There are no hidden validation layers, no magic, no surprises.
How It Compares
| cegraph | DoWhy | causalnex | |
|---|---|---|---|
| Runtime deps | 1 (NumPy) | 10+ | 6+ |
| Core logic | < 500 lines | 10,000+ | 5,000+ |
| Overhead guarantee | 3.77% | Not measured | Not measured |
| has_edge complexity | O(1) | O(1) | O(1) |
| Python support | 3.10+ | 3.9+ | 3.8+ |
When to Use cegraph
- You need causal inference but don't want a heavy dependency tree.
- You want to understand every line of your causal analysis tooling.
- You're teaching or learning causal inference — the simple API and small codebase make it ideal for education.
- You want to benchmark causal effect estimators on synthetic data.
When Not to Use cegraph
- You need instrument variable methods or non-parametric estimators.
- You require Pandas/DataFrame integration in the core library.
- You need causal discovery from observational data — cegraph expects you to provide the graph structure.