Private Information Retrieval for Ethereum State

Cryptographic protocols that let clients query Ethereum data without revealing what they are reading.

The Problem

Every time a wallet checks a balance, verifies a transaction receipt, or reads contract storage from a remote server, the server learns exactly which records were accessed. This metadata leakage from read patterns enables MEV extraction, frontrunning, identity correlation, and surveillance. Even with encrypted connections, the pattern of accesses reveals user intent.

The Solution: Sharded PIR

Our sharded PIR design segments Ethereum state into data slices — accounts, storage slots, contract code, transaction receipts — and serves each slice through a cryptographic PIR engine tuned for its size and access profile. The server processes encrypted queries and returns encrypted responses, never learning which record was requested.

Data Slices

No single PIR scheme handles all of Ethereum's data well. Our design segments the state into slices, each paired with a PIR engine tuned for its size and access profile:

  • Hot mutable state (~2 GB) — balances, nonces, contract storage. Latency-critical, updates every block. Requires doubly-stateless schemes (no persistent client or server state) to avoid session linkability.
  • State with Merkle proofs (~100–300 GB) — full state for light clients. Larger databases where preprocessing tradeoffs matter.
  • Immutable logs and receipts (hundreds of GB) — append-only data that can leverage schemes with expensive one-time hint generation.
  • Archival / warehouse data (4–20 TB) — historical queries, high latency tolerance. Multi-GPU parallelism essential at this scale.

Key Components

Sidecar Architecture

A lightweight process alongside any execution client that absorbs real-time state updates into a small, fast PIR engine while the main engine serves stable snapshots — decoupling query latency from Ethereum's 12-second block time.

Universal PIR Interface

A scheme-agnostic API that mirrors Ethereum JSON-RPC semantics, routing queries through PIR engines under the hood. Backends can be swapped as the field advances — no client-side changes needed.

GPU Acceleration

Custom CUDA kernels for lattice-based PIR operations, achieving sub-second preprocessing and millisecond-scale query times on single GPUs, with multi-GPU scaling for larger databases.

PIR Schemes

We are building, benchmarking, and integrating multiple PIR schemes. The current frontrunners for the sharded architecture:

  • LeanPIR (in-house) — A new GPU-friendly scheme designed by Keewoo with <100 KB communication for multi-GB databases, sub-second preprocessing, and ~30 ms server runtime for 32 GB. The primary engine candidate for hot mutable state.
  • VIA — A lattice-based scheme being specified and implemented with reusable primitives across VIA, VIA-B, and VIA-CB variants.
  • OnionPIRv2 — An FHE-native single-server scheme with strong performance characteristics for medium-sized databases.
  • Harmony / RMS24 — Preprocessing-based schemes suitable for immutable or slowly-changing data slices where one-time hint generation cost is amortized over many queries.

Resources

← Back to Workstreams