Private Information Retrieval for Ethereum State

Cryptographic protocols that let users query Ethereum data from remote servers without revealing what they are reading.

The Problem

Every time a wallet checks a balance, verifies a transaction receipt, or reads contract storage from a remote server, the server learns exactly which records were accessed. This read-pattern leakage enables MEV extraction, frontrunning, identity correlation, and surveillance. Even with encrypted connections, the pattern of accesses reveals a lot about the user.

Approach: Sharded PIR

The sharded PIR design segments Ethereum state into data slices and serves each slice through a PIR engine tuned for its size and access profile. The server processes encrypted queries and returns encrypted responses, never learning which record was requested.

Data Slices

No single PIR scheme can handle all of Ethereum's data while meeting the latency and client/server overhead requirements in different contexts. The design segments the state into slices, each paired with a PIR engine tuned for its size, access profile, and likely context of consumption:

Hot mutable state (1–10 GB) — balances, nonces, contract storage. Latency-critical, updates every block. Requires doubly-stateless schemes (no persistent client or server state) to avoid session linkability.
State with Merkle proofs (~100–300 GB) — full state for light clients. Larger databases where preprocessing tradeoffs matter.
Immutable logs and receipts (hundreds of GB) — append-only data that can leverage schemes with expensive one-time hint generation.
Archival / warehouse data (2–30 TB) — historical queries, high latency tolerance. Multi-GPU parallelism essential at this scale.

PIR Schemes

Multiple PIR schemes are being built, benchmarked, and integrated. The current frontrunners for the sharded architecture:

VIA — A lattice-based scheme being specified and implemented with reusable primitives across VIA, VIA-B, and VIA-CB variants.
OnionPIRv2 — An FHE-native single-server scheme with strong performance characteristics for medium-sized databases.
insPIRe — A preprocessing-based single-server scheme with GPU acceleration support (being explored with a potential collaborator). Well-suited for moderately sized databases where preprocessing can be amortized; requires reprocessing on each database update.
Harmony / RMS24 — Preprocessing-based schemes suitable for immutable or slowly-changing data slices where one-time hint generation cost is amortized over many queries.

Resources

← Back