A Sharded PIR Design
for the Ethereum State

A practical path to private reads of Ethereum data

Ali Atiia Lead — Privacy of Reads
Ethereum Foundation privreads.ethereum.foundation

Workshop on Cryptographic Tools for Blockchains Rome, May 2026

QR code linking to privreads.ethereum.foundation

● Motivation

Ethereum users routinely query state data from remote servers The content and the patterns of these queries leak privacy

a curious/malicious server can profile a user just by tracking what they read — undermining other privacy measures the user has worked hard to follow (eg shielding assets).

Part 1 · Background | where reads happen

The edge reads, providers see 👀

wallets frontends dApps / SDKs light clients

RPC calls

balance nonce allowances contract code merkle roots

they don't or can't hold the full state, so they query remote providers

RPC provider

Sees: which address, which slot, which block, when, from where.

— the privacy boundary leaks at the read.

Part 1 · Background | what gets leaked

A read is a fingerprint

Holdings inferred from which balances and slots you poll
On-chain ↔ off-chain identity correlated by IP, timing, query mix
Frontrunning & MEV: a wallet polling a liquidation price is a tell
Behavioral profile assembled over time, even without any tx broadcast

Reads defeat the rest of the stack.

Shielding your transactions doesn’t help if the read pattern alone tells the same story.

Part 1 · Background | what gets queried

What does “the edge” actually read?

Hot state

“What is my balance now?”

eth_getBalance
eth_getStorageAt
eth_call
eth_getCode

Txs & blocks

“Did my tx land?”

eth_getBlockByNumber
eth_getTxByHash
eth_getTxReceipt

Historical state

“What was my balance on Dec 31?”

eth_getBalance(@blockN)
eth_getStorageAt(@blockN)
archival state
(“data warehouse” in traditional db lingo)

Answer: a bit of everything — any complete privacy story has to cover all three.

The Ethereum state

Part 1 · Background | the state

The Ethereum state, briefly

A key→value store of accounts; code and storage live one indirection deeper
Committed under the state root (a Merkle-Patricia trie) in every block header
Every read is verifiable with Merkle roots to the state root in the block header

Reading a value can entail fetching a merkle proof anchored to the state root in the block header — if the user wants to independently verify the value (and soon, the zkVM proof of that header).

Part 1 · Background | trie geometry

MPT will be upgraded from, eventually

MPT — today arity 16

Disadvantages of MPT

Heavy proofs — 15 siblings × 32 B per level
typical ~2.4–2.9 KB (depth 5–6) · worst-case ~3.8 KB (depth ~8)
Worst-case stateless proof per block ~300 MB
Storage slots require a second trie traversal
Keccak — slow in zkVMs

UBT — EIP-7864 (draft) arity 2

Advantages of UBT

Lean proofs — 1 sibling × 32 B per level → ~1 KB (~4× smaller)
Single trie — storage & code under one root
Snark-friendly hash (BLAKE3 / Poseidon — exact choice TBD) → 3–100× proving speedup
Page-based locality — meaningful gas savings under Verkle/UBT witness pricing for storage-heavy dApps

We are using UBT as the source DB for PIR, with a zk proof of its equivalence to mainnet MPT.

Part 1 · Background | unification

Many roots today, one tomorrow

What “tree of trees” costs

One state root + many storage roots + a flat code store. Three commitment regimes side by side.
A balance proof = path through state trie. A storage-slot proof = two paths.

What unification buys

One root, one keyspace. Account, code chunks, storage slots all addressed by stem.
For PIR: one universal interface, one slicing schema.

Private Information Retrieval (PIR)

Part 1 · Background | PIR

PIR at a glance

Private Information Retrieval: read $\mathrm{DB}[x]$ without revealing $x$.

Server holds $\mathrm{DB}$; client wants $\mathrm{DB}[x]$
Client sends a query $q$ that encodes the index of $x$ under a cryptographic veil
Server computes a response $r$ — learning nothing about $x$
Client decodes $r$ to get $\mathrm{DB}[x]$

Part 1 · Background | a quick primer

A one-hot vector picks one row

A vector $q$ of length $N$
Zero everywhere except a single $1$ at index $x$
Inner-product against any vector $\mathrm{DB}$ —
… selects exactly $\mathrm{DB}[x]$, zeroing the rest.

$\phantom{\mathtt{enc}(}q\phantom{)}$

$\phantom{\mathtt{enc}(}0\phantom{)}$

$\phantom{\mathtt{enc}(}1\phantom{)}$

$\phantom{\mathtt{enc}(}0\phantom{)}$

⋮

$\phantom{\mathtt{enc}(}0\phantom{)}$

$\times$

$\phantom{\mathtt{enc}(}\mathrm{DB}\phantom{)}$

$\phantom{\mathtt{enc}(}\mathrm{DB}[0]\phantom{)}$

$\phantom{\mathtt{enc}(}\mathrm{DB}[1]\phantom{)}$

$\phantom{\mathtt{enc}(}\mathrm{DB}[x]\phantom{)}$

$\phantom{\mathtt{enc}(}\mathrm{DB}[3]\phantom{)}$

⋮

$\phantom{\mathtt{enc}(}\mathrm{DB}[N{-}1]\phantom{)}$

$=$

$q[i] \cdot \mathrm{DB}[i]$

$\phantom{\mathtt{enc}(}0\phantom{)}$

$\phantom{\mathtt{enc}(}\mathrm{DB}[x]\phantom{)}$

$\phantom{\mathtt{enc}(}0\phantom{)}$

⋮

$\phantom{\mathtt{enc}(}0\phantom{)}$

sum $=$ $\phantom{\mathtt{enc}(}\mathrm{DB}[x]\phantom{)}$

We can send $q$ to the server while hiding the “$1$”

Part 1 · Background | single-server

Single-server PIR: encrypt the one-hot

Client encrypts each entry of $q$ under SHE/FHE.
Server computes $\sum_i q[i] \cdot \mathrm{DB}[i]$ homomorphically — an inner product on ciphertexts.
Client decrypts the response and recovers $\mathrm{DB}[x]$.

$\mathtt{enc}(q)$

$\mathtt{enc}(0)$

$\mathtt{enc}(1)$

$\mathtt{enc}(0)$

⋮

$\mathtt{enc}(0)$

$\times$

$\mathtt{enc}(\mathrm{DB})$

$\mathtt{enc}(\mathrm{DB}[0])$

$\mathtt{enc}(\mathrm{DB}[1])$

$\mathtt{enc}(\mathrm{DB}[x])$

$\mathtt{enc}(\mathrm{DB}[3])$

⋮

$\mathtt{enc}(\mathrm{DB}[N{-}1])$

$=$

$q[i] \cdot \mathrm{DB}[i]$

$\mathtt{enc}(0)$

$\mathtt{enc}(\mathrm{DB}[x])$

$\mathtt{enc}(0)$

⋮

$\mathtt{enc}(0)$

sum $=$ $\mathtt{enc}(\mathrm{DB}[x])$

Privacy here is computational — it relies on the semantic security of the encryption.

The catch: the server must touch every record to compute that sum — cost is $O(N)$ per query. Skipping any record would leak that the client didn’t want it.

Part 1 · Background | two-server PIR

Two non-colluding servers

PIR with XOR’ing

Client wants index $x$; encodes it as a one-hot vector $q$ over $[N]$.
Splits $q$ into two random XOR shares: $q = q_1 \oplus q_2$. Each share alone is uniform random.
Sends $q_1$ to $S_1$, $q_2$ to $S_2$. Each server XORs the $\mathrm{DB}$ entries selected by its share.
Client XORs the responses: $r_1 \oplus r_2 = \mathrm{DB}[x]$.

Privacy here is information-theoretic — it follows from how the query is split, not from any encryption.

Currently we are focused exclusively on single-server schemes

The Performance Tradeoffs of PIR Schemes

Part 1 · Background | the landscape

The design space

Part 1 · Background | the bottom line

No single PIR scheme dominates

family / examples	online server cost	client storage	per-client server state	update cost / freshness	online comm
Server-stateful Spiral · VIA-C · OnionPIRv2	!	✓	×	✓	✓
Double-stateless YPIR · VIA · InsPIRe	!	✓	✓	✓	✓
Download-hint SimplePIR · DoublePIR	!	!	✓	×	✓
Interactive-hint RMS24 · Plinko · HarmonyPIR	✓	!	!	!	✓

Sharding

Part 2 · Approach | slicing the state

Sharding the Ethereum State

Size ↓ Update frequency ↑

Update frequency ↓ Size ↑

1–10 GB

Curated

ETH & ERC* balances, transfer & shielding events, recent block’s txs & receipts, storage of popular defi contracts, select historical storage/events of popular contracts

10–20 GB

accounts & contract code

60–100 GB

accounts with merkle proofs

100–300 GB

accounts w/ proofs, storage tries

2–20 TB

full archival state

Mutability Churn Latency sensitivity Frequency of access

Part 2 · Approach | slice ↔ scheme

Sharding the Ethereum State

Size ↓ Update frequency ↑

Update frequency ↓ Size ↑

1–10 GB

Curated

10–20 GB

accounts & contract code

60–100 GB

accounts with merkle proofs

100–300 GB

accounts w/ proofs, storage tries

2–20 TB

full archival state

Part 2 · Approach | a problem

Knowing which slice → leakage

which slice is being queried leaks privacy — the server knows the content of each slice, and can make correlations over time

1–10 GB

PIR engine 1

10–20 GB

PIR engine 2

60–100 GB

PIR engine 3

100–300 GB

PIR engine 4

2–20 TB

PIR engine 5

Part 2 · Approach | the core trick

Genuine + decoy queries = full privacy

What the client knows vs. what the network observer sees.

Client’s view — ground truth

labelled · asymmetric · 1 real + $k-1$ decoys

Observer’s view — on the wire

all rays identical · no labels · uniformly distributed

Observer’s view = monolithic PIR over the whole state. Performance = per-slice optimized.

privacy Decoys are real PIR queries — not blanks. They cost $k\times$ wire, but each shard sees a uniform access just as before.

Part 2 · Approach | a side-channel

The wire is safe, but what about the clock

Decoys hide which slice — per-slice response times can give it away.

The failure mode

Response time is different across PIR engines.
If the client acts as soon as it receives the response of the real query, the observer correlates time → recovers which slice.

Mitigations

Wait-for-all: client doesn’t act until every shard has responded.
$m$-of-$k$: tolerate slowest few; discard their slices this round.
Constant background traffic: client always queries every slice.

Middleware

Part 2 · Approach | the middleware

A universal PIR interface

The edge keeps speaking standard Ethereum RPC. The PIR backend evolves independently.

Edge unchanged. Wallets, dApps, light clients keep speaking eth_getBalance, eth_getProof, …
Adapter in the SDK (ethers/viem) translates RPC into a GraphQL-shaped PIR query plan.
Query router constructs k queries (1 real + decoys), dispatches per-slice.
Decoupling: PIR families and slice boundaries evolve under the interface.

Part 2 · Approach | recap

Summary of Sharded Design

Optimizations

Part 3 · Optimizations | three levers

Three levers to scale sharded PIR

Lever 1

Shrink the DB

Reduce what each PIR engine has to scan per query.

→ PIR for Merkle proofs — serve proofs, not nodes
→ SNARKify archival roots — serve proofs of stem nodes instead of Merkle proofs

Lever 2

Slice smarter

Carve slices along their natural seams — especially mutability.

→ Sidecar pattern — isolate the mutable tail
→ Verifiable UBT equivalence — capitalize on the uniformity of binary tries

Lever 3

Build better schemes

Push the per-slice scheme harder — especially toward GPU-native designs.

→ Double-stateless GPU-tailored R&D
→ Delegated hints · DEPIR to the rescue?

Part 3 · Optimizations | lever 1b — shrink the DB

Snarkifying away merkle roots → reduce db size massively

Before vs after: snarkifying upper trie roots into one SNARK proof eliminates the need to store upper-trie roots; lower-trie nodes and leaves remain.

What gets dropped: upper trie levels — most-mutated, biggest by volume. Lower nodes & leaves stay; PIR continues unchanged.

Why it pays: ~50–80% archival DB-size reduction (§8 of the ethresear.ch post). Smaller DB → cheaper PIR per query.

Part 3 · Optimizations | lever 2 — slice smarter

Isolate mutability — the sidecar pattern

Big snapshot stays cold; fresh writes live in a small, fast engine; client always sees the freshest answer.

Part 3 · Optimizations | lever 2 — slice smarter

Verifiable UBT ↔ MPT equivalence

∀ (key, value): (key, value) ∈ MPT ⇔ (key, value) ∈ UBT

Why this anchors everything

PIR backend can use UBT before it lands on mainnet.
SNARKification (slide 30) operates on the UBT side.
One verification surface for all PIR engines — the UBT root.

Cost shape

One proof per block, generated server-side; clients verify it once.
SNARK-friendly UBT hash (BLAKE3 / Poseidon—TBD) keeps prover work tractable.

in progress initial UBT-enabled Geth / Ethrex nodes in testing right now.

Part 3 · Optimizations | lever 3 — better schemes

Researching double-stateless GPU schemes

Active in-house research direction: PIR schemes where both client and server are stateless — and the server kernel is shaped to be GPU-native from the start.

Why double-stateless?

✓ No per-client server state — server scales horizontally without client tracking.
✓ No client hint to keep fresh — client carries only its keys.
✓ Clean trust model — every client looks the same to the server.
✓ Updates are clean — no broadcast hint to invalidate, no per-client preproc to redo.

Family includes e.g. HintlessPIR, YPIR, VIA, InsPIRe. Online server work is still $O(N)$ — but with much smaller constants than FHE-PIR.

Why GPU-tailored?

▸ Server work is dense linear algebra — matrix–vector multiply over the DB.
▸ Batch friendly — many concurrent queries amortize one DB pass.
▸ Small modular params fit in GPU L2 cache — bandwidth-bound, not compute-bound.
▸ Stateless server means many GPUs can serve the same DB without coordination.

Designing the scheme around the GPU memory hierarchy — not just porting an existing scheme to CUDA — is what unlocks order-of-magnitude gains. (See Part 4 for current GPU PIR numbers.)

research Open target: match or beat GPIR throughput on Ethereum-shaped DBs while staying double-stateless. Construction details forthcoming.

Part 4 · Progress

Progress, ongoing work, references

Progress

Correctness-focused specs of Plinko, RMS24, VIA schemes.
Speccing Plinko surfaced the invertible PRF as an intractable bottleneck before months of impl work.
GPU acceleration of insPIRe scheme.
Reproducing benchmarks of existing schemes.

Ongoing work

New double-stateless GPU-tailored schemes — can they hit GPIR throughput while keeping the cleaner trust model?
Universal PIR middleware spec & wallet integration.
SNARKification cost analysis (binary trie).

References

Sharded PIR design — the ethresear.ch post.
GPIR (latest) — SOTA GPU acceleration of PIR.
PIR-Eng-Notes — per-scheme papers, specs, engineering notes.
PIR benchmarks & PIR specs on Private Reads.

QR: privreads.ethereum.foundation/code (PIR specs)

A Sharded PIR Designfor the Ethereum State

Ethereum users routinely query state data from remote servers The content and the patterns of these queries leak privacy

The edge reads, providers see 👀

A read is a fingerprint

What does “the edge” actually read?

The Ethereum state

The Ethereum state, briefly

MPT will be upgraded from, eventually

MPT — today arity 16

UBT — EIP-7864 (draft) arity 2

Many roots today, one tomorrow

Private Information Retrieval (PIR)

PIR at a glance

A one-hot vector picks one row

Single-server PIR: encrypt the one-hot

Two non-colluding servers

The Performance Tradeoffs of PIR Schemes

The design space

The $O(N)$ wall, and how preprocessing breaks it

No single PIR scheme dominates

Sharding

Sharding the Ethereum State

Sharding the Ethereum State

Knowing which slice → leakage

Genuine + decoy queries = full privacy

Client’s view — ground truth

Observer’s view — on the wire

The wire is safe, but what about the clock

The failure mode

Mitigations

Middleware

A universal PIR interface

Summary of Sharded Design

Optimizations

Three levers to scale sharded PIR

Shrink the DB

Slice smarter

Build better schemes

Snarkifying away merkle roots → reduce db size massively

Isolate mutability — the sidecar pattern

Verifiable UBT ↔ MPT equivalence

Why this anchors everything

Cost shape

Researching double-stateless GPU schemes

Why double-stateless?

Why GPU-tailored?

Progress, ongoing work, references

Progress

Ongoing work

References

Thank you

A Sharded PIR Design
for the Ethereum State