Cross-checking data in state fraud proof with published rollup data · evstack/ev-node · Discussion #775

Manav-Aggarwal
Mar 14, 2023
Collaborator

Problem

When a verifier/light client receives a State Fraud Proof, it must verify that the transaction data and intermediate state roots (ISRs) inside the proof were actually part of the data commitment of a rollup block published in the Rollup's DA layer. This verification step should be done before the verifier goes to the state transition execution step.

The three main components in the State Fraud Proof that need cross-checking with the DA layer are:

Pre-ISR: The ISR before the fraudulent state transition
FraudTx: The transaction data representing state transition executed fraudulently.
Post-ISR: The ISR (invalid) after the fraudulent state transition

Note: ISRs need not be present after every transaction and can span a period, p. So ISRs get calculated every, p transactions. We assume p = 1 for simplification in this post.

Proposed Solutions

Solution 1

The naive solution can be to include the reference to the relevant rollup block in the State Fraud Proof, let the verifier download the whole rollup block, and cross-check the above components with the contents of the rollup block. This approach is fine in case we can put a low limit on the rollup block size. One could argue that since any Full Node can gossip State Fraud Proofs to the network, they can force every verifier to download all the blocks and thus force it to be Full Node. However, a valid response can be that it's a well-known griefing vector for fraud-proof-based systems and issues such as #386 tackle it.

Note: A reference to a rollup block is dependent on the DA layer. For example, It can look something like (DA Block Height, share Index, number of shares block spans) as described in this DA Interface ADR.

Solution 2

In the case of large rollup blocks where it might not be practical for verifiers to download the whole rollup block due to lower bandwidth/hardware requirements, we can define a custom rollup block serialization scheme to enable only parts of the block which are required to be downloaded.

Assume that a rollup block spans amongst a number of shares with fixed-size, shareSize. We can reserve the first share(s) to hold block metadata consisting of:

N: Number of transactions, first 8 bytes
S: List of transaction sizes where S_i = length(Tx_i), next 8 * N bytes
ISRs: List of intermediate state roots between transactions, next 8 * (N / p)

For k = fraudulent transaction index, when a verifier receives State Fraud Proof, they can start by downloading the block metadata, cross-checking the ISRs, and then calculating the share(s) index at which a Tx_k lives just using S_k and shareSize. Then, it can go on to download these share(s) by constructing a reference just like in Solution 1 and then cross-check the transaction data with it as well.

Replies: 4 comments 13 replies

Manav-Aggarwal
Mar 14, 2023
Collaborator Author

We could go on to implement Solution 1 first if we can put some reasonable limit on rollup block size but are there any drawbacks that I missed between these solutions?

3 replies

@musalbas

musalbas Mar 14, 2023
Maintainer

Yes, see "microblocks" in #775 (comment). But then why did we abandon microblocks in the first place for ISRs (does anyone have a record or recollection of that discussion)?

@Manav-Aggarwal

Manav-Aggarwal Mar 17, 2023
Collaborator Author

Paraphrasing @jbowen93: "Microblocks result in slow execution and limited throughput due to per tx disk access and aren't really different than ISRs"

@musalbas

musalbas Mar 17, 2023
Maintainer

May be possible to modify microblocks tho to not commit to disk after every microblock

musalbas
Mar 14, 2023
Maintainer

Why not implement the solution in section 4.3 https://arxiv.org/pdf/1809.09044.pdf ("solution 3")?

Solution 1 defeats the point of compact fraud proofs (you might as well not even use ISRs at that point), so it's a different discussion altogether (whether we want compact fraud proofs or not).

Solution 2 is suboptimal because it requires light nodes to download all ISRs and O(n) transaction metadata.

The solution in the paper (as well as the microblocks solution) above avoids those pitfalls.

However, there is an acceptable version of solution 1 based on "microblocks" ("solution 1.2"). As far as I know, the original plan was to not use ISRs but "microblocks" of one or few transaction each, and have multiple microblocks per DA block (in the same PFB batch), because ISRs were deemed to difficult to implement (similar to Optimism v1). Then we went back to ISRs if I recall correctly because an easy way to implement them was found. Does someone have a record of that discussion? This would be an acceptable solution. cc @jbowen93 @adlerjohn

2 replies

@Manav-Aggarwal

Manav-Aggarwal Mar 17, 2023
Collaborator Author

The main issue I had with Solution 3 in section 4.3 was that there doesn't seem to be a clear way to distinguish transaction boundaries and ISR boundaries within a share. I think we can modify it to prepend metadata, specifically, transaction length.

Then you can just include the start position within a share of the pre-ISR in the state fraudproof. In that case, the verifier can read the fixed length pre-ISR, a byte representing transaction length, the transaction bytes, and finally the fixed length post-ISR.

@musalbas

musalbas Mar 17, 2023
Maintainer

The ISR could just be inside the transaction struct itself, or you could create a struct called Period that includes multiple transactions and an ISR

nashqueue
Mar 15, 2023

Solution 4

The serialization scheme utilizes two separate namespaces for transactions and intermediate state roots (ISRs). Transactions are divided into fixed-sized shares, with their length (TxL) added before each transaction. Transactions may span across multiple shares, allowing for efficient storage of larger transactions. The ISRs are stored in a separate namespace with a fixed length, and a byte indicates the starting position of the first ISR in each share.

To match transactions and ISRs, IDs are used to mark the beginning of each share in both namespaces. Only the first transaction and ISR in a share include an ID, as the rest can be derived from their preceding components.

Applying the example, we have two shares for transactions and one for ISRs. In the transactions namespace, Tx3 spans both shares, and TxL is used to indicate the length of each transaction. In the ISR namespace, the single share contains all 6 ISRs, and a byte denotes the starting position of the first ISR. The IDs are used to match the transactions and their corresponding ISRs.

Transaction Namespace (Share ) :

ID (Tx1)	Start position	TxL (Tx1)	Tx1	TxL (Tx2)	Tx2	TxL (Tx3)	Tx3P1
(2 bytes)	(1 byte)	(4 bytes)	(120 bytes)	(4 bytes)	(80 bytes)	(4 bytes)	(297 bytes)

Transaction Namespace (Share 2):

ID (Tx4)	Start position	Tx3P2	TxL (Tx4)	Tx4	TxL (Tx5)	Tx5
(2 bytes)	(1 byte)	(200 bytes)	(4 bytes)	(70 bytes)	(4 bytes)	(231 bytes)

SR Namespace (Share 1):

ID (ISR 0)	Start position	ISR 0	ISR 1	ISR 2	ISR 3	ISR 4	ISR 5	Padding
(2 bytes)	(1 byte)	(32 bytes)	(32 bytes)	(32 bytes)	(32 bytes)	(32 bytes)	(32 bytes)	(315 bytes)

4 replies

@nashqueue

nashqueue Mar 15, 2023

The downside is you have to have 2 shares with inclusion proofs for a fraud proof. The upside is that you put the ISRs into a different namespace which means you only need this when you are an optimistic rollup. (separating logic for compatibility reasons)

This could also lead to hybrid rollups.

@musalbas

musalbas Mar 16, 2023
Maintainer

As far as I can tell, the core idea here is to make it so that the ISRs are all next to each other, instead of interspersed after transactions. Why does that require the ISRs to be in a different namespace or data blob? Can't you put the data in the same blob?

@nashqueue

nashqueue Mar 16, 2023

Yes, you could. That would be Solution 3. Solution 4 decouples the Data from the ISRs. This aims to increase the design space of how you can compress transaction data. ZKPs would maybe use the same scheme without the ISR namespace. This would also enable aggregator, Blockprducer separation. (Create the ordering first and ISRs in a separate Celestia Block)

@nashqueue

nashqueue Sep 22, 2023

If we want to use shared sequencers or based rollups we will need to be able to have ISR and transactions in different namespaces, as we are separating ordering and execution.

nashqueue
May 8, 2023

Solution 3

We are going with Solution 3. The Flow will be this:

Wrap Transactions with pre-ISR and post-ISR
Create Celestia compact shares with wrapped transactions
Remove Namespace, Namespace version, Sequence length and info byte from compact shares
Append stripped shares to one blob
Post this blob to Celestia.

This will result in a Sparse share looking like a compact share for Rollkit.

By using compact shares, we can read out-of-context shares because we have the location of the first unit per share.

An optimization could be only to have the pre or post-ISR, not both, in the wrapped transaction to reduce duplicated data.

4 replies

@evan-forbes

evan-forbes May 9, 2023

I think we could help out by separating the compact share logic to not include the namespace version etc (wdyt @rootulp?)

@evan-forbes

evan-forbes May 9, 2023

similar to how we parse shares in a universal way, I think we have most of the code to split shares in a more universal way.

kinda related related as well celestiaorg/celestia-app#802

@nashqueue

nashqueue May 12, 2023

Yeah we will have to implement out of context parsing one way or another. Very happy about collaborating/extracting the logic so we can both use it.

@Manav-Aggarwal

Manav-Aggarwal May 13, 2023
Collaborator Author

added all the methods needed for this solution here: https://github.com/rollkit/rollkit/pull/889/files

Note that this also handles out of context parsing of shares and includes a test for the same with randomly generated ranges of shares being interpreted.

Cross-checking data in state fraud proof with published rollup data #775

Uh oh!

Uh oh!

Manav-Aggarwal Mar 14, 2023 Collaborator

Problem

Proposed Solutions

Solution 1

Solution 2

Replies: 4 comments · 13 replies

Uh oh!

Manav-Aggarwal Mar 14, 2023 Collaborator Author

Uh oh!

Uh oh!

musalbas Mar 14, 2023 Maintainer

Uh oh!

Manav-Aggarwal Mar 17, 2023 Collaborator Author

Uh oh!

musalbas Mar 17, 2023 Maintainer

Uh oh!

Uh oh!

musalbas Mar 14, 2023 Maintainer

Uh oh!

Manav-Aggarwal Mar 17, 2023 Collaborator Author

Uh oh!

musalbas Mar 17, 2023 Maintainer

Uh oh!

Solution 4

Uh oh!

Uh oh!

musalbas Mar 16, 2023 Maintainer

Uh oh!

Uh oh!

Uh oh!

Solution 3

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Manav-Aggarwal May 13, 2023 Collaborator Author

Manav-Aggarwal
Mar 14, 2023
Collaborator

Replies: 4 comments 13 replies

Manav-Aggarwal
Mar 14, 2023
Collaborator Author

musalbas Mar 14, 2023
Maintainer

Manav-Aggarwal Mar 17, 2023
Collaborator Author

musalbas Mar 17, 2023
Maintainer

musalbas
Mar 14, 2023
Maintainer

Manav-Aggarwal Mar 17, 2023
Collaborator Author

musalbas Mar 17, 2023
Maintainer

musalbas Mar 16, 2023
Maintainer

Manav-Aggarwal May 13, 2023
Collaborator Author