Validation

All data that is mined into a data moat can have its authenticity be verified. There are three distinct mechanisms that ensure query verifiability:

  • Secret Hash

  • Query Signature

  • Query ID

Proof of Secret

KwilDB uses its own mechanism for identifying the origin of queries that we call "proof-of-secret". At any point in time, there are three secrets defined for a data moat: a past secret, a current secret, and a future secret.

The past secret is a previously used secret that has since been revealed and is stored publicly (at the beginning of each data moat, the past secret is simply the name of the moat). The current secret, which is known only to the client, is revealed to the public as a hash of the current secret and the past secret. The future secret is publicly available as a hash of the current secret and the future secret.

All incoming data writes are hashed with the current secret, ensuring that the resulting hash came from someone who has knowledge of the secret (presumably the client). Since they do not have knowledge of the secret, all validators assume that an incoming query's hash is valid (to see how nodes prevent faulty submissions, see the next two sections on this page).

When submitting data to Arweave, nodes construct a merkle tree of the hashes sorted in chronological order. Once data has been submitted, the merkle root is stored in the block header. Nodes can easily check the block header to ensure that they received the same data.

In the event that the merkle root differs between nodes, nodes can vote to slash the stake of the lead validator and resubmit the block.

A third case exists, in which node operators disagree on the merkle root, but have all been truthful. This occurs if there was a portion of node downtime, in which it missed some of the queries. In this instance, the client can reveal the current secret, allowing nodes to reconstruct each query hash to check the validity of the origin. Nodes can also check to ensure that the revealed secret was already defined by checking the current secret hash. By doing this, nodes can merge their diverging states while verifying the data that was previously received. When the client reveals the current secret, it switches to using the future secret, and subsequently defines a new future secret.

Query Signature

Each incoming query has a signature from the client. This signature is simply used in lieu of an API key, since node operators can't verify the secret in real time. If a node operator verifies the signature against the known public key of the data moat, it can verify that this piece of data came from the client.

Once verified, the node operator can discard the signature, since it is not stored on Arweave. The reason why signatures are not stored on Arweave is due to size. A signature generated by an RSA-4096 key is 1024 bytes, while query hashes are Base64 URL encoded SHA384 hashes, which are only 48 bytes. Due to the size of an average SQL query, if queries were stored with an RSA signature, one could reasonably expect >80% of all data submitted to a moat to be only RSA signatures. By only using signatures as a means of client <> node verification, KwilDB is able to be much more scalable.

Query ID

Each incoming query contains a unique query ID. A query ID is simply a randomly generated 64 byte string that is attached and signed with each query.

The query ID's only purpose is to prevent SQL resubmission by malicious nodes. If there was no query ID, a malicious node could re-submit any previously valid SQL statement to its peers, and there would be no way to see who resubmitted this data.

By attaching a unique query ID to each query, nodes can check the uniqueness to ensure that this query is not being resubmitted.

Query ID's can only not be re-used when submitted within the same block. This requires nodes to only stored previous Query ID's until a block becomes finalized, at which point they can simply discard those ID's.

Last updated