MultiVAC Yellowpaper

Sunday, February 9, 2020
Download document
Save for later
Add to list is a project of OpenBook

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain The MultiVAC Foundation * [email protected] September 2018 S calability is the major bottleneck preventing and progress and today finds commercial usage in blockchains from reaching industrial capac- fields as diverse as cross-border settlement, supply- ity. Systems tackling the most promising scal- chain management, entertainment, and investment. ing solutions thus far, blockchain sharding, have However, the technology is at a major bottleneck. produced progress in parallelizing transaction pro- Low levels of transactions per second (TPS) saddle cessing but have not achieved full sharding needed blockchain systems, creating constraints on their real- for total scalability. MultiVAC was completely de- world use. Bitcoin’s throughput is on average only 4-5 signed to serve as a solution: the world’s first fast, transactions per second (tps) and Ethereum’s is 10 tps, efficient, and all-dimensional sharded blockchain causing rampant congestion and backlog for today’s designed for total scalability, performing sharding best known networks. All this while the dominant parallelization not only for computation but also provider of payment solutions, VISA, easily processes transmission and storage. In this paper, we present 2000 tps on average and 45,000 tps at maximum ca- an overview of MultiVAC’s sharding and storage so- pacity. The current rate of blockchain development is lution. MultiVAC uses the Proof of Stake mecha- nowhere near capable of servicing millions of transac- nism to prevent Sybil attacks and uses Verifiable tions that businesses conduct on a routine basis. The Random Functions to divide the network into frag- entire industry is cognizant that speed and scalability ments called shards. Each shard processes transac- are the factors determining whether or not the use of tions in parallel. MultiVAC provides an elegant dis- blockchain will become ubiquitous in modern society. tributed storage solution for the blockchain, pro- MultiVAC believes that in the coming decade, ducing a robust architecture that divides storage blockchain will be pervasively employed for any type and transmission across shards. In this fashion, of transactions imaginable. To obtain this ease of use, MultiVAC provides a blockchain throughput that the blockchain community needs to overcome numer- can serve the real economy while maintaining ous scalability problems, a challenge we have accepted. blockchain’s core values of decentralization, equal- Academia and the industry have provided numerous ity, and security. proposals to scale blockchains, and we roughly summa- rize them below. They can be categorized into three camps: partial centralization, off-chain scaling, and 1 Introduction on-chain scaling. Since the publication of the Bitcoin White Paper [1] by Satoshi Nakamoto in 2008, Blockchain technology has 1.1 Blockchain Scaling Approaches taken the world by storm. As exemplified by Bitcoin • Partial centralization is an approach that allows and Ethereum [2], it has been subject to both scrutiny the bulk of consensus processing in a blockchain * This yellowpaper is produced by the MultiVAC technical team system to be handled by a small number of high- under the direction of Dr. Xiang Ying (email: [email protected]). We powered nodes called supernodes. This approach wish to thank Junyu Lu, Zhengyuan Ma, Dr. Ge Li, Zhong Dong, Liang He, Yuan Li, Libo Shen, Weihua Zhang, Nan Lin and Jiajun is exemplified by EOS [3], IOST [4], and the root Wu, Dr. Xiao Tong, Koupin Lyu, Dr. Minqi Zhang, Dr. Hong Sun, chain system of Quarkchain [5]. In a traditional David Sebaoun and Lu Lu for their contributions. blockchain, the network’s processing capacity is

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain bottlenecked by the processing capacities of indi- blockchain’s bottleneck. vidual nodes. Partial centralization gives network stewardship over to nodes that fulfill a particu- larly high processing capacity, thus bringing up 1.2 MultiVAC Design Principles the network’s total throughput. Such a system is MultiVAC was designed with the following question in fast but cuts out ordinary users, creating a semi- mind: “Which features of a blockchain are necessary centralized system that loses blockchain’s original for it to scale to be used on an industrial scale and serve value proposition: decentralization. the real economy?” Several conclusions are apparent. • Off-chain scaling technologies can be roughly An ideal blockchain scalability architecture should be divided into side-chain and state channel ap- decentralized at its core, presicely because blockchain proaches to scalability. These approaches attempt technology is touted for its decentralized nature. to side-step a blockchain’s performance bottleneck Blockchain technology was designed to provide a de- by processing most transactions outside of the centralized computing platform to avoid central points main blockchain. Typical side-chain schemes like of failure, at the cost of requiring a consensus algorithm Cosmos [6] and Aelf [6] process transactions by to finalize outcomes. Reverting to a semi-centralized interfacing the main chain with the side chain, system would not be a step forward towards a sys- increasing throughout by and order of magni- tem robust enough to be at the forefront of global tude. They however also create even more secu- commerce. Moreover, an ideal scalability architecture rity and performance risks, including cross-chain should consist of fundamental upgrades to blockchain trust limitations and bottlenecks arising from over- at the base layer. Off-chain approaches are thus not burdened main chains. State channel schemes ideal as they function more like patches which are are represented by Plasma [8] and the Lightning sometimes helpful but often bring their own security Network [9] provide specialized transaction chan- risks. For this reason, MultiVAC improves blockchain nels between users that are based outside of the architecture from the bottom-up, using the most de- main blockchain, similarly allowing throughput pendable of the scaling solutions thus far, blockchain to scale. They, however, critically require trust re- sharding. quirements between the on- and off-chain links Sharding is the parallelization of a blockchain’s con- which carry high security risks, reintroducing the sensus process.This is similar to how large-scale pro- central point of failure that blockchain systems cessing of corporate data today is mostly conducted were built to avoid. by multiple servers in parallel. Conducting blockchain • On-chain scaling technologies roughly include operations in parallel is seen as essential to making the use of directed acyclic graphs (DAG) and them capable of handling the computational demand blockchain sharding. Both systems try to change of large corporations. Sharding is the only practical the structure of the blockchain itself to achieve and manageable, and base-layer method to scale a scalability. Directed Acyclic Graphs (DAGs) are blockchain to industrial capacity while still maintain- represented by Iota [10], Hash Graph [11], ing a decentralized network with open participation. Vite [12] and Conflux [13] and is an alternate For this reason, scalability efforts of the best known graph-based data structure which allows blocks blockchain networks including Ethereum have been to be added asynchronously. DAGs show promise focused on blockchain sharding. There are, however, and can run extremely quickly in the laboratory inherent difficulties of designing sharding over an origi- but have proven difficult to apply commercially nally non-sharded system. MultiVAC has the advantage because they require a huge number of transac- of being a fully sharded system from inception. tion broadcasts which causes a network storm Our sharding approach differs from and improves that makes them unwieldy in an open environ- on almost all of the sharding methods existing to date. ments. Sharding is represented by Zilliqa [14], Central to MultiVAC’s sharding design is the important Quarkchain [5] and Ethereum [15] and is a observation that computers do more than just compute method of breaking up the processing of transac- data: a large volume of their work is also in transmit- tions on the blockchain into sub-networks called ting and storing it. Thus, we designed MultiVAC for shards. Sharding is a commonly used scalabil- all-dimensional sharding at the core: sharding not only ity mechanism in distributed databases and can for computation but also for data storage and trans- also significantly improve throughput. It has seen mission, the first blockchain system to design for all significant accomplishment in public arena, with three. With a fully sharded architecture, MultiVAC Zilliqa, for example, having stably run over 2000 is the first blockchain able to practically and robustly tps [14] on its testnet. At the same time, all of handle industrial-level demand. the most well-known sharding mechanism only We summarize MultiVAC’s core technological advan- perform partial sharding: They split up the pro- tages described in the below paper: cessing labor but still impose the full burden of the network’s storage and transmission needs on 1. Transaction sharding alone is not sufficient to solve every node. This inevitably only increases the blockchain’s scalability problem. We are the first Page 2 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain blockchain system providing sharding for transac- network and also produces a proof statement allowing tions, transmission and storage. other nodes to verify the generated number is legiti- 2. Fairness, reliability, and security, and are essential mately random and not manipulated in any way. The to blockchain systems. We utilize a fair resharding VRF has two other features; its unpredictablility ahead technology based on Verifiable Random Functions of time and unbiasedness over time. The VRF requires (VRF) which ensures the reliability and security other miners to verify that the node assignment is fair of every shard. before assigning miners to tasks. 3. Cross-shard communication is a difficult problem that must be addressed by every sharding system. Node Setup All nodes are connected over a semi- We design a compact UTXO ledger and decentral- synchronous network such that almost all miners can ized Merkle Root data structure which enables communicate with each other within a certain small asynchronous secure cross-shard communication, time limit. Each node has a cryptographic public key trusted third-party data storage, and trusted data and private key pair. The private key is only known verification in a sharded data environment. by the node while the public key is known publicly. The node may sign computations with his private key which is then verifiable by all other nodes using his 2 MultiVAC Protocol at a Glance public key. We first present an overview of how MultiVAC achieves sharding for processing, storage and transmission. We 2.2 Dynamic Sharding use the following legends in the diagrams : MultiVAC utilizes a dynamic sharding mechanism, where miners are allocated to shards which are dynam- ically changed every couple of minutes. New miners can join a shard when it is periodically re-allocated and miners are not limited to participating in just one shard. The entire process is described in Figure 2. Figure 1: Legend for Figure 2 and Figure 3 2.1 Foundations at a Glance Nodes Nodes are the different types of machines con- nected to the network. MultiVAC uses three types of nodes: Light Nodes, Miner Nodes and Storage Nodes. Light nodes or clients are nodes which submit new transactions and perform no processing. They function like users in the system who own accounts and make transactions. Miner nodes are nodes running the con- sensus algorithm and are reassigned to different shards or fragments of the network every couple of minutes. They are the bookkeepers in the system with very low hardware barriers-of-entry, and are incentivized for running consensus. Storage nodes are nodes assigned to particular shards that are responsible for storing and serving up transactions. They function like utilities sim- ilar to the internet infrastructure on which the network is run, providing services to miners that allow them to perform their tasks more quickly and efficiently. A large number of each node type make up the MultiVAC network. Verifiable Random Function MultiVAC uses Verifi- able Random Functions (VRF) to allocate miners to shards, and to further select subgroups of sharded miners to complete bookkeeping tasks. A VRF is a Figure 2: Dynamic Miner Sharding Flow Diagram pseudorandom function that produces a trusted source of randomness, that is, a function that both acts as MultiVAC is a Proof of Stake system. In a Proof of a random number generator for a node in a trustless Stake system, miners have to prove ownership of a Page 3 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain certain amount of tokens (stake) to be able to partici- 2.4 Formal Definitions pate in consensus. For more details on why we choose a Proof of Stake rather than a Proof of Work system, We now provide definitions for the MultiVAC system in please refer to Appendix 1. In MultiVAC, we enforce detail below. Proof-of-Stake by requiring each miner to lock up a deposit before a shard can be joined. 2.4.1 Transactions After locking in the deposit, the miner waits until • Transaction: A message submitted from one one of the shards initiates re-sharding, which takes a client to another representing a movement of to- few minutes. MultiVAC runs shards in an asynchronous kens from one account to another account. Each manner, so miners may be selected for multiple shards transaction is a block of data that includes a num- proportional to their deposit size. The miner runs his ber of inputs to spend, a number of outputs to VRF on his private key and the shard’s VRF seed to produce, and the id of the transaction type. determine whether or not he can enter the shard. A • Input: An input is a gained unit of money (a out- miner node that has not been selected for inclusion by put) which must be referenced to be spent. Inputs a shard waits until it is included. contain a reference to the output of a previous When shards are overburdened, MultiVAC supports transaction and a script signature unlocking the incremental shard splitting to increase the number of input. shards. When a shard splitting is triggered, an over- • Output: Outputs are produced by transactions burdened shard is split into two shards. Light nodes occurring in MultiVAC. They contain a number belonging to the original shard are allocated to the new denoting the value of the output and a public key shard according to their addresses, and storage nodes script corresponding to the receiver’s address. are given some leeway to choose between the old and • Output State: An output state is a binary (0-1) new shards. The new shards then are both allocated flag which determines if the output has been al- with miners by the re-sharding algorithm. ready used as an input to a transaction. • Transaction Type: The transaction type is a code which may be one of three types: normal transac- 2.3 Transaction Confirmation, Consensus, tion, deposit transaction or withdraw transaction. and Storage – N, Normal Transaction: a regular transac- tion of funds from one account to another In MultiVAC, storage nodes store all transactions. Mul- account. tiVAC adapts a variation of the UTXO model used in – D, Deposit Transaction: a transaction used Bitcoin in which the input of every transaction is the to place a deposit. The payer and payee must output of a previously confirmed transaction. The sys- be the same account, and the token included tem traces outputs and their states (whether or not the in the deposit transaction can only be used transaction has already been spent) to prevent double in a withdraw transaction. spending. MultiVAC stores outputs and their states in – W, Withdraw Transaction: a transaction the Merkle Tree data structure. withdrawing the deposit and unlocking it for When a new transaction is produced and signed by future use. a light node, it is submitted to the storage nodes in the • Output Message: The output message is a mes- corresponding shard. The storage node will broadcast sage submitted from one shard’s storage nodes to the transactions to all the shard’s current miners, and another which allows the second shard to recon- enabling them to receive rewards. The blocks thus struct outputs deposited to the second shard by reach all the shard’s current miners. the first shard. Each shard runs a fast Byzantine Consensus algo- rithm to reach blockchain consensus. Miners are se- lected for different consensus tasks by the VRF, the first 2.4.2 Nodes and Clients of which is block proposal. All miners selected for block • Miner Node M : Also known as users, miner proposal include their pending transactions on the new nodes run MultiVAC’s Byzantine consensus and block. The block is further processed and voted on by receive rewards when selected as block proposer. miners selected by the VRF until consensus is reached. Any user running MultiVAC software is eligible to After achieving consensus on the block, it is broad- be selected as miner as long as he or she first locks cast to all storage nodes in the shard, which is stored up a stake deposit. Miners are required to listen to on their hard disks. The corresponding block header all messages from their own shard and block head- is broadcasted to all miners in the network but is kept ers from all shards. Miners keep track of block outside of the shard for space efficiency. At this point headers, Merkle roots, and some partial Merkle the transaction is confirmed. tree structure from all shards in order to confirm The whole process is briefly illustrated in Figure 3. the validity of pending transactions. Finally, miner Page 4 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain Figure 3: Block Generation Flow Diagram nodes maintain a pending transaction pool to ag- of a Merkle tree. Users with the Merkle root of gregate transactions for potential block proposal. a Merkle tree can quickly verify the existence of The hardware requirements to be a miner node any content in the Merkle tree by checking the are quite low and any modern personal computer content’s Merkle path. would suffice. M denotes the set of all miner • Merkle Path: A Merkle path is a path from a leaf nodes. of the Merkle tree to its root node, allowing for • Storage Node S : Storage nodes are assigned to secure verification of the leaf node’s content. particular shards and are responsible for storing • Main Merkle Tree: The Main Merkle Tree is the all of that shard’s historical transactions. Storage name of MultiVAC’s primary Merkle Tree data nodes are designed to respond to and serve re- structure used to store transactions in storage quests from miner nodes and are incentivized for nodes. The leaves of the Main Merkle Tree com- providing storage; they have higher hardware re- prise all historical outputs and their output states quirements than miner nodes. Storage nodes are aggregated from all the shards. only providers of services to the network and do • Block Merkle Tree: A Block Merkle Tree is a not participate in consensus and thus do not have Merkle tree in each block that stores the trans- decisions making “power” in the network; in par- actions outputs recorded in that block. The Block ticular they do not pose a centralization risk. S Merkle Trees are appended to the leaves of the denotes the set of all storage nodes. Main Merkle Tree as blocks get confirmed. • Light Node C : Light nodes or clients are nodes • Top Main Merkle Tree: The Top Main Merkle which submit new transactions. They are not Tree is the top section of the Main Merkle Tree, ex- required to store any block information except cluding the leaves which are Block Merkle Trees. records of their own transactions. Light nodes The leaves of the Top Main Merkle Tree are in- have minimum hardware requirements, allowing stead the Merkle Roots of the corresponding Block a common mobile device to serve as a light node. Merkle Trees. C denotes the set of all light nodes. 2.4.3 Data Storage 3 MultiVAC Sharding in Detail • Merkle Tree: A Merkle tree is a binary tree shaped Sharding is central to the design of MultiVAC. Using a data structure for data storage. Every leaf node strategy similar to distributed computing used by all represents a data block and every non-leaf node major modern corporations to process huge volumes up to the root is the cryptographic hash of its child of data, MultiVAC separates nodes into network frag- nodes. This hierarchy of hashes allows for ex- ments called shards in order to perform parallel pro- tremely fast and secure verification of the contents cessing of transactions. This allows the network to scale stored in the Merkle Tree, for example, the data and handle the high-volume business computations re- of transactions representing a certain amount of quired by the modern economy. As a fundamental and monetary value. integral part of MultiVAC, sharding is run automatically • Merkle Root: A Merkle root is the root (top node) by the MultiVAC protocol. New shards are allocated if Page 5 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain the network burden on any existing shard becomes too Miner Deposits MultiVAC requires each miner to great, and miner and storage nodes assigned to a new place a deposit to be a potential candidate for shard shard can immediately start constructing new blocks. inclusion. To place a deposit, a miner submits a special A shard Si,r is a subset of the MultiVAC network, deposit transaction to a storage node and waits until the transaction is confirmed in a block. The deposit (Ci,r , (Bi,h )1≤h≤r , (Hi,h )1≤h≤r , M Ti,r , Mi,r , Ci,r , Si,r ), transaction has a special output, the deposit output, which cannot be spent by the user until it is unlocked where i is the shard index, r is the current height of by a withdraw transaction. the shard’s internal blockchain, Ci,r is the consensus The deposit transaction consists of the following: algorithm running within the shard, Bi,h is the block of shard i at height h, Hi,h is the header of Bi,h , M Ti,r • Input: one or several outputs belonging to the is the current Main Merkle Tree maintained by the user; shard, and Mi,r ⊆ M , Ci,r ⊆ C and Si,r ⊆ S are • Output: an output from the user to himself; the current subsets of miner nodes, client nodes and • Transaction Type: “D” for deposit transaction. storage nodes belonging to the shard, respectively. Each shard maintains a separate and distinct in- After the deposit transaction is confirmed, it is shard blockchain comprising a log of transactions. recorded in the block header. The miner broadcasts Clients are separated into different shards for service a heartbeat message to the whole network to signal based on their public keys, with transactions assigned that it is live, and is then eligible for selection into a to the shard of the payer’s public key. Each shard is pe- shard. The miner then waits for a shard to perform riodically reassigned a group of randomly selected min- re-sharding, the process of which is described below, ers responsible for generating new blocks and running and checks if it has been allocated to a shard by the Byzantine consensus. Each shard also has a number VRF. Upon allocation, other miners can verify that the of storage nodes which maintain records of all blocks miner by verifying his random selection number as well generated by the shard and update all of the shard’s as his deposit-confirmation message. assigned output states. The miner’s heartbeat message is valid for a period of around 24 hours, to avoid the allocation of inac- tive miners. After the heartbeat expires, the miner is 3.1 Miner Selection required to re-send another heartbeat. MultiVAC is a shard-based blockchain system where Withdraw Deposit The output value of a deposit is miners perform consensus as part of network fragments locked and not spendable as an input of a further trans- called shards. The central question here is the selection action. If deposits were allowed to be spent, a Sybil process of the miners who would perform such tasks. attacker may stake a deposit to send a confirmation- It is apparent that a fair selection mechanism must be message but then transfer the deposit to one of the fake random and not biased towards any subset of miners Sybil accounts. This account can then use the money over others. A naive random selection method is not to stake another deposit and transfer it to yet another however sufficient for the task. In the absence of a account, resulting in a large number of deposits backed robust selection mechanism, malicious miners can per- by nothing at all. We thus require a special withdraw form Sybil attacks by passing off as multiple users so as transaction to unlock a deposit. to dominate the network. A classic random selection mechanism that prevents Sybil attacks is Proof of Work (PoW), which require miners to solve computational 3.2 Re-sharding puzzles in order to contribute to consensus. However, as discussed many times in the blockchain literature, Suppose N is the total number of miners in the network Proof of Work is an extremely energy-inefficient and and c is a pre-determined honesty threshold such that wasteful mechanism. as long as the number of honest miners Nh ≥ cN , the outcome of consensus is trustworthy. Define Pt (Nh ≥ MultiVAC instead uses a Proof of Stake (PoS) based cN ) to be the probability that the number of honest algorithm which avoids wasteful energy expenditures nodes is always greater than cN after the network is while ensuring an equally high level of security. Each running for a period of time t. The security assumptions miner has certain amount of MultiVAC tokens that he of traditional blockchains depend on the fact that, over or she can lock up in deposit which would correspond a large time period t, N is sufficiently great such that to his or her current stake. Together with a random Pt (Nh ≥ cN ) ≥ 1 − p for negligibly small p. seed produced by the shard and a selection likeliness Suppose Ni is the total number of miners in shard i, proportional to the stake, a miner calculates a random and Nh,i is the number of honest miners in the same number with the VRF to determine whether or not he shard. It can be seen that Pt (Nh ≥ cN ) ≥ Pt (Nh,i ≥ is allocated to a shard. Miners may be selected for cNi ) because Ni is significantly lesser than N . In order inclusion in more than one shard. to maintain the same security level, we need to restrict t′ < t such that Pt′ (Nh,i ≥ cNi ) ≥ 1 − p. Page 6 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain a(λ) In order to achieve this, we perform a re-allocation • Complete Provability: for any x ∈ {0, 1} , of miners to shards at intervals of time t′ (for example, P (FV ER (P K, x, δ, π) = T rue) > 1 − 2−Ω(λ) every a few minutes) in order to eliminate the probabil- if V RF (SK, x) = (δ, π). ity that any particular shard is compromised. As such, 2. Unique Provability: For any P K, x, δ1 , δ2 , π1 , π2 the threat of compromising a shard is negligible as it its with δ1 6= δ2 , then quite impossible to take control of a super majority of a shard containing several hundred random nodes in P (FV ER (P K, x, δi , πi ) = T rue) < 2−Ω(λ) , several minutes and carry out double spending trans- actions. The most damage an attacker can inflict on for any i ∈ {1, 2}. a shard is to prevent it from processing blocks in the 3. Residual Pseudorandomness: Let T = (TE , TJ ) be current round. Even then, the damage will be resolved any pair of algorithms taking 1λ as the input and when the miners are re-sharded. taking execution count less than s(λ) steps. Then for ∗ = 6 x, let Verifiable Random Functions (VRF) Our re-sharding V RF (SK,∗) TE (1λ , P K) = (x, π̂), mechanism is based on Verifiable Random Functions (VRF), a pseudo-random number generator first in- where P K, SK are generated by FGEN . troduced by Micali, Rabin and Vadhan in [16] and Now, define a random variable X taking on two improved by Dodis and Yampolskiy in [17]. A VRF’s states with equal probability. Depending on the output can be verified without further communication state of X, a value for δ̂ is determined either ran- with the generator, making the VRF an ideal tool for domly or from δ(SK, x): random selection. Dfinity [18], Algorand [19], and Ouroboros Paros [20] all implement VRF as a random • P (X : δ̂ = δ(SK, x)) = 0.5; generator as part of their consensus schemes. MultiVAC b(λ) • P (X : δ̂ →R {0, 1} ) = 0.5. adopts the VRF construction [21], where a detailed analysis of the VRF’s security and pseudo-randomness We require that no prediction algorithm TJ is able is provided. to accurately predict within the safety margin s(λ) Let a : N → N ∪ {∗}, b : N → N, and s : N → the actual state of X that generated δ: N be three polynomial-time functions. A VRF with V RF (SK,∗) input length a(λ), output length b(λ), and security P (TJ (1λ , δ̂, π̂) = x) ≤ 0.5 + s(λ)−1 . level s(λ) is a suite of three polynomial-time algorithms (FGEN , V RF, FV ER ) such that The Re-sharding Process All miners in MultiVAC re- ceive block headers produced by all shards. The block • FGEN is the key generation function, a probabilistic header includes function which takes in a unary string with length λ and which outputs a public key P K and private • the current height of that shard’s in-shard chain h key SK. • the block’s random seed Qr FGEN (1λ ) = (P K, SK). Miners determine when a shard is re-allocated by the current height of the in-shard chain. Currently we • V RF is the main random number generator, a set the protocol to re-shard when the in-shard chain deterministic function which receives a private key expands every n blocks or when the shard hits a time- SK and a seed x and which outputs two binary out ts after the shard’s first allocation to , so miners will strings: the generated random value δ and the trigger a re-shard when either of the below is true: verification proof π. ( h ≡ 0 (mod n), V RF (SK, x) = (δ(SK, x), π(SK, x)). t = ts + to . • FV ER is the randomness verification function, a After the commencement of re-sharding, all miners that probabilistic function that verifies the value of δ have placed a deposit may choose to participate in the based on π and the public information. new shard. The deposits are not counted in absolute token count, but in terms of deposit units, the size of FV ER (P K, x, δ, π) = T rue or F alse. which adjusts with the shards’ level of overcrowding as well as the overall system’s security needs. A VRF must have the following properties: If the deposit of a miner m is αm token units, the 1. Correctness: the following two conditions hold miner generates a uniform random number prob and with probability 1 − 2−Ω(k) : a corresponding verification ver using VRF with their private key SK and the random seed Qr : • Domain Range Correctness: for any x ∈ a(λ) b(λ) {0, 1} , δ(SK, x) ∈ {0, 1} . prob, ver = V RF (SK, Qr ), 0 ≤ prob < 1. Page 7 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain Now, given a security level 0 < s′ < 1 (discussed where the security level is s′ and the miner’s deposit is below), the probability that a miner is allocated into αm deposit units. the shard is s′ is 1−(1−s′ )αm , and the miner is selected For simplicity, we assume that all the miners place if prob < 1 − (1 − s′ )αm . Due to the pseudorandomness the same amount of deposit. Let B(n, p) be the bino- of the VRF, we have mial distribution with number of trials n and success probability p, and let N (µ, σ 2 ) be the normal distri- |P (m ∈ Mi,r ) − (1 − (1 − s′ )αm )|< ǫ, bution with mean µ and variance σ 2 . We define two random variables for very small ǫ, equivalent to the statement that under negligible probability, the following statements hold: • X ∼ B(N −Nh , p) ≈ N (p(N −Nh ), (N −Nh )p(1− ′ p)), the number of malicious miners chosen by the • P (m ∈ Mi,r ) > P (m ∈ Mi,r ) if αm > αm′ ; shard; • P (m ∈ Mi,r ) < P (m′ ∈ Mi,r ) if αm < αm′ ; • Y ∼ B(Nh , p) ≈ N (pNh , Nh p(1−p)), the number • P (m ∈ Mi,r ) = P (m ∈ Mi′ ,r ) for any i and i′ . of honest miners chosen by the shard. That is, miners with the same deposit are equally We have that likely to be chosen for a shard and miners with higher deposit have a higher likelihood to be chosen for a P (Ni,h ≥ cNi ) = P (Y ≥ c(X+Y )) = P ((1−c)Y −cX ≥ 0), shard. and we define the random variable W = (1 − c)Y − The Security Level s′ In each shard the security level cX. W ≥ 0 corresponds to a situation in which the s′ is a parameter that affects the number of allocated number of honest nodes in the shard satisfies the hon- miners. A high s′ gives miners a greater likelihood of esty requirement. Because X and Y are independent, allocation into the shard and causes the shard to have the distribution of W can be approximated as follows: more miners serving it on average. This causes the W ∼ N (µ, σ 2 ), shard to be both • more secure, as it is less likely that an adversary where can corrupt the majority of the shard to cause • µ = (1 − c)pNh − cp(N − Nh ) = pNh − cpN = denial of service, and p(ρ − c)N , • less efficient, as it is more likely that a greater • σ 2 = (1 − c)2 Nh p(1 − p) + c2 (N − Nh )p(1 − p) = amount of communication is required to reach (1 − c)2 ρ + c2 (1 − ρ) p(1 − p)N. in-shard consensus. Security requirements for transactions and for mon- Let Z be the standard normal distribution N (0, 1). etary processing are fixed at a very high rate. A default Then setting for s′ currently used is P (W ≥ 0) 1 µ s′ = = P (σZ + µ ≥ 0) = P (Z ≤ ) # all shards σ ! p(ρ − c)N with the same number of expected miners for each = Φ p shard. In other applications, it can, however, be feasi- [(1 − c)2 ρ + c2 (1 − ρ)] p(1 − p)N ble for users to choose a range of s′ levels under which ! p(ρ − c) √ they wish to run their applications, allowing them to = Φ p N obtain faster speed for less security-intensive applica- [(1 − c)2 ρ + c2 (1 − ρ)] p(1 − p) √ tions. We call this technique flexible sharding. Flexible Z √ p(ρ−c) N [(1−c)2 ρ+c2 (1−ρ)]p(1−p) 1 x2 sharding would give users significantly more leverage = √ e 2 dx, −∞ 2π in customizing their applications than most blockchains systems today, where usually speed and security pa- and hence, rameters for all applications are rigidly fixed by the underlying blockchain infrastructure. P (Ni,h ≥ cNi ) = P (W ≥ 0) = 1 − P (W < 0) Z a Security Analysis Suppose the number of honest min- 1 x2 ers in the whole network is Nh = ρN with honesty = 1− √ e 2 dx, −∞ 2π ratio ρ, and suppose further that after re-sharding, we √ require that Ni,h ≥ cNi for some honesty threshold c. where a = √ p(ρ−c) N . By the analysis from previous subsections, a miner is [(1−c)2 ρ+c2 (1−ρ)]p(1−p) chosen by a shard with probability Assume all the miners place 10 units of deposit, the security level is s′ = 0.1, the honesty ratio ρ = 0.8, p = 1 − (1 − s′ )αm , and the honesty threshold is c = 0.75. Then we can Page 8 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain compute the probability P (W ≥ 0) in terms of N . The or storage nodes. In particular, storage nodes are re- plot of log(N ) against log(1 − P (W ≥ 0)) is shown in quired to work for both newly created shards for a Figure 4. short period to guarantee a smooth transition. Inside It can be seen that if N ≥ 1500, the probability that the storage node, the blockchain of the shard fork is the number of honest miners who are chosen into shard able to serve both shards while avoiding duplicated i is less than the threshold cNi becomes less than 10−10 . storage. Provided that each new shard has enough For comparison, N = 10424 in Bitcoin and N = 14383 storage nodes, storage nodes are allowed after a cer- in Ethereum. Thus, the shards would fail to receive an tain period to stop supporting one of the shards so as honest threshold of miners with negligible probability. to not overburden them by serving multiple shards. Even if attackers are extremely lucky and a shard does After shard splitting, miners M2i,r and M2i+1,r are fail to obtain an honest threshold, attackers are still re-allocated to both shards using re-sharding. Since unable to falsify transactions and can only halt the accounts are assigned to shards based on their address, production of blocks in the shard until the shard is the original shard addresses will starte with 00. Af- re-allocated. ter splitting, the new shard will be responsible for ad- dresses starting with 000 and 001 respectively. This is depicted in Figure 5. Figure 4: Plot of log(N ) against log(1 − P (Ni,h ≥ cNi )). At N = 1500 (where log(N ) ≈ 3.17), the Figure 5: Account Address-Based Shard-Splitting likelihood that less than the threshold of honest miners is allocated into a shard is 10−10 . 3.4 In-Shard Consensus 3.3 Shard Splitting In MultiVAC’s definition of a shard MultiVAC dynamically achieves scalability by using shard-splitting to increase the number of shards. If the Si,r = (Ci,r , (Bi,h ), (Hi,h ), M Ti,r , Mi,r , Ci,r , Si,r ), network realizes that a particular shard consistently with 1 ≤ h ≤ r, the consensus algorithm Ci,r is re- faces high transaction flow, the shard is split in two placeable from shard to shard. Theoretically, any shard shards with each half serving half of the original shard’s could run any consensus algorithm, including PoW, al- accounts. though this would not be practical. MultiVAC instead Suppose the original shard is runs consensus algorithms from the Byzantine consen- Si,r = (Ci,r , (Bi,h ), (Hi,h ), M Ti,r , Mi,r , Ci,r , Si,r ), sus family, which assume a quorum of 2/3 honest nodes to operate. Byzantine consensus algorithms have full where 1 ≤ h ≤ r. After splitting, the shard is sepa- finality, meaning every shard’s blockchain is unique rated into and cannot be forked. Examples of Byzantine consen- sus algorithms are pBFT [22], Tendermint [23], and split(Si,r ) = (S2i,r , S2i+1,r ), BA* [19]. Research and development is underway to where explore the Byzantine Consensus Algorithm(s) most suited to our speed and flexibility requirements. • S2i,r = Once allocated to a shard, miners are delegated via (Ci,r , (Bi,h ), (Hi,h ), M Ti,r , M2i,r , C2i,r , Si,r ); VRF to tasks comprising of block generation and con- • S2i+1,r = sensus. The benefit of using a VRF is that not all min- (Ci,r , (Bi,h ), (Hi,h ), M Ti,r , M2i+1,r , C2i+1,r , Si,r ); ers need to participate in every consensus task, sig- nificantly reducing network traffic and speeding up and consensus. It is precisely because the randomness used Ci,r = C2i,r ⊔ C2i+1,r . to assign miners cannot be manipulated and is verifi- Shard splitting does not affect the in-shard consen- able, a high level of security is reached even without sus, the shard’s blocks, block headers, Merkle Trees all miners participating in every step. Page 9 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain The in-shard miner selection process for consensus between miner nodes and storage nodes. Miner nodes is as follows. We call the length of time between block are responsible for generating blocks and have voting verifications as blocktime. Every blocktime is broken up rights in the system, whereas storage nodes are solely into a number of tasks, which are usually voting pro- responsible for storing and serving data and act as cedures which depend on the specific algorithm used. service providers to the network. MultiVAC inclines to- Suppose the current blocktime is r and the current wards reducing the local computational load on miners task number is t. There exists a random seed Qr−1 in as much as possible so that many ordinary computers the block header of the previous blocktime, and each can join the mining network, allowing decision-making miner in the shard generates a random number with power to remain with ordinary users and promoting their private key the network’s health and efficiency. Our storage solution has the following features: h = V RF (SK, hash(Qr−1 , r, t)). No full ledger in miner nodes. Suppose we have a blockchain with a Visa-average throughput of 2000 A miner is selected as a voter if h is less than a tps. If miners stored the full ledger, this would amount predetermined threshold pr,s . to 23T of data a year linearly increasing over time, a An important difference exists between the above large amount for ordinary computers. MultiVAC has algorithm and the VRF used in shard-selection, namely, most of the data held by storage nodes so that miners in the above algorithm, miners are not weighted. In need only hold less than 1G of data at any time. other words, so long as a miner is allocated into a In-shard gossip protocol. In traditional shard, he or she has an equal say in the vote, making it blockchains like Bitcoin, messages are broad- impossible for miners with high stake deposits to also cast via gossip protocol to all the users in the network. dominate a shard with high voting power. In MultiVAC we implement a gossip protocol that The first task for every blocktime is block proposal. makes it possible to broadcast a message only to all At the beginning of the consensus, each miner selected users within particular shards, allowing nodes to only for the block proposal task constructs a new block Bu receive messages in which they are interested. and broadcasts the proposed block to the whole shard. Verifiable and trustworthy service by storage In the block proposal task, the random number h gen- nodes. Though most of the data in MultiVAC is stored erated during miner selection doubles as that miner’s by storage nodes to which miners send requests, the priority value, thus producing a total order over prior- miners must also store enough information to verify ity values for proposed blocks. Receivers of potential that the data returned is accurate. The MultiVAC pro- blocks cache all blocks and proceed to future tasks tocol is designed in such a way that all the information based on the priorities of the blocks. provided by storage nodes is verifiable by the miner and all unverified messages are discarded. No extra network transmission for miners. To 4 MultiVAC Storage and Transmis- perform verification, miner nodes need to locally store sion Sharding the Merkle root of the main Merkle tree. In Multi- VAC’s design, the miners are capable of updating the In this section we describe MultiVAC’s sharded storage Merkle root with only newly generated blocks, thus up- and transmission solution. dating the main Merkle root without additional trans- MultiVAC performs full sharding over the blockchain mission pressure to the storage node, giving miners network, meaning that we not only split up the net- necessary information for in-shard block generation work computational load amongst miners but also di- and re-sharding. vide up network storage amongst nodes. We design this in such a way that miners need only communicate 4.1 Storage Methodology with other miners in the same shard to obtain or com- mit a transaction, significantly reducing transmission A Merkle Tree of Transaction Outputs MultiVAC’s costs and realizing sharded transmission. MultiVAC storage nodes store transaction output states in a is the first major blockchain to create a fully-sharded, Merkle Tree data structure. These are packaged into three-pronged computation, storage, and transmission blocks, which contain outputs in a Merkle tree called solution, essential to scale blockchains to industrial the block Merkle tree. Block Merkle trees are com- capacity. piled together into a large Merkle tree, the main Indeed, storage load is a major concern on modern Merkle tree, where all historical outputs are stored. blockchains. A blockchain achieving high TPS (transac- Each storage node is responsible for maintaining the tions per second) also creates a high storage load. If a section of the main Merkle tree that contains the out- blockchain achieves 2,000 TPS with a transaction size puts covered by his shard, determined based on the of 400 bytes, the monthly storage pressure is 1,931GB. receiver’s account id. Merkle trees provide an efficient It thus becomes impractical for a common node to store storage and communication solution to store transac- all the blocks. MultiVAC solves this issue through stor- tion outputs and to allow other nodes to verify outputs age sharding. In MultiVAC, there is a division of labor by tracing Merkle Paths. Page 10 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain (a) The Full Main Merkle Tree (b) The Pruned Main Merkle Tree Stored by the Storage Nodes Figure 6: A depiction of the Main Merkle Tree stored in storage nodes, where red nodes are outputs to shard 1; blue nodes are outputs to shard 2; and orange nodes are intermediate nodes in the Merkle tree. Figure (a) is the complete Merkle Tree; and Figure (b) is the main Merkle tree maintained by shard 1’s storage nodes. Outputs that shard 1 are not responsible for are pruned along with unnecessary intermediate nodes. Division of Storage Labor In MultiVAC, miners are required to keep track of • All block headers; • Roots of all shards’ main Merkle trees; • Merkle Paths as required to perform updates to the Merkle Roots of all shards. Storage nodes are required to be aware of all blocks, comprising all historical transactions. However, they only maintain and update transaction outputs whose Figure 7: A Block Merkle Tree. New outputs are labelled Outs accounts are in the shard they are responsible for. For 1-4. hi = hash(Outs i, 0) for i = 1, 2, 3, 4, and this reason, storage nodes of different shards will have h5, h6 and h7 are calculated from their child different main Merkle Trees because storage nodes are hashes. only responsible for updating the state of transactions in their shard. Once the state is updated, the block hash changes and is propagated up to the Merkle root, top main Merkle tree and constituent block Merkle resulting in each storage node having recorded the trees, such that the leaves of the top main Merkle tree same transactions but updating only the transactions are the roots of the block Merkle trees. Storage nodes that it is responsible for serving up. Storage nodes append new block Merkle trees to the top main Merkle may then optionally choose to prune the records of tree when new blocks are added. We illustrate this transactions that it is not responsible for. structure in Figure 6(a), assuming only two shards for simplicity. Recall that an output consists of a value and a re- The Block Merkle Tree Every newly confirmed block ceiver’s address. Storage nodes in Shard 1 are only contains a group of newly confirmed transactions, interested in output addresses that are also in Shard stored in a Merkle tree data structure called the block 1. To improve storage efficiency the storage node is Merkle tree. The outputs of these transactions need allowed to discard sub-trees containing transactions to be recorded because they may be used as inputs to outputted to other shards so long as it still maintains future transactions. For every block that goes through those outputs’ Merkle roots. This still allows the node consensus, the miner that proposed it constructs the to have access to every output that it is responsible for. block Merkle tree and adds its Merkle root to the block’s This is illustrated in Figure 6(b). header. Inside of the block Merkle tree, each output When clients propose transactions, storage nodes for- is marked with a 0/1 state marking whether or not ward to miners the Merkle paths of inputs to be spent, the output is spent. These outputs are sorted by the which are all outputs of previously confirmed transac- recipient’s address and the transaction’s hash such that tions in a particular user’s account. Miners maintain transactions within the same shard are adjacent. Be- their own copy of the shard’s Merkle root and verify cause of this strict ordering, the block Merkle tree can the veracity of the forwarded transaction by checking be deterministically generated given a list of transac- its Merkle root against their local copy. Afterwards, tions. The idea is summarized in Figure 7. miners follow the Merkle path and check whether or not the input has already been spent. If a malicious The Main Merkle Tree Storage nodes store all histor- storage node provides false information to the miners, ical outputs in a Merkle tree called its main Merkle miners will reject the false information upon receiving tree. The main Merkle tree consists of two parts, the a Merkle root that is different from their local copies. Page 11 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain (a) Original main Merkle tree (b) New main Merkle tree Figure 8: Here, we depict the effect of appending a block Merkle Tree to a main Merkle Tree which is not full. Figure (a) is the original main Merkle tree, which becomes Figure (b) after appending a the block Merkle tree. Thus, the only harm that a malicious storage node can Recall that miners store not only all block headers do is to refuse to provide services, but since storage but also the Merkle root from all shards and some nodes are greatly duplicated to provide reliability of Merkle paths from all shards as required for bookkeep- service, miners can always easily switch to another ing. When a miner is a block proposer, he is responsi- storage node. ble for producing a block whose header contains two Merkle roots: the root of the attached block Merkle tree and the shard’s new main Merkle root upon ad- 4.2 The Block Confirmation Message dition of the block. Miners must update state changes from newly spent transactions when calculating the Miners need to keep track of the Merkle roots of all new main Merkle root. In addition, Merkle roots of shards. Thus, miners need to be notified when new blocks received from other shards must be appended to blocks in other shards are confirmed so as to update miners’ local versions of that shard’s Merkle root and their local versions of that shard’s Merkle root. This, Merkle paths in order to stay consistent with the other however, is not sufficient information to perform the shards. Finally, the miner’s pending transaction pool update; miners also need a way to prove that the infor- consists of Merkle paths generated according to the mation about the consensus transmitted to it is true. current Merkle tree. After a new block is confirmed, MultiVAC achieves this by broadcasting the votes of the these Merkle paths need to be updated in order for the final consensus task along with the newly confirmed system to stay consistent. These comprise a miner’s block. Upon completion of the final consensus task, the update tasks in MultiVAC. miner generates a confirmation message containing all MultiVAC designs a process allowing miners to up- of the votes together with block header and broadcasts date their local state without knowledge of the whole this message to the whole network. This message is Merkle tree. To do this, the miner requires the follow- called the block confirmation message. This confir- ing information: mation message contains sufficient information for any miner to verify that the shard has indeed reached con- • The Merkle paths of all inputs spent in a miner’s sensus, and the miner will update its own Merkle root newly generated block. with the obtained block header. • All roots of block Merkle trees included in block Negligible forgery risk. The possibility of forging headers received from other shards since the a confirmation message by a malicious user is negligi- miner’s last update. ble, since votes included in the message must contain • The right-most Merkle path of each shard’s top signatures and credentials from voting miners. A mali- main Merkle tree, that is, the Merkle path marking cious user has to corrupt a majority of voting miners in the locations where new blocks should be added in the shard in order to create an artificial confirmation each shard. These provided by the shard’s storage message. node when the miner first joins the shard. Minimizing unnecessary network transmissions. A significant communication cost is incurred if all min- We present in more detail the different update tasks ers broadcast a confirmation message to the entire that a miner is required to perform below. network. In MultiVAC’s protocol miners only gossip one confirmation message and discard other messages Updating Merkle Roots for a Newly Spent Transaction if they have already gossiped. First, a block proposing miner must take all transactions involved in the block and accordingly update all their input states. This requires obtaining the hash value 4.3 Miners Update Requirements of a Merkle path at index i, a process summarized in Pseudocode 1. We give a survey of miners’ update requirements to maintain the block ledger. Page 12 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain (a) Original main Merkle tree (b) New main Merkle tree Figure 9: Here, we depict the effect of appending a block Merkle Tree to a full main Merkle Tree. Figure (a) is the original main Merkle tree, which becomes Figure (b) after appending a the block Merkle tree. Pseudocode 1: getHash shards. This is performed by appending the new blocks INPUT: to the main Merkle tree before computing the Merkle Pt = ((H0 , S0 ), (H1 , S1 ), . . . , (Hn , Sn )): Merkle root. path of a transaction t For all the pending blocks from shard i with height i: The location of the required hash hi , the miner sorted them as (i, hi ) lexicographically OUTPUT: and append the blocks onto the main Merkle tree in H : A hash value in the merkle path such order. New heights of each shards will be recorded S ∈ {L, R}: The side of the hash value in the block and broadcasted within the shard. It is PROCEDURE: H ← Hi worth mentioning that the new height of each shard S ← Si is greater than or equal to the height of the previous round, which is equal if the miner has not received a The miner needs to change the states of spent inputs new block from that shard. from 0 to 1 and calculate the new resultant Merkle root. Different block proposers may see different pending It may update the Merkle root without knowledge of blocks due to network asynchronization and thus gener- the entire Merkle tree as the miners will be provided ate different main Merkle root and different heights for with the corresponding Merkle paths for each input other shards. However, only one of the new Merkle root is changed. In Pseudocode 2 we provide the logic of will be confirmed by the consensus algorithm. Given updating the main Merkle root if only one input was that the main Merkle tree is completely determined by changed. The procedure for updating the Merkle root if the new heights of the other shards, it gives sufficient more than one input was changed can be easily derived. information for other miners to confirm the compu- tation of new main Merkle root and storage nodes to Pseudocode 2: updateRoot update the main Merkle tree. INPUT: We design a mechanism such that miners need only t: Transaction store the block merkle roots instead of the entire blocks Pt : Merkle path of the transaction t in order to perform this operation. This solves a com- OUTPUT: mon problem in sharding solutions: as the network R: New Merkle Root expands, cross-shard communication requirements be- PROCEDURE: come burdensome and hamper scalability. Our solution R ← hash(input, 1) realizes transmission sharding by keeping the amount for i in Pt ; do H , S ← getHash(Pt , i) of cross-shard communication low, so that the trans- if S = L; do mission volume required for a single shard is close to R ← hash(H , R) fixed no matter how much the network expands. To do else; do this, miners are required to maintain each shard’s main R ← hash(R, H ) Merkle Tree’s rightmost Merkle path through which Output R new blocks are added. There are two cases for adding a new block. In MultiVAC, new block Merkle trees are always appended Incorporating Merkle Roots from New Blocks of Other to the right-most branch of the main Merkle tree. If the Shards The miner has received a number of new Merkle tree is incomplete, the miner simply appends blocks from other shards since its last update. When a the new block Merkle tree to the next available node. miner is responsible for creating the next main Merkle This scenario is represented in Figure 8. Alternatively, root (i.e. the miner is one of the block proposer), if the main Merkle tree is already complete, the miner he is responsible for including these blocks into the can increase the tree height by one and append the main Merkle tree so as to remain consistent with other block. The second scenario is presented in Figure 9. Page 13 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain (a) Full Merkle Tree of Output Messages (b) Pruned Merkle Tree of Output Messages Figure 10: In the figures above, red circles are outputs to shard 1, black circles are outputs to shard 2, and pink circles are outputs to shard 3. Suppose we have a storage node for shard 2 that is only interested in outputs to shard 2 only. Figure (a) is the full Merkle tree of output messages that the storage node receives, with the left-most path for shard 2 denoted by the blue nodes and the right-most path for shard 2 denoted by the red nodes. Figure (b) is the resulting Merkle Tree of output messages that shard 2 stores after pruning the outputs to other shards based on the left-most and right-most paths. These updates are made possible because the miner 4.4 Storage Node Update Requirements keeps track of each shard’s right-most Merkle paths. This allows the miner responsible for proposing the We now give a survey of storage nodes’ update require- next main Merkle root in its shard to also incorporate ments to maintain the states of their assigned shards. new blocks from the other shards. Recall that storage nodes are required to maintain the states of all historical transactions sent to accounts in their shard. Since these transactions may have been Updating Internal Storage After In-Shard Block Con- generated by any shard, storage nodes have differ- firmation Miners who receive a new confirmed block ent procedures for updating in-shard and out-of-shard in their shard is able to update their main Merkle root blocks. and rightmost Merkle path with the information con- When new blocks are confirmed in any shard, the tained in the new block. In addition, they must update miner that produced them creates and transmits an the Merkle paths of the pending transactions in their output message directed to all shards that must receive pending transaction pools, as they will have changed its outputs. Storage node listen to output messages upon the update of the Merkle root. The logic for doing from other shards in order to learn about new con- so is presented in Pseudocode 3: firmed blocks and receive the outputs that they are responsible for. When a storage node receives a new Pseudocode 3: updatePath block in its own shard, it processes all output messages INPUT: and adds partial block Merkle trees holding the out- Ptold : Merkle path of pending transaction puts from other shards to its main Merkle tree. It then P : Merkle path of a confirmed transaction finally appends the new block Merkle tree from its own O: An output shard to its main Merkle tree. We give descriptions of these processes below. OUTPUT: Ptnew : New Merkle path of the pending The Output Message An output message from shard transaction i to shard j is produced and sent every time a new PROCEDURE: block is produced in shard i that has outputs in shard Hold ← hash(O, 0) j. The message consists of: Hnew ← hash(O, 1) if Hold in Ptold ; do • The total number of outputs in the block; Ptnew ← replace Hold by Hnew in Ptold • The content, index and Merkle path of the output for i in P ; do Htmp , Stmp ← getHash(P, i) prior (left of) the outputs to shard j; if Stmp = L; do • The content, index and Merkle path of the output Hold ← hash(Htmp , Hold ) subsequent to (right of) outputs to shard j; Hnew ← hash(Htmp , Hnew ) • The list of outputs from shard i to shard j. else; do Hold ← hash(Hold , Htmp ) Output messages indicate to a storage node of shard Hnew ← hash(Hnew , Htmp ) j that a new block has been confirmed in shard i. With if Hold in Ptold ; do the list of outputs from shard i to shard j as well as the Ptnew ← replace Hold by Hnew in Ptold left and right Merkle paths, shard j is able to recon- Output Ptnew struct the section of the block’s Merkle Tree containing only the outputs to shard j. Page 14 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain Updating Out-of-Shard Confirmed Blocks When a store the blockchain’s state of the shard allowing for storage node for shard j receives an output message duplication and availability. The design of this storage from another shard i, it buffers the corresponding out- solution allows the vast majority of communications put message and does not immediately update its main in the blockchain to be conducted within the shard, Merkle tree. which we call transmission sharding. We further de- The storage node only processes all output messages sign mechanisms to relieve the network pressure on when a new block is confirmed in shard j, as described any parts of the network at any point in time, allowing in the section below. When it does so, for each output for extremely fast and scalable protocol operation. message from i to j it constructs a partial block Merkle tree with the same Merkle root as shard i’s block Merkle tree but only containing the outputs associated with 5 Summary shard j. The partial block Merkle tree is represented in Figure 10. Traditional blockchain systems suffer from scalability issues that hamper their usability in the real-world con- text. Among the various architectures proposed to scale Updating In-Shard Confirmed Blocks New con- blockchains, the most promising thus far is, blockchain firmed blocks in a storage node’s shard contain the sharding, which still carries inherent limitations. Cur- shard’s new main Merkle root and information incorpo- rent sharding solutions have produced progress in par- rating the other shards’ blockchain heights determined allelizing processing but are still limited in speeding by the block proposer. up other aspects of computation, namely transmission After a new in-shard block is received, the current and storage. To design a blockchain that is able to heights of other shards hi are included. The storage serve the real economy, MultiVAC is the world’s first node processes all buffered output messages from other architecture that designs sharding for all aspects of shards at once, up to the height of the corresponding blockchain computation: processing, transmission and shard hi , and reconstructs their partial block Merkle storage. trees, appending each partial Merkle tree to the right MultiVAC uses Verifiable Random Functions to dy- end of its own main Merkle tree in lexicographic order. namically parallelize the network of miners into net- It then appends the block Merkle tree of the new block work fragments called shards. Miners are re-allocated from its shard to its main Merkle tree. into shards in an equitable and publicly verifiable way Upon receiving a confirmed block in its own shard, that is also safe against Sybil attacks. MultiVAC’s el- the storage node updates its main Merkle tree, ensuring egant engineering solution involves a division of la- that the resultant Merkle root is the same as the one bor that realizes sharded transmission and storage as included in the block’s header. Since the network is through, ensuring that no part of the computational asynchronous, it is possible that a storage node may be process is limited by a single bottleneck and that ordi- missing blocks from other shards during this process. nary miners always have decision-making power in the Under those circumstances, the storage node requests blockchain. MultiVAC’s architecture allows for linearly the missing block and confirmation message directly scalable throughput while staying close to blockchain’s from the corresponding shard after a short time delay core values of decentralization and security. With our of 5 seconds. elegant all-dimensional shard architecture, MultiVAC These sum up a storage node’s update requirements is the world’s first blockchain able to practically and to keep their respective shards up to date. robustly handle industrial-level demand. 4.5 Storage and Transmission Summary References Here, we summarize our storage and transmission solu- [1] Nakamoto, Satoshi, Bitcoin: A Peer-To-Peer Elec- tion. MultiVAC realizes storage and transmission shard- tronic Cash System. (2008) ing by utilizing a division of labor between miners and storage nodes. Miners are responsible for achieving [2] Wood, Gavin. Ethereum: A Secure Decentralised consensus on blocks and processing transactions and Generalised Transaction Ledger. (2014). storage nodes are responsible for storing all historical data and serving up transactions. To perform secure [3] Larimer, Daniel et al. EOS.IO Technical White Paper and verifiable information sharing, miners must store v2. (2017) all block headers, the roots of the main Merkle trees [4] The Internet of Services Foundation. Iost: The in all shards, as well as the right-most Merkle paths Next-Generation, Secure, Highly Scalable Ecosystem of all shards. This allows them to keep summaries of For Online Services. (2017). the blockchain state in all shards and process transac- whitepaper/ tions consistent with such summaries, while keeping their hardware storage requirements at an incredibly [5] The QuarkChain Foundation. QuarkChain: A High- low level. Storage nodes serve particular shards and Capacity Peer-to-Peer Transactional System (2018) Page 15 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain [6] Kwon, Jae and Ethan Buchman. Cosmos: [21] Daniel J. Bernstein, Niels Duif, Tanja Lange, A Network of Distributed Ledgers (2018) Peter Schwabe, and Bo-Yin Yang High-speed high-security signatures Journal of Cryp- tographic Engineering 2 (2012), 77–89. [7] The AElf Foundation AElf - A Multi-Chain Parallel Computeing Blockchain Framework (2017) [22] Castro, Miguel, and Barbara Liskov. Practical [8] Poon, Joseph, and Vitalik Buterin. Plasma: Scalable Byzantine Fault Tolerance. In OSDI, vol. 99, pp. Autonomous Smart Contracts. (2017) 173-186. 1999. [9] Poon, Joseph, and Thaddeus Dryja. The Bitcoin [23] Buchman, Ethan, Jae Kwon, and Zarko Milosevic. Lightning Network: Scalable Off-Chain Instant Pay- The Latest Gossip on BFT Consensus. (2018) ments. (2016) [24] de Vries, Alex. Bitcoin’s Growing Energy Problem. Joule 2, no. 5 (2018): 801-805. [10] Popov, Serguei The Tangle. (2018). [25] Back, Adam. Hashcash-A Denial of Service Counter- Measure. (2002) [11] Baird, Leemon, Mance Harmon, and Paul Mad- sen. Hedera: A Governing Council & Public Hash- graph Network. (2018) [12] Liu, Chunming, Daniel Wang, and Ming Wu. Vite: A High Performance Asynchronous Decentralized Application Platform (2018). [13] Li, Chenxing, Peilun Li, Wei Xu, Fan Long, and Andrew Chi-chih Yao. Scaling Nakamoto Consensus to Thousands of Transactions per Second. (2018) [14] The Zilliqa Team. The Zilliqa Technical Whitepaper. (2017), [15] The Ethereum Foundation. Sharding Roadmap. roadmap [16] Micali, Silvio, Michael Rabin, and Salil Vadhan. Verifiable Random Functions. In Foundations of Computer Science, 1999. 40th Annual Symposium on, pp. 120-130. IEEE, 1999. [17] Dodis, Yevgeniy, and Aleksandr Yampolskiy. A Ver- ifiable Random Function with Short Proofs and Keys. In International Workshop on Public Key Cryptog- raphy, pp. 416-431. Springer, Berlin, Heidelberg, 2005. [18] Hanke, Timo, Mahnush Movahedi, and Dominic Williams. DFINITY Technology Overview Series, Con- sensus System. (2018) [19] Gilad, Yossi, Rotem Hemo, Silvio Micali, Georgios Vlachos, and Nickolai Zeldovich. Algorand: Scaling Byzantine Agreements For Cryptocurrencies. (2017) [20] David, Bernardo Machado, Peter Gazi, Aggelos Kiayias, and Alexander Russell. Ouroboros Praos: An Adaptively-Secure, Semi-Synchronous Proof-Of- Stake Protocol. IACR Cryptology ePrint Archive 2017 (2017): 573 Page 16 of 17

MultiVAC Sharding Yellowpaper The All-Dimensional Sharded Blockchain Appendix 1: Proof of Work and lected according to their amount of money (stake) in the system. As it is very hard to accumulate a Proof of Stake controlling stake of the money in a network, pro- viding proof of ownership of a certain amount of Proof of Work and Proof of Stake are the two money also performs the same function of iden- best competing systems for achieving consensus tity establishment as Proof of Work. In MultiVAC, in blockchains. Bitcoin famously uses the Proof of we prove miners’ identity by forcing them to lock Work algorithm popularized by Satoshi Nakamoto, up a stake deposit before they are able to start requiring miners to solve random computational mining and earn mining rewards. Proof of Stake puzzles requiring on average fixed amounts of time is secure because money ownership is required to before they can contribute to consensus. PoW se- be selected for bookkeeping, and it is very hard cures the network but requires the nonstop opera- for dishonest users to own a controlling stake in tion of tens of thousands of computations equiv- the total amount of money in the network. After alent to continuously guessing random numbers. Proof of Stake identities are established, we use This requirement is extremely energy-inefficient the mechanism of VRF to select bookkeepers in and wasteful, causing the Bitcoin network to con- a random and verifiable way, fulfilling both func- sume as much electricity as the entire country of tions of Proof of Work. Ireland in 2017. [24] Because of this huge expen- diture of computational power, Proof of Work is not able to scale to the level that MultiVAC desires. In reality, the purpose of the Proof of Work algo- rithm is not in itself able to produce consensus, and neither is its use essential to doing so. The function of Proof of Work is to certify identities, thus protecting the network against Sybil attacks or fake-identity attacks, when malicious actors pass off as multiple users so as to overwhelm the network. If node selection for bookkeeping is sim- ply randomly chosen based on accounts, malicious users can easily simulate thousands of accounts in a virtual machine and overwhelm the network. PoW forces miners to provide proof of a costly lim- ited resource, namely computational power, before being eligible to engaging in consensus, rendering it basically useless to create fake identities in ex- change for mining power. In this sense the use of Proof of Work in Bitcoin is similar to earlier uses of Proof of Work in HashCash to certify an email against spam [25]. PoW provides one other important functionality: verifiable randomness. It both establishes identities and also selects miners in a random fashion whose fairness can be validated by all the other miners. When a Bitcoin miner succeeds in landing a PoW solution, what he or she performs is both a proof of his or her identity though the expenditure of computational power and also a proof of winning the node selection lottery by providing the random puzzle solution. This mechanism is based on the assumption that it very hard for dishonest users to own a controlling stake in the majority of the network’s processing power, which is required to solve the computational puzzles. For this reason, MultiVAC uses a Proof of Stake (PoS) selection system which avoids PoW’s waste- ful energy requirements but ensures an equal level of security. An alternative to Proof of Work, PoS is a system where where miners are randomly se- Page 17 of 17