Holochain scalable agent-centric distributed computing DRAFT(ALPHA 1) – 2/15/2018 Eric Harris-Braun, Nicolas Luck, Arthur Brock1 1 Ceptr, LLC ABSTRACT : We present a scalable, agent-centric distributed computing platform. We use a formalism to characterize distributed systems, show how it applies to some existing distributed systems, and demonstrate the benefits of shifting from a data-centric to an agent-centric model. We present a detailed formal specification of the Holochain system, along with an analysis of its systemic integrity, capacity for evolution, total system computational complexity, implications for use-cases, and current implementation status. I. INTRODUCTION approach data-centric because of its focus on cre- ating a single shared data reality among all nodes. Distributed computing platforms have achieved a new We claim that this fundamental original stance re- level of viability with the advent of two foundational sults directly in the most significant limitation of the cryptographic tools: secure hashing algorithms, and blockchain: scalability. This limitation is widely known public-key encryption. These have provided solutions 3 and many solutions have been offered 4 . Holochain of- to key problems in distributed computing: verifiable, fers a way forward by directly addressing the root data- tamper-proof data for sharing state across nodes in the centric assumptions of the blockchain approach. distributed system and confirmation of data provenance via digital signature algorithms. The former is achieved t by hash-chains, where monotonic data-stores are ren- II. PRIOR WORK dered intrinsically tamper-proof (and thus confidently sharable across nodes) by including hashes of previous af entries in subsequent entries. The latter is achieved by combining cryptographic encryption of hashes of data and using the public keys themselves as the addresses This paper builds largely on recent work in crypto- graphic distributed systems and distributed hash tables and multi-agent systems. of agents, thus allowing other agents in the system to Ethereum: Wood [EIP-150], DHT: [Kademlia] Benet mathematically verify the data’s source. [IPFS] TODO: discussion and more references here Dr Though hash-chains help solve the problem of indepen- dently acting agents reliably sharing state, we see two very different approaches in their use which have deep systemic consequences. These approaches are demon- III. DISTRIBUTED SYSTEMS strated by two of today’s canonical distributed systems: A. Formalism 1. git1 : In git, all nodes can update their hash-chains as they see fit. The degree of overlapping shared We define a simple generalized model of a distributed state of chain entries (known as commit objects) system Ω using hash-chains as follows: across all nodes is not managed by git but rather ex- plicitly by action of the agent making pull requests 1. Let N be the set of elements {n1 , n2 , . . . nn } par- and doing merges. We call this approach agent- ticipating in the system. Call the elements of N centric because of its focus on allowing nodes to nodes or agents. share independently evolving data realities. 2. Let each node n consist of a set Sn with elements 2. Bitcoin2 : In Bitcoin (and blockchain in general), {σ1 , σ2 , . . . }. Call the elements of Sn the state the “problem” is understood to be that of figuring of node n. For the purposes of this paper we as- out how to choose one block of transactions among sume ∀σi ∈ Sn : σi = {Xi , Di } with Xi being a the many variants being experienced by the mining hash-chain and D a set of non-hash chain data nodes (as they collect transactions from clients in elements. different orders), and committing that single vari- ant to the single globally shared chain. We call this 3. Let H be a cryptographically secure hash function. 1 https://git-scm.com/about 3 add various sources 2 https://bitcoin.org/bitcoin.pdf 4 more footnotes here

2 4. Let there be a state transition function: 1. Call a set of nodes in N for which any of the func- tions T, V, P and E have the properties of being τ (σi , t) = (τX (Xi , t), τD (Di , t)) (3.1) both reliably known and also known to be identical for that set of nodes: trusted nodes with respect where: to the functions so known. (a) τX (Xi , t) = Xi+1 where 2. Call a channel C with the property that messages Xi+1 = Xi ∪ {xi+1 } in transit can be trusted to arrive exactly as sent: (3.2) secure. = {x1 , . . . , xi , xi+1 } 3. Call a channel C on which the address An of a node with n is An = H(pkn ), where pkn is the public key of xi+1 = {h, t} the node n, and on which all messages include a digital signature of the message signed by sender: h = {H(t), y} (3.3) authenticated . y = {H(xj )|j < i} 4. Call a data element that is accessible by its hash Call h a header and note how the sequence content addressable. of headers creates a chain (tree, in the general case) by linking each header to the previous For the purposes of this paper we assume untrusted header(s) and the transaction. nodes, i.e., independently acting agents solely under their (b) Di+1 = τD (σi , t) own control, and an insecure channel. We do this be- cause the very raison d’être of the cryptographic tools 5. Let V (t, v) be a function that takes t, along with mentioned above is to allow individual nodes to trust the whole system under this assumption. The cryptography t extra validation data v, verifies the validity of t and only if valid calls a transition function for t. Call immediately makes visible in the state data when any V a validation function. other node in the system uses a version of the functions 6. Let I(t) be a function that takes a transaction t, af evaluates it using a function V , and if valid, uses τ to transform S. Call I the input or stimulus different from itself. This property is often referred to as a trustless system. However, because it simply means that the locus of trust has been shifted to the state data, rather than other nodes, we refer to it as systemic reliance function. on intrinsic data integrity . See IV C for a detailed discussion on trust in distributed systems. Dr 7. Let P (x) be a function that can create transactions t and trigger functions V and τ , and P itself is triggered by state changes or the passage of time. B. Data-Centric and Agent-Centric Systems Call P the processing function. 8. Let C be a channel that allows all nodes in N Using this definition, Bitcoin can be understood as that to communicate and over which each node has a system Ωbitcoin where: unique address An . Call C and the nodes that com- municate on it the network . ! ! 1. ∀n, m ∈ N : Xn = Xm where = means is enforced. 9. Let E(i) be a function that changes functions 2. V (e, v) e is a block and v is the output from the V, I, P . Call E the evolution function. “proof-of-work” hash-crack algorithm, and V con- Explanation: this formalism allows us to model sepa- firms the validity of v, the structure and validity of rately key aspects of agents. e according to the double-spend rules5 . First we separate the agent’s state into a cryptograph- ically secured hash-chain part X and another part that 3. I(t, n) accepts transactions from clients and adds holds arbitrary data D. Then we split the process of up- them to D (the mempool ) to build a block for later dating the state into two steps: 1) the validation of new use in triggering V (). transactions t through the validation function V (t, v), 4. P (i) is the mining process including the “proof-of- and 2) the actual change of internal state S (as either X work” algorithm and composes with V () and τX or D) through the state transition functions τX and τD . when the hash is cracked. Finally, we distinguish between 1) state transitions trig- gered by external events, stimuli, received through I(t), and 2) a node’s internal processing P (x) that also results in calling V and τ with an internally created transaction. 5 pointer here We define some key properties of distributed systems:

3 5. E(i) is not formally defined but can be mapped problem in distributed systems which consist entirely of informally to a decision by humans operating the multiple vantage points by definition. nodes to install new versions of the Bitcoin soft- In the distributed world, events don’t happen in the ware. same sequence for all observers. For Blockchain specif- ically, this is the heart of the matter: choosing which The first point establishes the central aspect of Bit- block, from all the nodes receiving transactions in differ- coin’s (and Blockchain applications’ in general) strategy ent orders, to use for the “consensus,” i.e., what single for solving or avoiding problems otherwise encountered vantage point to enforce on all nodes. Blockchains don’t in decentralized systems, and that is by trying to main- record a universal ordering of events – they manufacture tain a network state in which all nodes should have the a single authoritative ordering of events – by stringing same (local) chain. together a tiny fragment of local vantage points into one By contrast, for Ωgit there is no such constraint on any global record that has passed validation rules. Xn , Xm in nodes n and m matching, as git’s core intent The use of the word consensus seems at best dubious is to allow different agents act autonomously and diver- as a description of a systemic requirement that all nodes gently on a shared code-base, which would be impossible carry identical values of Xn . Especially when the algo- if the states always had to match. rithm for ensuring that sameness is essentially a digital Through the lens of the formalism some other aspects lottery powered by expensive computation of which the of Ωgit can be understood as follows: primary design feature is to randomize which node gets 1. the validation function V (e, v) by default only to run Vn such that no node has preference to which e checks the structural validity of e as a commit ob- gets added to Xn . ject not it’s content (though note that git does also The term consensus, as normally used, implies delib- support signing of commits which is also part of the eration with regard to differences and work on crafting validation) a perspective that holds for all parties, rather than sim- ply selecting one party’s dataset at random. In contrast, t 2. the stimulus function I(t) for Ωgit consists of the as a more agent-centric distributed system, git’s merge set of git commands available to the user command provides for a processes more recognizable as consensus, however it’s not automated. af 3. the state transition function τX is the internal git function that adds a commit object and τD is the git function that adds code to the index triggered Perhaps a more accurate term for the hash-crack al- gorithm applied in Ωbitcoin would be “proof-of-luck” and for the process itself simply sameness, not consensus. If by add you start from a data-centric viewpoint, which naturally 4. E is, similarly to Ωbitcoin , not formally defined for throws out the “experience” of all agents in favor of just Dr Ωgit . one, it’s much harder to design them to engage in pro- cesses that actually have the real-world properties of con- We leave a more in depth application of the formalism sensus. If the constraint of keeping all nodes’ states the to Ωgit as an excercise for the reader, however we under- same were adopted consciously as a fit for a specific pur- score that the core difference between Ωbitcoin and Ωgit pose, this would not be particularly problematic. Un- ! lies in the formers constraint of ∀n, m ∈ N : Xn = Xm . fortunately the legacy of this data-centric viewpoint has One direct consequence of this for Ωbitcoin is that as the been held mostly unconsciously and is adopted by more size of Xn grows, necessarily all nodes of Ωbitcoin must generalized distributed computing systems, for which the grow in size, whereas this is not necessarily the case for intent doesn’t specifically include the need to model “dig- Ωgit and in it lies the core of Bitcoin’s scalability issues. ital matter” with universally absolute location. While It’s not surprising that a data-centric approach was having the advantages of conceptual simplicity, it also im- used for Bitcoin. This comes from the fact that its stated mediately creates scalability issues, but worse, it makes intent was to create digitally transferable “coins,” i.e., it hard to take advantages inherent in the agent-centric to model in a distributed digital system that property approach. of matter known as location. On centralized computer systems this doesn’t even appear as a problem because centralized systems have been designed to allow us to IV. GENERALIZED DISTRIBUTED think from a data-centric perspective. They allow us COMPUTATION to believe in a kind of data objectivity, as if data exists, like a physical object sitting someplace having a location. The previous section described a general formalism for They allow us to think in terms of an absolute frame - as if distributed systems and compared git to Bitcoin as an ex- there is a correct truth about data and/or time sequence, ample of an agent-centric vs. a data-centric distributed and suggests that “consensus” should converge on this system. Neither of these systems, however, provides gen- truth. In fact, this is not a property of information. Data eralized computation in the sense of being a framework exists always from the vantage point of an observer. It is for writing computer programs or creating applications. this fact that makes digitally transferable “coins” a hard So, lets add the following constraints to formalism III A

4 as follows: 2. Let M be a virtual machine used to execute code. 1. With respect to a machine M , some values of Sn 3. Let the initial entry of all Xn in N can be interpreted as: executable code and the re- be identical and consist in the set sults of code execution, and they may be accessible DNA{e1 , e2 , . . . , f1 , f2 , . . . , p1 , p2 , . . . } where to M and the code. Call such values the machine ex are definitions of entry types that can be state. added to the chain, fx are functions defined as executable on M (which we also refer to as the 2. ∃t and nodes n such that In (t) will trigger execution set Fapp = {app1 , app2 , . . . }), and px are system of that code. Call such transaction values calls. properties which among other things declare the expected operating parameters of the application being specificed. For example the resilience factor A. Ethereum as defined below is set as one such property. Ethereum6 provides the current premier example of 4. Let ιn be the second entry of all Xn and be a set generalized distributed computing using the Blockchain of the form {p, i} where p is the public key and i model. The Ethereum approach comes from an ontology is identifying information appropriate to the use of of replicating the data certainty of single physical com- this particular Ωhc . Note that though this entry is puter, on top of the stratum of a bunch of distributed of the same format for all Xn it’s content is not the nodes using the blockchain strategy of creating a sin- same. Call this entry the agent identity entry. gle data reality in a cryptographic chain, but commiting computations, instead of just monetary transactions as 5. ∀ex ∈ DN A let there be an appx ∈ Fapp which can in bitcoin, into the blocks. be used to validate transactions that involve entries This approach does live up to the constraints listed of type ex . Call this set Fv or the application above as described by Wood [EIP-150] where the bulk validation functions. of that paper can be understood as a specification of a validation function Vn () and the described state transi- t6. Let there be a function Vsys (ex, e, v) which checks that e is of the form specified by the entry definition af tion function σt+1 ≡ Υ(σ, T ) as a specification of how constraints above are met. Unfortunately the data-centric legacy inherited by Ethereum from the blockchain model, is immediately ob- for ex ∈ DNA. Call this function the system entry validation function. 7. Let W the overall validation function V (e, v) ≡ servable in its high compute cost7 and difficulty in scal- x Fv (ex )(v) ∧ Vsys (ex , e, v). ing8 . Dr 8. Let FI be a subset of Fapp distinct from Fv such that ∀fx (t) ∈ FI there exists a t to I(t) that will B. Holochain trigger fx (t). Call the functions in FI the exposed functions. We now proceed to describe an agent-centric dis- tributed generalized computing system, where nodes can 9. Call any functions in Fapp not in Fv or FI internal still confidently participate in the system as whole even functions and allow them to be called by other though they are not constrained to maintaining the same functions. chain state as all other nodes. In broad strokes: a Holochain application consists of 10. Let the channel C be authenticated . a network of agents maintaining a unique source chain 11. Let DHT define a distributed hash table on an au- of their transactions, paired with a shared space imple- thenticated channel as follows: mented as a validating, monotonic, sharded, distributed hash table (DHT) where every node enforces validation (a) Let ∆ be a set {δ1 , δ2 , . . . } where δx is a rules on that data in the DHT as well as providing prove- set {key, value} where key is always the hash nance of data from the source chains where it originated. H(value) of value. Call ∆ the DHT state. Using our formalism, a Holochain based application Ωhc is defined as: (b) Let FDHT be the set of functions {dhtput , dhtget } where: 1. Call Xn the source chain of n. i. dhtput (δkey,value ) adds δkey,value to ∆ ii. dhtget (key) = value of δkey,value in ∆ (c) Assume x, y ∈ N and δi ∈ ∆x but δi ∈ / ∆y . 6 https://github.com/ethereum/wiki/wiki/White-Paper Allow that when y calls dhtget (key), δi will be 7 link to our benchmarkng retrieved from x over channel X and added to 8 find a scholarly article ∆y .

5 DHT are sufficiently mature that there are a num- Call the union of such sets Vδ , from a given ber of ways to ensure property 11c. For our cur- node’s perspective, the overlap list and also rent alpha version we use a modified version of note that q ≥ r. [Kademlia] as implemented in [LibP2P]. (i) Allow every node n to discard every δx ∈ ∆n if the number of closer (with regards to d(x, y)) 12. Let DHThc augment DHT as follows: nodes is greater than q (i.e. if other nodes are (a) ∀δkey,value ∈ ∆ constrain value to be of able to construct their Vδ sets without includ- an entry type as defined in DNA. Furth- ing n, which in turn means there are enough more, enforce that any function call dhtx (y) other nodes responsible for holding δ in their which modifies ∆ also uses Fv (y) to validate ∆m to have the system meet the resilience set y and records whether it is valid. Note that by r even without n participating in storing δ). this validation phase may include contacting Note that this results in the network adapt- the source nodes involved in generating y to ing to changes in topology and DHT state mi- gather more information about the context of grations by regulating the number of network- the transaction, see IV C 2. wide redundant copies of all δi ∈ ∆ to match r according to node uptime. (b) Enforce that all elements of ∆ only be changed monotonically, that is, elements δ can only be Call DHThc a validating , monotonic, sharded added to ∆ not removed. DHT. (c) Include in FDHT the functions defined in A. 13. ∀n ∈ N assume n implements DHThc , that is: ∆ is (d) Allow the sets δ ∈ ∆ to also include more a subset of D (the non hash-chain state data), and elements as defined in A. FDHT are available to n, though note that these (e) Let d(x, y) be a symmetric and unidirectional functions are NOT directly available to the func- tions Fapp defined in DNA. t distance metric within the hash space defined by H, as for example the XOR metric defined 14. Let Fsys be the set of functions in [Kademlia]. Note that this metric can be {syscommit , sysget , . . . } where: af applied between entries and nodes alike since the addresses of both are values of the same hash function H (i.e. δkey = H(δvalue ) and (a) syscommit (e) uses the system validation func- tion V (e, v) to add e to X , and if successful An = H(pkn )). calls dhtput (H(e), e). (f) Let r be a parameter of DHThc to be set de- (b) sysget (k) = dhtget (k). Dr pendent on the characteristics deemed benefi- (c) see additional system functions defined in B. cial for maintaining multiple copies of entries in the DHT for the given application. Call r 15. Allow the functions in Fapp defined in the DNA to the resilience factor . call the functions in Fsys . (g) Allow that each node can maintain a set M = 16. Let m be an arbitrary message. Include in Fsys {mn , . . . } of metrics mn about other nodes, the function syssend (Ato , m) which when called on where each mn contains both a node’s direct nfrom will trigger the function appreceive (Afrom , m) experience of n with respect to that metric, in the DNA on the node nto . Call this mechanism as well as the experience of other nodes of n. node-to-node messaging . Enforce that one such metric kept is uptime which keeps track of the percentage of time a 17. Allow that the definition of entries in DNA can node is experienced to be available. Call the mark entry types as private. Enforce that if an process of nodes sharing these metrics gossip entry σx is of such a type then σx ∈ / ∆. Note and refer to IV C 3 for details. however that entries of such type can be sent as node-to-node messages. (h) Enforce that ∀δ ∈ ∆n each node n maintains a set Vδ = {n1 , . . . , nq } of q closest nodes to δ 18. Let the system processing function P (i) be a set of as seen from n, which are expected by n to also functions in Fapp to be registered in the system as hold δ. Resiliency is maintained by taking into callbacks based on various criteria, e.g. notification account node uptimes and choosing the value of rejected puts to the DHT, passage of time, etc. of q so that: q X C. Systemic Integrity Through Validation uptime(ni ) ≥ r (4.1) i=0 The appeal of the data-centric approach to distributed whith uptime(n) ∈ [0, 1]. computing comes from the fact that if you can prove that

6 all nodes reliably have the same data then that provides only have considered confidence in (RC ). Still unclear is strong general basis from which to prove the integrity how to measure a concrete confidence level Ψα . In real- of the system as a whole. In the case of Bitcoin, the world contexts and for real-world decisions, confidence is X holds the transactions and the unspent transaction mainly dependent on an (human) agent’s vantage point, outputs, which allows nodes to verify future transactions set of data at hand, and maybe even intuition. Thus against double-spend. In the case of Ethereum, X holds we find it more adequate to call it a soft criteria. In what ammounts to pointers to machine state. Proving order to comprehend this concept objectively and relate the consistency across all nodes of those data sets is fun- it to the notion conveyed by Woods in the quote above, damental to the integrity of those systems. we proceed by defining the measure of confidence of an However, because we have started with the assump- aspect α as the conditional probability of it being the tion (see III A) of distributed systems of independently case in a given context: ! acting agents, any proof of ∀n, m ∈ N : Xn = Xm in a Ψα ≡ P(α|C) (4.3) blockchain based system is better understood as a choice ! (hence our use of the =), in that nodes use their agency where the context C models all other information avail- to decide when to stop interacting with other nodes based able to the agent, including basic and intuitive assump- on detecting that the X state no longer matches. This tions. might also be called “proof by enforcement,” and is also Consider the fundamental example of cryptographi- appropriately known as a fork because essentially it re- cally signed messages with asymetric keys as applied sults in partitioning of the network. throughout the field of cryptographic systems (basically The heart of the matter has to do with the trust any what coins the term crypto-currency). The central aspect single agent has is in the system. In [EIP-150] Section in this context we call αsignature which provides us with 1.1 (Driving Factors) we read: the ability to know with certainty that a given message’s Overall, I wish to provide a system such that real author Authorreal is the same agent indicated solely users can be guaranteed that no matter with t which other individuals, systems or organizations via locally available data in the message’s meta infor- they interact, they can do so with absolute con- mation through the cryptographic signature Authorlocal . fidence in the possible outcomes and how those We gain this confidence because we deem it very hard for outcomes might come about. af The idea of “absolute confidence” here seems impor- tant, and we attempt to understand it more formally and any agent not in possession of the private key to create a valid signature for a given message. αsignature ≡ Authorreal = Authorlocal (4.4) generally for distributed systems. 1. Let Ψα be a measure of the confidence an agent has The appeal of this aspect is that we can check author- Dr in various aspects of the system it participates in, ship locally, i.e., without the need of a 3rd party or direct where 0 ≤ Ψ ≤ 1, 0 represents no confidence, and trusted communication channel to the real author. But, 1 represents absolute confidence. the confidence in this aspect of a certain cryptographic system depends on the context C: 2. Let Rn = {α1 , α2 , ... . . . } define a set of aspects about the system with which an agent n ∈ N mea- Ψsignature = P(Authorreal = Authorlocal |C) (4.5) sures confidence. Call Rn the requirements of n with respect to Ω. If we constrain the context to remove the possibility of an adversary gaining access to an agent’s private key and 3. Let εn (α) be a thresholding function for node n ∈ also exclude the possible (future) existence of computing N with respect to α such that when Ψα < ε(α) devices or algorithms that could easily calculate or brute then n will either stop participating in the system, force the key, we might then assign a (constructed) confi- or reject the participation of others (resulting in a dence level of 1, i.e., “absolute confidence”. Without such fork). constraints on C, we must admit that Ψsignature < 1, which real world events, for instance the Mt.Gox hack 4. Let RA and Let RC be partitions of R where from 20149 , make clear. ∀α ∈ RA : ε(α) = 1 We aim to describe these relationships in such detail (4.2) in order to point out that any set RA of absolute ∀α ∈ RC : ε(α) < 1 requirements can’t reach beyond trivial statements - so any value of Ψ 6= 1 is rejected in RA and any statements about the content and integrity of the local value Ψ < ε(α) is rejected in RC . Call RA the state of the agent itself. Following Descarte’s way of absolute requirements and RC the considered requirements. 9 So we have formally separated system characteristics ”Most or all of the missing bitcoins were stolen straight out of the that we have absolute confidence in (RA ) from those we Mt. Gox hot wallet over time, beginning in late 2011” [Nilsson15]

7 questioning the confidence in every thought, we project expend differing levels of resources to validate. We de- his famous statement cogito ergo sum into the reference signed Holochain to allow such validation functions to be frame of multi-agent systems by stating: Agents can set contextually per application and expose these con- only have honest confidence in the fact that they texts explicitly. Thus, one could conceivably build a perceive a certain stimulus to be present and Holochain application that deliberately makes choices in whether any particular abstract a priori model its validation functions to implement either all or par- matches that stimulus without contradiction, i.e., tial characteristics of Blockchains. Holochain, therefore, that an agent sees a certain piece of data and that it is can be understood as a framework that opens up a spec- possible to interpret it in a certain way. Every conclusion trum of decentralized application architectures in which being drawn a posteriori through the application of Blockchain happens to be one specific instance at one end sophisticated models of the context is dependent on of this spectrum. assumptions about the context that are inherent to the In the following sections we will show what categories model. This is the heart of the agent-centric outlook, of validation algorithms exist and how these can be and what we claim must always be taken into account stacked on top of each other in order to build decen- in the design of decentralized multi-agent systems, as tralized systems that are able to maintain integrity with- it shows that any aspect of the system as a whole that out introducing an absolute truth every agent would be includes assumptions about other agents and non-local forced to accept or consider. events must be in RC , i.e., have an a priori confidence of Ψ < 1. Facing this truth about multi-agent systems, we find little value in trying to force an absolute truth ! 1. Intrinsic Data Integrity ∀n, m ∈ N : Xn = Xm and we instead frame the problem as: Every application but the most low-level routines uti- We wish to provide generalized means by which lize non-trivial, structured data types. Structured im- t decentralized multi-agent systems can be built so plies the existence of a model describing how to interpret that: raw bits as an instance of a type and how pieces of the structure relate to each other. Often, this includes cer- 1. fit-for-purpose solutions can be applied in order dences Ψα , af to optimize for application contextualized confi- 2. violation of any threshold ε(α) through the ac- tain assumptions about the set of possible values. Cer- tain value combinations might not be meaningful or vio- late the intrinsic integrity of this data type. tions of other agents can be detected and managed Consider the example of a cryptographically signed by any agent, such that message m = {body, signature, author}, where author Dr 3. the system integrity is maintained at any point in is given in the form of their public key. This data type time or, when not, there is a path to regain it (see conveys the assumption that the three elements body, ??). signature and author correspond to each other as con- strained by the cryptographic algorithm that is assumed We perceive the agent-centric solution to these re- to be determined through the definition of this type. The quirements to be the holographic management of system- intrinsic data integrity of a given instance can be val- integrity within every agent/node of the system through idated just by looking at the data itself and checking application specific validation routines. These sets of val- the signature by applying the cryptographic algorithm idation rules lie at the heart of every decentralized ap- that constitutes the central part of the type’s a priori plication, and they vary across applications according to model. The validation yields a result ∈ {true, f alse} context. Every agent carefully keeps track of their repre- which means that the confidence in the intrinsic data in- sentation of that portion of reality that is of importance tegrity is absolute, i.e. Ψintrinsic = 1. to them - within the context of a given application that Generally, we define the intrinsic data integrity has to manage the trade-off between having high confi- of a transaction type φ as an aspect αφ,intrinsic ∈ RA , dence thresholds ε(α) and a low need for resources and expressed through the existence of a deterministic and complexity. local validation function Vα (t) for transactions t ∈ φ that For example, consider two different use cases of trans- does not depend on any other inputs but t itself. actions: Note how the intrinsic data integrity of the message ex- 1. receipt of an email message where we are trying to ample above does not make any assumptions about any validate it as spam or not and message’s real author, as the aspect αsignature from the previous section does. With this definition, we focus on 2. commit of monetary transaction where we are try- aspects that don’t make any claims about system prop- ing to validate it against double-spend. erties non-local to the agent under consideration, which roots the sequence of inferences that constitutes the va- These contexts have different consequences that an agent lidity and therefore confidence of a system’s high-level may wish to evaluate differently and may be willing to aspects and integrity in consistent environmental inputs.

8 2. Membranes & Provenance transactions. This resembles how knowledge gets con- structed within social fields and through interaction with Distributed systems must rely on mechanisms to others, as described by the sociological theory of social restrict participation by nodes in processes that without constructivism. such restriction would compromise systemic integrity. The properties of the DHT in conjunction with the Systems where the restrictions are based on the nodes’ hash function provide us with a deterministically defined identity, whether that be as declared by type or author- set of nodes, i.e., a neighborhood for every transaction. ity, or collected from the history of the nodes’ behaviors, One cannot easily construct a transaction such that it are know as permissioned [Swanson15]. Systems lands in a given neighborhood. Formally: where these restrictions are not based on properties of the nodes themselves are known as permissionless. ∀t ∈ ∆ : ∃η : H → N r (4.6) In permissionless multi-agent systems, a principle η(H(t)) = (n1 , n2 , . . . , nr ) threat to systemic integrity comes from Sybil-Attacks [Douceur02], where an adversary tries to overcome the where the function η maps from the range H of the hash system’s validation rules by spawning a large number of function H to the r nodes that keep the r redundant compromised nodes. shards of the given transaction t (see 12i). Having the list of nodes η(H(t)) allows an agent to However, for both permissioned and permissionless compare third-party viewpoints regarding t, with its own systems, mechanisms exists to gate participation. For- and that of the transaction’s source(s). The random- mally: ization of the hash function H ensures that those view- points represent an unbiased sample. r can be adjusted Let M (n, φ, z) be a binary function that evaluates depending on the application’s constraints and the cho- whether transactions of type φ submitted by n ∈ N sen trade-off between costs and system integrity. These are to be accepted, and where z is any arbitrary extra properties provide sufficient infrastructure to create sys- t information needed to make that evaluation. Call M tem integrity by detecting nodes that don’t play by the the membrane function, and note that it will be a rules - like changing the history or content of their source component of the validation function V (t, v) from the chain. In appendix C we detail tooling appropriate for initial formalism5. af In the case of Ωbitcoin and Ωethereum , M ignores the different contexts, including ones where detailed analysis of source chain history is required - for example financial transaction auditing. value of n and makes its determination solely on whether Depending on the application’s domain, neighbor- z demonstrates the “proof” in proof-of-X be it work or hoods could become vulnerable to Sybil-Attacks because Dr stake which is a sufficient gating to protect against Sybil- a sufficiently large percentage of compromised nodes Attacks. could introduce bias into the sample used by an agent Giving up the data-centric fallacy of forcing one ab- to evaluate a given transaction. Holochain allows ap- ! solute truth ∀n, m ∈ N : Xn = Xm reveals that we plications to handle Sybil-Attacks through domain spe- can’t discard transaction provenance. Agent-centric dis- cific membrane functions. Because we chose to inher- tributed systems instead must rely on two central facts ently model agency within the system, permission can about data: be granted or declined in a programmatic and decentral- ized manner thus allowing applications to appropriately 1. it originates from a source and land on the spectrum between permissioned and permis- sionless. 2. its historical sequence is local to that source. In appendix D, we provide some membrane schemes that can be chosen either for the outer membrane of that For this reason, Ωhc splits the system state data into two application that nodes have to cross in order to talk to parts: any other node within the application or for any sec- ondary membrane inside the application. That latter 1. each node is responsible to maintain its own entire means that nodes could join permissionless and partic- Xn or source chain and be ready to confirm that ipate in aspects of the application that are not integrity state to other nodes when asked and critical without further condition but need to provide cer- 2. all nodes are responsible to share portions of other tain criteria in order to pass the membrane into applica- nodes’ transactions and those transactions’ meta tion crucial validation. data in their DHT shard - meta data includes Thus, Holochain applications maintain systemic in- validity status, source, and optionally the source’s tegrity without introducing consensus and therefore chain headers which provide historical sequence. (computationally expensive) absolute truth because 1) any single node uses provenance to independently verify Thus, the DHT provides distributed access to others’ any single transaction with the sources involved in that transactions and their evaluations of the validity of those transaction and 2) because each Holochain application

9 runs independently of all others, they are inherently per- recieved from those nodes will result in changing missioned by application specific rules for joining and mother . continuing participation in that application’s network. These both provide the benefit that any given Holochain 5. ∀m ∈ M let the function Gabout (m) return a set of application can tune the expense of that validation to a nodes important for a node to gossip about defined contextually appropriate level. by the properties of m. 6. Define subsets of Gwith (m) according to a corre- lation with what it means to have low vs. high 3. Gossip & World Model confidence value c: So far, we have focused on those parts of the valida- (a) Pull: consisting of nodes about which a low tion function V used to verify elments of X . However, confidence means a need for more frequent maintaining system integrity in distributed systems also gossip to raise a node’s confidence. Such nodes requires that nodes have mechanisms sharing informa- would include those for which, with respect tion about nodes that have broken the validation rules to the given node, hold its published entries, so that they can be excluded from participation. There hold entries it is also responsible for holding, exist, additionally, forms of bad-acting that do not live in are close the then node (i.e. in its lowest k- the content of a transaction but in the patterns of trans- bucket), and which it relies on for routing (i.e. acting that are detrimental to the system, for example, a subset of each k-bucket) denial of service attacks. (b) Push: consisting of nodes about which a high Holochain uses gossip for nodes to share information confidence implies a need for more frequent about their own experience of the behavior of other gossip to spread the information about that nodes. Informally we call this information the node’s node. Such nodes would include ones for world model . In this section we describe the nature which a given node has high confidence is a t of Holochain’s gossip protocols and how they build and bad actor, i.e. it has directly experienced bad maintain a node’s world model. acting, or has recevied bad actor gossipe from In 12f we described one such part of the world model, nodes that it has high confidence in being able af the uptime metric and how it is used for maintaing redun- dant copies of entries. In IV C 2 we defined a membrane function that determines if a node shall accept a trans- to make that bad actor evaluation. 7. TODO: describe a gossip trigger function based on the pull vs. pull distinction that demostrates when action and allowed that function to take arbitrary data z. The main source of that data comes from this world gossip happens Dr model. The computational costs of gossip depend on the set of More formally: metrics that a particular application needs to keep track 1. Recall that each node maintains a set M of metrics of to maintain system integrity. For an application with a m about other nodes it knows about. Note that in very strong membership membrane perhaps only uptime terms of our formalism, this world model is part of metrics are necessary to gossip about to balance resil- each node’s non-chain state data D. lience. But this too may depend on apriori knowledge of the nodes involved in the application. Applications with 2. Let m be a tuple of tuples: ((µ, c)self , (µ, c)others ) very loose membership membranes may have a substan- which record an experience µ of a node with re- tial number of metrics and complex membrane functions spect to a given metric and a confidence c of using those metrics which may require substantial com- that exprience, both as directly experienced or as pute effort. The Holochain design intentionally leaves ”hearsay” recieved from other nodes. these parameters only loosly specificed so that applica- tions can be built fit for purpose. 3. Allow a class of entries stored in Xn be used also as a metric mw which act as a signed declaration of the experience of n regarding some other node. 4. CALM & Logical Monotonicity Call such entries warrants. These warrants al- low us to use the standard tooling of Holochain TODO: description of CALM in multi-agent systems, to make provenance based, verifyable claims about and how it works in our case other nodes in the network, which propagate or- thogonally from the usual DHT methods, via gossip to nodes that need to ”hear” about these claims so V. COMPLEXITY IN DISTRIBUTED SYSTEMS as to make decisions about interacting with nodes. 4. ∀m ∈ M let the function Gwith (m) return a set In this section we discuss the complexity of our pro- of nodes important for a node to gossip with de- posed architecture for decentralized systems and compare fined by a probabilistc weighting that information it to the increasingly adopted Blockchain pattern.

10 Formally describing the complexity of decentralized transaction at least flood through 90% of the network, multi-agent systems is a non-trivial task for which more block size and time can’t be pushed beyond 4MB and complex approaches have been suggested ([Marir2014]). 12s respectively, according to [Croman et al 16]. This might be the reason why there happens to be unclarity and misunderstandings within communities dis- cussing complexity and scalability of Bitcoin for example B. Ethereum [Bitcoin Reddit]. In order to be able to have a ball-park comparison be- tween our approach and the current status quo in decen- Let ΩEthereum be the Ethereum main network, n be tralized application architecture, we proceed by model- the number of transactions and m the number of full- ing the worst-case time complexity both for a single node clients within in the network. ΩSystemN ode as well as for the whole system ΩSystem and The time complexity of processing a single transaction both as functions of the number of state transitions (i.e., on a single node is a function of the code that has its transactions) n and the number of nodes in the system execution being triggered by the given transaction plus m. a constant: c + ftxi (n, m) (5.4) A. Bitcoin Similarly to Bitcoin and as a result of the Blockchain de- sign decision to maintain one single state (∀n, m ∈ N : Let ΩBitcoin be the Bitcoin network, n be the number ! of transactions and m be the number full validating nodes Xn = Xm , “This is to be avoided at all costs as the uncer- (i.e., miners 10 ) within ΩBitcoin . tainty that would ensue would likely kill all confidence in For every new transaction being issued, any given node the entire system.” [EIP-150]), every node has to process will have to check the transaction’s signature (among every transaction being sent resulting in a time complex- t other checks, see. [BitcoinWiki]) and especially check if ity per node as this transaction’s output is not used in any other transac- n X tion to reject double-spendings, resulting in a time com- plexity of c+n af (5.1) that is c+ i=0 ftxi (n, m) (5.5) per transaction. The time complexity in big-O notation ΩEthereumN ode ∈ O(n · favg (n, m)) (5.6) per node as a function of the number of transactions is Dr therefore: whereas users are incentivized to hold the average com- plexity favg (n, m) of the code being run by Ethereum ΩBitcoinN ode ∈ O(n2 ) (5.2) small since execution has to be payed for in gas and which is due to restrictions such as the block Pn gas limit. In other The complexity handled by one Bitcoin node does not 11 words, because of the complexity i=0 ftxi (n, m) being depend on m the number of total nodes of the system. burdened upon all nodes of the system, other systemic But since every node has to validate exactly the same properties have to keep users from running complex code set of transactions, the system’s time complexity as a on Ethereum so as to not bump into the network’s limits. function of number of transactions and number of nodes Again, since every node has to process the same set of results as all transactions, the time complexity of the whole system then is that of one node multiplied by m: ΩBitcoin ∈ O(n2 m) (5.3) ΩEthereum ∈ O(nm · ftxi (n, m)) (5.7) Note that this quadratic time complexity of Bitcoin’s transaction validation process is what creates its main bottleneck as this reduces the network’s gossip band- width since every node has to validate every transaction C. Blockchain before passing it along. In order to still have an average Both examples of Blockchain systems above do need a non-trivial computational overhead in order to work 10 at all: the proof-of-work, hash-crack process also called For the sake of simplicity and focusing on a lower bound of the mining. Since this overhead is not a function of either system’s complexity, we are neglecting all nodes that are not crucial for the operation of the network, such as light-clients and the number of transactions nor directly of the number of clients not involved in the process of validation nodes, it is often omitted in complexity analysis. With 11 not inherently - that is more participants will result in more the total energy consumption of all Bitcoin miners today transactions but we model both values as separate parameters being greater than the country of Iceland [Coppock17],

11 neglecting the complexity of Blockchain’s consensus al- complexity of gorithm seems like a silly mistake. Blockchains set the block time, the average time be- c + ⌈log(m)⌉. (5.9) tween two blocks, as a fixed parameter that the system keeps in homeostasis by adjusting the hash-crack’s diffi- After receiving the state transition data, this node will culty according to the network’s total hash-rate. For a gossip with its q neighbors which will result in r copies given network with a given set of mining nodes and a of this state transition entry being stored throughout the given total hash-rate, the complexity of the hash-crack system - on r different nodes. Each of these nodes has to is constant. But as the system grows and more miners validate this entry which is an application specific logic come on-line, which increases the networks total hash- of which the complexity we shall call v(n, m). rate, the difficulty needs to increase in order to keep the Combined, this results in a system-wide complexity per average block time constant. state transition as given with With this approach, the benefit of a higher total hash- rate xHR is an increased difficulty of an adversary to c + ⌈log(m)⌉ +q + r · v(n, m) (5.10) | {z } | {z } influence the system by creating biased blocks (which DHT lookup validation would render this party able to do double-spend attacks). That is why Blockchains have to subsidize mining, de- which implies the following whole system complexity in pending on a high xHR as to make it economically im- O-notation possible for an attacker to overpower the trusted miners. So, there is a direct relationship between the network’s ΩHolochain ∈ O(n · (log(m) + v(n, m)) (5.11) total trusted hash-rate and its level of security against mining power attacks. This means that the confidence Now, this is the overall system complexity. In or- ΨBlockchain any agent can have in the integrity of the der to enable comparison, we reason that in the case of system is a function of the system’s hash-rate xHR , and Holochain without loss of generality (i.e., dependent on t more precisely, the cost/work cost(xHR ) needed to pro- the specific Holochain application), the load of the whole vide it. Looking only at a certain transaction t and given system is shared equally by all nodes. Without further any hacker acts economically rationally only, the confi- assumptions, for any given state transition, the probabil- dence in t being added to all Xn has an upper bound in af ity of it originating at a certain node is m1 , so the term for the lookup complexity needs to be divided by m to describe the average lookup complexity per node. Other cost(xHR ) than in Blockchain systems where every node has to see ΨBlockchain (t) < min 1, (5.8) every transaction, for the vast majority of state tran- value(t) sitions one particular node is not involved at all. The Dr In order to keep this confidence unconstrained by stochastic closeness of the node’s public key’s hash with the mining process and therefore the architecture of the entry’s hash is what triggers the node’s involvement. Blockchain itself, cost(xHR ) (which includes the setup We assume the hash function H to show a uniform dis- of mining hardware as well as the energy consumption) tribution of hash values which results in the probability has to grow linearly with the value exchanged within the of a certain node being one of the r nodes that cannot 1 system. discard this entry to be m times r. The average time complexity being handled by an average node then is n D. Holochain ΩHolochainN ode ∈ O · (log(m) + v(n, m)) (5.12) m n Let ΩHC be a given Holochain system, let n be the sum Note that the factor m represents the average number of of all public12 (i.e., put to the DHT) state transitions state transactions per node (i.e., the load per node) and (transactions), let all agents in ΩHC trigger in total, and that though this is a highly application specific value, it let m be the number of agents (= nodes) in the system. is an a priori expected lower bound since nodes have to process at least the state transitions they produce them- Putting a new entry to the DHT involves finding a selves. node that is responsible for holding that specific entry, which in our case according to [Kademlia] has a time The only overhead that is added by the architecture of this decentralized system is the node look-up with its complexity of log(m). The unknown and also application specific complex- 12 ity v(n, m) of the validation routines is what could drive private (see:17) state transitions, i.e., that are confined to a lo- cal Xn , are completely within the scope of a node’s agency and up the whole system’s complexity still. And indeed it don’t affect other parts of the system directly and can therefore is conceivable to think of Holochain applications with a be omitted for the complexity analysis of ΩHC as a distributed lot of complexity within their validation routines. It is system basically possible to mimic Blockchain’s consensus vali-

12 dation requirement by enforcing that a validating node 1. 30k+ lines of go code. communicates with all other nodes before adding an en- try to the DHT. It could as well only be half of all nodes. 2. DHT: customized version of libp2p/ipfs’s kademlia And there surely is a host of applications with only little implementation. complexity - or specific state transitions within an appli- 3. Network Transport: libp2p including end-to-end cation that involve only little complexity. In a Holochain encryption. app one can put the complexity where it is needed and keep the rest of the system fast and scalable. 4. Javascript Virtual Machine: otto In section VI we proceed by providing real-world use https://github.com/robertkrimen/otto. cases and showing how non-trivial Holochain applications can be built that get along with a validation complexity 5. Lisp Virtual Machines: zygomys of O(1), resulting in a total time complexity per node https://github.com/glycerine/zygomys. in O(log(m)) and a high enough confidence in integrity without introducing proof-of-work at all. Additionally we have created a benchmarking suite to examine the processing, bandwidth and storage used in various scenarios, and compared these with Ethereum VI. USE CASES applications in similar scenarios. These can be seen here: https://github.com/holochain/benchmarks We have yet to implement scalability tests for large Now we present a few use cases of applications built scale applications, but it is in our roadmap.TODO on Holochain, considering the context of the use case and how it affects both complexity and evaluation of integrity and thus validation design. Appendix A: DHThc t 1. dhtputLink (base, link, tag) where base and link are A. Social Media keys and where tag is an arbitrary string, which associates the tuple {link,tag} with the key base. af Consider a simple implementation of micro-blogging using Holochain where: 1. FI = {fpost (text, node), ffollow (node), fread (text)} 2. dhtgetLinks (base, tag) where base is a key keys and where tag is an arbitrary string, which returns the set of links on base identified by tag. and 3. dhtmod (key, newkey) where key and newkey are Dr 2. FV = {fisOriginator } keys, which adds newkey as a modifier of σkey ∈ ∆ and calls dhtputLink (key, newkey, “replacedby”). describe O(1) complexity 4. dhtdel (key) where key is a key, and marks σkey ∈ ∆ as deleted. B. Identity 5. modification to dhtget re mod & del. DPKI Appendix B: Fsys C. Money 1. all the other sys functions... mutual-credit vs. coins where the complexity of the transaction is higher, complexity may be O(n2 ) or Appendix C: Patterns of Trust Management O(log(n)) see holo currency white paper: [? ] Tools in Holochain available to app developers for use in Considered Requirements, some of which are also used VII. IMPLEMENTATION at the system level and globally parameterized for an application: At the time of this writing we have a fully operational implementation of system as described in this paper, 1. Countersigning TODO that includes two separate virtual machines for writing DNA functions in JavaScript, or Lisp, along with proof- 2. Notaries TODO – “The network is the notary.” of-concept implementations of a number of applications 3. Publish Headers e.g. for chain-rollback detection including a twitter clone, a slack-like chat system, DPKI, and a set mix-in libraries useful for building applications. 4. Source-chain examination. TODO

13 5. Blocked-lists. e.g. DDOS, spam, etc ments/passports/identity cards within the agent entry (second entry in X ). 6. ... more here... • Proof-of-Service Cryptographic proof of delivery of a service / host- ing of an application. We intend to leverage this Appendix D: Membranes technique with our distributed cloud hosting ap- plication Holo, which we will build on top of Holochain. See our Holo Hosting white paper for • Invitation much more detail [? ]. One of the most natural approaches for membrane crossing in a space in which agents provide identity • Proof-of-Work is to rely on invitation by agents that are already If the application’s requirement is not anonymity, in the membrane. This could be invitation: other than the cryptographic hash-cracking work applied in most of the Blockchains, this could also – by anyone be useful work that new members are asked to con- tribute to the community or a puzzle to proof do- – by an admin (that could either be set in the main knowledge. Examples are: application’s DNA or a variable shared within the DHT - both could be mutable or constant) – Test for knowledge about local maps to proof citizenship – by multiple users (applying social triangula- tion) – DNA sequencing – Protein folding • Proof-of-Identity / Reputation – SETI t Given the presence of other applications/chains, – Publication of scientific article these can be used to attach the identity and its • Proof-of-Stake / Payment reputation in that chain to the agent that wants af to join. Since this seems to be a crucial pillar of the ecosystem of Holochain applications, we plan to deliver a system-level application called DPKI (dis- Depost or payment to have agent certified. • Immune System Blacklisting of nodes that don’t play by the appli- tributed public key infrastructure) that will func- cation rules. tion as the main identity and reputation platform. A prototype of this app was already developed prior Dr to the writing of this paper. ACKNOWLEDGMENTS • Proof-of-Presence We thank Steve Sawin for his review of this paper, Use of notarized national docu- LATEX 2ε support and so much more.... . . [DUPONT] Quinn DuPont. Experiments in Algorithmic Financial Cryptography and Data Security, Springer Ver- Governance: A history and ethnography of The DAO, a lag 2016 failed Decentralized Autonomous Organization [Bitcoin Reddit] /u/mike hearn, /u/awemany, /u/nullc et al. http://www.iqdupont.com/assets/documents/ https://www.reddit.com/r/Bitcoin/comments/ DUPONT-2017-Preprint-Algorithmic-Governance.pdf 3a5f1v/mike_hearn_on_those_who_want_all_scaling_ [EIP-150] Gavin Wood. Ethereum: A Secure Decentralised to_be/csa7exw/?context=3&st=j8jfak3q&sh=6e445294 Generalised Transaction Ledger. Reddit discussion 2015 http://yellowpaper.io/ [Marir2014] Marir, Toufik and Mokhati, Farid and [Kademlia] Petar Maymounkov and David Mazieres Kadem- Bouchelaghem-Seridi, Hassina and Tamrabet, Zouheyr”, lia: A Peer-to-peer Information System Base on the Complexity Measurement of Multi-Agent Systems”, Mul- XOR Metric tiagent System Technologies: 12th German Conference, https://pdos.csail.mit.edu/~petar/papers/ MATES 2014, Stuttgart, Germany, September 23-25, maymounkov-kademlia-lncs.pdf 2014. Proceedings, Springer International Publishing [Zhang13] Zhang, H., Wen, Y., Xie, H., Yu, N. Distributed 2014 Hash Table Theory, Platforms and Applications https://doi.org/10.1007/978-3-319-11584-9_13 [Croman et al 16] Kyle Croman, Christian Decker, Ittay [Coppock17] Mark Coppock THE WORLDS CRYPTOCUR- Eyal, Adem Efe Gencer, Ari Juels, Ahmed Kosba, An- RENCY MINING USES MORE ELECTRICITY THAN drew Miller, Prateek Saxena, Elaine Shi, Emin Gn Sirer, ICELAND Dawn Song, Roger Wattenhofer, On Scaling Blockchains, https://www.digitaltrends.com/computing/

14 bitcoin-ethereum-mining-use-significant-electrical-power/2Fiptps2002.pdf International workshop on Peer-To- [BitcoinWiki] Bitcoin Protocol Peer Systems. Retrieved 23 April 2016. https://en.bitcoin.it/wiki/Protocol_rules#.22tx. [HoloCurrency] Arthur Brock and Eric Harris-Braun 2017 22_messages Bitcoin Wiki Holo: Cryptocurrency Infrastructure for Global Scale and [IPFS] Juan Benet IPFS - Content Addressed, Versioned, Stable Value P2P File System (DRAFT 3) https://holo.host/holo-currency-wp/ [Nilsson15] Nilsson, Kim (19 April 2015). The missing MtGox https://ipfs.io/ipfs/QmR7GSQM93Cx5eAg6a6yRzNde1FQv7uL6X1o4k7zrJa3LX/ ipfs.draft3.pdf bitcoins”. Retrieved 10 December 2015. [LibP2P] Juan Benet, David Dias libp2p Specification http://blog.wizsec.jp/2015/04/ https://github.com/libp2p/specs the-missing-mtgox-bitcoins.html [Oxford] Oxford Online dictionary [Swanson15] Tim Swanson Consensus-as-a-service: a brief https://en.oxforddictionaries.com/definition/ report on the emergence of permissioned, distributed provenance ledger systems April 6, 2015 [Douceur02] Douceur, John R. (2002). ”The Sybil Attack” https://pdfs.semanticscholar.org/f3a2/ https://www.microsoft.com/en-us/research/ 2daa64fc82fcda47e86ac50d555ffc24b8c7.pdf publication/the-sybil-attack/?from=http%3A% 2F%2Fresearch.microsoft.com%2Fpubs%2F74220% t af Dr