Building a blockchain with go — Part 6: transaction (2)

Time:2020-12-2

A series of translated articles I have put on GitHub:blockchain-tutorialAny subsequent updates will be on GitHub and may not be synchronized here. If you want to run the code directly, you can also run it in the SRC directory from the tutorial repository on clone GitHubmakeThat’s fine.


introduction

At the beginning of this series, we mentioned that blockchain is a distributed database. In previous articles, however, we selectively skipped the “distributed” section and focused on the “database” section. So far, we have implemented almost all the elements of a blockchain database. Today, we’ll look at some of the mechanisms we’ve skipped. In the next article, we will start to discuss the distributed features of blockchain.

Previous series:

  1. Basic prototype
  2. proof of work
  3. Persistence and command line interface
  4. Transactions (1)
  5. address

The code implementation of this article has changed a lot, please clickhereView all code changes.

reward

One small detail we missed in the last article was mining rewards. Now, we can refine this detail.

Mining reward is actually a coalbase transaction. When a mining node starts digging out a new block, it takes the transaction out of the queue and appends a coinbase transaction in front of it. The coalbase transaction has only one output, which contains the miner’s public key hash.

To achieve rewards, very simple, updatedsendThen:

func (cli *CLI) send(from, to string, amount int) {
    ...
    bc := NewBlockchain()
    UTXOSet := UTXOSet{bc}
    defer bc.db.Close()

    tx := NewUTXOTransaction(from, to, amount, &UTXOSet)
    cbTx := NewCoinbaseTX(from, "")
    txs := []*Transaction{cbTx, tx}

    newBlock := bc.MineBlock(txs)
    fmt.Println("Success!")
}

In our implementation, the person who created the transaction also dug up a new block, so there was a reward.

Utxo set

In part 3: persistence and command line interface, we studied how bitcoin core stores blocks in a database and learned that blocks are stored inblocksDatabase, transaction output is stored inchainstateDatabase. I’ll look backchainstateOrganization:

  1. c+32 byte transaction hash, then the transaction output record of the transaction
  2. B+32 byte block hash > block hash without spending transaction output

In the previous article, although we have implemented the transaction, we did not use itchainstateTo store the output of the transaction. So, let’s move on.

chainstateDo not store transactions. What it stores is a set of utxo, that is, a set of unused transaction output. In addition, it also stores “block hashes represented by the database for unused transaction output,” but we’ll skip the block hash for a while because we haven’t used the block height yet (but we’ll continue to improve it in a later article).

So why do we need the utxo set?

Think about what we implemented earlierBlockchain.FindUnspentTransactionsmethod:

func (bc *Blockchain) FindUnspentTransactions(pubKeyHash []byte) []Transaction {
    ...
    bci := bc.Iterator()

    for {
        block := bci.Next()

        for _, tx := range block.Transactions {
            ...
        }

        if len(block.PrevBlockHash) == 0 {
            break
        }
    }
    ...
}

This function finds transactions that have an output that is not spent. Since transactions are stored in blocks, it iterates over each block in the blockchain and checks every transaction in it. As of September 18, 2017, there are 485860 blocks in bitcoin, and the disk space required for the entire database exceeds 140 GB. This means that if a person wants to validate a transaction, he must run a full node. In addition, validating transactions will require iterations over many blocks.

The solution to the whole problem is to have an index with only unused output, which is what the utxo set does: it’s a cache built from all blockchain transactions (iterating over blocks, but only once), and then using it to calculate balances and validate new transactions. As of September 2017, the utxo set is about 2.7 GB.

Well, let’s think about the changes that need to be made to implement the utxo set. At present, the following methods are used to find the transaction:

  1. Blockchain.FindUnspentTransactions-Find the main function that has an unencumbered output transaction. In this function, all blocks are iterated.
  2. Blockchain.FindSpendableOutputs-This function is used when a new transaction is created. If you find the required number of outputs. useBlockchain.FindUnspentTransactions.
  3. Blockchain.FindUTXO-Find the unspent output of a public key hash and use it to get the balance. useBlockchain.FindUnspentTransactions.
  4. Blockchain.FindTransation-A transaction is found in the blockchain based on the ID. It iterates over all blocks until it is found.

As you can see, all methods iterate over all blocks in the database. However, we haven’t improved all the methods at present, because the utxo set can’t store all the transactions, only those that have the unused output. Therefore, it cannot be used forBlockchain.FindTransaction

So we want to do the following:

  1. Blockchain.FindUTXO-Iterate over the blocks to find all the unused output.
  2. UTXOSet.Reindex-UseUTXOFind the unused output and store it in the database. This is where the cache is.
  3. UTXOSet.FindSpendableOutputs-SimilarBlockchain.FindSpendableOutputs, but uses the utxo set.
  4. UTXOSet.FindUTXO-SimilarBlockchain.FindUTXO, but uses the utxo set.
  5. Blockchain.FindTransactionSame as before.

So, from now on, the two most commonly used functions will use cache! Let’s start writing code.

type UTXOSet struct {
    Blockchain *Blockchain
}

We will use a single database, but we will store the utxo set from in different buckets. So,UTXOSetFollowBlockchainTogether.

func (u UTXOSet) Reindex() {
    db := u.Blockchain.db
    bucketName := []byte(utxoBucket)

    err := db.Update(func(tx *bolt.Tx) error {
        err := tx.DeleteBucket(bucketName)
        _, err = tx.CreateBucket(bucketName)
    })

    UTXO := u.Blockchain.FindUTXO()

    err = db.Update(func(tx *bolt.Tx) error {
        b := tx.Bucket(bucketName)

        for txID, outs := range UTXO {
            key, err := hex.DecodeString(txID)
            err = b.Put(key, outs.Serialize())
        }
    })
}

This method initializes the utxo set. First, if the bucket exists, it will be removed first, then all the unused output will be obtained from the blockchain, and finally the output will be saved to the bucket.

Blockchain.FindUTXOAlmost followBlockchain.FindUnspentTransactionsIt’s as like as two peas, but now it’s back to a single one.TransactionID -> TransactionOutputsMap.

The utxo set can now be used to send currency:

func (u UTXOSet) FindSpendableOutputs(pubkeyHash []byte, amount int) (int, map[string][]int) {
    unspentOutputs := make(map[string][]int)
    accumulated := 0
    db := u.Blockchain.db

    err := db.View(func(tx *bolt.Tx) error {
        b := tx.Bucket([]byte(utxoBucket))
        c := b.Cursor()

        for k, v := c.First(); k != nil; k, v = c.Next() {
            txID := hex.EncodeToString(k)
            outs := DeserializeOutputs(v)

            for outIdx, out := range outs.Outputs {
                if out.IsLockedWithKey(pubkeyHash) && accumulated < amount {
                    accumulated += out.Value
                    unspentOutputs[txID] = append(unspentOutputs[txID], outIdx)
                }
            }
        }
    })

    return accumulated, unspentOutputs
}

Or check the balance:

func (u UTXOSet) FindUTXO(pubKeyHash []byte) []TXOutput {
    var UTXOs []TXOutput
    db := u.Blockchain.db

    err := db.View(func(tx *bolt.Tx) error {
        b := tx.Bucket([]byte(utxoBucket))
        c := b.Cursor()

        for k, v := c.First(); k != nil; k, v = c.Next() {
            outs := DeserializeOutputs(v)

            for _, out := range outs.Outputs {
                if out.IsLockedWithKey(pubKeyHash) {
                    UTXOs = append(UTXOs, out)
                }
            }
        }

        return nil
    })

    return UTXOs
}

This isBlockchainA simple modified version of the method. thisBlockchainMethods are no longer needed.

With the utxo set, it means that our data (transactions) are now stored separately: the actual transactions are stored in the blockchain, and the unused output is stored in the utxo set. So we want to have the latest state of the xuto, because we need to have the latest state of the transaction set. But we don’t want to regenerate the index every time a new block is generated, because this is the frequent blockchain scanning we are trying to avoid. Therefore, we need a mechanism to update the utxo set:

func (u UTXOSet) Update(block *Block) {
    db := u.Blockchain.db

    err := db.Update(func(tx *bolt.Tx) error {
        b := tx.Bucket([]byte(utxoBucket))

        for _, tx := range block.Transactions {
            if tx.IsCoinbase() == false {
                for _, vin := range tx.Vin {
                    updatedOuts := TXOutputs{}
                    outsBytes := b.Get(vin.Txid)
                    outs := DeserializeOutputs(outsBytes)

                    for outIdx, out := range outs.Outputs {
                        if outIdx != vin.Vout {
                            updatedOuts.Outputs = append(updatedOuts.Outputs, out)
                        }
                    }

                    if len(updatedOuts.Outputs) == 0 {
                        err := b.Delete(vin.Txid)
                    } else {
                        err := b.Put(vin.Txid, updatedOuts.Serialize())
                    }

                }
            }

            newOutputs := TXOutputs{}
            for _, out := range tx.Vout {
                newOutputs.Outputs = append(newOutputs.Outputs, out)
            }

            err := b.Put(tx.ID, newOutputs.Serialize())
        }
    })
}

Although this method seems a little complicated, what it needs to do is very intuitive. When a new block is dug out, the utxo set should be updated. Updating means removing the spent output and adding the unspent output to the newly mined transaction. If the output of a transaction is removed and no longer contains any output, the transaction should also be removed. It’s quite simple!

Now let’s use the utxo set when necessary:

func (cli *CLI) createBlockchain(address string) {
    ...
    bc := CreateBlockchain(address)
    defer bc.db.Close()

    UTXOSet := UTXOSet{bc}
    UTXOSet.Reindex()
    ...
}

When a new blockchain is created, the index will be rebuilt immediately. At present, this isReindexThe only place to use it, even if it seems like “killing chickens with a knife”, because at the beginning of a chain, there is only one block, and there is only one transaction,UpdateIt’s already in use. However, we may need to rebuild the indexing mechanism in the future.

func (cli *CLI) send(from, to string, amount int) {
    ...
    newBlock := bc.MineBlock(txs)
    UTXOSet.Update(newBlock)
}

When a new block is dug out, the utxo set is updated.

Let’s check if we work as scheduled:

$ blockchain_go createblockchain -address 1JnMDSqVoHi4TEFXNw5wJ8skPsPf4LHkQ1
00000086a725e18ed7e9e06f1051651a4fc46a315a9d298e59e57aeacbe0bf73

Done!

$ blockchain_go send -from 1JnMDSqVoHi4TEFXNw5wJ8skPsPf4LHkQ1 -to 12DkLzLQ4B3gnQt62EPRJGZ38n3zF4Hzt5 -amount 6
0000001f75cb3a5033aeecbf6a8d378e15b25d026fb0a665c7721a5bb0faa21b

Success!

$ blockchain_go send -from 1JnMDSqVoHi4TEFXNw5wJ8skPsPf4LHkQ1 -to 12ncZhA5mFTTnTmHq1aTPYBri4jAK8TacL -amount 4
000000cc51e665d53c78af5e65774a72fc7b864140a8224bf4e7709d8e0fa433

Success!

$ blockchain_go getbalance -address 1JnMDSqVoHi4TEFXNw5wJ8skPsPf4LHkQ1
Balance of '1F4MbuqjcuJGymjcuYQMUVYB37AWKkSLif': 20

$ blockchain_go getbalance -address 12DkLzLQ4B3gnQt62EPRJGZ38n3zF4Hzt5
Balance of '1XWu6nitBWe6J6v6MXmd5rhdP7dZsExbx': 6

$ blockchain_go getbalance -address 12ncZhA5mFTTnTmHq1aTPYBri4jAK8TacL
Balance of '13UASQpCR8Nr41PojH8Bz4K6cmTCqweskL': 4

Good!1JnMDSqVoHi4TEFXNw5wJ8skPsPf4LHkQ1Address received 3 rewards:

  1. One is to dig out the genesis block
  2. The first one is the excavation of a block: 0000001f75cb3a5033aeecbf6a8d378e15b25d026fb0a665c7721a5bb0faa21b
  3. One is the excavated block: 000000cc51e665d53c78af5e65774a72fc7b864140a8224bf4e7709d8e0fa433

Merkle tree

In this article, I also want to discuss an optimization mechanism.

As mentioned above, a full bitcoin database (blockchain) requires more than 140 GB of disk space. Because of the decentralized nature of bitcoin, each node in the network must be independent and self-sufficient, that is, each node must store a complete copy of the blockchain. As more and more people use bitcoin, this rule becomes increasingly difficult to follow: it’s unlikely that everyone will run a full node. Moreover, since nodes are full participants in the network, they have a related responsibility: nodes must verify transactions and blocks. In addition, to interact with other nodes and download new blocks, there is also a certain demand for network traffic.

In Nakamoto’sBitcoin original paperThere is also a solution to this problem: simplified payment verification (SPV). SPV is a bitcoin light node, it does not need to download the entire blockchain, alsoThere is no need to validate blocks and transactions。 Instead, it looks for transactions in the blockchain (to validate payments) and needs to connect to a full node to retrieve the necessary data. This mechanism allows multiple light wallets with only one full node running.

To implement SPV, there needs to be a way to check whether a block contains a transaction without downloading the entire block. That’s what Merkle trees do.

Bitcoin uses Merkle tree to obtain transaction hash, which is saved in block header and used in workload proof system. So far, we have only linked the hash of each transaction in a block, and we will apply SHA-256 algorithm to it. Although this is a good way to get a unique representation of block transactions, it does not make use of Merkle trees.

Take a look at the Merkle tree:

Building a blockchain with go -- Part 6: transaction (2)

Each block has a Merkle tree, which starts from the leaf node (the bottom of the tree), and a leaf node is a transaction hash (bitcoin uses double sha256 hashes). The number of leaf nodes must be even, but not every block contains even number of transactions. Because, if the number of transactions in a block is odd, a copy of the last leaf node (that is, the last transaction of the Merkle tree, not the last transaction of the block) is copied to form an even number.

From the bottom up, pair by pair, connect the two node hashes, and use the combined hash as the new hash. The new hash becomes the new tree node. Repeat the process until there is only one node, the root of the tree. The root hash is then used as a unique identifier of the entire block transaction, saved to the block header, and then used for workload proof.

The advantage of Merkle tree is that a node can verify whether a transaction is included without downloading the entire block. And these only need a transaction hash, a Merkle tree root hash, and a Merkle path.

Finally, write the code:

type MerkleTree struct {
    RootNode *MerkleNode
}

type MerkleNode struct {
    Left  *MerkleNode
    Right *MerkleNode
    Data  []byte
}

Start with the structure. eachMerkleNodeContains data and pointers to the left and right branches.MerkleTreeIt’s actually connecting to the root node of the next node, and then to further nodes in turn, and so on.

Let’s first create a new node:

func NewMerkleNode(left, right *MerkleNode, data []byte) *MerkleNode {
    mNode := MerkleNode{}

    if left == nil && right == nil {
        hash := sha256.Sum256(data)
        mNode.Data = hash[:]
    } else {
        prevHashes := append(left.Data, right.Data...)
        hash := sha256.Sum256(prevHashes)
        mNode.Data = hash[:]
    }

    mNode.Left = left
    mNode.Right = right

    return &mNode
}

Each node contains some data. When a node is in a leaf node, data is passed in from outside (in this case, a serialized transaction). When a node is associated with other nodes, it will fetch the data of other nodes, and then hash after connection.

func NewMerkleTree(data [][]byte) *MerkleTree {
    var nodes []MerkleNode

    if len(data)%2 != 0 {
        data = append(data, data[len(data)-1])
    }

    for _, datum := range data {
        node := NewMerkleNode(nil, nil, datum)
        nodes = append(nodes, *node)
    }

    for i := 0; i < len(data)/2; i++ {
        var newLevel []MerkleNode

        for j := 0; j < len(nodes); j += 2 {
            node := NewMerkleNode(&nodes[j], &nodes[j+1], nil)
            newLevel = append(newLevel, *node)
        }

        nodes = newLevel
    }

    mTree := MerkleTree{&nodes[0]}

    return &mTree
}

When building a new tree, the first thing to ensure is that the leaf nodes must be even. then,data(i.e., an array of serialized transactions) is converted to the leaves of the tree, from which a tree is slowly formed.

Now, let’s modify itBlock.HashTransactions, which is used to obtain the transaction hash in the workload proof system:

func (b *Block) HashTransactions() []byte {
    var transactions [][]byte

    for _, tx := range b.Transactions {
        transactions = append(transactions, tx.Serialize())
    }
    mTree := NewMerkleTree(transactions)

    return mTree.RootNode.Data
}

First, the transaction is serialized (usingencoding/gob)And then use the sequential transactions to build a mekle tree. The tree root will be the unique identifier for the block transaction.

P2PKH

There is one more thing that I want to talk about again.

As you will remember, there is one in bitcoinScriptProgramming language, which is used to lock the transaction output; the transaction input provides the data to unlock the output. This language is very simple. The code written in this language is actually a series of data and operators. For example:

5 2 OP_ADD 7 OP_EQUAL

Five, two, and seven are data,OP_ADDandOP_EQUALIs the operator.scriptThe code is executed from left to right: the data is put into the stack in turn. When the operator is encountered, the data is taken from the stack, and the operator is applied to the data, and the result is taken as the top element of the stack.scriptIn fact, the stack is a first in and last out memory storage: the first element in the stack is taken out last, and each element behind will be put on top of the previous one.

Let’s execute the above script separately:

step Stack script explain
1 empty 5 2 OP_ADD 7 OP_EQUAL At first, the stack is empty
2 5 2 OP_ADD 7 OP_EQUAL Take it out of the script5Put it on the stack
3 5 2 OP_ADD 7 OP_EQUAL Take it out of the script2Put it on the stack
4 7 7 OP_EQUAL Operator encounteredOP_ADD, take two operands from the stack5and2Add and put the result back on the stack
5 7 7 OP_EQUAL Take it out of the script7Put it on the stack
6 true empty Operator encounteredOP_EQUAL, take two operands from the stack and compare them. Put the comparison results back into the stack. The script is executed and empty

OP_ADDTake two elements from the stack, add the two elements, and put the results back on the stack.OP_EQUALTake two elements from the stack and compare them: if they are equal, put one on the stacktrueOtherwise, put onefalse。 The top of our stack is the result of the execution of the scripttrue, then the script is executed successfully.

Now let’s take a look at how the script is used to execute payment in bitcoin:

<signature> <pubKey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

This script is calledPay to Public Key Hash(P2PKH)This is the most commonly used script for bitcoin. What it does is pay to a public key hash, that is, to lock some currency with a public key. This isThe core of bitcoin payment: no account, no funds transfer; only one script checks that the signature and public key provided are correct.

The script is actually stored in two parts:

  1. The first part,<signature> <pubkey>, stored in the inputScriptSigField.
  2. The second part,OP_DUP OP_HASH160 <pubkeyHash> OP_EQUALVERYFY OP_CHECKSIGStored in the outputScriptPubKeyInside.

Therefore, the output determines the unlocking logic, and the input provides the “key” for unlocking the output. Let’s run this script:

step Stack script
1 empty <signature> <pubKey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
2 <signature> <pubKey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
3 <signature> <pubkey> OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
4 <signature> <pubKey> <pubKey> OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
5 <signature> <pubKey> <pubKeyHash> <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
6 <signature> <pubKey> <pubKeyHash> <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG
7 <signature> <pubKey> OP_CHECKSIG
8 trueorfalse empty

OP_DUPCopy the top elements of the stack.OP_HASH160Take the top element of the stack, and then use theRIPEMD160Hash it and send the result back to the stack.OP_EQUALVERIFYCompare the two elements at the top of the stack and terminate the script if they are not equal.OP_CHECKSIGBy hashing the transaction and using the<signature>andpubKeyTo verify the signature of a transaction. The final operator is a bit complicated: it generates a trimmed copy of the transaction, hashes it (because it is a signed transaction hash), and then uses the provided<signature>andpubKeyCheck that the signature is correct.

With such a scripting language, bitcoin can actually become a smart contract platform: in addition to transferring funds from a single public key, the language also makes possible other payment schemes.

summary

That’s all for today! We have implemented almost all the key features of a blockchain based cryptocurrency. We already have blockchains, addresses, mining and trading. But to give life to all these mechanisms and make bitcoin a global system, there is an indispensable link: consensus. In the next article, we will begin to implement “decentralizing” the blockchain. Please listen!

Link:

  1. Full source codes
  2. The UTXO Set:_Data_Storage#The_UTXO_set_.28chainstate_leveldb.29)
  3. Merkle Tree
  4. Script
  5. “Ultraprune” Bitcoin Core commit
  6. UTXO set statistics
  7. Smart contracts and Bitcoin
  8. Why every Bitcoin user should understand “SPV security”

Link to the original text:Building Blockchain in Go. Part 6: Transactions 2