[develop on nervos CKB] Introduction to script programming of nervos CKB [2]: script Foundation

Time:2020-6-29

Introduction to CKB scripting [2]: script Basics

Original author: Xuejie
Introduction to CKB script programming 2: Script
Translator: shooter, Jason, orange

In the last article, we introduced the current CKB verification model. This article will be more interesting. We will show you how to deploy script code to the CKB network. I hope that after you read this article, you will be able to explore the world of CKB and write new script code according to your own will.

It should be noted that although I believe that the current programming model of CKB is relatively stable, the development is still in progress, so there may be some changes in the future. I will try my best to ensure that this article is always up-to-date, but if there is any doubt in the process, this article is based on CKB in this version.

Enough for next week’s article, because I’d like to give you a long warning. So if you don’t have enough time, you don’t have to finish it right away. I’m trying to break it down into a few independent ones so you can try one at a time.

grammar

Before we move on, let’s distinguish between two terms: script and script code

In this article and throughout the series, we’ll distinguish between scripts and script code. Script code is actually a program that you write and compile and run on CKB. Script, in fact, refers to the script data structure used in CKB, which is slightly more than the script code:

pub struct Script {
    pub args: Vec<Bytes>,
    pub code_hash: H256,
    pub hash_type: ScriptHashType,
    }

We can ignore it for nowhash_typeWhat is ithash_typeAnd what interesting uses it has. We will explain later in this articlecode_hashIn fact, it is used to identify script code, so we can only regard it as script code at present. What else does the script include? The script also includesargsThis part is used to distinguish between script and script code.argsIt can be used to provide additional parameters for a CKB script. For example, although everyone may use the same default lock script code, everyone may have his own pubkey hash,argsIt is used to save the location of pubkey hash. In this way, each CKB user can have different lock script, but can share the same lock script code.

Note that in most cases, scripts and script code are interchangeable, but if you’re confused in some places, you may need to consider the difference.

A minimal CKB script code

As you may have heard before, CKB is based on the open source risc-v instruction set. But what does that mean? In my own words, this means that we have (to some extent) embedded a real micro-computer in CKB, rather than a virtual machine. The advantage of a real computer is that you can write any logic you want in any language. Here, the first few examples we show will be written in C language to keep simplicity (I mean simplicity in tool chain, not language), and then we will switch to javascript based script code and hope to show more languages in this series. Remember, there are unlimited possibilities on CKB!

As we mentioned, the CKB VM is more like a real microcomputer. CKB’s code script also looks more like a common UNIX style executable that we run on computers.

int main(int argc, char* argv[])
{
  return 0;
}

When your code is compiled with the C compiler, it becomes script code that can run on CKB. In other words, CKB just takes the normal old UNIX style executable (but uses risc-v architecture instead of the popular x86 architecture) and runs it in a virtual machine environment. If the return code of the program is 0, we think the script is successful, and all non-zero return codes will be regarded as failure scripts.

In the example above, we show a script code that always succeeds. Because the return code is always 0. But please do not use this as your lock script code, or your token may be taken away by anyone.

But obviously the above example is not interesting. Here we start with an interesting idea: I personally don’t like carrots very much. I know carrots are good from a nutritional point of view, but I still want to avoid the taste. If I want to set a rule now, for example, I want to make sure that I don’t have a rule in the cell on CKBcarrotStarting data? Let’s write a script code to do this.

To ensure that no cell is in cell data
ContainscarrotFirst, we need a way to read the cell data in the script. CKB providessyscallsTo help solve this problem.

To ensure the security of CKB scripts, each script must run in an isolated environment that is completely separate from the host computer running CKB. So it can’t access data it doesn’t need, such as your private key or password. However, to make a script useful, there must be specific data to access, such as the cell protected by the script or the transaction verified by the script. CKB providessyscallsTo ensure that,syscallsDefined in risc-v standards, they provide access to certain resources in the environment. Normally, the environment here refers to the operating system, but in the CKB VM, the environment refers to the actual CKB process. usesyscallsThe CKB script can access the whole transaction including itself, including input, output, witness and DEPs.

The good news is that we havesyscallsIt is encapsulated in an easy-to-use header file. You are welcome to view this file here to learn how to implement itsyscalls。 Most importantly, you can just get this header file and use the wrapper function to create the system call you want.

Now there aresyscalls, we can usecarrotScript start for:

#include <memory.h>
#include "ckb_syscalls.h"

int main(int argc, char* argv[]) {
  int ret;
  size_t index = 0;
  volatile uint64_t len = 0; /* (1) */
  unsigned char buffer[6];

  while (1) {
    len = 6;
    memset(buffer, 0, 6);
    ret = ckb_load_cell_by_field(buffer, &len, 0, index, CKB_SOURCE_OUTPUT,
                                 CKB_CELL_FIELD_DATA); /* (2) */
    if (ret == CKB_INDEX_OUT_OF_BOUND) {               /* (3) */
      break;
    }

    if (memcmp(buffer, "carrot", 6) == 0) {
      return -1;
    }

    index++;
  }

  return 0;
}

The following points need to be explained:

  1. Because of the quirks of C language,lenField needs to be marked asvolatile。 We will use it as input and output parameters at the same time. The CKB VM can set output parameters only when it is still in memory. andvolatileYou can ensure that the C compiler saves it as a risc-v memory based variable.
  2. in usesyscallWe need to provide the following function: a buffer to savesyscallData provided; alenField to represent the buffer length and available data length returned by the system call; the offset in an input data buffer; and the parameters of several exact fields we need to obtain in the transaction. Please refer to our RFC for details.
  3. To ensure maximum flexibility, CKB uses the return value of system call to represent the data grabbing state: 0 (orCKB_SUCCESS)It means successCKB_INDEX_OUT_OF_BOUND)It means that you have got all the indexes in one way, 2 (orCKB_ITEM_MISSING)It means that there is no entity, such as a script that gets the type from a cell that does not contain the type script.

In summary, the script will loop through all the output cells in the transaction, load the first six bytes of each cell data, and test whether these bytes are in line withcarrotMatch. If a match is found, the script returns-1, indicating the error status; if no match is found, the script returns0Exit means the execution is successful.

To execute the loop, the script saves aindexVariable, in each iteration of the loop, it will try to get syscall to get the currentindexValue, if syscall returnsCKB_INDEX_OUT_OF_BOUNDThis means that the script has traversed all the cells and then exits the loop; otherwise, the loop will continue every time the cell data is tested,indexThe variable is incremented once.

This is the first useful CKB script code! In the next section, we’ll see how we deploy it into CKB and run it.

Deploy script to CKB

First, we need to compile the carrot source code written above. Since GCC already provides risc-v support, you can of course use the official GCC to create script code. Or you can use the docker image we prepared to avoid the trouble of compiling GCC:

$ ls
carrot.c  ckb_consts.h  ckb_syscalls.h
$ sudo docker run --rm -it -v `pwd`:/code nervos/ckb-riscv-gnu-toolchain:xenial bash
[email protected]:/# cd /code
[email protected]:/code# riscv64-unknown-elf-gcc -Os carrot.c -o carrot
[email protected]:/code# exit
exit
$ ls
carrot*  carrot.c  ckb_consts.h  ckb_syscalls.h

In this way, CKB can directly use GCC compiled executable files as scripts on the chain without further processing. We can now deploy it on the chain. Note that I will use CKB’s Ruby SDK because I used to be a ruby programmer, and of course ruby is the most natural (but not necessarily the best) for me. Please refer to the official readme file for details.

To deploy the script to CKB, we just need to create a new cell and set the script code as the cell data section:

pry(main)> data = File.read("carrot")
pry(main)> data.bytesize
=> 6864
pry(main)> carrot_tx_hash = wallet.send_capacity(wallet.address, CKB::Utils.byte_to_shannon(8000), CKB::Utils.bin_to_hex(data))

Here, I want to create a new cell with enough capacity by sending a token to myself. Now we can create a script that contains the carrot script code:

pry(main)> carrot_data_hash = CKB::Blake2b.hexdigest(data)
pry(main)> carrot_type_script = CKB::Types::Script.new(code_hash: carrot_data_hash, args: [])

Recall the script data structure:

pub struct Script {
    pub args: Vec<Bytes>,
    pub code_hash: H256,
    pub hash_type: ScriptHashType,
    }

We can see that instead of embedding the script code directly into the script data structure, we only include the hash of the code, which is the blake2b hash of the actual script binary code. Since the carrot script does not use parameters, we canargsSome use empty arrays.

Note that this is still ignoredhash_type, we will discuss the specified code hash in another way in a later article. Now, let’s keep it as simple as possible.

To run the carrot script, we need to create a new transaction and set the carrot type script to the type script of one of the output cells:

pry(main)> tx = wallet.generate_tx(wallet2.address, CKB::Utils.byte_to_shannon(200))
pry(main)> tx.outputs[0].instance_variable_set(:@type, carrot_type_script.dup)

We also need to carry out one step: in order for CKB to find the carrot script, we need to reference the cell containing the carrot script in the DEPs of a transaction:

pry(main)> carrot_out_point = CKB::Types::OutPoint.new(cell: CKB::Types::CellOutPoint.new(tx_hash: carrot_tx_hash, index: 0))
pry(main)> tx.deps.push(carrot_out_point.dup)

Now we are ready to sign and send the transaction:

[44] pry(main)> tx.witnesses[0].data.clear
[46] pry(main)> tx = tx.sign(wallet.key, api.compute_transaction_hash(tx))
[19] pry(main)> api.send_transaction(tx)
=> "0xd7b0fea7c1527cde27cc4e7a2e055e494690a384db14cc35cd2e51ec6f078163"

Because there is no cell data in the cell of this transactioncarrot, so the type script will validate successfully. Now let’s try a different deal that does contain ancarrotCell at the beginning:

pry(main)> tx2 = wallet.generate_tx(wallet2.address, CKB::Utils.byte_to_shannon(200))
pry(main)> tx2.deps.push(carrot_out_point.dup)
pry(main)> tx2.outputs[0].instance_variable_set(:@type, carrot_type_script.dup)
pry(main)> tx2.outputs[0].instance_variable_set(:@data, CKB::Utils.bin_to_hex("carrot123"))
pry(main)> tx2.witnesses[0].data.clear
pry(main)> tx2 = tx2.sign(wallet.key, api.compute_transaction_hash(tx2))
pry(main)> api.send_transaction(tx2)
CKB::RPCError: jsonrpc error: {:code=>-3, :message=>"InvalidTx(ScriptFailure(ValidationFailure(-1)))"}
from /home/ubuntu/code/ckb-sdk-ruby/lib/ckb/rpc.rb:164:in `rpc_request'

As we can see, our carrot script rejected a transaction with carrots in the generated cell. Now I can use this script to make sure that all cells contain no carrots!

Therefore, to sum up, to deploy and run a script of type script, we need to do the following:

  1. Compiling scripts into risc-v executable binaries
  2. Deploy the binary file in the data part of the cell
  3. Create a type script data structure using the blake2b hash of the binary as thecode hash, completeargsRequired parameters for script code in section
  4. Create a new transaction with the type script set in the generated cell
  5. Write the outlet of the cell containing the script code to the DEPs of a transaction

That’s all you need! If your script encounters any problems, you need to review these points.

Although we’ve only discussed the type script here, the lock script works exactly the same way. The only thing you need to remember is that when you create a cell with a specific lock script, the lock script does not run here, it only runs when you use the cell. Therefore, the type script can be used to construct the logic that runs when the cell is created, while the lock script is used to construct the logic that runs when the cell is destroyed. With this in mind, make sure your lock script is correct, otherwise you may lose token in the following scenarios:

Your lock script has a bug that someone else can unlock your cell.
There is a bug in your lock script that no one (including you) can unlock your cell.

One of the skills we can provide here is to always attach your script as a type script to an output cell of your transaction for testing, so that you can know immediately when an error occurs, and your token can always be safe.

Analyzing the default lock script code

Based on our knowledge, let’s take a look at the default lock script code included in CKB. To avoid confusion, we are looking at the lock script code in this commit.

The default lock script code iterates through all input cells with the same lock script as itself, and performs the following steps:

  • It obtains the current transaction hash through the syscall provided
  • It takes the corresponding witness data as the current input
  • For the rest of the optional parameters provided by the user, the signature can be restored by the owner of the script
  • The default lock script runs the blake2b hash of the binary linked by the transaction hash, along with all user supplied parameters, if any
  • The blake2b hash result is used as the message part of secp256k1 signature verification. Note that the first parameter in the witness data structure provides the actual signature.
  • If signature verification fails, the script exits with an error code. Otherwise, it will continue to the next iteration.

Note that we discussed the differences between scripts and script code earlier. Each different public key hash generates a different lock script. Therefore, if the input cell of a transaction has the same default lock script code, but has different public key hash (therefore, it has different lock scripts), multiple instances of the default lock script code will be executed, and each instance has a set of cells sharing the same lock script.

Now we can traverse different parts of the default lock script code:

if (argc != 2) {
  return ERROR_WRONG_NUMBER_OF_ARGUMENTS;
}

secp256k1_context context;
if (secp256k1_context_initialize(&context, SECP256K1_CONTEXT_VERIFY) == 0) {
  return ERROR_SECP_INITIALIZE;
}

len = BLAKE2B_BLOCK_SIZE;
ret = ckb_load_tx_hash(tx_hash, &len, 0);
if (ret != CKB_SUCCESS) {
  return ERROR_SYSCALL;
}

When a parameter is included in theScriptData structuredargsPart of them, they’re through UNIX’s traditionalarc/argvMode to the actual running script. In order to further maintain the agreement, weargv[0]A dummy argument is inserted at, so the first contained argument from theargv[1]Start. In the case of the default lock script code, it takes a parameter, the public key hash generated from the owner’s private key.

ret = ckb_load_input_by_field(NULL, &len, 0, index, CKB_SOURCE_GROUP_INPUT,
                             CKB_INPUT_FIELD_SINCE);
if (ret == CKB_INDEX_OUT_OF_BOUND) {
  return 0;
}
if (ret != CKB_SUCCESS) {
  return ERROR_SYSCALL;
}

Using the same technique as the carrot example, we check to see if there are more input cells to test. There are two differences from the previous example:

  • If we just want to know if a cell exists and doesn’t need any data, we just need to pass inNULLAs a data buffer, alenThe value of the variable is 0.

In this way, syscall will skip data padding and provide only the available data length and the correct return code for processing.

  • In this carrot example, we loop through all the inputs in the transaction, but here we only care about the input cell with the same lock script. CKB will have the same lock (or type) scriptcellNamedgroup。 We can use itCKB_SOURCE_GROUP_INPUTreplaceCKB_SOURCE_INPUTFor example, cells with the same lock script as the current cell are calculated.
len = WITNESS_SIZE;
ret = ckb_load_witness(witness, &len, 0, index, CKB_SOURCE_GROUP_INPUT);
if (ret != CKB_SUCCESS) {
  return ERROR_SYSCALL;
}
if (len > WITNESS_SIZE) {
  return ERROR_WITNESS_TOO_LONG;
}

if (!(witness_table = ns(Witness_as_root(witness)))) {
  return ERROR_ENCODING;
}
args = ns(Witness_data(witness_table));
if (ns(Bytes_vec_len(args)) < 1) {
  return ERROR_WRONG_NUMBER_OF_ARGUMENTS;
}

Continuing along this path, we are loading the witness of the current input. The corresponding witness and input have the same index. CKB is now used in syscallsflatbufferAs a serialization format, so if you’re curious, flatcc documentation is your best friend.

/* Load signature */
len = TEMP_SIZE;
ret = extract_bytes(ns(Bytes_vec_at(args, 0)), temp, &len);
if (ret != CKB_SUCCESS) {
  return ERROR_ENCODING;
}

/* The 65th byte is recid according to contract spec.*/
recid = temp[RECID_INDEX];
/* Recover pubkey */
secp256k1_ecdsa_recoverable_signature signature;
if (secp256k1_ecdsa_recoverable_signature_parse_compact(&context, &signature, temp, recid) == 0) {
  return ERROR_SECP_PARSE_SIGNATURE;
}
blake2b_state blake2b_ctx;
blake2b_init(&blake2b_ctx, BLAKE2B_BLOCK_SIZE);
blake2b_update(&blake2b_ctx, tx_hash, BLAKE2B_BLOCK_SIZE);
for (size_t i = 1; i < ns(Bytes_vec_len(args)); i++) {
  len = TEMP_SIZE;
  ret = extract_bytes(ns(Bytes_vec_at(args, i)), temp, &len);
  if (ret != CKB_SUCCESS) {
    return ERROR_ENCODING;
  }
  blake2b_update(&blake2b_ctx, temp, len);
}
blake2b_final(&blake2b_ctx, temp, BLAKE2B_BLOCK_SIZE);

The first parameter in the witness is the signature to load, while the rest (if provided) is appended to the transaction hash for the blake2b operation.

secp256k1_pubkey pubkey;

if (secp256k1_ecdsa_recover(&context, &pubkey, &signature, temp) != 1) {
  return ERROR_SECP_RECOVER_PUBKEY;
}

Then, the hashed blake2b result is used as information to verify the signature of secp256.

size_t pubkey_size = PUBKEY_SIZE;
if (secp256k1_ec_pubkey_serialize(&context, temp, &pubkey_size, &pubkey, SECP256K1_EC_COMPRESSED) != 1 ) {
  return ERROR_SECP_SERIALIZE_PUBKEY;
}

len = PUBKEY_SIZE;
blake2b_init(&blake2b_ctx, BLAKE2B_BLOCK_SIZE);
blake2b_update(&blake2b_ctx, temp, len);
blake2b_final(&blake2b_ctx, temp, BLAKE2B_BLOCK_SIZE);

if (memcmp(argv[1], temp, BLAKE160_SIZE) != 0) {
  return ERROR_PUBKEY_BLAKE160_HASH;
}

Last but not least, we need to check that the pubkey contained in the recoverable signature is indeed the pubkey used to generate the pubkey hash contained in the lock script parameter. Otherwise, someone might steal your token with a signature generated by another public key.

In short, the scheme used in the default lock script is very similar to that used in bitcoin today.

Introducing duktape

I’m sure you feel the same as I do now: we can write contracts in C, which is great, but C is always a bit boring and, let’s face it, it’s dangerous.
Is there a better way?

of course! The CKB VM we mentioned above is essentially a microcomputer, and we can explore many solutions. One of the things we do here is to use JavaScript to write CKB script code. Yes, you’re right. Simple Es5 (yes, I know, but this is just an example, you can use a converter) JavaScript.

How can this be possible? Because we have a C compiler, We only need to use a JavaScript implementation for the embedded system. In our example, duktape compiles it from C to risc-v binary file, and puts it on the chain. Then we can run JavaScript on CKB! Because we are using a real micro computer, there is nothing to prevent us from embedding another VM as a CKB script CKB VM, and explore the VM on the VM path.

From this path, we can use JavaScript on CKB through duktape or Ruby on CKB through mruby. We can even put bitcoin scripts or EVM on the chain. We just need to compile their virtual machines and put them on the chain. This ensures that the CKB VM can help us preserve assets and build a diverse ecosystem. All languages should be treated equally on CKB, and freedom should be in the hands of developers of blockchain contracts.

At this stage, you may want to ask: Yes, it is possible, but will the VM above the VM be slow? I believe it depends on whether your example is slow. I firmly believe that benchmarking doesn’t make any sense unless we put it in actual use cases with standard hardware requirements. So we need to have time to see if this is really going to be a problem. In my opinion, high-level languages are more likely to be used for type scripts to protect cell conversions, in which case I suspect it will be slow. In addition, we are also working hard in this area to optimize the CKB VM and the CKB VM on VMS to make it faster and faster,: P

To use duktape on CKB, you need to compile duktape itself into risc-v executable binary file

$ git clone https://github.com/nervosnetwork/ckb-duktape
$ cd ckb-duktape
$ sudo docker run --rm -it -v `pwd`:/code nervos/ckb-riscv-gnu-toolchain:xenial bash
[email protected]:~# cd /code
[email protected]:/code# make
riscv64-unknown-elf-gcc -Os -DCKB_NO_MMU -D__riscv_soft_float -D__riscv_float_abi_soft -Iduktape -Ic -Wall -Werror c/entry.c -c -o build/entry.o
riscv64-unknown-elf-gcc -Os -DCKB_NO_MMU -D__riscv_soft_float -D__riscv_float_abi_soft -Iduktape -Ic -Wall -Werror duktape/duktape.c -c -o build/duktape.o
riscv64-unknown-elf-gcc build/entry.o build/duktape.o -o build/duktape -lm -Wl,-static -fdata-sections -ffunction-sections -Wl,--gc-sections -Wl,-s
[email protected]:/code# exit
exit
$ ls build/duktape
build/duktape*

Like the carrot example, the first step here is to deploy the duktape script code in the CKB cell:

pry(main)> data = File.read("../ckb-duktape/build/duktape")
pry(main)> duktape_data.bytesize
=> 269064
pry(main)> duktape_tx_hash = wallet.send_capacity(wallet.address, CKB::Utils.byte_to_shannon(280000), CKB::Utils.bin_to_hex(duktape_data))
pry(main)> duktape_data_hash = CKB::Blake2b.hexdigest(duktape_data)
pry(main)> duktape_out_point = CKB::Types::OutPoint.new(cell: CKB::Types::CellOutPoint.new(tx_hash: duktape_tx_hash, index: 0))

Unlike the carrot example, duktape script code now requires one parameter: javascript source code to execute:

pry(main)> duktape_hello_type_script = CKB::Types::Script.new(code_hash: duktape_data_hash, args: [CKB::Utils.bin_to_hex("CKB.debug(\"I'm running in JS!\")")])

Note that with different parameters, you can create different duktape supported type scripts for different use cases:

pry(main)> duktape_hello_type_script = CKB::Types::Script.new(code_hash: duktape_data_hash, args: [CKB::Utils.bin_to_hex("var a = 1;\nvar b = a + 2;")])

This reflects the difference between script code and script mentioned above: duktape is used as script code to provide JavaScript engine, while different scripts use duktape script code to provide different functions on the chain.

Now we can create a type script attachment of cell and duktape:

pry(main)> tx = wallet.generate_tx(wallet2.address, CKB::Utils.byte_to_shannon(200))
pry(main)> tx.deps.push(duktape_out_point.dup)
pry(main)> tx.outputs[0].instance_variable_set(:@type, duktape_hello_type_script.dup)
pry(main)> tx.witnesses[0].data.clear
pry(main)> tx = tx.sign(wallet.key, api.compute_transaction_hash(tx))
pry(main)> api.send_transaction(tx)
=> "0x2e4d3aab4284bc52fc6f07df66e7c8fc0e236916b8a8b8417abb2a2c60824028"

We can see that the script is executed successfully if theckb.tomlIn the documentckb-scriptThe level of the log module is set todebug, you can see the following logs:

2019-07-15 05:59:13.551 +00:00 http.worker8 DEBUG ckb-script  script group: c35b9fed5fc0dd6eaef5a918cd7a4e4b77ea93398bece4d4572b67a474874641 DEBUG OUTPUT: I'm running in JS!

Now you have successfully deployed a JavaScript engine on CKB and run javascript based scripts on CKB!

You can try to understand the JavaScript code here.

A thinking question

Now that you are familiar with the basics of CKB scripting, here is a thought:
In this article, you’ve seen what an always success script looks like, but what about an always failure script? How small can an always failure script (and script code) be?

Tip: This is not a GCC optimization game, it’s just a thought.

Preview of the next episode

I know this is a long post, I hope you have tried and successfully deployed a script to CKB. In the next article, we will introduce an important topic: how to define your own user-defined token (UDT) in CKB. The best part of UDT on CKB is that each user can store his own UDT in his own cell, which is different from erc20 token on Ethereum. On Ethereum, everyone’s token must be located in a single address of the token initiator. All of this can be achieved by using type script alone.

If you are interested, please continue to follow:)

Join nervos community

Nervos community is committed to becoming the best nervos community. We will continue to promote and popularize nervos technology, deeply tap the intrinsic value of nervos, explore the unlimited possibilities of nervos, and provide a high-quality platform for everyone who wants to deeply understand nervos network.

Add micro signal: bitcoindog can join nervos community. If you are a programmer, please note that it will bring you into the developer group.