How to become a “real” person in the digital age?

Time:2021-9-5

Reading guide

Don’t think this is a bait title. This article is a dry article, so there are deep-seated technical reasons for taking this title.

The sentence pattern of this title is a question sentence. Look carefully, it actually contains two questions:

1. How to become a person with status in the digital age?

The answer to the first question is “ordinary digital identity“. Everyone reading this article actually has a digital identity, either a micro signal or an IP address.

But do you find a problem: these digital identities are not really owned by you, but distributed to you by the identity provider. The IP address is distributed to you by the operator and can be taken away by the operator at any time. Wechat is also distributed to you by Tencent. You know the account password, but the server knows it better. You choose to trust operators, trust service providers, and trust them not to destroy your identity at will, but there is no technical guarantee.

2. How to really have identity?

Is it possible to really control the account in your own hands?

Is it possible to log in to the account without telling the server what your password is, but still enable the server to verify that you really know the password, and no one else can impersonate?

The answer is yes.

This paper will introduce the related technologies and the digital identity technology in the Web3 era based on these technologies: decentralized identity (did).

First, let’s review the process of identity development (if you want to see the technical principle directly, skip to the second part).

Trends in identity development

In ancient China, identity was first appeared in the Qin Dynasty. Shang Yang changed the law to avoid invasion by foreign spies and invented the body photo. (Figure 1-1 shows that this photo paste is very close to our modern identity card, with the head, name, registered residence and other basic information).

Since then, identity technology has developed continuously in ancient times. The tiger talisman, death free gold medal, jade seal and dental plaque of royal guards, which we often see in TV dramas, are all technologies used to prove identity in ancient times.

How to become a

In modern times, China’s first generation ID card was issued in 1984. Since then, it has been continuously improved and anti-counterfeiting technology has been added. In 2004, the second generation ID card was released and multiple anti-counterfeiting technologies were added. In 2013, the biometrics of residents were integrated.

We found that there are two major trends in identity development: anti-counterfeiting and interoperability.

The trend of anti-counterfeiting is easy to explain. Identity is originally to prove that “I am me”. Anti counterfeiting reduces the probability of “others posing as me” and “I posing as others”.

The reason for interoperability is that people often have multiple features and identities, fingerprint features and facial features, both resident certificates and driving certificates.

Modern digital identity has a similar trend.

With the development of information technology, digital identity began to appear, and there have been four stages of identity: centralized identity, federated identity (1999), user centric identity (2005) and self sovereign identity (SSI, 2016).

There are three development trends in these four stages: decentralization, interoperability and privacy protection.

Decentralization: users have complete control over their identity. Only they know the password. Only they have the right to modify and read identity information. Their identity right cannot be deprived by any other organization. Decentralization can be understood as a kind of anti-counterfeiting in the ultimate sense. From anti-counterfeiting to technical realization, only “I can prove it is me”.

Interworking: register a digital identity and log in on any digital service of other service providers.

Privacy protection: users keep their own data, so they can decide what data digital services can call.

The development trend of digital identity has more privacy protection than that of identity.

Because digital identity comparison involves data, and the topic of data and privacy is a very hot topic at present. On October 21, 2020, the legal working committee of the National People’s Congress publicly solicited opinions on the personal information protection law (Draft), which means that China’s first law dedicated to the protection of personal information is not far away.

Distributed digital identity belongs to the fourth stage, which hopes to eventually provide all the technologies to realize autonomous identity SSI. Some organizations predict that the market of distributed digital identity will grow 127 times from 2017 to 2025, from US $57.6 million to US $7.3 billion, which shows that the development of distributed digital identity has a promising future.

Next, we introduce the technologies involved in distributed digital identity.

Asymmetric encryption and digital signature
As mentioned earlier, the technology of “not telling the server what your password is, but still allowing the server to verify that you really know the password” exists. This technology is called zero knowledge password proof, which is defined in IEEE p1363.2.

If we classify zero knowledge cipher proof, it belongs to public key cryptography, and IEEE believes that it is also a kind of zero knowledge proof.

For the purpose of space and writing, we only briefly introduce asymmetric encryption, not the details of zero knowledge password proof. The two principles are the same.

Asymmetric encryption is a very important branch of modern cryptography. In general asymmetric encryption, the key used to authenticate users is not a password, but a key. It can be understood as a password with a long length (such as 50 characters).

Cryptography is mainly used for information encryption. The content before encryption is called plaintext, such as “attack at 6am”. After using an encryption key and encryption algorithm, it may become “np7-ub-ldbuub”, which is called ciphertext.

To get plaintext from ciphertext, decryption key and decryption algorithm must be used. If the encryption key and decryption key are the same, it is symmetric encryption; If different, it is asymmetric encryption.

There are two keys for asymmetric encryption, which are called public key and private key.

The content encrypted by the public key can be decrypted by the private key; On the contrary, the content encrypted with the private key can be decrypted by the public key. Generally, the private key is hidden, and only the user knows it; The public key needs to be published to others. In this way, when others want to send a message to the user, the public key is used to encrypt the message. The encrypted message can only be decrypted by the person who has the user’s private key, and other people who have the public key cannot decrypt it.

Asymmetric encryption is mainly used for information encryption. How can it be used for user authentication?

Digital signature.

Suppose user a wants to prove that he is a. first, construct a message “I am a”; Then, the hash function of the message is calculated to obtain the hash value H (Iam a), and then the private key priv is used to encrypt the hash value. The obtained ciphertext e (H (Iam a), priv) is the digital signature of user a on the message “Iam a”.

Send the original message “I am a” and signature e (H (I am a), priv) to others, who can decrypt the signature to obtain H (I am a) using the user’s public key; Then, hash the original message to get H (I am a) ‘. If h (I am a)’ = = H (I am a), it means that the user sending the “I am a” message does have the private key priv, which proves that he is user a.

In short, the private key is actually the user’s password, and the public key can be given to the server to verify whether the user really holds the private key. The way of verification is to verify the digital signature.

With this foundation, the next step is to introduce distributed digital identity did.

The distributed digital identity system is based on asymmetric encryption and digital signature.

Did specification

Since the development of distributed digital identity, there are mainly five technical specifications: did identifier, did document, did resolver, verifiable credential and identity hub, The main leading organizations of these technical specifications are W3C (World Wide Web Consortium) and dif (decentralized identity Foundation).

The existence of these specifications is also related to the requirements of the identity system itself:

Did identifier: format of identity identifier;

Did document: format of identity information;

Did parser: the acquisition of identity information provides guarantee for identity authentication;

Verifiable statement: the way of privacy data disclosure provides protection for data authorization;

Identity repository: management of privacy data;

▲ did identifier
According to the identifier system “zooko triangle theory” proposed by the founder of zcash, identifiers can not realize security, decentralization and meaningful (easy to remember) to human beings at the same time. W3C did identifiers mainly consider security and decentralization.

How to become a

Alpha and digit here are defined in ABNF (augmented Backus normal form), while other grammars not defined in ABNF are defined in rfc3986. It is worth mentioning that the W3C did identifier conforms to the specification of W3C URI.

for instance:

did:ethr:0xE6Fe788d8ca214A080b0f6aC7F48480b2AEfa9a6
It is a did ID, where ethr is method name, indicating the domain where the identity is located (the domain referred to by ethr here is Ethereum); 0xe6fe788d8ca214a080b0f6ac7f48480b2aefa9a6 is a method specific ID, indicating the address of this identity in the domain.

▲ did document
Did identifier is only an identifier representing an identity and does not contain identity information. A did document is a document used to describe identity details. A did identifier is associated with a did document.

Did documents generally include the following contents:

Did identifier (required);

A collection of encrypted materials, such as public keys;

Validation method set;

A collection of service endpoints;

Time, including creation time and update time.

Examples of did documents:

{

"@context": "https://w3id.org/did/v1",
"id": "did:ethr:0xE6Fe788d8ca214A080b0f6aC7F48480b2AEfa9a6",
"publicKey": [
  {
    "id": "did:ethr:0xE6Fe788d8ca214A080b0f6aC7F48480b2AEfa9a6#controller",
    "type": "Secp256k1VerificationKey2018",
    "controller": "did:ethr:0xE6Fe788d8ca214A080b0f6aC7F48480b2AEfa9a6",
    "ethereumAddress": "0xe6fe788d8ca214a080b0f6ac7f48480b2aefa9a6"
  }
],
"authentication": [
  {
    "type": "Secp256k1SignatureAuthentication2018",
    "publicKey": "did:ethr:0xE6Fe788d8ca214A080b0f6aC7F48480b2AEfa9a6#controller"
  }
]

}
The @ context field indicates the version of the document; The ID field indicates the did to which the document is associated; The publickey field indicates the relevant public key; The authentication field refers to the public key stored in the publickey field and forms a way to verify the identity of the did user. The did document is actually a little similar to the certificate in the traditional PKI system.
The actual format of did document can be JSON, json-ld, yaml, XML, etc. Its storage needs to be chained, or at least hashed.
▲ did parser
The function of the parser is to obtain the did document through the did identifier. In this way, when the did user logs in to a service, the service provider calls the parser to obtain the did document, so as to know how to verify the did user.

The specification of did parser is mainly dominated by dif.

The architecture of dif universal resolver is shown in the figure below. First, get the method of the did ID through the did ID, and then call the driver corresponding to the method to complete the final resolution. The specific implementation of these drivers is not limited, but should follow the specification of the interface.

DIF universal resolver can be considered as a driver aggregator.

This architecture is designed because the storage of different did is located on different blockchains and may also be stored in different smart contracts. To use did, users must first complete the registration of did, and the registration of did must be associated with a blockchain (or other type of decentralized system), such as Ethereum.

Moreover, ordinary users also use some did registry services to complete registration. For example, there is a uport on Ethereum, which can help users on Ethereum complete did registration. If it is on other chains, there may be other services that provide did registry.

Therefore, each registry service that provides did registration may be different. Using this aggregator architecture can maximize compatibility with all did registries.

How to become a

Figure 3-1

▲ verifiable statement
Next, the fourth technical specification verifiability statement of did is introduced, which may be the most important specification in the did ecology at present. Verifiable credential, abbreviated as VC.

As mentioned earlier, the purpose of VC is data authorization, which is as fine-grained as possible, so as to minimize the leakage of private data.

How to become a

Figure 3-2

Proof of something can be achieved by disclosing different degrees of privacy, as shown in Figure 3-2. From left to right, the degree of privacy disclosure is reduced. Let’s take an example.

Suppose you are 24 years old, how can you prove that you are over 21 years old? If there are three options:

Show me your ID card

Date of birth

Write a certificate older than 21

Which would you choose?

Obviously, the three schemes have different degrees of disclosure of your personal privacy.

The first kind divulges your private information the most, the second kind is second, and the third kind hardly divulges any redundant information.

How to become a

Figure 3-3

The operation of VC needs a set of mechanism and many roles. You can see that there are many roles in Figure 3-3. The functions of these roles are as follows:

Issuer: can issue VC (can access user data), such as government, banks, universities and other institutions and organizations.

Verifier: it can verify the VC, so it can provide certain types of services to those who show the VC, such as game websites and cigarette stores.

Holder: that is, the user can request & receive and hold the VC from the issuer, show the VC to the verifier, and the issued VC can be stored in the wallet for future proof.

Identifier registry: maintain the database of did identifiers and keys (did documents), such as blockchain, trusted database, distributed ledger, etc.

What is the data format of VC? It will contain roughly the following fields:

ID of VC (required);

The issuer of the VC;

The main content of the declaration;

Proof of declaration.

Time, such as release time.

One instance:
{

"@context": [
  "https://www.w3.org/2018/credentials/v1",
  "https://www.w3.org/2018/credentials/examples/v1"
],
"id": "http://example.edu/credentials/1872",
"type": ["VerifiableCredential", "AlumniCredential"],
"issuer": {
  "id": "did:example:76e12ec712ebc6f1c221ebfeb1f",
  "name": "Example University"
},
"issuanceDate": "2020-01-01T19:73:24Z",
"credentialSubject": {
  "id": "did:example:ebfeb1f712ebc6f1c276e12ec21",
  "alumniOf": {
    "id": "did:example:c276e12ec21ebfeb1f712ebc6f1",
    "name": [{
      "value": "Example University",
      "lang": "en"
}]}},
"proof": {
  "type": "RsaSignature2018",
  "created": "2017-06-18T21:19:10Z",
  "proofPurpose": "assertionMethod",
  "verificationMethod": "https://example.edu/issuers/keys/1",

“jws”: “eyJhbGciOiJSUzI1NiIsImI2NCI6ZmFsc2UsImNyaXQiOlsiYjY0Il19..TCYt5XsITJX1CxPCT8yAV-TVkIEq_PbChOMqsLfRoPsnsgw5WEuts01mq-pQy7UJiN5mgRxD-WUcX16dUEMGlv50aqzpqh4Qktb3rk-uQy72IFLOqV0G_zS245-kronKb78cPN25DGlcTwLtjPAYuNzVBAh4vGHSrQyHUdBBPM”

}

}
In this VC, @ context field indicates the format of this VC; The ID field indicates the ID of the VC; The type field indicates the type of VC; The issuer field indicates the publisher of the VC; The issuancedate field indicates the issue date; The credentialsubject field indicates the main content of the VC; The proof field indicates the proof part of VC, which can be verified by verifier.
Of course, the most important contents here are credentialsubject and project.
▲ identity repository
Next, the fifth technical specification of did, identity hub, is introduced.

First, we should make it clear that identity data and privacy data are different. Identity data refers to the public key, which is only related to the account, while privacy data refers to the data related to the user’s own real information, such as gender, age, etc.

Only identity related data is stored in did documents; The identity hub is used to store users’ privacy data. Although the identity hub is an identity hub, it stores data and can be understood as a data bank.

We are used to putting our assets in the bank. Why? Because of safety, the bank ensures the safety of our assets. Similarly, in the future, we will store data in the data bank to ensure the security of data.

It has the following characteristics:

Identity hub is a decentralized personal data store under the chain, which can hand over the control of personal data to users. They allow users to store their sensitive data in a secure and private way. Without the explicit authorization of users, they cannot obtain user data.

The actual location of the identity hub is determined by the user, which can be local (mobile phone, PC) or cloud;

In the future, the user will store the privacy data in the identity hub, and then when the application service calls the user data, the user’s consent must be requested to obtain the data.

A simple example
Let’s take a simple example. String all the above.

Suppose Xiaoming has an account 0x96f… 3d4 on Ethereum. Xiaoming wants to use did to log in to the game website a that supports did.

  1. Xiaoming finds a did registry service (such as uport) to help him register a did on Ethereum: did: eth: 0x96f… 3d4;
  2. The did registry service stores the did documents (including public key and other information) related to the did on the Ethereum chain;
  3. Xiao Ming logs in with the registered did on game website a (game website a can get the did document through the did parser, so as to know the verification method of the did);
  4. Xiaoming stores his personal privacy data in multiple identity hubs, where the privacy data on the resident ID card exists in government agency g, which also needs to register his did identity;
  5. On game website a, Xiao Ming wants to prove that he is over 16 years old to get game time;
  6. Xiaoming requests the government agency g (issuer) to issue a verifiable Certificate (VC) for his age > 16 years old;
  7. The government agency g found that Xiaoming was indeed over 16 years old by querying the relevant privacy data of Xiaoming’s residents, so it issued this VC (with G’s signature) to Xiaoming;
  8. Game website a verifies the signature of the VC and finds that it is indeed the choice trust issued by government agency g, so as to release the game time;
  9. If one day, game website a goes bankrupt. At this time, Xiao Ming’s did still exists and can also be used for login of other applications (such as game website b).

Summary
To summarize did.

Did was proposed to achieve autonomy. But can it actually achieve its purpose?

From the perspective of identity, the did scheme is indeed good. The identity is stored on the blockchain and the asymmetric encrypted key is used to ensure the user’s complete control over the account. Did did a good job in this part.

However, we can obviously find some problems, mainly in data storage.

In fact, the issuers issuing VC in the VC system still master user data. Therefore, the operation architecture of VC is centralized and controllable in essence. Users must trust some institutions to host private data. But this is much better than putting this private data on the service provider’s server.

Although the service provider (such as game website a in 4) can’t get the user’s privacy data, the data generated by the user at the service provider, such as the equipment, skin and grade generated by Xiaoming playing the game, seems to be firmly controlled by game website a.