Thoroughly understand cookie, session, token and JWT

Time:2022-1-7

What is authentication

Generally speaking, it is to verify the identity of the current user and prove that “you are yourself” (for example, you need to punch in through fingerprint every day when you punch in and out of work. When your fingerprint matches the fingerprint entered in the system, you punch in successfully).

Authentication in the Internet

  • User name password login
  • Email send login link
  • Mobile phone number receiving verification code
  • As long as you can receive the email / verification code, you are the owner of the account by default

What is authorization

  • The user grants third-party applications access to certain resources of the user

    • When you install the mobile app, the app will ask whether permission (access to photo albums, geographical locations, etc.) is allowed
    • When you access the wechat applet, when you log in, the applet will ask whether permission is allowed to be granted (obtain personal information such as nickname, avatar, region, gender, etc.)
  • The authorization methods include cookie, session, token and OAuth

What are credentials

The premise of authentication and authorization is that a medium (certificate) is needed to mark the identity of visitors.

  • In the Warring States period, Shang Yang’s reform, invented according to the body tie. The photo sticker is distributed by the government. It is a polished bamboo board with the holder’s head and native place information engraved on it. Chinese people must hold it. If they don’t, they will be regarded as black households or spies.
  • In real life, everyone will have an exclusive resident identity card, which is a legal document used to prove the identity of the holder. Through the ID card, we can handle mobile phone card / bank card / personal loan / transportation, etc., which is the certificate of authentication.
  • In Internet applications, general websites (such as nuggets) will have two modes, tourist mode and login mode. In the tourist mode, you can browse the articles on the website normally. Once you want to like / collect / share articles, you need to log in or register an account. When the user logs in successfully, the server will issue a token to the browser used by the user. This token is used to indicate your identity. Each time the browser sends a request, it will bring this token to use the functions that cannot be used in the tourist mode.

What is a cookie

HTTP is a stateless protocol (it has no memory for transaction processing, and the server will not save any session information every time the client and server sessions are completed): each request is completely independent. The server cannot confirm the identity information of the current visitor, and cannot distinguish whether the sender of the last request and the sender of this time are the same person. Therefore, in order to track the session (know who is accessing me), the server and browser must actively maintain a state, which is used to inform the server whether the two requests come from the same browser. This state needs to be implemented through cookies or sessions.

  • A cookie is stored on the client: a cookie is a small piece of data sent by the server to the user’s browser and stored locally. It will be carried and sent to the server the next time the browser sends a request to the same server.
  • Cookies cannot cross domains: each cookie will be bound to a single domain name and cannot be used under other domain names. The primary domain name and the secondary domain name are allowed to be shared (depending on the domain).
Important properties of cookies

Thoroughly understand cookie, session, token and JWT

What is session

  • Session is another mechanism for recording server and client session states
  • Session is implemented based on cookies. The session is stored on the server side, and the sessionid will be stored in the cookie on the client side

Thoroughly understand cookie, session, token and JWT

Session authentication process
  • When the user requests the server for the first time, the server creates a corresponding session according to the relevant information submitted by the user
  • When the request is returned, the unique identification information of this session, sessionid, is returned to the browser
  • After receiving the sessionid information returned by the server, the browser will store this information in the cookie, and the cookie records which domain name this sessionid belongs to
  • When the user accesses the server for the second time, the request will automatically determine whether there is cookie information under this domain name. If there is, the cookie information will be automatically sent to the server. The server will obtain the sessionid from the cookie, and then find the corresponding session information according to the sessionid. If it is not found, it means that the user has not logged in or the login fails, If session is found, it proves that the user has logged in. You can perform the following operations.

According to the above process, sessionid is a bridge between cookie and session, and most systems also verify the user’s login status according to this principle.

The difference between cookie and session

  • Security: session is safer than cookieThe session is stored on the server side and the cookie is stored on the client side.
  • Different types of access values: cookies only support storing string data. If you want to set other types of data, you need to convert them into strings. Session can store any data type.
  • Different validity periods: cookies can be set to stay for a long time. For example, the default login function we often use. Generally, the session expiration time is short, and the client will expire when the client is closed (by default) or the session times out.
  • Different storage sizes: the data saved by a single cookie cannot exceed 4K. The data that can be stored by session is much higher than that of cookies, but too many accesses will occupy too many server resources.

What is a token

Acesss Token
  • Resource credentials required to access the resource interface (API).
  • The composition of a simple token: uid (user’s unique identity), time (timestamp of the current time), sign (signature, the first few bits of the token are compressed into a hexadecimal string of a certain length by hash algorithm).

characteristic:

  • The server is stateless and has good scalability
  • Support mobile devices
  • security
  • Support cross program calls

Authentication process of Token:

Thoroughly understand cookie, session, token and JWT

  • The client requests login with user name and password
  • The server receives a request to verify the user name and password
  • After successful verification, the server will issue a token and send it to the client
  • After the client receives the token, it will store it, such as in a cookie or localstorage
  • Every time the client requests resources from the server, it needs to bring the token signed by the server
  • The server receives the request, and then verifies the token carried in the client request. If the verification is successful, it returns the requested data to the client

Each request needs to carry a token, which needs to be put in the HTTP header.

Token based user authentication is a stateless authentication method for the server, and the server does not need to store token data.
The calculation time of parsing token is exchanged for the storage space of session, so as to reduce the pressure on the server and reduce frequent query to the database.

The token is completely managed by the application, so it can avoid the homology strategy.

Refresh Token

Another token — refresh token

Refresh token is a token dedicated to refreshing access tokens. If there is no refresh token, you can also refresh the access token, but each refresh requires the user to enter the login user name and password, which will be very troublesome. With the refresh token, this trouble can be reduced. The client directly updates the access token with the refresh token without additional operations by the user.

Thoroughly understand cookie, session, token and JWT

The validity period of the access token is relatively short. When the acess token expires, you can get a new token by using the refresh token. If the refresh token also fails, the user can only log in again.

The refresh token and expiration time are stored in the database of the server. They can only be verified when applying for a new acess token, which will not affect the response time of the business interface, and do not need to be kept in memory like session to deal with a large number of requests.

The difference between token and session

Session is a mechanism to record the session state of server and client, which makes the server stateful and can record session information. Token is a token, which is the resource credential required to access the resource interface (API). Token makes the server stateless and does not store session information.

Session and token are not contradictory. As an identity authentication, token has better security than session, because every request has a signature, which can prevent listening and replay attacks, and session must rely on the link layer to ensure communication security. If you need to implement a stateful session, you can still add a session to save some state on the server.

The so-called session authentication simply stores the user information in the session. Because the session ID is unpredictable, it is considered safe for the time being. Token, if it refers to OAuth token or similar mechanism, provides authentication and authorization. Authentication is for users and authorization is for apps. The purpose is to give an app the right to access the information of a user. The token here is unique. It cannot be transferred to other apps or other users. Session only provides a simple authentication, that is, as long as there is this sessionid, it is considered to have all the rights of this user. It needs to be kept strictly confidential. This data should only be saved in the website and should not be shared with other websites or third-party apps. So simply put: if your user data may need to be shared with a third party, or allow a third party to call the API interface, use token. If it’s always just your own website and app, it doesn’t matter what you use.

What is JWT

  • JSON web token (JWT) is the most popular cross domain authentication solution.
  • Is an authentication and authorization mechanism.
  • JWT is a JSON based open standard (RFC 7519) implemented to transfer declarations between network application environments. The declaration of JWT is generally used to transfer authenticated user identity information between identity providers and service providers, so as to obtain resources from the resource server. For example, it is used for user login.
  • You can use HMAC algorithm or RSA public / private key to sign JWT. Because of the existence of digital signature, the transmitted information is credible.
Generate JWT

jwt.io/www.jsonwebtoken.io/

Principle of JWT

Thoroughly understand cookie, session, token and JWT

JWT certification process:
  • After the user enters the user name / password to log in, the server will return a JWT to the client after successful authentication
  • The client saves the token locally (usually using localstorage or cookies)
  • When users want to access a protected route or resource, they need to add JWT using bearer mode in the authorization field of the request header. Its content looks like the following

    Authorization: bearer copy code
  • The protection route of the server will check the JWT information in the request header authorization. If it is legal, the user’s behavior is allowed
  • Because JWT is self-contained (it contains some session information internally), it reduces the need to query the database
  • Because JWT does not use cookies, you can use any domain name to provide your API services without worrying about cross domain resource sharing (CORS)
  • Because the user’s state is no longer stored in the server’s memory, this is a stateless authentication mechanism
How JWT is used

The client receives the JWT returned by the server, which can be stored in cookie or localstorage.

  • Mode 1

When users want to access a protected route or resource, they can put it in a cookie and send it automatically, but this can’t cross domain, so it’s better to put it in the authorization field of HTTP request header information and add JWT using bearer mode.

    GET /calendar/v1/events
    Host: api.example.com
    Authorization: Bearer <token>
  • The user’s state will not be stored in the server’s memory, which is a stateless authentication mechanism
  • The protection route of the server will check the JWT information in the request header authorization. If it is legal, the user’s behavior is allowed.
  • Since JWT is self-contained, the need to query the database is reduced
  • These features of JWT enable us to completely rely on its stateless features to provide data API services, or even create a download stream service.
  • Because JWT does not use cookies, you can use any domain name to provide your API services without worrying about cross domain resource sharing (CORS)
  • Mode II

When cross domain, JWT can be placed in the data body of post request.

  • Mode III

Transfer via URL

    http://www.example.com/user?token=xxx

JWT used in the project

Project address: https://github.com/yjdjiayou/jwt-demo

The difference between token and JWT

identical
  • Are tokens for accessing resources
  • Can record user information
  • It makes the server stateless
  • Only after successful verification can the client access the protected resources on the server
difference
  • Token: when the server verifies the token sent by the client, it also needs to query the database to obtain user information, and then verify whether the token is valid.
  • JWT: the token and payload are encrypted and stored in the client. The server only needs to decrypt the key for verification (the verification is also implemented by JWT itself). There is no need to query or reduce the query database, because JWT contains user information and encrypted data.

Common front and back end authentication methods

  • Session-Cookie
  • Token verification (including JWT and SSO)
  • OAuth2. 0 (open authorization)

Common encryption algorithms

Thoroughly understand cookie, session, token and JWT

Hash algorithm

Hash algorithm, also known as hash algorithm, hash function and hash function, is a method to create a small digital “fingerprint” from any kind of data. The hash algorithm re scrambles and mixes the data and recreates a hash value.

Hash algorithm is mainly used to ensure data authenticity (i.e. integrity), that is, the sender sends the original message and hash value together, and the receiver verifies whether the original data is true through the same hash function.

Hash algorithms usually have the following characteristics:

  • Positive image fast: the original data can quickly calculate the hash value
  • Reverse difficulty: it is almost impossible to deduce the original data from the hash value
  • Input sensitivity: as long as the original data changes a little, the resulting hash values vary greatly
  • Conflict avoidance: it is difficult to find different original data to get the same hash value. The number of atoms in the universe is about the 60th power of 10 to the 80th power, so the 256th power of 2 has enough space to accommodate all possibilities. When the algorithm is good, the probability of collision is very low:

    • The 128th power of 2 is 340282366920938463374607431768211456, that is, the 39th power of 10
    • The 160th power of 2 is 1.4615016373309029182036848327163e + 48, that is, the 48th power of 10
    • The 256 power of 2 is 1.15792082373619542357098500869 × The 77th power of 10, that is, the 77th power of 10

be careful

  • The above cannot guarantee that the data is maliciously tampered with, and the original data and hash value may be maliciously tampered with. To ensure that they are not tampered with, RSA public and private key scheme can be used, combined with the hash value.
  • The hash algorithm is mainly used to prevent errors in the computer transmission process. In the early days, computers were guaranteed by the first 7 bits of data and the eighth bit parity code (12.5% waste and low efficiency). For a section of data or file, the hash algorithm was used to generate 128bit or 256bit hash value. If there was a problem in the verification, it was required to retransmit.

common problem

Issues to consider when using cookies
  • Because it is stored on the client, it is easy to be tampered with by the client, and the legitimacy needs to be verified before use
  • Do not store sensitive data, such as user passwords and account balances
  • Using httponly improves security to some extent
  • Try to reduce the size of cookies and the amount of data that can be stored cannot exceed 4KB
  • Set the correct domain and path to reduce data transmission
  • Cookies cannot cross domain
  • A browser can store up to 20 cookies for a website, and the browser is generally allowed to store only 300 cookies
  • The mobile terminal does not support cookies very well, and session needs to be implemented based on cookies, so token is commonly used by the mobile terminal
Issues to consider when using session
  • Store sessions in the server. When users are online at the same time, these sessions will occupy more memory. It is necessary to clean up expired sessions on the server regularly
  • When a web site is deployed in a cluster, it will encounter the problem of how to share sessions among multiple web servers. Because a session is created by a single server, but the server that processes the user’s request is not necessarily the server that created the session, the server cannot get the login credentials that have been put into the session before.
  • When multiple applications want to share sessions, in addition to the above problems, cross domain problems will also be encountered, because different applications may deploy different hosts, and cookie cross domain processing needs to be done in each application.
  • Sessionid is stored in cookies. What if the browser prohibits cookies or does not support cookies? Generally, the sessionid will be followed by the URL parameter, that is, the URL will be rewritten, so the session does not have to be implemented by cookies
  • The mobile terminal does not support cookies very well, and session needs to be implemented based on cookies, so token is commonly used by the mobile terminal
Issues to consider when using token
  • If you think that storing tokens in the database will lead to too long query time, you can choose to put them in memory. For example, redis is very suitable for your needs for token query.
  • The token is completely managed by the application, so it can avoid the homology policy
  • Token can avoid CSRF attack (because cookies are not needed)
  • The mobile terminal does not support cookies very well, and session needs to be implemented based on cookies, so token is commonly used by the mobile terminal
Issues to consider when using JWT
  • Because JWT does not rely on cookies, you can use any domain name to provide your API services without worrying about cross domain resource sharing (CORS)
  • JWT is not encrypted by default, but it can also be encrypted. After generating the original token, you can encrypt it again with the key.
  • Secret data cannot be written to JWT without encryption.
  • JWT can be used not only for authentication, but also for exchanging information. Using JWT effectively can reduce the number of times the server queries the database.
  • The biggest advantage of JWT is that the server no longer needs to store sessions, so that the server authentication service can be easily expanded. But this is also the biggest disadvantage of JWT: because the server does not need to store the session state, it cannot discard a token or change the permission of the token during use. That is, once the JWT is issued, it will remain valid until it expires, unless the server deploys additional logic.
  • JWT itself contains authentication information. Once it is leaked, anyone can obtain all the permissions of the token. In order to reduce embezzlement, the validity period of JWT should be set relatively short. For some important permissions, users should be authenticated again.
  • JWT is suitable for one-time command authentication. A JWT with a very short validity period is issued. Even if the risk is exposed, it is very small. Since a new JWT will be generated every operation, it is not necessary to save the JWT to truly realize stateless.
  • In order to reduce embezzlement, JWT should not use HTTP protocol for explicit transmission, but HTTPS protocol for transmission.
Issues to consider when using encryption algorithms
  • Never store passwords in clear text
  • Always use hash algorithm to process passwords. Never use Base64 or other encoding methods to store passwords. This is the same as storing passwords in plaintext. Use hash instead of encoding. Coding and encryption are two-way processes, and the password is confidential and should only be known by its owner. This process must be one-way. Hashing is used to do this. There is no such saying as hashing, but there is decoding when encoding and decryption when encrypting.
  • Never use weak hash or cracked hash algorithms, such as MD5 or SHA1, but only strong password hash algorithms.
  • Never display or send the password in clear text, even to the owner of the password. If you need the “forget password” function, you can randomly generate a new one-time (which is very important) password and send it to the user.

Session sharing scheme in distributed architecture

Session replication

If the session on any server changes (addition, deletion and modification), the node will serialize all the contents of the session and broadcast them to all other nodes, regardless of whether other servers need sessions or not, so as to ensure session synchronization

  • Advantages: fault tolerance and real-time response of sessions between servers.
  • Disadvantages: it will put some pressure on the network load. If the number of sessions is large, it may cause network congestion and slow down the server performance.
Sticky session / IP binding policy

Adopt IP in ngnix_ Hash mechanism directs all requests of an IP to the same server, that is, bind the user to the server. When the user requests for the first time, the load balancer forwards the user’s request to the a server. If the load balancer sets a sticky session, then each subsequent request of the user will be forwarded to the a server, which is equivalent to sticking the user and the a server together. This is the sticky session mechanism.

  • Advantages: it is simple and does not need to do any processing on the session.
  • Disadvantages: lack of fault tolerance. If the currently accessed server fails and the user is transferred to the second server, his session information will become invalid.
  • Applicable scenario: failure has little impact on customers; Server failure is a low probability event. Implementation method: take nginx as an example, configure IP in the upstream module_ The hash attribute implements the sticky session.
Session sharing (common)

Distributed caching schemes such as memcached and redis are used to cache sessions, but memcached or redis must be a cluster

The sessions are stored in redis. Although the architecture becomes complex and redis needs to be accessed more than once, this scheme also brings great benefits:

  • Session sharing is realized;
  • It can be expanded horizontally (add redis server);
  • When the server restarts, the session is not lost (however, pay attention to the refresh / invalidation mechanism of the session in redis);
  • It can be shared not only across server sessions, but also across platforms (such as web page and APP)

Thoroughly understand cookie, session, token and JWT

Session persistence

Store the session in the database to ensure the persistence of the session

  • Advantages: if there is a problem with the server, the session will not be lost
  • Disadvantages: if the website has a large number of visits, storing the session in the database will cause great pressure on the database, and additional overhead is required to maintain the database.

Just close the browser and the session really disappears?

incorrect.

For a session, unless the program notifies the server to delete a session, the server will remain. Generally, the program sends an instruction to delete a session when the user makes log off. However, the browser never actively informs the server that it will close before closing, so the server will never have a chance to know that the browser has closed. The reason for this illusion is that most session mechanisms use session cookies to save the session ID, which disappears after closing the browser, When connecting to the server again, the original session cannot be found.

If the cookie set by the server is saved on the hard disk, or the HTTP request header sent by the browser is rewritten by some means to send the original session ID to the server, the browser can still open the original session when it is opened again.

It is precisely because closing the browser will not cause the session to be deleted, forcing the server to set an expiration time for the session. When the time from the last session used by the client exceeds this expiration time, the server will consider that the client has stopped its activity and delete the session to save storage space.

Author: autumn leaves no leaves Juejin cn/post/6844904034181070861

Thoroughly understand cookie, session, token and JWT