Basic program structure of go language: data type

Time:2021-3-8

Go language is a static programming language. This means that the compiler needs to know the type of each value in the program at compile time. The emergence of data type is to divide data into data with different memory sizes. When programming, you need to use big data to apply for big memory, so you can make full use of memory.

  • The following basic types are built into go language:
  • Boolean type: bool
  • Integer: int8, byte, int16, int, uint, uintptr, etc
  • Floating point type: float32, float64.
  • Plural types: complex64, complex128
  • String: String
  • Character type: run
  • Error type: error

In addition, the go language supports the following composite types:

  • Pointer: pointer
  • Array: array
  • Slice: slice
  • Dictionary: Map
  • Channel: Chan
  • Structure: struct
  • Interface: Interface

1. Type

1.1 boolean type

Boolean values can only be constants true or false. For example:

var b bool = true

1.2 integer

Serial number Length (bytes) Type and description
1 1 Uint8: unsigned 8-bit integer (0 to 255)
2 2 Uint16: unsigned 16 bit integer (0 to 65535)
3 4 Uint32: unsigned 32-bit integer (0 to 4294967295)
4 8 Uint64: unsigned 64 bit integer (0 to 18446744073709551615)
5 1 Int8: signed 8-bit integer (- 128 to 127)
6 2 Int16: signed 16 bit integer (- 32768 to 32767)
7 4 Int32: signed 32-bit integer (- 2147483648 to 2147483647)
8 8 Int64: signed 64 bit integer (- 9223372036854775808 to 922337203685475807)
9 1 Byte: similar to uint8
10 4 Run: similar to int32
11 Platform related Uint: unsigned integer, 32 or 64 bit,
12 Platform related Int: signed integer, 32 or 64 bits. In go language, int and int32 are different data types,
The compiler does not automatically convert data
13 Same pointer Uintptr: unsigned integer, used to store a pointer, which is 4 bytes in 32-bit platform and 8 bytes in 64 bit platform

Example:

var value int32
Value: = 64 // value will be automatically inferred as int type

 

Integer data can be used for numerical operation, comparison operation and bit operation, which are described in the section of “operator”.

1.3 floating point

Floating point data is used to represent data containing decimal points. For example, 1.234 is a floating-point data. Floating point type in go language is expressed in IEEE-754 standard.

Serial number Length (bytes) Type and description
1 4 Float32: single precision
2 8 Float64: double precision

Example:

var fvalue1 float32
fvalue1 = 12 
Fvalue2: = 12.0 // without a decimal point, fvalue2 is derived as an integer instead of a floating point

1.4. Plural

Here is the complex number in our mathematics, which is actually composed of two real numbers (represented by floating-point numbers in the computer), one for the real part and the other for the imaginary part.

Example:

Var value1 complex64 // a complex type consisting of two float32
value1 = 3.2 + 12i 
Value2: = 3.2 + 12I // Value2 is a complex128 type
Value3: = complex (3.2, 12) // the result of value3 is the same as that of Value2

For a complex z = complex (x, y), the real part of the complex can be obtained through the go language built-in function real (z), that is, x, and the imaginary part of the complex can be obtained through imag (z), that is, y. For more information about complex functions, please refer to the math / cmplx standard library.

1.5 string

In go language, string is also a basic type.A string is an immutable sequence of bytesA string can contain arbitrary data, but is usually used to contain readable text.

1.5.1. String definition

Var STR string // declares a string variable
STR = "Hello world" // String assignment

Escape characters can be used in strings to achieve the effects of line feed and indentation

  • \N newline
  • \R carriage return
  • \T tab
  • \U or u Unicode characters
  • \\Backslash itself

Example:

package main

import (
    "fmt"
)

func main() {
    var str = "Hello\nworld"
    fmt.Println(str)
}

Results of operation:

Hello
world

Multi line string assignment

package main

import (
	"fmt"
)

func main() {
	Var STR ='first line
The second line
    Third line
    The fourth line`
	fmt.Println(str)
}

Running results

first line
The second line
    Third line
    The fourth line

It can be seen that in this way, all escape characters are invalid, and the text will be output as is. Note that ‘is not a single quotation mark, but a reverse quotation mark, that is, the key to the left of key 1 on the keyboard. Multiline strings are generally used to embed source code and data.

1.5.2. String encoding

UTF-8 is a widely used encoding format, which is the standard encoding of text files. Each UTF-8 character can be easily accessed through the run type. Of course, go language also supports character by character access according to the traditional ASCII code.

1.5.3 string operation

operation meaning Examples
+ String connection “Hello” + “world” / / the result is HelloWorld
Len() String length Len (“HelloWorld”) / / the result is 10
[] Take character “HelloWorld” [1] / / the result is’ e ‘

1.5.4 string traversal

There are two traversal methods in go language. One is traversal by byte array

package main

import "fmt"

func main() {
  STR: "Hello world, hello world" 
	n := len(str)
	for i := 0; i < n; i++ {
		Ch: = STR [i] // gets the characters in the string by subscript, and the type is byte
		fmt.Println(i, ch)
	}

}

Running results

0 72
1 101
2 108
3 108
4 111
5 32
6 87
7 111
8 114
9 108
10 100
11 44
12 228
13 189
14 160
15 229
16 165
17 189
18 228
19 184
20 150
21 231
22 149
23 140

In go language, Chinese characters are utf8 encoded by default, accounting for 3 bytes, and alphabets and half angle punctuation are ASCII encoded, accounting for 1 byte, so len (STR) = = 24, and the output is also 24 lines.

The other is traversing with Unicode characters

package main

import "fmt"

func main() {
	STR: "Hello world, hello world" 
	for i, ch := range str {
		fmt.Println The type of (I, CH) // ch is run
	}

}

Running results

0 72
1 101
2 108
3 108
4 111
5 32
6 87
7 111
8 114
9 108
10 100
11 44
12 20320
15 22909
18 19990
21 30028

In go language, Unicode characters occupy 4 bytes, so the output result is 16 lines.

1.5.5. String interception

In essence, it is array slicing, which will be explained in detail in the following array slicing. Here is just an example:

package main

func main() {
	STR: "Hello world, hello world"
	Newstr: = STR [1:3] // start from the first byte (inclusive), read to the third byte (inclusive)
	println(newStr)
}

Running results

el

It should be noted that this kind of string interception is actually in the form of bytes. Chinese characters account for 3 bytes. You can try the following code to see the result

package main

func main() {
	STR: "Hello world, hello world"
	newStr := str[12:15]
	println(newStr)
}

1.5.6. String search

Forward search for the first substring, example:

package main

import "strings"

func main() {
	STR: "Hello world, hello world"
	pos :=  strings.Index (STR, world)
	pos2 := strings.Index(str, "o")
	Println (POS, pos2) // returns the location of the substring
}

Running results

18 4

 

Reverse search for the first substring, example:

package main

import "strings"

func main() {
	STR: "Hello world, hello world"
	pos :=  strings.LastIndex (STR, "OK")
	pos2 := strings.LastIndex(str, "o")
	Println (POS, pos2) // returns the location of the substring
}

Running results

15 7

We can see that when searching substring, it is traversed by byte, and Chinese characters account for 3 bytes

1.5.7 string splicing

Previously, we know that string splicing can be used+It is simple and intuitive, but the performance of this method is not good, because each time two strings are spliced, a brand new string will be generated. In go language, there is also a mechanism similar to StringBuilder (variable string) for efficient string connection

package main

import (
	"bytes"
	"fmt"
)

func main() {
	Firststring: "hello"
	Secondstring: "world"

	//Declare byte buffer
	var stringBuilder bytes.Buffer

	//Write string to buffer
	stringBuilder.WriteString(firstString)
	stringBuilder.WriteString(secondString)

	//Output the buffer as a string
	fmt.Println(stringBuilder.String())
}

Operation output

Hello Worlds 

1.6 character type

Each element in a string is called a “character”, which can be obtained by traversing or obtaining a single string element. Character is the general name of all kinds of characters and symbols, including national characters, punctuation marks, graphic symbols, numbers, etc.

There are two characters in go language

  • One is uint8 type, or byte (actually the alias of uint8), which represents the value of a single byte of UTF-8 string, and the value corresponds to ASCII code.
  • The other is the rune type, which represents a Unicode character. For the sake of simplifying the language, most API of go language assume that the string is UTF-8 encoding. When it needs to process Chinese, Japanese or other compound characters, it needs to use the rune type. Rune type is equivalent to int32 type. For rune related operations, please refer to the Unicode package of go standard library. In addition, the Unicode / utf8 package also provides the conversion between utf8 and Unicode. Although Unicode characters are supported in the standard library, they are rarely used.

In the ASCII code table, the value of a is 65, which is 41 when expressed in hexadecimal, so the following writing is equivalent:

package main

func main() {
	//In the ASCII code table, the value of a is 65, which is 41 when expressed in hexadecimal, so the following writing is equivalent:
	Var ch byte ='a '// characters are enclosed in single quotation marks
	var ch2 byte = 65
	Var CH3 byte = '- x41' // / (- x always follows a hexadecimal number of length 2)
	Var CH4 byte = '- 101' // another possible way to write it is to follow the octal number of length 3, for example, 377.
	println(ch, ch2, ch3, ch4)
}

In go language, Unicode characters are also represented by int in memory. In documents, the format U + hhhh is generally used, where h is a hexadecimal number, for example:

package main

import "fmt"

func main() {
  //When writing a Unicode character, you need to prefix the hexadecimal number with either a prefix or a prefix. Because Unicode takes at least 2 bytes, we use int16 or int to represent it. Use the prefix if you need to use up to 4 bytes, and use the prefix if you need to use up to 8 bytes.
	var ch int = '\u0041'
	var ch2 int = '\u03B2'
	var ch3 int = '\U00101234'
	fmt.Printf("%d - %d - %d\n", ch, ch2, ch3) // integer
	fmt.Printf("%c - %c - %c\n", ch, ch2, ch3) // character
	fmt.Printf("%X - %X - %X\n", ch, ch2, ch3) // UTF-8 bytes
	fmt.Printf("%U - %U - %U", ch, ch2, ch3)   // UTF-8 code point
}

Results of operation:

65 - 946 - 1053236
A - β - 􁈴
41 - 3B2 - 101234
U+0041 - U+03B2 - U+101234

We find that we often usefmt.PrintfTo output text to the terminal, some of them are used in this exampleFormat specifierTheir meanings are as follows:

  • %c: used to represent characters when used with characters.
  • %d: will output the integer used to represent the character,%vThe same is true.
  • %X: represents a hexadecimal integer
  • %U: output a string in the format U + hhhh.

Some functions used to test characters are built in the Unicode package. The return value of these functions is a Boolean value, as follows (where ch stands for characters):

package main

import "unicode"

func main() {
	//Judge whether it is a letter or not: unicode.IsLetter (ch)
	//Judge whether it is a number: unicode.IsDigit (ch)
	//Judge whether it is a blank symbol: unicode.IsSpace (ch)
	println(unicode.IsLetter(65))

}

Output:

true

Now that we know what characters are in go language, we often talk about character sets when we talk about characters. So what is a character set? seeing the name of a thing one thinks of its function,A character set is a set of multiple characters

Common character sets are: ASCII character set, GB2312 character set, Big5 character set, GB18030 character set, Unicode character set, etc. For details: https://baike.baidu.com/item/ Character set / 946585? Fr = Aladdin

The character set assigns a unique ID to each character. All the characters we use have a unique ID in the Unicode character set. For example, the encoding of a in the above example is 97 in both unicode and ASCII.

The broad sense of Unicode refers to a standard, which defines the character set and encoding rules. Our commonly used UTF-8 is one of the ways of using Unicode (encoding). UTF is Unicode transformation format, which means to convert Unicode to a certain format.

UTF-8 is convenient for different computers to use the network to transmit different languages and codes of text, so that double byte Unicode can be correctly transmitted on the existing single byte processing system.

UTF-8 uses variable length bytes to store Unicode characters. For example, ASCII letters continue to use 1 byte to store, accented words, Greek or Cyrillic letters use 2 bytes to store, and commonly used Chinese characters use 3 bytes. Auxiliary plane characters use 4 bytes.

In addition, there areArraySlicePointerDictionary (map)Channel (Chan)InterfaceErrorStructureThis will be explained in detail in the following chapters.

Recommended Today

How to use Linux bash for loop

The for loop isLinux shellThe most commonly used structure in. For loop has three structures: one is list for loop; The second structure is for loop without list; The third structure is a C-like for loop. This blog post focuses on the list for loop. The format of the list for loop is fixed. There […]