Seven days, write a docker with go (the third day)

Time:2021-10-23

Project source code:Click to view the project source code

After understanding the docker principle in the past two days, today we set up the project structure. Let’s take a general look at the project structure first

Seven days, write a docker with go (the third day)

The whole file calling process is as follows
Seven days, write a docker with go (the third day)

Our final effect is to implement the following command, which will start an isolated container and run the first command in the container as top

go-docker run -ti top

main.go

The entry of the program is mainly to receive command line parameters. The third-party toolkit used for receiving command line parameters isgithub.com/urfave/cli, log printinggithub.com/sirupsen/logrus

package main

import (
    "github.com/sirupsen/logrus"
    "github.com/urfave/cli"
    "os"
)

const usage = `go-docker`

func main() {
    app := cli.NewApp()
    app.Name = "go-docker"
    app.Usage = usage

    app.Commands = []cli.Command{
        runCommand,
        initCommand,
    }
    app.Before = func(context *cli.Context) error {
        logrus.SetFormatter(&logrus.JSONFormatter{})
        logrus.SetOutput(os.Stdout)
        return nil
    }
    if err := app.Run(os.Args); err != nil {
        logrus.Fatal(err)
    }
}

The main focus here isCommandsArray, we define two run commandsrunCommandinitCommand, these two commands are defined inCommand.goIn the file, look at the contents of the file

command.go

package main

import (
    "fmt"

    "github.com/sirupsen/logrus"
    "github.com/urfave/cli"

    "go-docker/cgroups/subsystem"
    "go-docker/container"
)

//Create a namespace isolated container process
//Start container
var runCommand = cli.Command{
    Name:  "run",
    Usage: "Create a container with namespace and cgroups limit",
    Flags: []cli.Flag{
        cli.BoolFlag{
            Name:  "ti",
            Usage: "enable tty",
        },
        cli.StringFlag{
            Name:  "m",
            Usage: "memory limit",
        },
        cli.StringFlag{
            Name:  "cpushare",
            Usage: "cpushare limit",
        },
        cli.StringFlag{
            Name:  "cpuset",
            Usage: "cpuset limit",
        },
    },
    Action: func(context *cli.Context) error {
        if len(context.Args()) < 1 {
            return fmt.Errorf("missing container args")
        }
        tty := context.Bool("ti")

        res := &subsystem.ResourceConfig{
            MemoryLimit: context.String("m"),
            CpuSet:      context.String("cpuset"),
            CpuShare:    context.String("cpushare"),
        }
        //Information of the first command executed after cmdarray is run for the container
        //Cmdarray [0] is the command content, followed by the command parameters
        var cmdArray []string
        for _, arg := range context.Args() {
            cmdArray = append(cmdArray, arg)
        }
        Run(cmdArray, tty, res)
        return nil
    },
}

//Initialize the contents of the container, mount the proc file system, and run the user execution program
var initCommand = cli.Command{
    Name:  "init",
    Usage: "Init container process run user's process in container. Do not call it outside",
    Action: func(context *cli.Context) error {
        logrus.Infof("init come on")
        return container.RunContainerInitProcess()
    },
}

The run command is mainly to start a container, then set up isolation for the process. Init is called in the run command, instead of ourselves calling through the command line.Run(cmdArray, tty, res)Function, which receives the parameters we passed,ttyIndicates whether to run in the foreground, corresponding to docker-tiCommand, the run function is written in the run.go file

run.go

package main

import (
    "os"
    "strings"

    "github.com/sirupsen/logrus"

    "go-docker/cgroups"
    "go-docker/cgroups/subsystem"
    "go-docker/container"
)

func Run(cmdArray []string, tty bool, res *subsystem.ResourceConfig) {
    parent, writePipe := container.NewParentProcess(tty)
    if parent == nil {
        logrus.Errorf("failed to new parent process")
        return
    }
    if err := parent.Start(); err != nil {
        logrus.Errorf("parent start failed, err: %v", err)
        return
    }
    //Add resource limit
    cgroupMananger := cgroups.NewCGroupManager("go-docker")
    //Remove resource restrictions
    defer cgroupMananger.Destroy()
    //Set resource limits
    cgroupMananger.Set(res)
    //Add the container process to the CGroup corresponding to each subsystem mount
    cgroupMananger.Apply(parent.Process.Pid)

    sendInitCommand(cmdArray, writePipe)
    parent.Wait()
}

func sendInitCommand(comArray []string, writePipe *os.File) {
    command := strings.Join(comArray, " ")
    logrus.Infof("command all is %s", command)
    _, _ = writePipe.WriteString(command)
    _ = writePipe.Close()
}

Basically, all the things to do for docker initialization are put in this file, mainly to start a container, and then make some resource restrictions on the container. What we need to pay attention to here iscontainer.NewParentProcess(tty), it will return us a process isolated by nameapce. This function is inprocess.goIn the document

process.go

package container

import (
    "os"
    "os/exec"
    "syscall"
)

//Create a command that isolates the namespace process
func NewParentProcess(tty bool) (*exec.Cmd, *os.File) {
    readPipe, writePipe, _ := os.Pipe()
    //Call itself and pass in the init parameter, that is, execute initcommand
    cmd := exec.Command("/proc/self/exe", "init")
    cmd.SysProcAttr = &syscall.SysProcAttr{
        Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS |
            syscall.CLONE_NEWNET | syscall.CLONE_NEWIPC,
    }
    if tty {
        cmd.Stdin = os.Stdin
        cmd.Stdout = os.Stdout
        cmd.Stderr = os.Stderr
    }
    cmd.ExtraFiles = []*os.File{
        readPipe,
    }
    return cmd, writePipe
}

This function will pass/proc/self/exe initTo call ourselves what we definedinitCommandCommand, and then set isolation information for the process. Take a look at ourinitCommandWhat did you do? What’s the content of this commandinit.goIn the file.

init.go

package container

import (
    "fmt"
    "io/ioutil"
    "os"
    "os/exec"
    "strings"
    "syscall"

    "github.com/sirupsen/logrus"
)

//The first process executed by this container
//Mount the proc file system using mount
//In order to view the current process resources through system commands such as' PS'
func RunContainerInitProcess() error {
    cmdArray := readUserCommand()
    if cmdArray == nil || len(cmdArray) == 0 {
        return fmt.Errorf("get user command in run container")
    }
    //Mount
    err := setUpMount()
    if err != nil {
        logrus.Errorf("set up mount, err: %v", err)
        return err
    }

    //Find the absolute path of the command in the system environment path
    path, err := exec.LookPath(cmdArray[0])
    if err != nil {
        logrus.Errorf("look %s path, err: %v", cmdArray[0], err)
        return err
    }

    err = syscall.Exec(path, cmdArray[0:], os.Environ())
    if err != nil {
        return err
    }
    return nil
}

func readUserCommand() []string {
    //Refers to the file descriptor with index 3,
    //That is, the readpipe we passed in cmd.extrafiles
    pipe := os.NewFile(uintptr(3), "pipe")
    bs, err := ioutil.ReadAll(pipe)
    if err != nil {
        logrus.Errorf("read pipe, err: %v", err)
        return nil
    }
    msg := string(bs)
    return strings.Split(msg, " ")
}

func setUpMount() error {
    //After SYSTEMd is added to Linux, the mount namespace becomes shared by default, so you must display
    //Declare that you want this new mount namespace to be independent.
    err := syscall.Mount("", "/", "", syscall.MS_PRIVATE|syscall.MS_REC, "")
    if err != nil {
        return err
    }
    //mount proc
    defaultMountFlags := syscall.MS_NOEXEC | syscall.MS_NOSUID | syscall.MS_NODEV
    err = syscall.Mount("proc", "/proc", "proc", uintptr(defaultMountFlags), "")
    if err != nil {
        logrus.Errorf("mount proc, err: %v", err)
        return err
    }

    return nil
}

Looking at a lot, I didn’t do much. I just set the mount point, and then run the first command after the container is started, that istopCommand. In fact, the isolation of a container has been completed. Let’s turn back and see what resource restrictions have done. All resource restrictions are placed in the CGroup folder

subsystem.go

Resource restriction interface, apply adds the process ID totasksIn, this process is added to the CGroup, set restricts a resource, and remove removes the CGroup. They are relatively simple, just creating and writing files. After understanding the principle, it is easy to write.

package subsystem

//Resource restriction configuration
type ResourceConfig struct {
    //Memory limit
    MemoryLimit string
    //CPU time slice weight
    CpuShare string
    //Number of CPU cores
    CpuSet string
}

/**
Abstract CGroup into path, because CGroup is the virtual path address in hierarchy
*/
type Subystem interface {
    //Returns the name of the subsystem, such as CPU and memory
    Name() string
    //Set the resource limit of CGroup in this subsystem
    Set(cgroupPath string, res *ResourceConfig) error
    //Remove this CGroup resource limit
    Remove(cgroupPath string) error
    //Add a process to CGroup
    Apply(cgroupPath string, pid int) error
}

var (
    Subsystems = []Subystem{
        &MemorySubSystem{},
        &CpuSubSystem{},
        &CpuSetSubSystem{},
    }
)

manager.go

Resource limit Manager

package cgroups

import (
    "github.com/sirupsen/logrus"
    "go-docker/cgroups/subsystem"
)

type CGroupManager struct {
    Path string
}

func NewCGroupManager(path string) *CGroupManager {
    return &CGroupManager{Path: path}
}

func (c *CGroupManager) Set(res *subsystem.ResourceConfig) {
    for _, subsystem := range subsystem.Subsystems {
        err := subsystem.Set(c.Path, res)
        if err != nil {
            logrus.Errorf("set %s err: %v", subsystem.Name(), err)
        }
    }
}

func (c *CGroupManager) Apply(pid int) {
    for _, subsystem := range subsystem.Subsystems {
        err := subsystem.Apply(c.Path, pid)
        if err != nil {
            logrus.Errorf("apply task, err: %v", err)
        }
    }
}

func (c *CGroupManager) Destroy() {
    for _, subsystem := range subsystem.Subsystems {
        err := subsystem.Remove(c.Path)
        if err != nil {
            logrus.Errorf("remove %s err: %v", subsystem.Name(), err)
        }
    }
}

Let’s see how to use it. Let’s look at the memory limit. Other resource limits are similar to it. Just change the file name.

memory.go

Memory limit instance

package subsystem

import (
    "io/ioutil"
    "os"
    "path"
    "strconv"

    "github.com/sirupsen/logrus"
)

type MemorySubSystem struct {
}

func (*MemorySubSystem) Name() string {
    return "memory"
}

func (m *MemorySubSystem) Set(cgroupPath string, res *ResourceConfig) error {
    subsystemCgroupPath, err := GetCgroupPath(m.Name(), cgroupPath, true)
    if err != nil {
        logrus.Errorf("get %s path, err: %v", cgroupPath, err)
        return err
    }
    if res.MemoryLimit != "" {
        //Set CGroup memory limit,
        //Write this limit to memory.limit in the corresponding directory of CGroup_ in_ Bytes file
        err := ioutil.WriteFile(path.Join(subsystemCgroupPath, "memory.limit_in_bytes"), []byte(res.MemoryLimit), 0644)
        if err != nil {
            return err
        }
    }
    return nil
}

func (m *MemorySubSystem) Remove(cgroupPath string) error {
    subsystemCgroupPath, err := GetCgroupPath(m.Name(), cgroupPath, true)
    if err != nil {
        return err
    }
    return os.RemoveAll(subsystemCgroupPath)
}

func (m *MemorySubSystem) Apply(cgroupPath string, pid int) error {
    subsystemCgroupPath, err := GetCgroupPath(m.Name(), cgroupPath, true)
    if err != nil {
        return err
    }
    tasksPath := path.Join(subsystemCgroupPath, "tasks")
    err = ioutil.WriteFile(tasksPath, []byte(strconv.Itoa(pid)), 0644)
    if err != nil {
        logrus.Errorf("write pid to tasks, path: %s, pid: %d, err: %v", tasksPath, pid, err)
        return err
    }
    return nil
}

The article will start on my WeChat official account, scan code attention, and get the latest content in time.

Seven days, write a docker with go (the third day)

This work adoptsCC agreement, reprint must indicate the author and the link to this article