Code management tool – git basic introduction and common skills


preface:This paper mainly introduces the principle, use and some skills of GIT. The purpose is to make readers understand git not only by simply using push and pull commands, but also by knowing what it is and why it is. Of course, this article will not go deep into the deep-seated things such as the implementation principle of GIT. After all, it is just a code management tool. As users, we only need to be really proficient in using it. As for the deeper things, you can learn and study by yourself if you are interested.

In addition, the pictures related to this branch are taken fromlearngitbranching, this is a website for learning git branch in the way of game and combining graphics and text. It is quite nice. I recommend you to go through it completely. I believe your understanding of GIT will be improved.

Git origin

Git is a distributed version control software written by Linux author Linus in two weeks. Before that, the Linux community used bitkeeper as the version control system. However, some people in the community tried to crack the bitkeeper protocol, which annoyed bitmover, the owner of bitkeeper, so bitmover decided to withdraw the free use right of the Linux community.

In this context, Linus spent two weeks writing GIT and replaced bitkeeper within a month as a version control tool for Linux, which was continuously improved later, and finally became the first choice for code version control

Basic concepts

Three work domains:

  • Git directory(GIT directory): that is, the repository, which stores all versions and relevant information in the project. It is the place where git stores data and information
  • working directory(work directory): it is a collection of files corresponding to a certain version of the project, corresponding to the data and information extracted from the GIT directory for users to operate and modify
  • staging directory (staging area): used to record the list of files to be saved during the next commit

Three file states:

  • committed: submitted status, indicating that the data file has been saved to the local data warehouse.
  • modified: modification status, indicating that the file has been modified but has not been submitted (saved).
  • staged: temporary status, which indicates the marked modified file. All marked modifications will be saved in the next submission.
    In addition, the newly added file is untracked file, which is not within the scope of GIT management. You need to add it to the temporary directory through git add, and then its status will change to staged
    As shown in the figure, Git is divided into remote and local. The remote server stores the warehouse information, while the local server has all three working domains.

Branch, head and commit tree

General process of submitting code from local to remote:

  1. Git add to save the changes to the stage area
  2. Git commit to push the files in the staging area to the local branch and update the local warehouse
  3. Git push, push the changes of the local warehouse to the remote warehouse, and the remote warehouse updates

As you can see, commit is essential to update the code. Each commit will generate a snapshot of the working directory (provided that it is modified). In git, the snapshot data of these commits are managed using a tree structure, calledSubmission tree(commit tree) orWork tree(Work Tree)。

Git’sbranch(Branch) is essentially just a variable pointer to the submission object. Branch is the core of GIT. Because of the existence of branch, the work tree is the work tree rather than the work “line”. Each branch can be regarded as the bifurcation of the work tree. Projects can be developed in parallel on different branches, and then combined at the right time. These are the functions of branches.

HEADIndicates the current submission location. Generally speaking, head points to a branch. Of course, you can manually switch to point head to any commit in the work tree (this is called head separation).
In the figure, there are c0-c4 four submissions. There are three branches: main, bugfix and feature. The three branches point to C1, C3 and C4 submissions respectively. The head is in the separated state and points to C2
After understanding the above basic concepts, let’s discuss the related contents of GIT branch.

Git branch

As mentioned earlier, the existence of branches is for parallel development, and each branch will point to a specific submission. Projects that require multi person collaboration are inseparable from the operation of branches.
Generally speaking, when creating a new project, the default branch is master. You can create new branches such as development and release as needed.

Here are some common git branch related commands

  • git commit。 A new child submission node is generated with the current submission as the parent node, and the current head / branch will point to the newly generated node
  • git branch。 When used alone, check all branches, and add a branchname after it, which means to create a new branch named branchname in the current node. Git checkout – B branchname can also achieve the same effect. The difference is that the latter will point the head to the newly created branch
  • git checkoutBRANCH/COMMIT。 Switch to the corresponding branch or submission node. As mentioned earlier, switching directly to the submission node is the head separation state
  • git mergeBRANCH/COMMIT。 Merge the current submission node and the specified submission node and generate a new submission node. The newly generated node has two parent nodes.
  • git rebaseBRANCH/COMMIT。 Copy a copy of all the nodes in the current branch and previous nodes and not in another branch / node to the target branch in order, and then move the current branch / head to the target location.
  • git resetCOMMIT. Undo the commit to the specified commit, with git branch – f current_ The effect of commit is the same. However, this change can only be reflected locally and cannot be synchronized to the remote
  • git revertCOMMIT。 In order to synchronize the revocation to the remote, you need to use git revert commit. This command will undo the specified submission node by generating a new submission node
  • git cherry-pickCOMMIT_ 1 COMMIT_ 2 … 。 Select some specified nodes to rebase to the current branch in order

The difference and choice between git merge and git rebase

git rebase:

  • Advantages: the submission tree is linear, clean and simple
  • Disadvantages: the history of the submission tree has been modified

git merge:

  • Advantages: the order of submitting records is correct and will not cause confusion
  • Disadvantages: it looks complex when there are multiple branches

How to choose between the two and which command to use depends on the user’s habit. If there are requirements for the correctness of the submission history sequence, use git merge, otherwise use git rebase

Interaction with remote warehouse

Generally speaking, the general process of development is to establish a warehouse in the remote, and then the developers create their own branches in the local clone warehouse for development. After the development is completed, they are pushed to the remote branch and then merged into the main branch.
When we clone the remote warehouse to the local, we will copy a copy of warehouse information and working directory locally
Note that there is an origin / main branch in the local area. This branch is called the remote branch, which reflects your latest operationRemote warehouseStatus of the. Remote branches are special. They cannot be controlled directly through checkout, branch and other commands like ordinary branches. They must be synchronized with the remote through pull, push, fetch and other commands.

Here are some common commands that interact with remote warehouses:

  • git clone Repository。 Clone the remote warehouse locally. You can specify the branch of the clone through the GIT clone – B branchname repository command.
  • Git fetch, download the latest status of the remote branch to the local. It will only update the local remote branch, and will not change the head and local remote branches
    There are updates
    After using git fetch
  • git pull。 It can be regarded as the abbreviation of GIT fetch + git merge. There are the following common situations. When someone in a remote branch submits an update and they also submit an update locally, they need to pull the latest code first
    Using git pull, the first step is to download the latest branch change, namely git fetch
    ![git pull1](
    The second step is to merge the current branch with the remote branch, namely git merge O / main
    git pull2
  • git push。 Push local changes from the current branch to the remote branch. Git push updates the local remote branch and synchronizes the changes to the corresponding remote branch
    git push1
    Push with git push
    git push2

Problems and solutions of some actual development scenarios

Premise: at present, the team uses the development branch as the local test environment and the release branch as the code backup. It is deployed to the real environment through manual deployment, that is, after each development is completed, the development branch will be combined into the release branch.

Problem: after a development, the latest submission of development is C2, but C2 forgot to merge into the release branch, and then created another branch, NAS, to develop NAS related functions C3. Now NAS has been merged into the development branch. At this time, NAS related functions need to be rolled back due to some problems. What should I do at this time?

Solution: use git revert C3 rollback

After rollback, create a new branch to develop new function C4, and then want to merge the function of NAS branch into development again. What should I do?

Solution: git revert C3 ‘

Recommended Today

Big data Hadoop — spark SQL + spark streaming

catalogue 1、 Spark SQL overview 2、 Sparksql version 1) Evolution of sparksql 2) Comparison between shark and sparksql 3)SparkSession 3、 RDD, dataframes and dataset 1) Relationship between the three 1)RDD 1. Core concept 2. RDD simple operation 3、RDD API 1)Transformation 2)Action 4. Actual operation 2)DataFrames 1. DSL style syntax operation 1) Dataframe creation 2. SQL […]