I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

Time:2022-11-25

1. The birth of Git and related history

1.1 Definition of Git

​ Git fordistributedThe version control system is currently the most advancedversion controlsystem.

think:

  1. What does version control mean?

    Wikipedia:

    Version control (English: Version control) is a standard practice for maintaining engineering blueprints, which can track the process of engineering blueprints from birth to finalization. In addition, version control is also a software engineering technique to ensure that the same program files edited by different people are synchronized during the software development process.

    Baidu Encyclopedia:

    Version control refers to the management of file changes such as various program codes, configuration files, and documentation during the software development process, and is one of the core ideas of software configuration management.

    My understanding: Version control is a system that records changes in the content of one or several files for future reference to revisions of specific versions.

    In layman’s terms, the content changes of one or several files are saved according to a specific version number, so that future viewers can quickly and clearly understand the file change information (content change information, content change time, author, etc.)

1.2. Creator of Git (Dad)

​ Name: Linus TorvaldsLinus Stovalls) Finns are currently employed inOpen Source Development Experiments(OSDL:Open Source Development Labs, Inc)

​ Autobiography: “The Musician Is King” just for fun

I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

1.3. Relevant history
  • In 1991, Linus created the operating system that can be said to dominate the server-Linux system. Due to its rapid development, Linux code management has become a big problem
  • Before 2002, Linus actually manually merged the source code submitted by developers around the world
  • In 2002, a company called BitMover provided free access to the version control system of the commercial version of BitKeeper out of humanitarian spirit.
  • In 2005, due to many Linux talents, Andrew, who developed Samba, tried to crack the BitKeeper protocol, but was discovered by BitMover, so BitMover was angry and wanted to take back the free use right of the Linux community
  • Linus spent 10 days developing Git in C language, and since then, the source code of Linux is managed by Git.
  • In 2008, the GitHub website was launched, providing free Git storage for open source projects.

2. Classification and characteristics of version control system

2.1 Version control system
  • local version control system

    • The first generation of version control systems were known as local version control systems. passLocking converts concurrent execution into sequential execution. Only one person can work on a file at a time. The specific process is as follows: First, the file should be placed on a server to facilitate users to upload or download files; second, when anyone wants to modify the file, he needs to lock the file first, and pass the checkout command so that others cannot modify it ;Finally, after the modification is completed, the lock needs to be released, and a new version is formed through the checkin command and stored on the server. The first generation of version control systems mainly includeRCSRevision Control System )、SCCS。
    • existon hard disk(Local, local computer) save the patch set (changes before and after file revision), through all the patches, you can calculate the content of each version of the file, most of which use some simple database to record the previous update differences of the file.

      I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

  • Centralized version control system

    • The second generation of version control systems is called a centralized version control system (Centralized Version Control Ssystems, CVCS), which are more lenient about simultaneous revisions, but with the obvious restriction that users must incorporate the current revision into their work before being allowed to commit. The inconvenience is that it needs to be connected to the Internet. If the central server has a stand-alone failure and goes down, no one can submit updates during the downtime, and they cannot work together, and the central server may lose data.
    • As can be seen from the figure below, in a centralized version control system, if the server hiccups, all developers can only stare blankly! Because, SVN’s management of projects depends on the central warehouse in the server! Our changes must be committed to the central repository on the server. The second generation version control system mainly hasCVSSubversion、SourceSafe、Team Foundation Server、SVK。

      I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

  • Distributed version control system

    • The third generation version control system is calledDistributed Version Control Systems (DVCS), which allows merges and commits to be separated. There is a complete data warehouse on each user’s computer, and it can still be used without a network.
    • As can be seen from the figure below, the distributed version control system can also have a server-side warehouse to synchronize the private warehouses of each developer. In a distributed version control system, each participant also has a complete repository locally. Even if the server crashes, we can still use Git (only manage our code in the local warehouse), and then synchronize with the server when the network is available! The third-generation version control systems mainly include Bazaar,Git、Mercurial、BitKeeper、Monotone。

I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

think:

  1. How to understand distributed and centralized? (Understanding given by Teacher Liao Xuefeng)

    • Centralized: The version library is stored centrally on the central server, and when working, you use your own computer, so you must first obtain the latest version from the central server, then start working, and then put the Push your own live to the central server. The central server is like a library. If you want to change a book, you must first borrow it from the library, then go home and change it yourself, and then put it back in the library.
    • Distributed: The distributed version control system has no “central server” at all. Everyone’s computer has a complete version library. In this way, when you work, you don’t need to connect to the Internet, because the version library is on your own computer. superior.
2.2 Comparison between SVN and Git

Advantages and disadvantages of SVN:

  • [Excellent] Centralized, easy to manage and secure
  • [Excellent] Convenient management, clear logic, concept in line with conventional thinking
  • [Excellent] Code consistency is high, suitable for project development with a small number of people
  • [Excellent] Support binary files, easier to handle large files, support empty directories
  • [Excellent] Allow a file to have any number of nameable attributes, and will take care of all file types
  • [Missing] The pressure on the server is too high, and the capacity of the database has increased sharply
  • [Missing] It must be connected to the server, otherwise it is basically impossible to work, submit, compare, restore, etc.
  • [Missing] Not suitable for open source development

Pros and cons of Git:

  • [Excellent] Suitable for distributed development, emphasizing individuals, and the pressure and number of public servers will not be too large.
  • [Excellent] Fast speed, mature architecture, flexible development; conflicts can be easily resolved between any two developers.
  • [Excellent] Work offline, the cost of managing code is low, and there is no need to rely on the server.
  • [Excellent] Easy to deploy. Basically the next command can be used.
  • [Excellent] A good branching mechanism can keep the main code clean.
  • [Missing] It does not conform to conventional thinking; there are few materials, the learning cost is relatively high, the learning cycle is relatively long, and the quality of personnel is required to be relatively high.
  • [Deficiency] Code confidentiality is poor. Once the developer clones the entire library, all code and version information can be fully disclosed.

    The main difference between SVN and Git:

  • The storage of SVN needs to rely on a server, and all the things of git are placed online. Save cost, save time and effort.
  • git is distributed, svn is not.
  • Git stores content in the form of source data, while svn stores it in the form of files.
  • Branches are different in git and svn.
  • git does not have a global version number, svn does.
  • The integrity of git content is better than svn.

Comparison of features between SVN and GIT:

characteristic SVN GIT
architectural pattern centralized distributed
safety Poor, regular backups High, the developer’s local computer is a complete repository
applicability document management, code management,
ease of use Easy to use and friendly to novices Difficult to learn, high learning cost but high efficiency
flexibility Low, prone to single point of failure, pulling branches High, stand-alone local operation, multiple backups, local new branch
authority management Has strict authority management There is no strict authority management yet, but there is account role division
3. Code hosting platform
  • GitHub

    • What is GitHub?

      • The definition on Baidu Encyclopedia is: a code hosting platform for open source and private software projects, which only supports git as the only repository format for hosting, hence the name GitHub.
      • GitHub is a code hosting cloud service website that helps developers store and manage their project source code, and can track, record and control users’ modifications to their code. It can even be used as a network disk to store code and so on, and use it to store anything.
    • The relationship between Github and Git?

      • GitHub is not the same as Git. The two are completely different things and should not be confused. Similarly, if you take a look at the relationship between java and javascript, Zhou Jie and Jay Chou, maybe you can learn some truth from it.
      • Git is just a command line tool, a distributed version control system. It is it that manages and tracks the historical version of your code behind the scenes, like a time machine, so that you will not be in a hurry when the code goes wrong, and you can quickly roll back the previous historical version.
      • GitHub is a code hosting website that uses Git as a version management tool (not svn) behind it. The main service is to host your project code on a cloud server instead of storing it on your own local hard drive.
    • What can GitHub do?

      • Managed code, managing historical versions of projects
      • Find and view the introduction and source code of open source projects, etc.
      • Use GitHub Pages to build your own personal blog
      • Share technical experience, projects, etc., communicate online, and enhance your influence
  • Gitlab

    • What is Gitlab?

      + GitLab is an open source project for warehouse management system, using Git as a code management tool, and a web service (online code warehouse management software) built on this basis.
    • What is the difference between Gitlab and GitHub?view comparison

      • The projects stored on GitHub are open source to the world. If you want to store a private warehouse, you have to pay. Paying money makes your project more private
      • GitHub is an online code warehouse. There is only GitHub in the world. Everyone stores the code on other people’s servers.
      • GitLab is relatively private and is used for corporate, school or personal code hosting libraries
      • Gitlab is equivalent to a small GitHub. You can build your own GitHub-like warehouse locally, and let your friends store the code on it, so that only a few of you can see the code, but if you store it on GitHub, the whole world can see it. see
    • What can Gitlab do for us?

      • Git warehouse management, code review, issue tracking, dynamic subscription, wiki and other functions, what GitHUb can do, Gitlab can also do (99%).
  • Domestic code hosting platform

4. How Git works

​ We use Git to record every file content change, version update, and clearly compare the content differences of different versions; you can use Git to switch freely between the historical versions of the project; you can also use Git to change from the current project Some operations can be undone, new branches can be created, branches can be merged, and even remote server warehouses can be associated. How is all this achieved? Knowing Git’s ideas and basic principles of these operations will also give you a little idea.

4.1 Git partitions
  • workspace: The place where the workspace is directly edited, the developer can see the specific project, and can directly operate the project file
  • index(stage): where the data in the temporary storage area is temporarily stored
  • repository: The local warehouse stores the submitted data
  • remote: The remote warehouse stores the data of the local warehouse on the remote server

    I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

4.2 Git database
  • Git is essentially aContent Addressed File System(It is to locate the file according to the hash code of the file content. This means that the file with the same content will point to the same location in this file system and will not be stored repeatedly.) The core part of Git is a simplekey-value database, you can insert any type of content into the database, it will return a key value, through which the content can be retrieved again at any time (retrieve)
  • What Git saves is not the change or difference of the file, but a series offile snapshot. When committing, Git will save a commit object (commit object). The commit object will contain a pointer to a snapshot of the staged content. But more than that, the commit object also contains the author’s name and email address, the information entered when committing, and a pointer to its parent object.
  • Git can obtain the SHA-1 hash value (40-bit characters) of any file through a secure hash algorithm 1 (SHA-1), that is, the commit ID, and then access the data through the file hash value. The data is located in the objects directory,The first two characters of the SHA-1 hash value are used as the name of the subdirectory, and the last 38 characters are used as the name of the file in the subdirectory
  • Gi’s data storage principle

    • Git object

      • data object(blob object) stores the specific content of a file
      • tree object(tree object) The directory of the stored file, a large set of pointers, pointing to the child tree, or blob
      • commit object (commit object) stores author information, submitter information, comments, and a pointer to a big tree
      • Tag object (tag object) It contains a tag creator information, a date, a note information, and a pointer.
  • Git’s low-level commands and high-level commands

    • There are more than 30 commonly used Git commands, which can be rungit helpView; but Git has more than 130 commands in total, which can be passedgit help -aView, these commands can be divided into high-level commands and low-level commands. The low-level commands are designed in Unix style and are not commonly used.

      I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

      • Put data into the Git database
      • Get data from Git database
      • Submit the testA.txt file for the first time
      • The second submission, modified the content of the test.txt file
      • The third submission, add a new file testB.txt, a new directory lib, add a file testC.txt in lib
      • For the fourth submission, create a new branch branchB, and make a commit in the new branch
      • Fifth commit, merge a branch
  • Git references (reference or refs)

    Browse the complete commit history, but in order to be able to traverse that history to find all related objects, you need to remember the last committedSHA-1 value. We need a file to save the SHA-1 value, and give the file a simple name, and then use this name pointer to replace the original SHA-1 value. Files are called “references (or refs for short)”, using thegit branch (branchname)Git will actually run a command like thisupdate-refcommand to get the SHA-1 value of the latest commit on the branch you’re on, and add it to any new refs you want to create.

    • HEADref.

      • Both branches and tags are pointers to commit objects, and all local branches are stored ingit/refs/headsIn the directory, each branch corresponds to a file
      • The essence of a Git branch: a pointer or reference to the head of a series of commits
    • remote reference

      • If you have added a remote repository and performed a push operation on it, Git will record the value corresponding to each branch at the time of the latest push operation, and save it inrefs/remotesUnder contents
    • label reference

Git’s object model

5. .gitfolder
  • Folders under .git

    • The hooks folder stores the project’s client or server hook scripts
    • The exclude file under the info folder contains the project’s global ignore matching pattern, which is complementary to the .gitignore file

      • exculd file
    • logs holds all updated reference records

      • refs
      • HEAD # last commit message
    • objectsThe folder stores all the contents of the Git database and stores all Git objects

      • info records additional information about object storage
      • pack A file that stores many objects in compressed form (.pack), with an accompanying index file (.idx) to allow them to be accessed randomly
    • refsFolders that store pointers to their respective commit objects for all branches; local branches, remote branches, tags, etc.

      • heads record the root of the commit branch

        • master identifies the hash value of the current commit pointed to by the master branch in the local project
      • remotes records the root of the commit branch copied from the remote warehouse (read-only)

        • origin

          • HEAD
          • master identifies the hash value of the current commit pointed to by the master branch in the remote project.
      • tags records any object name (not necessarily a commit object or a tag object pointing to a commit object)
  • Files under .git

    • HEADThe file points to the current branch and contains a reference to the branch. Through this file Git can get the parent of the next commit, which can be understood as a pointer
    • indexThe file stores the content information of the temporary storage area
    • configThe file contains configuration information for the project
    • description Stores the description information of the warehouse, mainly for git hosting systems such as gitweb
    • packed-refs packs headers and tags for efficient repository access
    • FETCH_HEAD is a version link, pointing to the end version of the branch that has been fetched from the remote warehouse
    • ORIG_HEAD records the position pointed to by HEAD before the operation when performing a dangerous (drastic) operation (such as merging, rolling back reset, etc.), so that we can roll back when a catastrophic mistake occurs
    • COMMIT_EDITMSG saves the latest commit message, the Git system will not use this file, it is just a reference for users

      I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

6. Common commands of Git
  • simple command

    I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

  • advanced command

    • HEAD

      • Always point to the latest commit of the current branch
      • git diff HEAD shows the difference between the workspace and the current latest commit
    • commit

      • git commit –amend -m [message] Modify the last commit
    • branch

      • git branch –track remote-branch Create a new branch and establish a tracking relationship with the specified remote branch
      • git branch –set-upstream-to=origin/[remote branch] Set remote to the upstream branch of the current branch
    • mergeMerge the specified branch

      • git merge branch merges other branches into the current branch

        I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

    • rebaserebase specified branch
      I have been in contact with Git for so long, let’s talk about my understanding of Git and related things
    • resetreset
    • revertUndo, roll back to a specific version specified
    • cherry-pick Choose to merge the commit of a commit into the current branch
    • reflogView all movements of HEAD

think:

What is the difference between git merge and git rebase?

  • Rebase will merge the commit history of this branch and other branches, and may get a new commit history
  • Rebase gets a more concise project history and removes the merge commit. If there is a code problem in the merge, it is not easy to locate, because re-write commit history
  • merge will create a new commit, including commit details for each branch
  • Each merge will automatically generate a merge commit, especially when commits are frequent, and the branches are very messy.
  • To get a clean, linear commit history without merge commits, choose git rebase
  • If you want to get a complete commit history and avoid the risk of rewriting the commit history, choose git merge

What is the difference between git reset and git rebase?

  • git revert will generate a new commit to undo a commit, and the commits before this commit will be preserved, which means that the version history of the project is moving forward.
  • Git reset is to go back to a certain submission, similar to traveling through time and space.
7. Git Flow
  • Introduction to branches used by gitflow workflow conventions

    • masterbranch for the projectcoreThe branch, which is also the branch finally released to the outside world, is unique and stable. It is only readable, and the code cannot be directly modified on this branch
    • developThe branch is the project’sdevelopment backboneBranch, only. It is only provided for readability, and the code cannot be directly modified on this branch. The development of new functions needs to pull a new branch from this branch to expand. The develop branch should contain the full history of the project.
    • featrueThe branch project is the development branch of the target requirement, which can be multiple, pulled from the develop branch or other featrue branches. The multi-person division of labor and cooperation of programmers is realized through featrue, which is the branch that the first-line programmers who implement the code have the most contact with. After the requirement development is completed, it should be merged back into the development branch.
    • releaseThe branch is a pre-release branch, usually calledtest branch,Mainly usedTesting and bug fixing during the development phase. When the feature branch is developed, it will be merged back into the develop branch, and then the release branch will be pulled from the develop branch for testing. The tested and repaired release branch should be merged back into the develop branch and the master branch, and marked with an appropriate tag (including the necessary releaseNote).
    • hotfixbranch asEmergency online repair branch, that is, when a major bug occurs in the master branch released to the public and affects online use, the hotfix branch is pulled from the master branch for emergency repair. The repaired hotfix branch should be merged back into the master branch and the develop branch.

      I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

  • GitFlow workflow

    I have been in contact with Git for so long, let’s talk about my understanding of Git and related things

First, complete the initialization of the central warehouse, and upload the project code after the framework of the new project or the project code to be converted to gitflow to the git central warehouse. The project leader clones the central warehouse to form a master branch locally, and pulls the develop branch (step ①) and pushes it to the server. In general actual scenarios, only the project leader in the development team has the authority to operate the master branch, pull the develop branch, and merge the code of the develop branch into the master branch.

Then, other people in the development team clone the development branch of the central warehouse to the local, forming a unified and unique development branch track for all members. After that, each member can pull out their featrue branch from the develop branch (step ②) for independent development according to their needs and their respective division of labor; if it involves multiple people working together to develop the same branch, the pulled branch should be pushed to the server in time , which is convenient for members to share.

After each member completes their respective functional development, they need to submit the completed code to the featrue branch, and then merge it into the develop branch (step ③). After the code is merged, the featrue branch can no longer be retained.

When the function accumulation is sufficient and stable or reaches the agreed testing cycle, the project leader should pull the release branch from the develop branch (step ④), package and submit the corresponding version to the tester for deployment testing, and all bugs submitted in the test are in this The release branch is modified.

After the test is over and bug fixes are completed, the release branch should be merged back into the develop branch and the master branch (step ⑤). After the code is merged, the release branch can no longer be retained. The merged master branch should be pushed to the central warehouse by the project leader in time (step ⑥). At the same time, all members should synchronize their own development branches in time.

When there is a need to go online, directly package and submit the application version from the master branch for deployment. When a major bug occurs in the online version, the project leader needs to pull the hotfix branch from the master branch (step ⑦) for online emergency repairs.

Finally, the repaired hotfix branch is merged back into the develop branch and the master branch (step ⑧). And push to the central warehouse (step ⑨).

8. Practical operation
  • Get the Git repository

    • Existing file directory usinggit initInitialize warehouse
    • Clone from remote servergit clonea warehouse
  • Merge code from other branches

    • git merge
    • git rebase
  • undo merge merge/rebase

    • git reset --hard [commit]The work area and temporary storage area are withdrawn to the specified commit version
    • git reset --merge [commit]orgit reset --merge HEAD^Revert to the commit before the merge
  • Modify a commit

    • git commit -amend -m 'comment'Replace the last commit
    • git rebase -i HEAD~[number n]Enter vim editing mode and display the latest n latest commit records
  • undo a commit

    • git revert HEADGenerate a new commit to offset the previous commit
    • git revert commit_idUndo an intermediate commit
    • git revert -m commit_idUndo the commits merged by other branches
    • git revert --no-commit commit1..commit5Undo the continuous commits between commit1 (not included) and commit5

      Parameters of git revert:

      –no-edit: Do not open the default editor when executing, and directly use the submission information automatically generated by Git.

      –no-commit: Only offset the file changes in the temporary storage area and the work area, and do not generate new submissions.

  • Discard a commit

    • git reset [last good SHA]Discard all submissions after a certain submission, and disappear completely in the submission history
    • git reset --hard [last good SHA] --hardThe parameter can make the files in the workspace return to the previous state
  • View commit history

    • git log
  • View file changes

    • git diffCompare the same file in the workspace and the temporary storage area, and the file changes that have not been added to the temporary storage area
    • git diff --staged/cachedView content modifications of all files added to the staging area
    • git stateDisplay the status of the working directory and staging area
  • Staging file changes

    • git stashTemporarily save and restore changes to workspace files
  • undo file changes

    • git checkout -- [filename]
  • Undo the files in the staging area

    • git rm --cache [filename]

Git in-depth learning link:

Recommended Today

nginx start, stop, shutdown

Use the centos system built by 3A server to install nginxTutorials are in my previous blog1, nginx specifies the configuration file /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf 1The -c parameter specifies the path of the nginx configuration file to be loaded1. Stop Nginx calmly:kill -QUIT main process number2, Stop Nginx quickly:kill -TERM main process number 3. Force stop […]