Introduction and deployment of distributed storage glusterfs

Time:2021-4-16

Introduction and deployment of distributed storage glusterfs

1. Overview of glusterfs

Glusterfs is an extensible network file system. Compared with other distributed file systems, glusterfs has the characteristics of high scalability, high availability, high performance, and horizontal scalability. Moreover, it has no design of metadata server, so that the whole service has no hidden trouble of single point of failure.Introduction and deployment of distributed storage glusterfsWhen clients access glusterfs storage, programs first read and write data by accessing mount points. For users and programs, the cluster file system is transparent. Users and programs can’t feel whether the file system is local or remote. The read and write operations will be processed by VFS (virtual file system). VFS will send the request to fuse kernel module, and fuse will send the data to glusterfs client through device / dev / fuse. Finally, through the calculation of glusterfs client, the request or data is sent to glusterfs server through the network.

2. A brief introduction of common distributed applications of glusterfs

Introduction and deployment of distributed storage glusterfsA distributed volume is also called a hash volume. Multiple files are stored randomly on multiple bricks in the unit of files.

  • Application scenario: a large number of small files
  • Advantages: good read / write performance
  • Disadvantage: if the storage or server fails, the data on the brick will be lost
  • Do not specify the volume type, the default is distributed volume
  • There is no limit to the number of bricks

Create distributed volume command:

gluster volume create volume_name node1:/data/br1 node2:/data/br1

Copy volume is to copy multiple files on multiple bricks. The number of bricks should be equal to the number of copies to be copied. It is recommended that bricks be distributed on different servers.

  • Application scenarios: high reliability and high performance scenarios
  • Advantages: good read-write performance
  • Disadvantages: poor write performance
  • replica = brick

To create a replication volume:

gluster volume create volume_name replica 2 node1:/data/br1 node2:/data/br1

Replica: the number of copies of the file savedIntroduction and deployment of distributed storage glusterfs

Striped volume divides files into strips and stores them on multiple bricks. The default strip size is 128K

  • Application scenario: large file
  • Advantages: suitable for large file storage
  • Disadvantages: low reliability, brick failure will lead to the loss of all data
  • stripe = birck
  • Strip: number of strips create strip volume:
gluster volume create volume_name stripe 2 node1:/data/br1 node2:/data/br1

Distributed striped volume stores multiple files in multiple nodes, and each file is striped in multiple bricks

  • Application scenario: large number of large files with high read / write performance
  • Advantages: high concurrency support
  • Disadvantages: no redundancy, poor reliability
  • Brick number is a multiple of stripe

To create a distributed striped volume:

gluster volume create volume_name stripe 2 node1:/data/br1 node2:/data/br1 node3:/data/br1 node4:/data/br1

Distributed replication volume is to hash multiple files on multiple nodes and copy multiple files on multiple bricks.

  • Application scenario: a large number of file reading and high reliability scenarios
  • Advantages: high reliability, high read performance
  • Disadvantages: sacrificing storage space and poor write performance
  • The number of bricks is a multiple of replica
gluster volume create volume_name replica 2 node1:/data/br1 node2:/data/br1 node3:/data/br1 node4:/data/br1

Introduction and deployment of distributed storage glusterfs

Striped copy volume is to strip a large file and save multiple copies

  • Application scenario: large files, and high reliability requirements
  • Advantages: large file storage, high reliability
  • Disadvantages: poor write performance at the expense of space
  • The number of bricks is the product of stripe and replica
gluster volume create volume_name stripe 2 replica 2 node1:/data/br1 node2:/data/br1 node3:/data/br1 node4:/data/br1

3. Glusterfs environment

Introduction and deployment of distributed storage glusterfs

Log storage cluster adopts distributed replication volume, which stores multiple files in multiple nodes by hash and copies multiple files in multiple bricks. There are five servers with a total disk space of 90t, so only 45t disk space is available in this distributed replication mode. In addition, the distributed replication volume mode needs to have even number of bricks, so now we use one server to create two bricks, as shown in the figure above, 10.102.23.4 / data_ 01 / node and 10.102.23.44 / data_ 01 / node is a backup relationship, which is true for other nodes. 10.102.23.44 is the management node of the log storage cluster. The NFS Ganesha service only needs to be installed on the control node, and the client can be mounted through the NFS mode.

#Sed - I's#selinux = forcing#selinux = disabled# '/ etc / sysconfig / SELinux #
#Iptables - F # clear firewall rules

Install glusterfs (01-05)Introduction and deployment of distributed storage glusterfs

# yum install userspace-rcu-*
# yum install python2-gluster-3.13.2-2.el7.x86_64.rpm
# yum install tcmu-runner-* libtcmu-*
# yum install gluster*
# yum install nfs-ganesha-*
#This NFS only needs to be installed on which server (10.102.23.44)
# systemctl start  glusterd.service All servers start glusterd
# systemctl start rpcbind
# systemctl enable glusterd.service
# systemctl enable rpcbind
#SS - LNT # query whether the port is 24007, if so, the service is running normally

Create a cluster (perform the following operation on node 10.102.23.44 to add nodes to the cluster)

[[email protected] ~]# gluster peer probe 10.102.23.44
peer probe: success. [[email protected] ~]# gluster peer probe 10.102.23.45
peer probe: success.
[[email protected] ~]# gluster peer probe 10.102.23.46
peer probe: success.
[[email protected] ~]# gluster peer probe 10.102.23.47
peer probe: success.
[[email protected] ~]# gluster peer probe 10.102.23.4
peer probe: success.

View virtual machine trust state add results

[[email protected] ~]# gluster peer status
Number of Peers: 4
Hostname: 10.102.23.46
Uuid: 31b5ecd4-c49c-4fa7-8757-c01604ffcc7e
State: Peer in Cluster (Connected)
Hostname: 10.102.23.47
Uuid: 38a7fda9-ad4a-441a-b28f-a396b09606af
State: Peer in Cluster (Connected)
Hostname: 10.102.23.45
Uuid: 9e3cfb56-1ed4-4daf-9d20-ad4bf2cefb37
State: Peer in Cluster (Connected)
Hostname: 10.102.23.4
Uuid: 1836ae9a-eca5-444f-bb9c-20f032247bcb
State: Peer in Cluster (Connected)

Perform the following disk operations on all nodes:

[[email protected] ~]# fdisk /dev/sdb

To create a volume group:

[[email protected] ~]# vgcreate vg_data01 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1
[[email protected] ~]# vgcreate vg_data02 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1

To view volume groups:

[[email protected] ~]# vgdisplay

To create a logical volume:

[[email protected] ~]# lvcreate -n lv_data01 -L 9TB vg_data01
[[email protected] ~]# lvcreate -n lv_data02 -L 9TB vg_data02

To view logical volumes:

[[email protected] ~]# lvdisplay

Format logical volume:

[[email protected] ~]# mkfs.xfs /dev/vg_data01/lv_data01
[[email protected] ~]# mkfs.xfs /dev/vg_data02/lv_data02

Mount logical volume:

[[email protected] ~]# mkdir -p /data_01/node /data_02/node
[[email protected] ~]# vim /etc/fstab
/dev/vg_data01/lv_data01 /data_01 xfs defaults 0 0
/dev/vg_data02/lv_data02 /data_02 xfs defaults 0 0
[[email protected] ~]# mount /data_01
[[email protected] ~]# mount /data_02

Distributed replication mode (composite type) requires at least 4 servers to create.

Create volume:

[[email protected] ~]# gluster volume create data-volume replica 2   10.102.23.4:/data_01/node  10.102.23.44:/data_01/node  10.102.23.44:/data_02/node 10.102.23.45:/data_02/node  10.102.23.45:/data_01/node  10.102.23.4:/data_02/node 10.102.23.46:/data_01/node  10.102.23.47:/data_01/node  10.102.23.46:/data_02/node  10.102.23.47:/data_02/node force

Start the created volume:

[[email protected] ~]# gluster volume start data-volume
Volume start: data volume: success all machines can view:
[[email protected] ~]# gluster volume info

Introduction and deployment of distributed storage glusterfsTo view the status of a distributed volume:

[[email protected] ~]# gluster volume status

Introduction and deployment of distributed storage glusterfsBased on the above deployment of glusterfs, the glusterfs distributed replication volume has been completed

4. Building of NFS Ganesha environment

The glusterfs service itself also supports NFS mount. Because there are multiple network segments in the existing production environment, and some network segments are not connected with the glusterfs storage server network segment, it is necessary to implement NFS mount through nginx proxy NFS. Glusterfs only supports the mount of nfs3 version, and it is not so convenient to use nginx proxy. It has many ports, so glusterfs and nfss Ganesha are a perfect combination. Nfss Ganesha abstracts a back-end storage into a unified API through fsal (file system abstraction layer), provides it to Ganesha server, and then mounts it to client through NFS protocol. Operate the hanging space on the client. And nfss Ganesha can specify the version of NFS.

Install NFS Ganesha on the management node 10.102.23.44. At the beginning of deployment, glusterfs has been installed on the management node. I will not repeat the description here. I will simply describe the configuration file

[[email protected] ~]# vim /etc/ganesha/ganesha.conf
.....................................
EXPORT
{
## Export Id (mandatory, each EXPORT must have a unique Export_Id)
#Export_Id = 12345;
Export_Id = 10;
## Exported path (mandatory)
#Path = /nonexistant;
Path = /data01;
## Pseudo Path (required for NFSv4 or if mount_path_pseudo = true)
#Pseudo = /nonexistant;
Pseudo = / data01; # the root directory mounted by the client through NFS
## Restrict the protocols that may use this export. This cannot allow
## access that is denied in NFS_CORE_PARAM.
#Protocols = 3,4;
Protocols = 4; # version of client NFS mount
## Access type for clients. Default is None, so some access must be
## given. It can be here, in the EXPORT_DEFAULTS, or in a CLIENT block
#Access_Type = RW;
Access_ Type = RW; # permission problem
## Whether to squash various users.
#Squash = root_squash;
Squash = No_ root_ Square; root
## Allowed security types for this export
#Sectype = sys,krb5,krb5i,krb5p;
Sectype = sys; # type
## Exporting FSAL
#FSAL {
#Name = VFS;
#}
FSAL {
Name = GLUSTER;
Host name = "10.102.23.44"; # glusterfs management node IP
Volume = "data volume"; "glusterfs" volume name
}
}
...................
[[email protected] ~]# systemctl restart nfs-ganesha
[[email protected] ~]# systemctl enable nfs-ganesha
[[email protected] ~]# showmount -e 10.102.23.44
Export list for 10.102.23.44: successful establishment of NF Ganesha

5. Client mount

Mount in glusterfs mode:

[[email protected] ~]# mkdir /logs
[[email protected] ~]# mount -t glusterfs 10.102.23.44:data-volume /logs/

Introduction and deployment of distributed storage glusterfsMount in NFS mode:

On the client (section 10.1.99)

[[email protected] ~]#yum -y install nfs-utils rpcbind
[[email protected] ~]# systemctl start rpcbind
[[email protected] ~]# systemctl enable rpcbind
[[email protected] ~]# mkdir /home/dwweiyinwen/logs/
[[email protected] ~]# mount -t nfs -o vers=4,proto=tcp,port=2049 10.102.23.44:/data01 /home/dwweiyinwen/logs/

Introduction and deployment of distributed storage glusterfs

Original text:https://www.jianshu.com/p/4b7…

Introduction and deployment of distributed storage glusterfs