The new version of fluid 0.3 was officially released: realizing the generalization of cloud native scene and data acceleration

Time:2021-9-11

The new version of fluid 0.3 was officially released: realizing the generalization of cloud native scene and data acceleration

Author Gu Rong   Pasalab, Nanjing University

Reading Guide:In order to solve the problems of data intensive applications such as big data and AI in the scenario of separation of cloud native computing and storageHigh access delay, difficult joint analysis and complex multi-dimensional managementSuch pain points as pasalab, Alibaba and alluxio of Nanjing University in 2020   Jointly launched in SeptemberOpen source project fluid

Fluid   It is an efficient support platform for data intensive applications in the cloud native environment. Since the open source release, the project has attracted the attention of many experts and engineers in relevant fields. With everyone’s positive feedback, the development of the community has made rapid progress. Recently, fluid version 0.3 was officially released, which mainly added three important functions, namely:

  • Realize universal data storage acceleration and provide kubernetes data volume access acceleration function
  • Strengthen the security protection of data access and provide fine-grained permission control function for data set
  • Simplify the user‘s complex parameter configuration and provide the optimization function of internal parameter configuration of the original biochemical system

Fluid project address:https://github.com/fluid-cloudnative/fluid

The development needs of these three main functions come from the actual production feedback of many community users. In addition, fluid v0.3 has also carried out some bug fixes and document updates. Welcome to experience fluid v0.3! Thank the community partners who have contributed to this version. We will continue to pay extensive attention to and adopt community suggestions to promote the development of fluid project. We look forward to hearing more feedback from you!

Fluid v0.3 download link:https://github.com/fluid-cloudnative/fluid/releases

The following is a further introduction to the functions of this new version release.

1. Support kubernetes   Data volume access acceleration

Although the previous version of fluid has supported many underlying storage systems (such as HDFS, OSS, etc.), in the actual production environment, the internal storage systems of enterprises are often more diverse, and the situation that fluid cannot be connected due to incompatible storage systems still exists. For example, if a user uses the lustre distributed file system, the user will not be able to use the fluid normally because the distributed cache engine used by the previous fluid is not compatible with the lustre system.

In order to improve the generality of fluid in the cloud native data access acceleration scenario, fluid v0.3. AddsAccelerated support for data volume persistent volume claim (PVC) and host path mountsThis provides a general acceleration scheme for the docking of various storage systems with fluid: no matter which underlying storage system is used,As long as the storage system can be mapped to kubernetes native data volume PVC resource objects or host directories on cluster nodes, it can enjoy the advantages brought by functional features such as distributed data caching and data affinity scheduling through fluid。 The basic concept is shown in the figure below:

The new version of fluid 0.3 was officially released: realizing the generalization of cloud native scene and data acceleration

The specific use method is very simple. Users only need to specify in mountpoint pvc://nfs-imagenet Where NFS Imagenet is the existing data volume in the kubernetes cluster.

apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
  name: fluid-imagenet
spec:
  mounts:
  - mountPoint: pvc://nfs-imagenet
    name: nfs-imagenet

Click to view the system demonstration video

We trained resnet-50 model through tensorflow benchmark as a test scenario to verify the PVC access acceleration ability. The following are the speed improvement results:

The new version of fluid 0.3 was officially released: realizing the generalization of cloud native scene and data acceleration

From the evaluation results, the distributed cache capability provided by fluid can improve the speed of the whole training task and shorten the overall training time by more than 20%. For more test related details, please refer to the on GitHubRelated sample documents

2. Access control of data set

Many enterprises providing machine learning platform services have multi-user shared storage systems and scenarios. For security reasons, machine learning platform service providers need to carry out strict access control toEnsure data isolation between usersThat is, any unauthorized user shall not access other people’s data sets at will.

Fluid provides support for the above scenarios in v0.3: after the underlying storage system shared by multiple users is mounted to fluid, the file permission information exposed by fluid (such as user, file mode, etc.) will be consistent with the underlying storage system, that is, the file from the underlying storage system to the node where fluid is deployedtransparent transmission。 This means that the access control in the underlying storage system will also take effect on each node deploying fluid, so as to ensure that the data isolation between users will not be damaged.

In addition, fluid v0.3 also provides the feature of “temporary borrowing” of datasets. “Temporary borrowing” means that a user needs to have temporary access to a dataset of another user. In fluid v0.3, the administrator can complete the conversion of dataset ownership on the node deploying fluid through flexible configuration, so as to give the specified user the ability to “temporarily borrow” other people’s datasets, which canHelp cluster administrators achieve more fine-grained and flexible data set permission management

Documentation for accessing non root user data:https://github.com/fluid-cloudnative/fluid/blob/master/docs/zh/samples/nonroot_access.md

three   Default parameter configuration optimization

Fluid provides many parameter configurations for users to customize their own applications. Before fluid version 0.3, users need to manually configure according to the actual environment and business objectives. However, it is difficult and heavy workload for most users to manually complete configuration optimization.

Fluid v0.3 has built-in a large number of default parameter configuration optimization for internal components such as alluxio and fuse. Users no longer need to focus a lot on parameter configuration optimization. According to our experience, the optimized default parameter settings can achieve better performance in most common fluid usage scenarios.

summary

Fluid v0.3 mainly solves the problems and needs fed back by community users in the actual production environment. The support of host directory and PVC mount provides a general solution for compatibility with different underlying storage systems; The access control of data sets enables fluid to truly meet the needs of the actual production environment shared by multiple users; The optimized default parameter configuration increases the ease of use of fluid and maintains stable performance in most scenarios.

If you have any questions, please join the nail exchange group to participate and discuss:https://img.alicdn.com/tfs/TB1Cm4ciNvbeK8jSZPfXXariXXa-452-550.png

thank

  • Thank Xu Zhihao and Luo Yili (pasalab, Nanjing University) for their contribution to supporting kubernetes data volume access acceleration
  • Thanks to LV Dongdong and Xie Yuandong (yunzhisheng) for their contributions to the dataset permission control function

Introduction to the author

Gu RongHe is an associate researcher in the computer department of Nanjing University and his research direction is big data processing system. He has published more than 20 papers in Frontier Journal conferences in TPDS, ICDE, jpdc, IPDPS, ICPP and other fields, and presided over a number of general projects / youth projects of the National Natural Science Foundation of China and special projects funded by the China Postdoctoral Science Foundation. The research results have been applied to Alibaba, Baidu Byte beat, Sinopec, Huatai Securities and other companies and open source projects Apache spark and alluxio won the first prize of Jiangsu Science and technology in 2018 and the youth science and technology award of Jiangsu computer society in 2019. They served as a member of the system software special committee of China Computer Society / communication member of the big data special committee, Secretary General of the big data special committee of Jiangsu computer society Member of PMC of fluid open source project co founder and alluxio open source project.

Alibaba cloud nativeFocus on micro service, Serverless, container, Service Mesh and other technology areas, focusing on cloud native technology trends, cloud native large-scale landing practice, do the best understanding of the official account of cloud developers.