Introduction:On April 27, 2021, the cloud native Computing Foundation (CNCF) announced to accept fluid as the official sandbox project of CNCF through the global TOC vote. Fluid is a cloud native data compilation and acceleration system jointly launched by Nanjing University, Alibaba cloud and aluxio open source community.
On April 27, 2021, the cloud native Computing Foundation (CNCF) announced to accept fluid as the official sandbox project of CNCF through the global TOC vote. Fluid is a cloud native data compilation and acceleration system jointly launched by Nanjing University, Alibaba cloud and aluxio open source community.
Fluid project address:
In the cloud native environment, while improving the system flexibility and flexibility, the computing storage separation architecture brings challenges to the computing performance and management efficiency of data intensive applications such as big data / AI. The existing cloud native orchestration framework running such applications is faced with high data access latency, difficult joint analysis of multiple data sources, complex data use process and other pain points. Fluid is born to solve these problems.
Fluid system architecture
Running on kubernetes, fluid is an extensible distributed data choreography and acceleration system. Its goal is to build an efficient support platform for data intensive applications in cloud native environment. The project was opened in September 2020 and has developed rapidly in just over half a year. It has attracted the attention and contribution of experts and engineers in many fields, and has been used in many large and well-known IT and Internet enterprises, including microblog, China Telecom, etc.
Fluid puts forward a series of technological innovations in the collaborative arrangement of cloud native applications and data, scheduling optimization, data caching, etc
- Provide data objects (datasets) for storing imperceptible data: realize unified abstract definition and management of different storage systems through custom resource definition, support observability and elastic scalability.
- Using distributed cache technology to speed up data set reading and writing: to customize and manage the distributed data cache engine by extending the cacheruntime object. At present, the cache engine has been natively supportedAlluxioandJindoFS。
- Intelligent data arrangement based on container scheduling: Based on kubernetes container scheduling and capacity expansion and reduction, the intelligent arrangement of data cache is realized.
- Collaborative scheduling of data sets and Applications: extend kubernetes scheduler to perceive the cache information of data set, schedule applications nearby, and give full play to the performance advantage of local read-write cache.
- Standard access interface: persistent volume claim using kubernetes standard storage interface Access data sets to achieve seamless compatibility with cloud native applications.
- Scene oriented performance tuning: for deep learning, batch data processing and other tasks, it provides data set preheating, metadata management optimization, small file IO optimization, automatic elastic scaling and other means to generally improve the efficiency of task operation.
Looking to the future
Fluid open source project is committed to accelerating cloud native infrastructure, embracing data intensive applications, and building a unified interface for application use and data management of kubernetes platform with the open source community by combining the original research of academia and the landing practice ability of industry. At present, there are five maintainers in fluid open source community, including Nanjing University, Alibaba and alluxio. Gu Rong, an associate researcher from pasalab of Nanjing University, is the chairman of the open source community. In addition, engineers from China Telecom, microblog, boss direct employment, fourth paradigm, cloud Zhisheng and other enterprises have contributed a lot of development work.
As a data intensive application operation support platform that is fully compatible with the native kubernetes ecology, fluid will develop towards a more flexible, intelligent and scalable architecture, and continuously improve the experience of developers and users. In the future, fluid will continue to work side by side with the community and with the ecology to promote the ecological construction and popularization of cloud native technology in the field of big data / AI system, and expand the boundaries of cloud native with global developers.
Copyright notice:The content of this article is spontaneously contributed by alicloud real name registered users, and the copyright belongs to the original author. The alicloud developer community does not own its copyright, nor does it bear the corresponding legal responsibility. For specific rules, please refer to the user service agreement of alicloud developer community and the guidelines for intellectual property protection of alicloud developer community. If you find any suspected plagiarism content in the community, fill in the infringement complaint form to report. Once verified, the community will immediately delete the suspected infringement content.