For three consecutive years, Flink won the most active Apache open source project in the world

Time:2021-1-18

2020, a year destined to be remembered by history. Under the challenge of global cooperation, as the world’s largest open source software foundation, the open source community led by Apache Software Foundation still gathers the world’s top developers and delivers an inspiring report card. On January 1, 2021, Apache Foundation published an article “Apache in 2020 – by the numbers” on its official blog, which reviews the community development in 2020 with figures

In the past year, the Apache Software Foundation has 238 projects from various fields, iterating nearly 3500 releases.
All the open source software of Apache Software Foundation contributes more than 22 billion dollars today.
Apache’s online conference attracted nearly 5750 participants from more than 150 countries, and the two-day Apache con attracted more than 1.5 million visitors.

As one of the 199 top projects of Apache Software Foundation, Apache Flink has made remarkable achievements in this report in terms of community user activity, developer activity, and exposure.

Community mailing list activity: Top 1

Mailing list is a common channel for communication between developers and users in various projects of Apache Software Foundation. It is generally divided into two channels: dev @ mailing list and user @ mailing list. We often use the activity of mailing list to reflect the communication activity of the community. In 2020, Flink won the first place in the user communication email channel and the second place in the developer communication channel

For three consecutive years, Flink won the most active Apache open source project in the world

In particular, among the top 20 mailing lists, the Flink community is the only one that provides communication channels for Chinese users( [email protected] )And its activity is second only to Flink’s English user mailing list in 2020. Since 2018, Flink has won the first place in the mailing list for three consecutive years. We are glad to see that more and more Chinese native speakers are speaking in the open source community, which has brought great influence to the global open source software community.

Commit submission: top 2

The number of new commits of open source projects in the past year is a commonly used indicator to measure the development activity of open source projects. The Apache Software Foundation publishes the top five most commit numbers of last year every year, while Flink ranks second in the number of new commit in 2020, second only to Apache camel, the routing engine building software. If the scope is limited to big data computing / storage, Apache Flink is the most active project for developers. If we look at the past annual reports of 2019 [2] and 2018 [3], we can see the figure of big data in the five most active open source software every year, among which Flink, Hadoop, HBase, beam, airflow and spark have all been listed. We draw the following table to describe this trend (because only top 5 has been published, some projects will fail in some years)

For three consecutive years, Flink won the most active Apache open source project in the world

Apache Flink is the only big data related open source project that has continuously appeared in the top 5 in recent three years, and its ranking is rising.

As the top 5 list changes every year, we also count the number of commits of the projects that have been listed in recent three years [4], and draw the following statistical chart. It can be seen that the number of commits of Flink increases year by year, and its performance is very bright in 2020, which further expands its advantages in big data projects.

For three consecutive years, Flink won the most active Apache open source project in the world

GitHub visits: top 2

Apache Flink community not only has a high degree of activity in development and user communication, but also has a high degree of exposure and browsing in the Internet world. Apache Software Foundation counted the traffic to Flink’s GitHub page in 2020, ranking second among all projects.

Since this indicator is not shown in the annual summary report of Apache Software Foundation in 2018 and 2019, we found the GitHub access flow indicators in the annual report of fiscal year 2019 (2018.5.1-2019.4.30) [5] and the annual report of fiscal year 2020 (2019.5.1-2020.4.30) [6]:

For three consecutive years, Flink won the most active Apache open source project in the world

It can be seen that since the middle of 2018, Flink has risen from the third in total to the second in 2020.

summary

Through the summary of Apache Software Foundation in 2020, combined with the summary of 2018 and 2019 and the annual report of the financial year, we can see that Flink has undoubtedly grown into one of the head projects of Apache. Whether it is user communication activity, development activity, or influence and other aspects have been firmly in the top position of all open source software projects of Apache.

At the same time, Flink forward Asia 2020, the annual event of Flink community, has just ended. We have also witnessed the rapid development of Flink community, technological innovation and the implementation of streaming batch integration in the production environment. More and more enterprises, such as byte beat, Xiaomi, Netease, Zhihu, are exploring the solution of using Flink as the unified architecture of streaming batch integration.

Among them, a large number of developers and users from China are undoubtedly one of the most important reasons for their achievements. Reading this article, you must be contributing to one of the top projects of Apache. Now, 2021 has come. I believe that in the new year, Apache Flink will continue to evolve towards streaming batch integration, offline real-time integration, big data and AI integration, and make greater achievements!

Real time is the future, and the Flink community is looking forward to your participation!

reference

[1] Apache in 2020 – By The Digits
https://blogs.apache.org/foun…
[2] Apache in 2019 – By The Digits
https://blogs.apache.org/foun…
[3] Apache in 2018 – By The Digits
https://blogs.apache.org/foun…
[4] Refer to the command git Rev list — after = “Jan 1 2020” — before = “Jan 1 2021” — all — no merges — count for commits number statistics
[5] Apache FY2019 annual report
https://files-dist.s3.amazona…
[6] Apache FY2020 annual report
https://www.apache.org/founda…