O’Reilly, a well-known computer book publishing company, recently released an analysis report on the development trend of the technology industry based on the data generated by its online learning platform.
On the other hand, the growth of artificial intelligence and web development continues. Cloud usage, security and privacy are also key trends.
O’Reilly’s trend analysis of the technology industry is based on the data generated by its platform. The use of O’Reilly online learning has been growing steadily. Considering the changes brought to the technology industry by the outbreak of covid-19, this growth trend is not surprising.
The analysis of the development trend of the technology industry is not from which technology field has attracted rapid attention and popularity in the short term, but from the observation of long-term performance. “Trend” is different from “trend”. The trend usually flashes by, but the trend shows up in a longer time range. In this process, there may even be retrogression, but the development of the trend will not stop. There are more reference factors behind the appearance.
In the latter ranking, rust’s growth is very obvious, reaching 94%. However, it is not difficult for rust to grow by 94% from a small base.
It can be seen from the statistics that go language, which has increased by 16%, has clearly established its position. As a language for concurrent programming, rust is likely to establish its own “system programming” status: building new operating systems and tools for cloud operations. As a language designed for mathematical calculation, Julia is an interesting unknown factor. Although it has declined slightly in the past year, we are optimistic about its long-term opportunities.
We do not advocate using python, Java or any other language. Although their usage will rise and fall in varying degrees with the development of the software industry, none of these top-level languages will disappear. However, when comparing, we need to pay attention to more factors.
If competition is not important, what are the important trends in programming languages? We found severalThree factors are significantly changing programming:
- Multi paradigm language
- Concurrent programming
Platform data for concurrency shows an annual growth rate of 8%. This is not a big number, but don’t miss the trend. Java is the first widely used language to support concurrency. In the mid-1990s, thread support was an extravagant hope, and Moore’s law had a lot of room for development. Now the situation is different. The support for concurrency is just like the support for functional programming. Go, rust, and most other modern languages have built-in support for concurrency. However, concurrency has always been one of Python’s weaknesses.
- Dynamic and static types
- Low code and no code calculations
For a learning platform, it is difficult to collect data about a trend, which minimizes the need for learning, but low code is real and is bound to work. Spreadsheets are pioneers of low code computing. When visicalc was first released in 1979, it enabled millions of people to perform important calculations without learning a programming language. Popularization is an important trend in many technical fields, and so is programming.
What matters is not the competition between different languages, but the function that the language wants to obtain and why it has this characteristic. Since we seem to have reached the end of Moore’s law, concurrency will become the core of future programming. We can’t just get faster processors. We will use microservices and serverless / feature-as-a-service in the cloud for a long time – these are inherent concurrent systems. Functional programming does not solve the problem of concurrency, but the principle of invariance will certainly help avoid pitfalls. As software projects inevitably become larger and more complex, it is very meaningful for languages to extend themselves by mixing function features. We need programmers who are considering how to use functions and object-oriented functions together. What practices and patterns make sense when building enterprise level concurrent software?
Low code and codeless programming will inevitably change the nature of programming and programming languages:
- There will be new languages, new libraries and new tools to support codeless or low code programmers. No matter what form they take, they need programmers to build and maintain them.
- Complex computer-aided coding can help experienced programmers. Whether this means “pair programming with machines” or algorithms that can write simple programs themselves remains to be seen. These tools will not eliminate programmers, they will make programmers more efficient.
- It can be predicted that there will be some opposition voices. Please ignore these, because low code can put the computing power into the hands of more people. Programmers will not be eliminated, but become more efficient and write tools that others will use.
It can be expected that there will be strong opposition to bringing great lower class people into the field of programmers. Ignore it. Low code is part of the democratization movement, and it is almost always a good thing to put computing power into the hands of more people. Programmers who are aware of what this change means will not be driven out of their jobs by non programmers. They will become more efficient and write tools that others will use.
Whether you are a technology leader or a new programmer, please pay attention to these slow long-term trends. They will change the face of the whole industry.
DevOps vs SRE
In the past decade, it has undergone fundamental changes. There has been a lot of discussion about the operational culture (commonly known as Devops), continuous integration and deployment (CI / CD), and site reliability (SRE)). Cloud computing has replaced data centers, hosting facilities and internal computer rooms. Containers allow for closer integration between developers and operations, and a lot of work has been done in standardizing deployment.
Without noops, technologies like function as a service (also known as FAAS, serverless and AWS lambda) have only changed its essence. The number of people needed to manage an infrastructure of a given size has decreased, but the infrastructure we are building has expanded, and it is easy to gather tens of thousands of nodes to train or deploy complex AI applications. Even if these machines are in Amazon’s huge data center and use highly automated tools for batch management, operators still need to keep the system running smoothly, monitor, troubleshoot and ensure that they will not pay for unnecessary resources. Serverless and other cloud technologies allow the same operations team to manage a larger infrastructure without disappearing operations.
Over the past year, the use of content titled Devops has decreased by 17%, while SRE (including “website reliability”) has increased by 37% and the word “operations” has increased by 25%. Although SRE and Devops are distinct concepts, for many customers, SRE is Google scale Devops – who doesn’t want such growth? Both SRE and Devops emphasize similar practices: version control (GitHub increased by 62% and git increased by 48%), testing (high utilization, but not increasing year by year), continuous deployment (reduced by 20%), monitoring (increased by 9%) and observability (increased by 128%) Terraform, an open source tool used by hashicorp to automate cloud infrastructure configuration, also showed strong growth (53%).
Interestingly, from the data, docker has hardly changed (decreased by 5% every year), but the content utilization of containers has soared by 99%. Therefore, containerization is obviously a major event. Docker itself may have stalled, but kubernetes’ dominance as a container choreography tool puts containers at the center. Docker is an enabling technology, but kubernetes makes it possible to deploy containers on a large scale.
Kubernetes itself is another superstar, with an increase of 47% a year. It is also the most frequently used (and the most search queries) in this group. Kubernetes is not just a choreography tool, it is also a cloud operating system (or, as Kelsey Hightower said, “kubernetes will be Linux in a distributed system”). But the data don’t show the number of conversations we have with people who think kubernetes is too “complex”. We seeThree possible solutions:
1. A “simplified” version of kubernetes is not so flexible, but it makes a trade-off between many complexities. K3s is a possible step in this direction. The question is, what is the cost? This is my view of the Pareto principle, also known as the 80 / 20 rule. Given any system (such as kubernetes), you can usually build something simpler by retaining 80% of the most widely used functions and reducing the other 20%. And some applications will fit 80% of the functionality retained. But most applications will need to sacrifice at least some functionality to simplify the system.
2. A new method, some tools that have not yet appeared, and we don’t know what the tool is at present.
3. Integrated solutions from cloud providers (for example, Microsoft’s open source dapr distributed runtime). Not those cloud providers that provide kubernetes services. What if cloud providers integrate kubernetes’ functions into their stack and make them disappear into some kind of management console? Then the question becomes, what functions have you lost and do you need them?
The rich tool ecosystem around kubernetes (istio, helm and others) shows its value. But what should we do next? Even if kubernetes is the right tool to manage the complexity of modern applications running in the cloud, the pursuit of simpler solutions will eventually lead to more complex requirements. Are they enough?
Observability grew the fastest in the past year, reaching 128%, while monitoring increased by only 9%. Although observability is richer and stronger than monitoring capacity. But this shift is largely superficial. “Observability” may become a new name for monitoring. If you think observability is just a more popular monitoring term, you lose its value. Complex systems running in the cloud need real observability to be managed.
Infrastructure is code, and we’ve seen a lot of tools for automated configuration. However, both chef and puppet decreased significantly (49% and 40% respectively), and so did salt. Ansible was the only tool to rise, up 34%. There are two trends that cause this situation. Ansible seems to replace chef and puppet, which may be because ansible is multilingual and chef and puppet are related to ruby. Second, docker and kubernetes changed the configuration. According to the data, chef and puppet peaked in 2017, when kubernetes began almost an explosion of exponential growth. Containerized deployment seems to minimize the problem of repeated configuration, because the container is a complete software package. If you have a container, you can deploy it multiple times and get the same result each time. In fact, it has never been so simple, but this apparent simplicity reduces the need for tools such as chef and puppet.
In the future, the biggest challenge facing the operation team and data engineers will be to learn how to effectively deploy AI systems. In the past decade, the Devops movement has produced many ideas and technologies, such as rapid automatic deployment of source code base, continuous testing and so on. They are very effective, but AI breaks the assumptions behind them, and deployment is often the biggest obstacle to AI success.
AI breaks these assumptions because data is more important than code. We don’t have enough tools to version control the data (although DVC is the beginning). The model is neither code nor data, and we don’t have enough tools for version control models. Frequent deployment assumes that software can be built relatively quickly, but training a model may take several days. It has been suggested that model training does not need to be part of the build process, but it is indeed the most important part of the application. Testing is crucial for continuous deployment, but the behavior of artificial intelligence system is probabilistic rather than deterministic, so it is difficult to say that this test or that test failed. Testing is particularly difficult if it includes issues such as fairness and bias.
Although there is a nascent mlops, our data do not show that people are using (or searching) a large amount of content in these areas. In many areas, content does not yet exist. However, whether the content exists or not, users will search the content, so a small number of searches show that most of our users are not aware of the problem. Operators too often think that AI systems are just another application, but they are wrong. AI developers believe that an operations team will be able to deploy their software and move on to the next project, but they are also wrong. With the emergence of a new generation of tools, these problems will eventually be solved. In fact, these tools are already under development, but we haven’t done that yet.
Artificial intelligence, machine learning and data
The healthy development of artificial intelligence continues: machine learning has increased by 14%, and artificial intelligence has increased by 64%; Data science rose by 16% and statistics by 47%. Although artificial intelligence and machine learning are two distinct concepts, their definitions are often confused. We informally define machine learning as “the working part of artificial intelligence”. Artificial intelligence itself is more research-oriented and ambitious. If you accept this definition, it is not surprising that the content of machine learning is widely used: it is about taking research out of the laboratory and putting it into practice. It is not surprising that we see the steady development of artificial intelligence, because this is where cutting-edge engineers look for new ideas and turn them into machine learning.
Indeed, some indicators can be said that AI has stagnated, and many projects have never been put into production. Although amazing progress has been made in natural language processing last year, up 21%, such as gpt-3 of openai, there are fewer and fewer amazing results such as winning go games. Artificial intelligence (and machine learning, data, big data and all their peers) may be entering a trough. However, we believe that it will take years of efforts to apply the current research results to commercial products.
The future of artificial intelligence is not so much an amazing breakthrough and creepy face or speech recognition as a small and ordinary application. Artificial intelligence has played a great role in the development of covid vaccine. Artificial intelligence is playing an important supporting role. It enables researchers to browse tens of thousands of research papers and reports, design potentially effective drugs and genes, and analyze millions of health records. If these tasks cannot be automated, it is difficult to organize the spread of the epidemic.
Therefore, we see the future of artificial intelligence and machine learning:
- Natural language has been (and will continue to be) a big problem. Gpt-3 has changed the world. We will find that artificial intelligence provides us with the best tool to detect what is false and what is not.
- Many companies are using artificial intelligence to automate customer service. We have made great progress in the ability to synthesize voice, generate realistic answers and find solutions.
- We will see many micro embedded artificial intelligence systems applied in various fields from medical sensors to electrical appliances to factory workshops. Anyone interested in the future of technology should observe Pete warden’s work on tinyml very carefully.
We still haven’t faced up to the user interface problem of human and artificial intelligence cooperation. We not only hope that artificial intelligence can do some work instead of human beings, but also hope that artificial intelligence can cooperate with human beings and produce better results than human or machine alone.
Tensorflow is the leader in the machine learning platform, with the most searches and a stable utilization rate of 6%. The content of Python’s machine learning library scikit learn is almost equally widely used, with an annual growth rate of 11%. Pytorch ranked third, but the usage of pytorch content increased by 159% year-on-year. There is no doubt that this growth is influenced by the popularity of Jeremy Howard’s practical deep learning for programmers and the fastai Library Based on pytorch. It seems that pytorch is more popular among researchers, while tensorflow is still dominant in production. But as Jeremy’s students move into industry and as researchers move to production, we want to see a shift in the balance between pytorch and tensorflow.
Kafka is an important tool for building data pipeline. It is very stable. Its growth rate and utilization rate are similar to spark, which is 6%. Kafka’s “next generation” competitor pulsar has not yet appeared in the ranking.
In the past year, tools for automated AI and machine learning development (IBM’s autoai, Google’s cloud automl, Microsoft’s automl and Amazon’s sagemaker) have attracted extensive attention. But we don’t see any sign that they are making significant progress in the market. It may be that autoai is relatively new, or users think they don’t need to search for supplementary training materials.
What about data science? The report “what is data science” has been published for 10 years. But surprisingly, for a paper 10 years ago, the number of views of the report increased by 142% over 2019. However, tools have changed. A decade ago, Hadoop was the center of the data science world. Now it still exists, but it is only a legacy system, down 23% since 2019. Spark is now the dominant data platform, and it must be what tool engineers want to know. The use of spark content is about three times that of Hadoop. But even spark has fallen 11% since last year. Ray is new and is expected to make it easier to build distributed applications, but it has not shown the use matching spark (or even Hadoop), but it shows an increase of 189%. There are other tools coming soon. For example, dask is newer than ray and has increased by nearly 400%.
In addition, topics such as ethics, fairness, transparency and interpretability do not affect our data. It may be because there are few books on these aspects and no training courses, but this is a problem in itself.
Since the invention of HTML in the early 1990s, the first web server and the first browser appeared, the web has become a variety of platforms. These platforms make network development more flexible. They make it possible to support a large number of devices and screen sizes. They make it possible to build complex applications that run in browsers.
So what’s the world of Web frameworks like? React is a leader in content usage and also shows significant growth (34% year-on-year). Although there are rumors that angular is declining, it is the second platform, with an increase of 10%. The content utilization rate of the server-side platform node.js is second only to angular, with an increase of 15%.
What’s more surprising is that Ruby on rails has shown very strong growth (annual growth rate of 77%) after several years of stable performance. Similarly, Django (roughly the same time as rails) showed significant usage and 63% growth. But this growth does not apply to all older platforms. Although nearly 80% of websites still use PHP content, the utilization rate of PHP is relatively low and declining (down 8%). Although jQuery shows a healthy growth of 18%, the utilization rate of jQuery content is lower than that of any other platform we see.
All kinds of clouds
It is not surprising that cloud computing is growing rapidly. Since last year, the use of cloud computing content has increased by 41%. The use of Amazon Web services, Microsoft azure or Google cloud grew faster, reaching 46%. Although most companies are using cloud services in some form, and many companies have transferred important business critical applications and data sets to cloud computing, we still have a long way to go. If there’s a technology trend you need to master, it’s it.
In the competition among leading cloud providers AWS, azure and Google cloud, Amazon’s growth has stalled, currently only 5%. The use of azure content shows a 136% growth, higher than any competitor, while the 84% growth rate of Google cloud is not low. With the growth of azure and Google cloud, Amazon’s dominance may change.
As a cloud computing company, Microsoft has done a great job in reshaping its image. Over the past decade, Microsoft has rethinked all aspects of its business. Now, Microsoft has become a leader in open source, with GitHub and LinkedIn. It’s hard to imagine any other company’s reform so radical.
Google faces a series of different problems. 12 years ago, the company realized no service through App Engine. It opens kubernetes and makes a big bet on its leading position in the field of artificial intelligence. Tensorflow, a leading artificial intelligence platform, is highly optimized and can run on Google’s hardware. So why is it in third place? Google’s problem is not its ability to provide cutting-edge technology, but its ability to reach customers. Thomas Kurian, CEO of Google cloud, is trying to solve this problem.
Although our data show that the usage rate of cloud computing content has increased very strongly, there is no significant use of terms such as “multi cloud” and “hybrid cloud”, as well as specific hybrid cloud products such as Google’s anthos or Microsoft’s azure arc. These are new products and few content exists, so it’s not surprising that the utilization rate is low. However, in this case, the use of specific cloud technologies is not so important. More importantly, the use of all cloud platforms is growing, especially the content unrelated to any vendor. We also see that our enterprise customers are using content across all cloud providers, and it is difficult to find anyone looking for a single provider.
Not long ago, we were skeptical about mixed clouds and cloudy. It’s easy to think that these concepts are Utopian ideas generated from the minds of suppliers ranking second, third, fourth or fifth. If you can’t win customers from Amazon, at least you can get a piece of their business. Cloud computing is hybrid in nature. Engineers can’t get resources for some projects, so they create an AWS account and bill it on the company’s credit card. Then, someone on the other team encountered the same problem, but used azure. Next is an acquisition. The new company has established its own infrastructure on Google cloud. Moreover, there are several petabytes of data inside, and these data are limited by regulatory requirements and difficult to move. Some companies already have hybrid clouds before many people realize the need for a coherent cloud strategy. By the time executives made the master plan, some mission critical applications had emerged in the areas of marketing, sales and product development.
All cloud providers, including Amazon, are attracted by a strategy that does not lock customers in a specific cloud, but promotes the management of hybrid cloud. All these providers provide tools to support hybrid cloud development. They know that support for hybrid clouds is the key to cloud adoption. Moreover, if there is any lock, it will be around management. As Rob Thomas of IBM often said, “the cloud is a function, not a location.”
As expected, we see a strong interest in micro services, with a year-on-year increase of 10%. Although it is not large, it is still healthy. No service also showed a 10% increase, but the utilization rate was low. Although it “feels” stagnant, the data show that it is growing in parallel with microservices.
Security and privacy
Security has always been very important. Defenders must deal with thousands of things correctly, and attackers only need to find one vulnerability. Moreover, this mistake may be made by careless users rather than it personnel. Most importantly, companies often underinvest in security.
However, there have been many hacking incidents in the past 10 years, involving billions of dollars, and leading to the resignation and dismissal of many executives. These data do not give a clear explanation of whether enterprises have learned a lesson. Although we avoid discussing absolute usage, the use of security content is very high, which is higher than that of any topic except the main programming languages such as Java and python. Perhaps a better comparison is to compare security with common topics such as programming or the cloud. If this approach is adopted, the use of programming will be greater than security, and security will only lag behind the cloud. Therefore, the usage rate of security related content is indeed very high, an increase of 35% compared with the same period last year.
Certification resources are widely used. CISSP content and training account for 66% of general security content, which has decreased slightly since 2019. The content utilization rate of CompTIA Security + authentication is about 33% of the general security, with a growth rate of 58%.
People’s interest in hacking is quite strong, an increase of 16%. Interestingly, the use of ethical hacking (a subset of hacking) is only half that of hacking, and has increased by 33%. So we share equally between “good people” and “bad people”, but “good people” grow faster. Penetration testing is considered an ethical hacker, and the data shows a 14% decline. This change may only reflect which term is more popular.
In addition to these categories, we also see the long tail effect. Content on specific topics such as phishing and blackmail software is rarely used, although the use of blackmail software has increased by 155% year-on-year. This growth undoubtedly reflects the frequency and severity of blackmail software attacks in the past year. The content of “zero trust”, a technology used to build defensible networks, has also increased by 130%. However, the use of this technology is still small.
Disappointingly, we hardly see anything about privacy, including specific regulatory requirements such as gdpr. We don’t see a lot of usage, growth, or even a lot of search queries.
We have browsed a large part of the technical field. The competition among various fields and the deeper story behind it. Trends are not just the latest fads, they are also a long-term process. Containerization dates back to UNIX version 7 in 1979. Sun Microsystems invented cloud computing in the 1990s with its workstation and sun ray terminal. We may talk about the “Internet Age”, but the most important trends span decades, not months or years, and often involve the re invention of useful but forgotten technologies, or technologies that appeared before the times.
With that in mind, let’s take a few steps back and think about the overall situation. How do we use the computing power required for AI applications? We have been discussing concurrency for decades, but it is only an external capability that is very important for large digital processing tasks. We have been discussing system management for decades. During this period, the ratio of IT personnel to computer managers has changed from many to one to thousands (monitoring the infrastructure in the cloud). As part of this development, automation has also changed from a choice to a necessity.
Finally, the most important trends may not yet appear in these data. In terms of regulation and legislation, technology is largely free. Industries such as healthcare and finance are strictly regulated, but social media, most machine learning, and even most online commerce are only slightly regulated. The era of free is coming to an end.
This industry has developed too fast and destroyed too many things. In this case, more attention should be paid to privacy and related topics. Now, the problem we face is very simple, that is, what kind of future should we build with technology.