Introduction: for programmers, I always think that code is the key to show their ability. It is easy to see the difference between the code written by an excellent programmer and the code written by an ordinary programmer. As the display of programmers’ hard power and business card, how to improve their ability to write code is always a key topic, Unfortunately, this article doesn’t really talk about specific steps, silver bullet methods and martial arts secrets. This article talks about four experiences in my impression that I have greatly improved my ability to write code, which may be for reference.
Paragraph 1: feel the challenge of the 100 million level system every day for the first time
In 2008, the second version of HSF was launched in Taobao’s most important trading center at that time. On the day of launch, the access to Taobao website was extremely slow, and the trading pages could hardly be opened. Finally, HSF recovered by going offline.
The second version of HSF is based on JBoss remoting. In that version, the timeout of remote synchronous call is 60s written in the code, and the called service does have some phenomena of more than 10 seconds, resulting in the thread pool of web application processing web requests being gradually occupied by these slow requests and requests piling up, Finally, the page opens very slowly.
After finding out the reasons, I decided to rewrite the entire HSF communication based on Mina at that time. The two months of rewriting greatly improved my ability to write code, whether it was in-depth study of network IO processing or in-depth study on high concurrency systems. Now think about the way to learn, that is, turn over all kinds of popular science materials of network IO, and then read Mina’s source code The source code of Java Network IO and the study of concurrency mainly rely on the classic java concurrent programming practice and reading the code in Java j.u.c. the biggest difference between the study in this period and the past is to turn think in Java into practice. With the launch of the new rewritten version of HSF, Basically, I have gradually mastered the code ability of these parts.
In addition to the improvement of code capability, another biggest lesson learned is that for a 100 million level and long-running system, many seemingly small probability problems will become serious problems. This is why it is difficult to write highly concurrent systems, which requires you to write your own code, And the implementation of various APIs called by your own code are very clear, so as to really ensure the robustness of the final code.
Paragraph 2: the story of folk “fire brigade”
In the second paragraph, I greatly improved my ability to write code. During the days of the folk “fire brigade”, Taobao had a lot of faults in 2009, but there was no standard system and organization to deal with the faults, resulting in many times that there was no one to deal with the faults, or the processing efficiency was not high, So at that time, a classmate of the operation and maintenance team pulled some people to form a group. The name of the group is Taobao fire brigade, which is used to deal with various faults on Taobao. I happened to join this group. In this group, there is another super technology God recognized by Ali: duolong.
When you first see various faults, you don’t know how to start. Dealing with faults usually requires not only the ability to write code, but also a certain grasp of the overall picture of a system. For example, a particularly popular article in a few years ago, when you click on the article behind the search, you actually need to be particularly familiar with the processing flow of a system, This is very important when dealing with faults. After having a fault, it is very important to control the details of the code operation mechanism of this link. At this time, various tools are usually very important and can effectively help you know what happened, such as top-h at the system level, btrace at the Java level, etc, Can let you locate the problem point according to the operation.
During this period of time, I think my promotion depends on a lot of practice. There are a lot of faults. At the beginning, I rely on how people deal with them, mainly learning from Dolon, and then trying to solve some faults by myself. After solving more and more faults, my proficiency gradually increases. In addition to the improvement of my ability to solve faults, I have seen a lot of faults caused by the code level, It is very helpful for me to better ensure the robustness when writing code to avoid failures. For example, I have seen many cases in which a large number of threads are created due to the abuse of thread pool, and finally threads cannot be created. I will understand that in the scenario of using thread pool, I must clearly control the maximum number, including the stacking strategy, For another example, I’ve seen more than n cases of oom caused by data structures with self increasing capacity, and I’ll understand that when writing code, I can’t think that the data structure will not grow to super large, so I don’t do any protection. At this time, I understand that it’s not difficult to write a piece of code that can work and meet the requirements, But it’s really not easy to write a piece of code that can run stably for a long time under various circumstances. I think it’s the biggest difference between a professional programmer who writes business systems and just writing programs.
Paragraph 3: rewrite the communication framework
In 2010, I left the middleware team to do HBase. At that time, the communication in HBase was still implemented in a very simple way. I thought that I could either transplant the previous HSF to HBase. At this time, Doron was using C to write a general communication framework libeasy for all kinds of C applications, so there was a test, I remember the first test results. I saw that there was a huge difference between the high concurrency capability and libeasy ratio of the communication framework in the original HSF. Doron and I discussed how he implemented it. I saw if I could learn to change it in the Java version, so I had this experience of rewriting the communication framework.
I thought I had mastered the code related capabilities of the communication framework in the years before writing HSF. During the process of rewriting with Dolon, I found that the gap was still large. Dolon taught many details. The core of the NiO based communication framework is to use very few IO threads to handle IO events (too much is useless, because some parts can only be serial) Therefore, how to use these IO threads efficiently is very important. It is necessary to minimize the processing of irrelevant actions by these IO threads. Another point is to minimize the switching between IO threads and business processing threads. For example, it is common to batch throw multiple requests in a stream to business processing threads at one time.
This experience is very helpful for me to better grasp the details of the overall code logic, which is very important for writing systems with high requirements. After all, for a very large-scale system, the 1% improvement is still considerable.
Paragraph 4: learning JVM
Before, I started to share with my colleagues how to deal with faults because I had a lot of faults to deal with. Later, I found that I couldn’t explain clearly or know how to deal with some problems. I had to learn more about the JVM. However, at the beginning, I couldn’t find a way to open the JVM code.
Fortunately, I met a classmate who has the same hobby and is much better than me, that is, Sakya. It is usually called R big in the circle. Sakya and I made an appointment to see the JVM code together in the company for several weekends. With Sakya’s guidance, I finally got started and knew how to see it. Moreover, the two people read the code together, shared and discussed with each other. The efficiency is very high.
With this experience and continuing to deal with some faults, I basically have a better understanding of the JVM code implementation. Later, when I do fault sharing and problem solving, I can finally know better. Similarly, it is also very helpful to my ability to deal with faults and write code, For example, you will understand more about what the so-called GC friendly code means before, and you will also have a deeper feeling. In fact, java code is usually not too bad, because the JVM will do a lot of optimization as much as possible during the runtime to reach an average, but it is very difficult to write well, because you need to understand the JVM and the OS under the JVM.
In fact, I can’t sum up anything, because everyone’s environment is different, and there are also methods suitable for their own promotion. Looking at my own experience, I think:
- If the environment is not available, give yourself a challenging proposition. For example, if you want to learn highly concurrent communication, you can try to write a PK to compare with others and do performance. This usually improves a lot. If you want to learn GC, you can try to give yourself several topics to control the behavior of GC. If the environment is available, it will be more favorable.
- With excellent programmers, I have learned a lot from Doron and Sakya, and from many excellent open source codes, such as netty and openjdk. Therefore, it is also a good way to participate in some excellent open source projects and read excellent books (for example, the Java Concurrent Programming Practice in concurrency, Oracle JRockit: the defining guide in JVM, in-depth understanding of Java virtual machine, etc.) is also a good way to learn from excellent programmers.
- Try to solve problems / faults more. This is definitely a very good way to improve the comprehensive ability of code. If you have few opportunities in your work, there are a lot of online, such as stack overflow, which are good practice fields.
Finally, I still want to say that code ability, as a hard business card of programmers, is always the most effective thing to distinguish programmers’ abilities. I think the sentence “talk is soap, show me the code” is always true.
Author: Bi Xuan
This article is the original content of Alibaba cloud and cannot be reproduced without permission