Improve efficiency! 10 key skills for Linux administrator


Improve efficiency! 10 key skills for Linux administrator

Good system administrators differ in efficiency. If an efficient system administrator can complete a task that others need 2 hours to complete in 10 minutes, he should be rewarded (get more reward) because he saves time for the company, and time is money. Here are some tips to save time – even if you don’t get paid more for efficiency, you can at least have more free time.

Tip 1: unload the unresponsive DVD drive

Novice experience: when you press the eject button on the DVD drive of the server (running the Redmond based operating system), it will pop up immediately. He then complained that in most Enterprise Linux servers, if a process was running in that directory, the pop-up would not happen. As a long-term Linux administrator, I will restart the machine. If I don’t know what’s running and why I don’t release the DVD drive, I eject the disk. But it’s inefficient.

Here’s how to find the process that keeps the DVD drive and easily eject the DVD drive: first simulate. Put a disk in the DVD drive, open a terminal, and load the DVD drive:

mount /media/cdrom

cd /media/cdrom

while [ 1 ]; do echo “All your drives are belong to us!”; sleep 30; done

Now open the second terminal and try to eject the DVD drive:


You will get the following message:

umount: /media/cdrom: device is busy

Before releasing the device, let’s find out who is using it

fuser /media/cdrom

Process is running, unable to eject the disk is actually our error. Now, if you are the root user, you can terminate the process at will:

fuser -k /media/cdrom

Now you are finally ready to uninstall the drive:


Fuser is normal.

Tip 2: restore the problem screen

Try the following:

cat /bin/cat

be careful! The terminal is like garbage. All the input is very messy. So what to do?

Enter reset. However, the input reset is too close to the input reboot or shutdown. Sweat your palms with fright – especially when you perform this operation on a production machine.

Don’t worry, the machine won’t restart during this operation. Continue:


Now the screen is back to normal. This is much better than logging in again after closing the window, especially when you have to go through five machines and SSH to get to this machine.

Tip 3: screen collaboration

David, a senior maintenance user from product engineering, called and said, “why can’t I compile supercode. C on these new machines that you deploy?”.

You ask him, “what machine are you running?”

David replied, “posh.”. (the virtual company named its five production servers in memory of spice girls.). Now you can show your skill. Another machine is operated by David

su – david

Go to posh:

ssh posh

When you arrive, run the following code:

screen -S foo

Then call David: “David, run the command # screen – x foo on the terminal.”.

At this point, your conversations with David are linked together in the Linux shell. You can input, he can also input, but each other can see each other’s doing. This avoids going to other levels, and both sides have the same control. The advantage of this is that David can observe your fault diagnosis skills and know exactly how to solve the problem.

Finally, you can see the problem: David’s compilation script hard coded an old directory that was not on the new server. Load it and compile it again to solve the problem, then David goes on working. You can continue the previous entertainment.

One thing to note about this technique is that both parties need to log in with the same user. The screen command can also: implement multiple windows and split the screen. Please read the manual page for more information.

For screen conversation, I have one last trick. To detach from and let it open, enter

Ctrl-A D

(that is, hold down the CTRL key and click the a key. Then press D). You can then reassemble by running the screen – x foo command again.

Tip 4: find the root password

If you forget the root password, you have to re install the whole machine. Worse still, many people do. But it’s very easy to start the machine and change the password. This is not applicable in all cases (such as setting a grub password, but forgetting it), but here is an example of CentOS Linux to illustrate the general operation.

First, restart the system. The grub screen shown in Figure 1 will pop up when you restart. Move the arrow keys so that you can stay on this screen instead of entering normal startup.

Improve efficiency! 10 key skills for Linux administrator

Figure 1. Grub screen after restart

Then, use the arrow keys to select the kernel to boot and enter e to edit the kernel line. Then you can see the screen as shown in Figure 2

Improve efficiency! 10 key skills for Linux administrator

Figure 2: preparing to edit the kernel row

Use the arrow keys again to highlight the line starting with kernel and press e to edit kernel parameters. When you reach the screen shown in Figure 3, you can add the number 1 after the parameter shown in Figure 3

Improve efficiency! 10 key skills for Linux administrator

Figure 3. The number 1 is added after the parameter

Then press enter and B, and the kernel will boot into single user mode. Then run the passwd command to change the user’s root password

sh-3.00# passwd
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully

It is now ready to restart and the machine will start with the new password.

Tip 5: SSH backdoor

Many times, my site needs someone’s remote support, but he is blocked by the company. Few people realize that if you can get to the outside through the firewall, you can easily let the external information in. From the original meaning, this is called “smashing a hole in the firewall”. I call it SSH backdoor. In order to use it, there must be a machine connected to the Internet as an intermediary. In this case, this machine is called . The machine behind the company firewall is called ginger. The machine supported by this technology is called tech. Figure 4 illustrates the setup process.

Improve efficiency! 10 key skills for Linux administrator

Figure 4. Making a hole in the firewall

Here are the steps:

Check what is allowed to be done, but make sure you ask the right person. Most people are worried that you’ve turned on the firewall, but they don’t understand that it’s completely encrypted. Moreover, we must crack the external machine to enter the company. However, you may be an “aggressive” person. You should choose your own way to judge, but don’t complain when you are not satisfied.

SSH from ginger to . Suppose you are the root user on ginger, and tech needs the root user ID to help use the system. Forward the description of port 2222 on blackbox to port 22 on ginger using the – R flag. This sets up the SSH channel. Note that only SSH communication can enter ginger: you will not put ginger on the unprotected Internet. You can do this using the following syntax:

~# ssh -R 2222:localhost:22 [email protected]

After entering the blackbox, you just need to keep the login status all the time. I always type the following command:

[email protected]:~$ while [ 1 ]; do date; sleep 300; done

Keep the machine busy. Then minimize the window.

Now instruct your friends on tech to use SSH to connect to blackbox without using any special SSH tags. But you have to give them the password:

[email protected]:~# ssh [email protected]

After tech is on blackbox, you can use the following command to connect from SSH to ginger:

[email protected]:~$: ssh -p 2222 [email protected]

Tech will prompt for the password. You should enter the root password of ginger. Now you and the support from tech can work together and solve the problem. Even need to use the screen together! (see tip 4).

Tip 6: remote VNC session through SSH channel

VNC or virtual network computing has existed for a long time. Usually, I need VNC only when some kind of graphics program on the remote server can only be used on this server.

For example, suppose in tip 5 that ginger is a storage server. Many devices use GUI programs to manage storage controllers. These GUI management tools usually need to be directly connected to the storage server through a network, which is sometimes stored in a dedicated subnet. Therefore, the GUI can only be accessed through ginger.

You can try to use the – x option to SSH connect to ginger and start it, but it requires a lot of bandwidth, and you need to endure the pain of waiting. VNC is a network friendly tool, which is suitable for almost all operating systems.

Suppose the settings are the same as in tip 5, but you want tech to access VNC instead of SSH. In this case, some similar operations are needed, but the VNC port is forwarded. Perform the following steps:

Start a VNC server session on ginger. Run the following command:

[email protected]:~# vncserver -geometry 1024×768 -depth 24 :99

These options indicate starting the server with a resolution of 1024 × 768 and a pixel depth of 24 bits per pixel. If you use a slower connection setting, 8 might be a better option. Use: 99 to specify the port that can access VNC server. VNC protocol starts at 5900, so: 99 means that the server can access from port 5999.

You are required to specify a password when starting the session. The user ID is the same as the user who started the VNC server (root in this case).

Connect from ginger to Forward port 5999 on blackbox to ginger. This is done in ginger by running the following command:

[email protected]:~# ssh -R 5999:localhost:5999 [email protected]

After running this command, you need to keep the SSH session open so that the port forwarded to ginger can be reserved. At this time, if you are on blackbox, run the following command to access the VNC session on ginger:

[email protected]:~$ vncviewer localhost:99

This will forward the port to ginger through SSH, but we want VNC to access ginger through tech. For this, another channel is needed. In tech, open a channel and forward port 5999 to port 5999 on blackbox through shh. This is done by running the following command:

[email protected]:~# ssh -L 5999:localhost:5999 [email protected]

The SSH used this time is marked as – L. instead of putting 5999 into the blackbox, it gets 5999 from it. After arriving at the blackbox, you need to keep the session open. Now you can use VNC in Tech!

In tech, run the following command to connect VNC to ginger:

[email protected]:~# vncviewer localhost:99

Tech will now have a VNC session directly to ginger. It’s a bit cumbersome to set up, but it’s better than running around to fix the storage array. But it’s easier to practice a few more times.

I would also like to add one more point to this tip: if tech is running the windows? Operating system, and there is no command line SSH client, then tech can run putty. Putty can be set to forward SSH ports by looking for options in the sidebar. If the port is 5902 instead of 5999 in this example, you can enter the content in Figure 5.

Improve efficiency! 10 key skills for Linux administrator

Figure 5. Putty can forward SSH as a channel

If this is set, tech can use VNC to connect to localhost:2 As if tech was running on Linux.

Tip 7: check bandwidth

Imagine: Company A has a storage server named ginger and loads NFS through a client node named Beckham. Company a determines that they need more bandwidth from ginger because there are a large number of nodes that need NFS to mount ginger’s shared file system.

The most common and cheapest way to do this is to combine two Gigabit Ethernet NICs. This is the cheapest because you usually have an additional available NIC and an additional port.

So take this approach. But now the question is: how much bandwidth do you need?

The theoretical limit of Gigabit Ethernet is 128 Mbit / s. Where does this figure come from? Look at these calculations:

1Gb = 1024Mb;1024Mb/8 = 128MB;”b” = “bits,”、”B” = “bytes”

But what do you actually see? What are the good measurement methods? I recommend a tool iperf. Iperf can be obtained as follows:


You need to install this tool on a shared file system visible to ginger and Beckham, or compile and install it on both nodes. I’ll compile it in the home directory of the Bob user, where both nodes are visible:

tar zxvf iperf*gz

cd iperf-2.0.2

./configure -prefix=/home/bob/perf


make install

On ginger, run:

/home/bob/perf/bin/iperf -s -f M

This machine will be used as a server and output execution speed in Mbit / s.

On the Beckham node, run:

/home/bob/perf/bin/iperf -c ginger -P 4 -f M -w 256k -t 60

The results on both screens indicate what the speed is. On a normal server using a Gigabit adapter, you may see a speed of about 112 Mbit / s. This is a common bandwidth in the TCP stack and physical cable. By connecting two servers in an end-to-end way, each server uses two connected Ethernet cards, I get about 220mbit / s bandwidth.

In fact, the NFS seen on the connected network is about 150-160 Mbit / s. This still means that the bandwidth can achieve the desired effect. If you see smaller values, you should check for problems.

I recently came across a situation in which two NICs with different drivers are connected through a connection driver. This leads to very low performance. The bandwidth is about 20 Mbit / s, which is smaller than the bandwidth when the Ethernet card is not connected!

Tip 8: command line scripts and utilities

Linux system administrators will become more efficient by using authoritative command line scripts. This includes clever use of loops and knowing how to parse data using utilities such as awk, grep, and sed. In general, this can reduce the number of keystrokes and reduce the user error rate.

For example, suppose you need to generate a new / etc / hosts file for the Linux Cluster you are about to install. The general practice is to add IP address in VI or text editor. However, you can do this by using the existing / etc / hosts file and appending the following to it. Run on the command line:

# P=1; for i in $(seq -w 200); do echo "192.168.99.$P n$i"; P=$(expr $P + 1);
done >>/etc/hosts

200 host names (n001 to N200) will be created by IP addresses ( to Manually filling such a file can create duplicate IP addresses or host names, so this is a good example of using the built-in command line to eliminate user errors. Note that this is done within the bash shell (the default for most Linux distributions).

As another example, suppose you want to check whether the memory size of each computing node in the Linux cluster is the same. In general, it’s best to have a distribution or similar shell. But for the sake of demonstration, SSH is used below. Suppose SSH is set to not use password authentication. Then run:

# for num in $(seq -w 200); do ssh n$num free -tm | grep Mem | awk '{print $2}';
done | sort | uniq

This command line is quite simple. It’s even worse if you put regular expressions in it. Let’s break it down and discuss the parts in detail.

First cycle from 001 to 200. Use the – W option of the SEQ command to precede with 0. Then the num variable is replaced to create a host connected through SSH. When you have the target host, issue a command to it. In this case:

free -m | grep Mem | awk ‘{print $2}’

1. This command means: use the free command to get the memory size in megabytes.

2. Get the result of this command and use grep to get the line containing the string mem.

3. Get that line and use awk to output the second field, which is the total memory in the node, and perform this operation on each node.

After the command is executed on each node, the entire output of the 200 nodes is sent (| d) to the sort command to sort all the memory values. Finally, use the uniq command to eliminate the duplicate items. This command results in one of the following situations:

1. If all nodes (n001 to N200) have the same memory size, only one number is displayed. This number is the memory size that each operating system sees.

2. If the node memory size is different, you will see several memory size values.

3. Finally, if SSH fails on a node, you will see some error messages.

This order is not perfect. If you find a different memory value than expected, you don’t know which node has the problem or how many nodes there are. To do this, another command needs to be issued.

This technique provides a quick way to view something, and you can know immediately if something goes wrong. Its value lies in rapid inspection.

Tip 9: console reconnaissance

Some software will output error messages to the console, which may not be displayed in the SHH session. It can be checked by using the VCs device. In an SSH session, run the following command on the remote server # cat / dev / vcs1. This will display the contents of the first console. You can also use 2, 3 to view other virtual terminals. If a user enters on a remote system, you will see what he has entered.

In most data farms, using remote terminal server, KVM or even serial over LAN is the best way to view this kind of information; it also provides some benefits of out of band viewing function. Using the VCs device can provide a fast in band method, which can save the time to check the console in the computer room.

Tip 10: random system information collection

In skill 8, an example of using the command line to get information about the total memory in the system is introduced. In this tip, I’ll introduce several other ways to gather important information from systems that need validation, troubleshooting, or remote support.

First, gather information about the processor. It’s easy to do with the following command:

cat /proc/cpuinfo

This command gives information about the speed, number, and model of the processor. In many cases, grep can be used to get the desired value. My regular check is to determine the number of processors in the system. So if I buy a quad core server with dual core processors, I can run the following command:

cat /proc/cpuinfo | grep processor | wc -l

Then I see that the value should be 8. If not, I’ll call the vendor and ask them to send me another processor.

Another piece of information I need is disk information. It can be obtained by using DF command. I always add the – H flag to see output in gigabytes or megabytes. #DF – h also displays the partition status of the disk.

At the end of the list is the way to view the system firmware – a way to get the BIOS level and firmware information on the NIC.

To check the BIOS version, run the dmidecode command. Unfortunately, grep can’t be used easily to get information, so it’s not a very effective method. For my Lenovo T61 laptop, the output is as follows:

dmidecode | less

BIOS Information

Vendor: LENOVO

Version: 7LET52WW (1.22 )

Release Date: 08/27/2007

This is much more effective than restarting the machine and looking at the post output. To check the driver and firmware version of the Ethernet adapter, run ethtool:

ethtool -i eth0

driver: e1000

version: 7.3.20-k2-NAPI

firmware-version: 0.3-0

Concluding remarks

There are many skills you can learn from someone who is proficient in the command line. The best way to learn is:

1. Work with others. Share screen conversations and watch how others work – you’ll discover new ways to do things. You may need to be modest and let others guide you, but you can usually learn a lot.

2. Read the manual page. If you read the manual page carefully, you can get a deeper insight even if you are familiar with the command. For example, you might not have known that you could use awk for network programming.

3. Solve the problem. As a system administrator, you always have to solve problems, whether it’s caused by you or others. This is experience. Experience can make you better and more efficient.

The best administrators are more leisurely, because they can find the fastest way to complete the task, and can quickly complete the task, so as to maintain the leisure life. (source: )

Improve efficiency! 10 key skills for Linux administrator

Improve efficiency! 10 key skills for Linux administrator

Improve efficiency! 10 key skills for Linux administrator

Improve efficiency! 10 key skills for Linux administrator