Network fault diagnosis under Linux

Time:2019-11-8

Because there are many hierarchies to realize the network server, it is more complex to solve when the network fails. Now let me introduce some network problems that may appear in Linux system in detail, such as network card hardware problems, network configuration problems, driver problems, network layer, transport layer, application layer problems, etc.

Network card failure can be divided into hardware failure and software failure. The easiest way to judge hardware failure is to plug the network card into other computers for use. If it is still an old problem, the network card will be damaged, otherwise the network card will be normal. In fact, most network card failures are software failures, which are generally divided into two categories: one is setup failure, the other is driver failure.

Diagnose network card failure

[[email protected] ~]#dmesg | grep eth
eth0:registered as PCnet/PCI II 79C970A
eth0:link up
eth0:no IPv6 routers present
[[email protected] ~]#

The above command lists the lines containing eth string in the boot information. If a prompt similar to “eth0: link up” appears, it means that Linux has detected the network card and is in normal working state. Another lspci command can list all PCI devices detected by the system. If the network card used is PCI bus, you should be able to see the information of this network card. Finally, you can use ethtool to check whether the link connection of Ethernet is normal.

The above command lists the lines containing eth string in the boot information. If a prompt similar to “eth0: link up” appears, it means that Linux has detected the network card and is in normal working state. Another lspci command can list all PCI devices detected by the system. If the network card used is PCI bus, you should be able to see the information of this network card. Finally, you can use ethtool to check whether the link connection of Ethernet is normal.

[[email protected] ~]#ethtool eth0
Settings for eth0:
       Current message level: 0x00000007 (7)
       Link detected:yes
[[email protected] ~]#

If you see the “link detected: Yes” line, it indicates that the network card is also connected to the other network line normally.

LAN driver

In RHEL 6, you need to first view or set the / etc / modeprobe.cong file, which contains information about module installation and alias.

[[email protected] ~]#more /etc/modeprobe.cong 
alias scsi_hostadapter mptbase 
... 
alias eth0 pcnet32 
[[email protected] ~]#

In the above display, the last line “alias eth0 pcnet32” indicates that an alias eth0 is defined for pcnet32. That is to say, the module corresponding to Ethernet interface eth0 currently used is pcnet32. You can use the following command to determine whether there is a pcnet32 module in the module currently loaded by the system.

[[email protected] 2.6.18-8.e15]#lsmod | grep pcnet32
pcnet32       35269      0
mii            9409      1   pcnet32 
[[email protected] 2.6.18-8.e15]#

As you can see, pcnet32 is already installed. Therefore, if the network card has been detected by Linux, but the eth0 interface cannot be seen when executing the “ipconfig-a” command, you can find the driver module of the network card according to the above methods, and then check whether the module has been installed.

Diagnose network layer problems

The diagnosis method of network layer problem is very simple, that is, directly Ping a domain name or IP of the external network, which can be connected normally, it means that there is no problem in the network layer.

There are many different reasons for Ping, such as network line, network setting, routing and ARP. It is recommended to Ping the gateway first to see if it can be connected. If it can be connected with the gateway, it generally indicates that there are no problems with the network line, the network settings of your own machine and ARP. You can display the routing table with the command “route – n” and get the address of the gateway. If the default gateway is not set in the routing table, it indicates that there is a problem with the routing settings. At this time, you need to set the default gateway.

Sometimes, there are ARP attacks or other reasons in the local area network, which make the MAC address of the gateway IP in the ARP cache of the local machine wrong, which will also cause the Ping failure with the gateway. At this time, you can use the “arp-d < gateway IP >” command to delete the ARP entry of the gateway, or set the static ARP entry through the “arp-a < gateway IP > < gateway MAC >” command.

Diagnose transport and application layer problems

One of the most effective ways to diagnose the faults in the transmission layer and application layer is to use the packet grabbing tool to grab data packets for analysis. In Linux, tcpdump tool is provided by default, which can be used to grab all packets accessing or going out of the local machine, and only interested packets can be grabbed through rules.

A possible cause of failure related to the operating system is improper firewall configuration. In Linux, iptables firewall is enabled by default when the system starts, and only a few ports are allowed. So when a service is configured on the local machine, and this service needs to be accessed through a port of TCP or UDP, the firewall is required to open the corresponding port, otherwise, other hosts will not be able to access this service of the local machine.