Tag:fault
-
Teach you how to build a monitoring system
dark-1850638_1280.jpg preface Why should we do surveillance? For example, various cameras on the road can monitor traffic flow and traffic faults. In case of failure, the place of occurrence can be determined at the first time. Of course, in our field, monitoring also plays the same role. It can also help us monitor business traffic, […]
-
Principle and implementation of service registration and discovery
What is service registration discovery? For students engaged in micro services, the concepts of service registration and service discovery should not be too strange. In short, when service a needs to rely on service B, we need to tell service a where to call service B, which is the problem to be solved by service […]
-
Distributed storage system reliability: system quantitative estimation
1、 Introduction We often hear two indicators to measure the quality of distributed storage systems: availability and reliability. Availability refers to the availability of system services. Generally, the availability is measured by dividing the available time of the whole year by the time of the whole year. Usually, the SLA index is the availability index, […]
-
Principle of redis sentinel mode
Principle of redis sentinel mode Redis high availability related technologies: Persistence: stand alone backup (backup from memory to hard disk) Master slave replication: multi machine hot standby, load balancing, fault recovery Sentry: automated fault recovery Cluster: write load balancing, horizontal expansion of storage capacity Why sentinel mode Only rely on the persistence scheme, and the […]
-
[North Asia server data recovery] oracle-sun-zfs file system server data recovery case
Server data recovery environment:Oracle-sun-zfs storage server;Windows operating system;ZFS file system;4 groups with 8 hard disks in each group;All hot spares are enabled. Server data recovery failure:When the server is running normally, the sudden failure can not work normally. The server administrator can not enter the system after restarting the server. Contact North Asia data recovery […]
-
Use the Tianyi cloud host unit function to make the virtual machine not put in the same basket
On February 6, 1958, British European Airlines flight 609 failed and crashed in the third attempt to take off at Munich Airport in West Germany. 23 of the 44 passengers and crew on board were killed, including 8 players and 3 staff members of Manchester United, the famous Premier League team. The air crash caused […]
-
Hadoop entry notes 11: HDFS high availability (HA)
I High availability background 1. Single point of failure, high availability Single point of failure (spof)It means that once a certain point in the system fails, the whole system will not workIn other words,A single point of failure is an overall failure。 High availability(English: high availability, abbreviated as HA), an it term, refers to the […]
-
MySQL high availability architecture MHA
summary Introduction to MHA Developed by Japan Dena company youshimaton It is an excellent solution to realize high availability of MySQL The automatic failover operation of the database can be completed within 0 ~ 30 seconds MHA can ensure the consistency of data to the greatest extent in the process of failover, so as to […]
-
[Beiya server data recovery] data recovery of file system consistency error caused by Lun mapping error
Server data recovery environment:Server: Sun optical storage system;Six 300g hard disks form raid6, which is divided into several LUNs and maps to servers of different services;The server operating system is sun Solaris. Fault:For new business applications, you need to add a server. The server administrator maps one Lun to a new server when the original […]
-
Evolution case of Internet e-commerce shopping cart architecture
The main functions of shopping cart are: Similar to traditional stores, it is convenient for users to select multiple goods to settle at one time. Functions as temporary favorites. For businesses, shopping cart is one of the best places to sell to users. early stage picture ERP split Business service splitting WCS split Overview of […]
-
Common fault handling under RHEL 5 system
1、 / boot all the following files are missing: (grub, kernel, initrd, ramdisk) 1. Start with BootDisk, enter liunx rescue mode, and select local install or NFS (HTTP)installmode 2. Enter the mold repair mode: (1)。 cd /mnt/sysimage Check what files (if empty) (2) install kernel: cd /mnt/source/Server rpm -ivh kernel-2.6.18-53.el5.rpm […]
-
Method of using sysdig to monitor and eliminate server failure of Linux system
When you need to track system calls generated and received by a process, what first comes to mind? You might think of strace, so you’re right. What command line tools would you use to monitor raw network traffic? If you think of tcpdump, you have made an excellent choice. If you encounter a need to […]