PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

Time:2020-9-28

Advanced experiment 1-3.1 median of two ordered sequences

Given that there are two equal length non descending sequence S1, S2, design function to find the median of the union of S1 and S2. The median of the ordered sequence a 0, a 1,…, a n − 1 refers to the value of a (n − 1) / 2, that is ⌊ (n + 1) / 2 ⌋ (a 0 is the first number).

Input example 1:

5
1 3 5 7 9
2 3 4 5 6

Output example 1:

4

Input example 2:

6   
-100  -10  1  1  1  1  
-50  0  2  3  4  5

Output example 2:

1  

Title: advanced experiment 1-3.1 median of two ordered sequences (25 points)

algorithm analysis

  after reading the topic, the first reaction is to find the union of two sets, and then arrange an order to output the number in the middle. However, when we see 100000 data, the time limit is 200ms. The time complexity of fast scheduling is O (nlogn), and it must time out.
  so there must be a better algorithm.
  next, notice that the sequence in the title is non descending sequence, and think of taking their respective median and comparing. By comparing the ways to reduce the scale of the problem. If the method is effective, the time complexity of the algorithm should be o (logn), meeting the scoring requirements.
  so let’s test this idea.

Conjecture and verification

Guess based on Mathematics

  we first take the median of the sequence S1 asmidaThen take the median of S2 and set it asmidb
  because the sequences S1 and S2 are in ascending order. So the numbers on the left of s1mida are all less thanmidaThe numbers on the right are greater thanmida。 The sequence S2 is the same.
  comparison at this timemidaandmidb
Due tomidaIs the median of S1,midbIs the median of S2. So set upU=S1∪S2Medium, greater thanMAX{mida,midb}None of the numbers can be a median. In the same way, the set is smaller thanMIN{mida,midb}None of the numbers can be a median.
  by comparing the size of Mida and MIDB, we divide the set u into two intervals
I.eA = [right interval of Min {Mida, MIDB}, left interval of Max {Mida, MIDB}]and∁UA
  at this time, the problem is simplified to find the median of set a.
  then through continuous binary search, a will eventually become a set of only two numbers. So according to the definition of median, the median must bemin{A}The smaller of the two numbers.


Verification based on test cases

Let’s simulate this process.

PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

  this is the sequence S1, wheremida=5。

PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

  this is sequence S2, wheremidb=4。

Due tomida > midbAt this timeU=S1∪S2It’s divided into two sets,A={1,3,5}∪{4,5,6}(blue) and∁UA(white).

PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

The median must be in set a. Because the median is in the middle of the sequence after sorting, it should be in the middle of the median of two ascending subsequences.

  at this point, the problem becomes to get the median in set a. off-white∁UAIt can be abandoned directly.

Recursively, we can iterate over set a step by step.

Until this step, we will encounter a problem, which is also a big hole encountered by the author.

PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

At this time, the number of numbers in the two sequences are even numbers, and the median is the smaller one of the two numbers, which is the previous one. If you continue to iterate this way, the next set becomes this.

PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

Since the median of {3,5} is 3, less than 4, the sequence to the right of it should be taken next. At this time, we will find that the sequence on the right is still {3,5}! It can cause infinite recursion or dead loop!

  therefore, by analyzing this step, we find that it is necessary to distinguish whether the number of numbers in the set is odd or even to obtain subsequences respectively. Finally, we find that the smallest even number of natural numbers is 2 except 0. When the sequence length is 2 and in ascending order, the median is directly the previous one.
  if we extend it to 4, we find that if we discard the first two numbers, the situation will degenerate to the above situation. That is, the set a is iterated again. Change it to the topic, i.eThe boundary number in even sequence can be discarded directly。 That is, Mida or MIDB.

therefore{4,5,6,1,3,5}After the iteration{4,5,3,5}It’s not true!

  the correct iteration method should be discarded{4,5,6,1,3,5}Mediummidaandmidb

The correct set a is as follows.

PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

The median was 4.

code

The following code is given by the author. Due to the recent review of C language, write is the tail recursive version.

#include <stdio.h>

#define MAX_N 100000

/*Binary search function declares left subscript of aleft a array and right subscript of right a array*/ 
int bin_search(int a[], int aleft, int aright, int b[], int bleft, int bright); 

int main()
{
    int n = 0, a[MAX_N] = {0}, b[MAX_N] = {0};
    scanf("%d", &n);
    for(int i = 0;i<n;i++){
        scanf("%d", &a[i]);
    }
    for(int i=0;i<n;i++){
        scanf("%d", &b[i]);
    }
    int mid = bin_search(a, 0, n-1, b, 0, n-1);
    printf("%d\n", mid);
    return 0;
}

int bin_search(int a[], int aleft, int aright, int b[], int bleft, int bright){
    Int al = 0, ar = 0, BL = 0, Br = 0; / * next recursive index of a, B array*/ 
    /*Indexa array median subscript the value of the median of Mida a array*/ 
    int indexa = (aleft+aright)/2, indexb = (bleft+bright)/2, mida = a[indexa], midb = b[indexb];
    /*If the median of two arrays is equal, it must be the solution*/
    if(mida == midb){ 
        return mida;
    }
    /*If there is only one number in the interval to be searched, the smaller one is the solution*/
    if(aleft >= aright && bleft >= bright){  
        return mida<midb?mida:midb;
    }
    if(mida > midb){
        BL = indexb; / * the smaller one takes the right interval*/
        br = bright;
        Ar = indexa; / * large left range*/ 
        al = aleft;    
        If ((right aleft + 1)% 2 = = 0) {/ * the current median is discarded when the even number is reduced*/
            bl = indexb+1;
        }
    }else if(mida < midb){
        al=indexa;
        ar = aright;
        bl=bleft;
        br = indexb;
        if((bright-bleft+1) % 2 == 0){
            al=indexa+1;
        }
    }
    return bin_search(a, al, ar, b, bl, br);
}

The operation is as follows.

PTA_ Data structure learning and experimental guidance_ Explanation_ 1-3.1 median of two ordered sequences

The fastest time is about 25ms. There is no magnitude gap. If we change recursion into loop, buffer input into fast input, it should take about the same time-consuming, indicating that this algorithm should be the fastest so far.

Summary

  this topic is not difficult, mainly exercise the attitude of writing code QAQ. After all, I haven’t written the code for a long time. For the boundary conditions are still a little unfamiliar, hope to be more rigorous.
  if you have any questions, please contact me~