Python obtains sample data or performs tasks proportionally

Time:2021-9-24

Obtain sample data or perform tasks in proportion

 

By: guest granting QQ: 1033553122

development environment

win 10

python 3.6.5

 

demand

Given the proportion of samples in each category and the total number of samples, it is necessary to obtain the samples of these categories in proportion. For example, I have four kinds of tasks to be executed, namely task a, Task B, task C and task D. the total number of tasks required to be executed is 100000, and the proportion of execution times of tasks of different categories is a: B: C: D = 3:5:7:9, and these tasks are carried out at the same time in the macro

 

 

code implementation

 

#!/usr/bin/env python
# -*- coding:utf-8 -*-


__author__ = 'shouke'

import time
from copy import deepcopy


def main():

    class_ propotion_ Map = {'a': 3, 'B': 5, 'C': 7,'d ': 7}# classification and sample number proportion mapping
    class_ List = [] # classification
    class_ proption_ List = [] # proportion of classified samples stored

    for class_ type, propotion in class_ propotion_ Map. Items(): # the same cycle can ensure one-to-one correspondence between the proportional index and the corresponding classification index
        class_list.append(class_type) 
        class_proption_list.append(propotion)

    temp_class_propotion_list = deepcopy(class_proption_list)
    result = []

    t1 = time.time()
    total_ sample_ Num = 100000 # task execution times
    for i in range(1, total_sample_num+1):
        max_propotion = max(temp_class_propotion_list)
        if max_propotion > 0:
            index = temp_class_propotion_list.index(max_propotion)
            result.append(class_list[index])
            temp_class_propotion_list[index] -= 1
        elif max_propotion == 0 and min(temp_class_propotion_list) == 0:
            temp_class_propotion_list = deepcopy(class_proption_list)
            index = temp_class_propotion_list.index(max(temp_class_propotion_list))
            result.append(class_list[index])
            temp_class_propotion_list[index] -= 1



    t2 = time.time()
    from collections import Counter
    c = Counter(result)
    for item in c.items():
        print(item[0], item[1]/total_sample_num)
    Print ('time consuming:% s'% (t2-t1))

main()

  

 

 

Operation results

 

explain

The general realization idea of the above method is to obtain the copy data list of the proportion of each classification sample number, then obtain the maximum proportion value from it each time, and find the classification corresponding to the proportion value (after obtaining the classification, you can construct and obtain the classification sample data as needed). After finding the target classification, reduce the proportion value in the copy of the proportion data by 1, Until the maximum proportion and minimum proportion are equal to 0, then reset the proportional copy data to the proportional value of the number of samples, and repeat the previous process until the number of samples reaches the target total number of samples. The premise of this method is to know the total number of samples and the proportion of samples of different classifications in advance, and the proportional value is an integer

 

 

Recommended Today

A detailed explanation of the differences between Perl and strawberry Perl and ActivePerl

Perl is the abbreviation of practical extraction and report language “practical report extraction language”. Application of activestateperl and strawberry PERL on Windows platformcompiler。 Perl   The relationship between the latter two is that C language and Linux system have their own GCC. The biggest difference between activestate Perl and strawberry Perl is that strawberry Perl […]