A practical application case of map reduce idea in ABAP programming

Time:2022-5-9

ABAP is an enterprise application programming language. Its 740 version was released in 2013, adding many new syntax and keywords:

A practical application case of map reduce idea in ABAP programming

One of the highlights is the newly introduced reduce keyword. The role of this keyword and widely used in the field of parallel computing of large-scale data setsMap-ReduceThe reduce operation in the programming model is similar, which can be understood literally asreduction

What is the idea of map reduce?

Map reduce is a programming model and related implementation, which is used to generate and process large-scale data sets using parallel distributed algorithms on clusters.

A map reduce program consists of a map process and a reduce method. The map process is responsible for filtering and sorting, such as sorting students into queues by name, and each name is maintained by a queue.

The reduce method is responsible for performing summary operations, such as calculating the number of students.
Map reduce system arranges distributed servers to run various tasks in parallel, manage all communication and data transmission between various parts of the system, provide data redundancy and realize fault tolerance mechanism.

The following figure shows the working steps of map reduce framework, which counts the number of words in a massive input data set (such as greater than 1TB). The work steps include splitting, mapping, shuffling and reducing to get the final output result.

A practical application case of map reduce idea in ABAP programming

Map reduce programming model has been widely used in tools and frameworks in the field of big data processing, such as Hadoop.

A practical application case of map reduce idea in ABAP programming

A practical application of map reduce in CRM system

Let’s look at a practical task in the author’s work. I need to make statistics on a CRM test system and list it in the database table CRM_ In jsto, the number of inner table rows with the same value in the two columns of obtyp (object type) and stsma (status schema). You can putObtyp and stsma columns are inner table rows with the same valueThis description is analogous to the repeated words in the figure above.

The following figure is the system table in the databaseCRM_JSTOPartial lines of:

A practical application case of map reduce idea in ABAP programming

The following figure shows the final statistical results completed by the author:

The total number of rows in the database table on the test system exceeds 550000, including 90279 rows. Only obtyp is maintained as TGP, but stsma is not maintained

The combination of COH and crmlead ranked second, with 78722 occurrences.

A practical application case of map reduce idea in ABAP programming

How is the result of the above figure calculated?

Friends who have done some ABAP development will write the following code immediately:

A practical application case of map reduce idea in ABAP programming

utilizeSELECT COUNTComplete the statistical work directly in the database layer. This is also the practice recommended by SAP, the so-calledCode pusudownCriterion, that is, the operations that can be carried out at the Hana database level should be put in as far as possible, so as to make full use of the powerful computing power of Hana. On the premise that the database can complete the calculation logic, try to avoid putting the calculation logic into the NetWeaver ABAP application layer.

A practical application case of map reduce idea in ABAP programming

However, we also need to pay attention to the limitations of this approach. SAP CTO once said:

There is no future with ABAP alone
There is no future in SAP without ABAP

A practical application case of map reduce idea in ABAP programming

In the future, ABAP will be open and interconnected. Back to the requirement itself, suppose that the input data to be retrieved is not from the ABAP database table, but from the HTTP request or IDoc sent by the third-party system. At this time, we can no longer use the select count operation of open SQL itself, but can only solve this problem in the ABAP application layer.

Here are two solutions to this requirement with ABAP programming language.

The first method is more traditional and is implemented in the method get_ result_ traditional_ Way:

A practical application case of map reduce idea in ABAP programming

The keyword combination of ABAP’s loop at group by is almost customized for this requirement: specify the two columns of obtyp and stsma for group by, and then loop at will automatically group the row records entered into the internal table according to the values of these two columns. The number of row records in each group will be calculated automatically through the key word group size. The values of obtyp and stsma in each group and the number of entries of row records in the group, Stored in the variable group specified by reference into_ Ref. What ABAP developers need to do is simply store these results in the output table.

A practical application case of map reduce idea in ABAP programming

The second method is to use the reduce keyword newly introduced by ABAP 740 as described in the title of this article:

A practical application case of map reduce idea in ABAP programming

REPORT zreduce1.

DATA: lt_status TYPE TABLE OF crm_jsto.

SELECT * INTO TABLE lt_status FROM crm_jsto.

DATA(lo_tool) = NEW zcl_status_calc_tool( ).

lo_tool = REDUCE #( INIT  o = lo_tool
                          local_item = VALUE zcl_status_calc_tool=>ty_status_result( )
                     FOR GROUPS <group_key> OF <wa> IN lt_status
                      GROUP BY ( obtyp = <wa>-obtyp stsma = <wa>-stsma )
       ASCENDING NEXT local_item = VALUE #( obtyp = <group_key>-obtyp
                                             stsma = <group_key>-stsma
       count = REDUCE i( INIT sum = 0 FOR m IN GROUP <group_key>
               NEXT sum = sum + 1 ) )
       o = o->add_result( local_item ) ).

DATA(ls_result) = lo_tool->get_result( ).

At first glance, the above code may seem a little obscure, but after careful reading, it is found that this method essentially adopts the same grouping strategy as method 1 loop at group by – grouping according to obtyp and stsma, and these subgroups pass through variablesgroup_keyIdentify, and then manually calculate the number of entries of this group by accumulating through the reduce keyword in line 10 – reduce a large input set into smaller subsets according to the conditions specified by group by, and then calculate the subsets respectively – this is the processing idea of the reduce keyword passed to ABAP developers through its literal meaning.

Summarize and compare these three implementation methods: when the data source to be counted is ABAP database table, open SQL must be preferred to complete the calculation logic in the database layer to obtain the best performance.

When the data source is not ABAP database table, and the demand of grouping statistics is simple count operation (count), loop at is preferred GROUP BY … Group size enables the counting operation to be completed in the ABAP kernel through group size to obtain better performance.

When the data source is not ABAP database table and the demand for grouping statistics is user-defined logic, use the third reduce solution introduced in this paper to write the user-defined statistical logic after the next keyword in line 11.

Performance evaluation of three solutions

I wrote a simple report for performance evaluation:

DATA: lt_status TYPE zcl_status_calc_tool=>tt_raw_input.

SELECT * INTO TABLE lt_status FROM crm_jsto.

DATA(lo_tool) = NEW zcl_status_calc_tool( ).

zcl_abap_benchmark_tool=>start_timer( ).
DATA(lt_result1) = lo_tool->get_result_traditional_way( lt_status ).
zcl_abap_benchmark_tool=>stop_timer( ).

zcl_abap_benchmark_tool=>start_timer( ).
lo_tool = REDUCE #( INIT  o = lo_tool
                          local_item = VALUE zcl_status_calc_tool=>ty_status_result( )
                     FOR GROUPS <group_key> OF <wa> IN lt_status
                      GROUP BY ( obtyp = <wa>-obtyp stsma = <wa>-stsma )
       ASCENDING NEXT local_item = VALUE #( obtyp = <group_key>-obtyp
                                             stsma = <group_key>-stsma
       count = REDUCE i( INIT sum = 0 FOR m IN GROUP <group_key>
               NEXT sum = sum + 1 ) )
       o = o->add_result( local_item ) ).

DATA(lt_result2) = lo_tool->get_result( ).
zcl_abap_benchmark_tool=>stop_timer( ).

ASSERT lt_result1 = lt_result2.

The test data are as follows:

A practical application case of map reduce idea in ABAP programming

The performance of these three solutions decreases in turn, but the applicable occasions and flexibility increase in turn.

LOOP AT ... GROUP BY ... GROUP SIZEThis solution, in the ABAP test server where the author works, processes 550000 records and uses0.3Seconds, while reduce takes0.8Second, the performance of the two solutions is within the same order of magnitude.

summary

Map reduce is a programming model and related implementation, which is used to generate and process large-scale data sets using parallel distributed algorithms on clusters. ABAP programming language supports reduce operation of large-scale data from the language level. This paper shares a practical case of using map reduce idea to deal with large-scale data sets in the author’s work, and compares it with the other two traditional solutions. On the premise that the performance is not inferior to the traditional solution, the solution based on map reduce has a wider range of applications and scalability. I hope the content shared in this article will inspire you to use ABAP to deal with similar problems. Thank you for reading.

A practical application case of map reduce idea in ABAP programming

Recommended Today

Ios15 adaptation

Need immediate modification 1. Uinavigationbar, uitoolbar and uitabbar styles Uinavigationbar, uitoolbar and uitabbar need to use uibarappearance APIs to set colors. // UINavigationBar let navigationBarAppearance = UINavigationBarAppearance() navigationBarAppearance.backgroundColor = .red navigationController?.navigationBar.scrollEdgeAppearance = navigationBarAppearance navigationController?.navigationBar.standardAppearance = navigationBarAppearance // UIToolbar let toolBarAppearance = UIToolbarAppearance() toolBarAppearance.backgroundColor = .blue navigationController?.toolbar.scrollEdgeAppearance = toolBarAppearance navigationController?.toolbar.standardAppearance = toolBarAppearance // UITabBar let tabBarAppearance […]