Notes on parallel stream based on Java 8

Time:2022-1-25

Java 8 parallel stream considerations

When parallel streams were first used, the query list occasionally reported null pointer exceptions, which made me very puzzled

The code is as follows:

List<OrderListVO> orderListVOS = new LinkedList<OrderListVO>();

baseOrderBillList.parallelStream().forEach(baseOrderBill -> {
   OrderListVO orderListVO = new OrderListVO();
   //Set properties in order

   orderListVO.setOrderbillgrowthid(baseOrderBill.getOrderbillgrowthid());
   orderListVO.setOrderbillid(baseOrderBill.getOrderbillid());
   ……
   orderListVOS.add(orderListVO);
}

The code itself is split into multiple tables and then assembled in the business layer. The use of parallel flow can improve this pure CPU intensive operation. Parallelstream takes the number of server CPU cores as the thread pool by default.

Because it is a parallel stream, multiple threads are actually operating the orderlistvos container concurrently, but this container can not guarantee thread safety.

After modification:


List<OrderListVO> orderListVOS = Collections
.synchronizedList(new LinkedList<OrderListVO>());

In this way, the ideal result can be obtained.

In addition, stream comes with the final aggregation method:


List<OrderListVO> orderListVOS = orderListVOS.parallelStream()
                .sorted(Comparator.comparing(OrderListVO::getCreatetime).reversed())
                .collect(Collectors.toList());

The collect (collectors. Tolist()) method finally summarizes the data after the operation. This method itself implements the operation of thread safety, and the final result will be correct.

Correct usage of parallel stream () of java8

1. Because it is a parallel stream, the data structure involved

Need to use thread safe, such as

listByPage.parallelStream().forEach(str-> {
           //Using thread safe data structures
           //ConcurrentHashMap
           //CopyOnWriteArrayList
           //And so on
        });

2. The default priority is used in CPU intensive computing

Some people here have said, isn’t it effective to request in parallel in io intensive operations such as HTTP requests

Since the default parallel flow uses the global thread pool, and the number of threads is set according to the number of CPU cores, if an operation occupies threads, it will affect other global operations using parallel flow

Therefore, the compromise solution is to customize the thread pool to perform a parallel flow operation


  ForkJoinPool forkJoinPool = new ForkJoinPool(10);
        forkJoinPool.execute(() -> {
            listByPage.parallelStream().forEach(str -> {
                
            });
        });

The above is my personal experience. I hope I can give you a reference, and I hope you can support developpaer.