A detailed explanation of the performance of Java traversal mechanism

Time:2020-1-10

Reason

Recently, when writing the lemonade change of leetcode, we found that the time-consuming of for loop and foreach loop is inconsistent, which is twice as long as the submission record

In general, most business logic development needs the help of traversal mechanism. Although we also pay attention to the performance comparison of data structure operations, we ignore the performance difference of traversal mechanism. I started to write two days ago. Procrastination

text

At this stage, I know that there are three Java traversal mechanisms

  • For cycle
  • Foreach cycle
  • Iterator loop

Java data structure is tens of millions, but most of it is the encapsulation of basic data structure. Compared with HashMap, it depends on node array, LinkedList is linked list at the bottom, and ArrayList is far away from the encapsulation of array

In summary, there are two basic data structures of Java

  • array
  • Linked list

If you add hash (the operation of hash is not consistent with array and linked list), there are three types

Because most of the normal development prefer the data structure after packaging, I will use

  • ArrayList (wrapped array)
  • LinkedList (packed list)
  • HashSet (wrapped array of hash types)

The time difference of these three data structures in different traversal mechanisms

Some people may wonder why I don’t compare with HashMap, because in Java design, map is implemented first, and then set. If you have read the source code, you will find that in the implementation of each set subclass, there is a serialized map corresponding to the attribute implementation. Because the search time complexity of hash is O (1), the search time of value after key is roughly the same, so I do not compare with HashMap.

Digression

When I read crazy Java, I read that the designer of Java set the value in the internal entry array of map to null to realize the set. Because I’m based on the source code and official documents, I don’t know whether it’s correct or not. But because the keys in the hash are different from each other and the elements in the set are different from each other, I think this view is correct.

In order to test fairness, I will adopt the following restrictions

The size of each data structure is set to three orders of magnitude

  • 10
  • 100
  • 1000

All elements are generated by random numbers

Traverse to output the value of the current element

Note:The time cost is affected by the local environment, and the measurement may change, but the overall proportion is correct

Comparison of ArrayList

Code


public class TextArray {

  private static Random random;

  private static List<Integer> list1;

  private static List<Integer> list2;

  private static List<Integer> list3;

  public static void execute(){
    random=new Random();
    initArray();
    testForWith10Object();
    testForEachWith10Object();
    testIteratorWith10Object();
    testForWith100Object();
    testForEachWith100Object();
    testIteratorWith100Object();
    testForWith1000Object();
    testForEachWith1000Object();
    testIteratorWith1000Object();
  }

  private static void testForWith10Object(){
    printFor(list1);
  }

  private static void testForWith100Object(){
    printFor(list2);
  }

  private static void testForWith1000Object(){
    printFor(list3);
  }

  private static void testForEachWith10Object(){
    printForeach(list1);
  }

  private static void testForEachWith100Object(){
    printForeach(list2);
  }

  private static void testForEachWith1000Object(){
    printForeach(list3);
  }

  private static void testIteratorWith10Object() {
    printIterator(list1);
  }

  private static void testIteratorWith100Object() {
    printIterator(list2);
  }

  private static void testIteratorWith1000Object() {
    printIterator(list3);
  }

  private static void printFor(List<Integer> list){
    System.out.println();
    System.out.print("data:");
    long start=System.currentTimeMillis();
    for(int i=0,length=list.size();i<length;i++){
      System.out.print(list.get(i)+" ");
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("for for "+list.size()+":"+(end-start)+"ms");
  }

  private static void printForeach(List<Integer> list){
    System.out.println();
    System.out.print("data:");
    long start=System.currentTimeMillis();
    for(int temp:list){
      System.out.print(temp+" ");
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("foreach for "+list.size()+":"+(end-start)+"ms");
  }

  private static void printIterator(List<Integer> list){
    System.out.println();
    System.out.print("data:");
    Iterator<Integer> it=list.iterator();
    long start=System.currentTimeMillis();
    while(it.hasNext()){
      System.out.print(it.next()+" ");
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("iterator for "+list.size()+":"+(end-start)+"ms");
  }

  private static void initArray(){
    list1=new ArrayList<>();
    list2=new ArrayList<>();
    list3=new ArrayList<>();
    for(int i=0;i<10;i++){
      list1.add(random.nextInt());
    }
    for(int i=0;i<100;i++){
      list2.add(random.nextInt());
    }
    for(int i=0;i<1000;i++){
      list3.add(random.nextInt());
    }
  }
}

Output (ignore output to element)

for for 10:1ms
foreach for 10:0ms
iterator for 10:2ms

for for 100:5ms
foreach for 100:4ms
iterator for 100:12ms

for for 1000:33ms
foreach for 1000:7ms
iterator for 1000:16ms

10 100 1000
for 1ms 5ms 33ms
forEach 0ms 4ms 7ms
Iterator 2ms 12ms 16ms

conclusion

For is the most unstable, foreach is the second, iterator is the best

Recommendations for use

  • When the data volume is not clear (maybe 1W, 10W or other), it is recommended to use iterator for traversal
  • Foreach is preferred when the amount of data is clear and the magnitude is small
  • When index is needed, incremental variables are less expensive than for

LinkedList comparison

Code


public class TextLinkedList {

  private static Random random;

  private static List<Integer> list1;

  private static List<Integer> list2;

  private static List<Integer> list3;

  public static void execute(){
    random=new Random();
    initList();
    testForWith10Object();
    testForEachWith10Object();
    testIteratorWith10Object();
    testForWith100Object();
    testForEachWith100Object();
    testIteratorWith100Object();
    testForWith1000Object();
    testForEachWith1000Object();
    testIteratorWith1000Object();
  }

  private static void testForWith10Object() {
    printFor(list1);
  }

  private static void testForEachWith10Object() {
    printForeach(list1);
  }

  private static void testIteratorWith10Object() {
    printIterator(list1);
  }

  private static void testForWith100Object() {
    printFor(list2);
  }

  private static void testForEachWith100Object() {
    printForeach(list2);
  }

  private static void testIteratorWith100Object() {
    printIterator(list2);
  }

  private static void testForWith1000Object() {
    printFor(list3);
  }

  private static void testForEachWith1000Object() {
    printForeach(list3);
  }

  private static void testIteratorWith1000Object() {
    printIterator(list3);
  }

  private static void printFor(List<Integer> list){
    System.out.println();
    System.out.print("data:");
    long start=System.currentTimeMillis();
    for(int i=0,size=list.size();i<size;i++){
      System.out.print(list.get(i));
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("for for "+list.size()+":"+(end-start)+"ms");
  }

  private static void printForeach(List<Integer> list){
    System.out.println();
    System.out.print("data:");
    long start=System.currentTimeMillis();
    for(int temp:list){
      System.out.print(temp+" ");
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("foreach for "+list.size()+":"+(end-start)+"ms");
  }

  private static void printIterator(List<Integer> list){
    System.out.println();
    System.out.print("data:");
    Iterator<Integer> it=list.iterator();
    long start=System.currentTimeMillis();
    while(it.hasNext()){
      System.out.print(it.next()+" ");
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("iterator for "+list.size()+":"+(end-start)+"ms");
  }


  private static void initList() {
    list1=new LinkedList<>();
    list2=new LinkedList<>();
    list3=new LinkedList<>();
    for(int i=0;i<10;i++){
      list1.add(random.nextInt());
    }
    for(int i=0;i<100;i++){
      list2.add(random.nextInt());
    }
    for(int i=0;i<1000;i++){
      list3.add(random.nextInt());
    }
  }
}

Output (ignore output to element)

for for 10:0ms
foreach for 10:1ms
iterator for 10:0ms

for for 100:1ms
foreach for 100:0ms
iterator for 100:3ms

for for 1000:23ms
foreach for 1000:25ms
iterator for 1000:4ms

10 100 1000
for 0ms 1ms 23ms
forEach 1ms 0ms 25ms
Iterator 0ms 3ms 4ms

conclusion

Foreach is the most unstable, followed by for, and iterator is the best

Recommendations for use

  • Try to traverse with iterator
  • When index is needed, incremental variables are less expensive than for

Comparison of hashsets

Note:Because hash traversal algorithm is inconsistent with other types, the comparison of for loop is cancelled

Code


public class TextHash {

  private static Random random;

  private static Set<Integer> set1;

  private static Set<Integer> set2;

  private static Set<Integer> set3;

  public static void execute(){
    random=new Random();
    initHash();
    testIteratorWith10Object();
    testForEachWith10Object();
    testIteratorWith100Object();
    testForEachWith100Object();
    testIteratorWith1000Object();
    testForEachWith1000Object();
  }

  private static void testIteratorWith10Object() {
    printIterator(set1);
  }

  private static void testForEachWith10Object() {
    printForeach(set1);
  }

  private static void testIteratorWith100Object() {
    printIterator(set2);
  }

  private static void testForEachWith100Object() {
    printForeach(set2);
  }

  private static void testIteratorWith1000Object() {
    printIterator(set3);
  }

  private static void testForEachWith1000Object() {
    printForeach(set3);
  }

  private static void initHash() {
    set1=new HashSet<>();
    set2=new HashSet<>();
    set3=new HashSet<>();
    for(int i=0;i<10;i++){
      set1.add(random.nextInt());
    }
    for(int i=0;i<100;i++){
      set2.add(random.nextInt());
    }
    for(int i=0;i<1000;i++){
      set3.add(random.nextInt());
    }
  }

  private static void printIterator(Set<Integer> data){
    System.out.println();
    System.out.print("data:");
    long start=System.currentTimeMillis();
    Iterator<Integer> it=data.iterator();
    while (it.hasNext()){
      System.out.print(it.next()+" ");
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("iterator for "+data.size()+":"+(end-start)+"ms");
  }

  private static void printForeach(Set<Integer> data){
    System.out.println();
    System.out.print("data:");
    long start=System.currentTimeMillis();
    for(int temp:data){
      System.out.print(temp+" ");
    }
    System.out.println();
    long end=System.currentTimeMillis();
    System.out.println("foreach for "+data.size()+":"+(end-start)+"ms");
  }
}

Output (ignore output to element)

iterator for 10:0ms
foreach for 10:0ms

iterator for 100:6ms
foreach for 100:0ms

iterator for 1000:30ms
foreach for 1000:9ms

10 100 1000
foreach 0ms 0ms 9ms
Iterator 0ms 6ms 30ms

conclusion

Foreach is far ahead of iterator in performance

Recommendations for use

I will choose foreach later. It has good performance and is convenient to write.

summary

  • In the comparison of the three, the performance of for loop is in the downwind, and the cost increases rapidly. In the future, I would rather use incremental variables than for even when I need to use indexes.
  • The performance of iterator is the best in array and linked list, which should be optimized by Java designer. When response time is sensitive (for example, Web response), priority is given.
  • Foreach’s performance is between the two. I will try my best to use it when the writing method is simple and the time is not sensitive.

The above is my comparison of common data structure traversal mechanisms. Although it’s only preliminary, I’ve learned a lot from it, and I hope you can get something from it.

Well, that’s all the content of this article. I hope that the content of this article has a certain reference learning value for your study or work. Thank you for your support for developpaer.