Android anr: Principle Analysis and solution

Time:2021-11-25
Android anr: Principle Analysis and solution

image.png

1、 Anr description and reason

1.1 INTRODUCTION

Full name of anr:Application Not RespondingThat is, the application is not responding.

1.2 reasons

In Android,Activitymanagerservice (AMS)andWindowmanagerservice (WMS)The response time of the app will be detected. If the app cannot touch the corresponding screen or enter the time on the keyboard at a specific time, or the specific event is not processed, anr will appear.

The following four conditions can cause anr to occur:

  • InputDispatching Timeout: unable to respond to screen touch events or keyboard input events within 5 seconds
  • BroadcastQueue Timeout: when performing the broadcast of the foregroundonReceive()The function is not processed in 10 seconds, and the background is 60 seconds.
  • Service Timeout: the foreground service fails to complete within 20 seconds and the background service fails to complete within 200 seconds.
  • ContentProvider Timeout: the publishing of ContentProvider has not finished within 10s.

1.3 avoidance

Try to avoid time-consuming operations in the main thread (UI thread).

Then the time-consuming operation is placed in the child thread.
For multithreading, refer to:Android multithreading: summary of understanding and simple use

2、 Anr analysis method

2.1 anr reproduction

Here is the test done by Google pixel XL (Android 8.0 system), known as Google’s own son, to generate a button to jump toANRTestActivity, in the latteronCreate()Main thread in sleep for 20 seconds:

@Override
protected void onCreate(@Nullable Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_anr_test);
    //This is the thread sleep function provided by Android. The biggest difference between this function and thread. Sleep()
    //Using this function will not throw an interruptedexception exception.
    SystemClock.sleep(20 * 1000);
}

In entryANRTestActivityAfter the black screen for a period of time, about seven or eight seconds, an anr exception finally pops up.

Android anr: Principle Analysis and solution

2.2 anr analysis method I: log

After generating anr just now, look at the log:

Android anr: Principle Analysis and solution

You can see that logcat clearly records the time of anr, the TID of the thread and the reason in a sentence:WaitingInMainSignalCatcherLoopThe main thread waits for an exception.
Last sentenceThe application may be doing too much work on its main thread.Tell that too much work may have been done in the main thread.

2.3 anr analysis method II: traces.txt

The log just now has the second sentenceWrote stack traces to '/data/anr/traces.txt', indicating that anr exception has been output totraces.txtFile, use the ADB command to export this file from the mobile phone:

  1. CD toadb.exeThe directory where it is located, that isAndroid SDKofplatform-toolsDirectory, for example:
cd D:\Android\AndroidSdk\platform-tools

In addition, in addition toWindowsIn addition to CMD, you can also useAndroidStudioofTerminalTo enter the ADB command.

  1. After the specified directory, execute the following ADB command to exporttraces.txtFile:
adb pull /data/anr/traces.txt

traces.txtThe default is exported toAndroid SDKof\platform-toolscatalogue generally speakingtraces.txtThere will be a lot of documents and records. When analyzing, you need to find relevant records.

-----PID 23346 at 2017-11-07 11:33:57 ----------- > process ID and anr generation time
Cmd line: com.sky.myjavatest
Build fingerprint: 'google/marlin/marlin:8.0.0/OPR3.170623.007/4286350:user/release-keys'
ABI: 'arm64'
Build type: optimized
Zygote loaded classes=4681 post zygote classes=106
Intern table: 42675 strong; 137 weak
JNI: CheckJNI is on; globals=526 (plus 22 weak)
Libraries: /system/lib64/libandroid.so /system/lib64/libcompiler_rt.so 
/system/lib64/libjavacrypto.so
/system/lib64/libjnigraphics.so /system/lib64/libmedia_jni.so /system/lib64/libsoundpool.so
/system/lib64/libwebviewchromium_loader.so libjavacore.so libopenjdk.so (9)
Heap: 22% free, 1478KB/1896KB;  21881 objects ----- > memory usage

...

"Main" prio = 5 TID = 1 sleeping ----- > the reason is sleeping
  | group="main" sCount=1 dsCount=0 flags=1 obj=0x733d0670 self=0x74a4abea00
  | sysTid=23346 nice=-10 cgrp=default sched=0/0 handle=0x74a91ab9b0
  | state=S schedstat=( 391462128 82838177 354 ) utm=33 stm=4 core=3 HZ=100
  | stack=0x7fe6fac000-0x7fe6fae000 stackSize=8MB
  | held mutexes=
  at java.lang.Thread.sleep(Native method)
  - sleeping on <0x053fd2c2> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:373)
  - locked <0x053fd2c2> (a java.lang.Object)
  at java.lang.Thread.sleep(Thread.java:314)
  at android.os.SystemClock.sleep(SystemClock.java:122)
  At com. Sky. Myjavatest. Anrtestactivity. Oncreate (anrtestactivity. Java: 20) - -- > the package name and the specific number of lines that generate ANR
  at android.app.Activity.performCreate(Activity.java:6975)
  at android.app.Instrumentation.callActivityOnCreate(Instrumentation.java:1213)
  at android.app.ActivityThread.performLaunchActivity(ActivityThread.java:2770)
  at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2892)
  at android.app.ActivityThread.-wrap11(ActivityThread.java:-1)
  at android.app.ActivityThread$H.handleMessage(ActivityThread.java:1593)
  at android.os.Handler.dispatchMessage(Handler.java:105)
  at android.os.Looper.loop(Looper.java:164)
  at android.app.ActivityThread.main(ActivityThread.java:6541)
  at java.lang.reflect.Method.invoke(Native method)
  at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:240)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:767)

Using Ctrl + F to find the package name in the file can quickly locate the relevant code.
Relevant problems can be seen from the above log:

  • Process ID and package name:pid 23346 com.sky.myjavatest
  • Causes of anr:Sleeping
  • Specific lines causing anr:ANRTestActivity.java:20Line 20 of the class

Special note: when a new anr is generated, the original traces.txt file will be overwritten.

2.4 anr analysis method 3: java thread call analysis

The commands provided by JDK can help analyze and debug Java applications. The commands are:

jstack {pid}

The PID can be obtained through the JPS command, which will list all Java virtual machine processes running in the current system, such as

7266 Test
7267 Jps

Specific analysis reference:Android application anr analysis4、 Section 1

2.5 anr analysis method 4: DDMS analyzes anr problems

  • Using DDMS – update threads tool
  • Read the output of update threads

Specific analysis reference:Android application anr analysis4、 Section 2

3、 Causes and solutions of anr

The above example is only anr caused by simple mainline time-consuming operation. There are many reasons for anr:

  • Main thread blocking or main thread data reading

Solution: avoid deadlock and use sub threads to handle time-consuming operations or blocking tasks. Try to avoid query providerDon’t abuse sharepreferences

  • CPU full load, I / O blocking

Solution: file read / write or database operation is placed on the sub thread for asynchronous operation.

  • insufficient memory

terms of settlement:AndroidManifest.xmlIt can be set in the file < Application >android:largeHeap="true"To increase the memory used by app. howeverThis method is not recommended, fundamentally prevent memory leakage and optimize memory use is the right way.

  • Anr of each major component

Time consuming operations should also be avoided in the life cycle of major components. Note that onreceive(), background service and ContentProvider of broadcastreceiver should not perform tasks for too long.

4、 Anr source code analysis

Special statement:articleUnderstand the triggering principle of Android anrRecorded byServiceBroadcastReceiverandContentProviderAnr caused by. The code is quoted below and summarized according to my simple understanding.

4.1 ServiceCausedService Timeout

Service TimeoutIs located“ActivityManager”Ams.mainhandler in thread receivedSERVICE_TIMEOUT_MSGTriggered when a message is received.

4.1.1 sending delay message

ServiceProcess attach to system_ Called during the server processrealStartServiceLocked, followed bymAm.mHandler.sendMessageAtTime()To send a delay message. The delay is often defined, such as the foregroundService20 seconds.ActivityManagerIn threadAMS.MainHandlerreceivedSERVICE_TIMEOUT_MSGTriggered when a message is received.

AS.realStartServiceLocked

ActiveServices.java

private final void realStartServiceLocked(ServiceRecord r,
        ProcessRecord app, boolean execInFg) throws RemoteException {
    ...
    //Send delay message (service_timeout_msg)
    bumpServiceExecutingLocked(r, execInFg, "create");
    try {
        ...
        //The oncreate() method of the final execution service
        app.thread.scheduleCreateService(r, r.serviceInfo,
                mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
                app.repProcState);
    } catch (DeadObjectException e) {
        mAm.appDiedLocked(app);
        throw e;
    } finally {
        ...
    }
}

AS.bumpServiceExecutingLocked

private final void bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why) {
    ... 
    scheduleServiceTimeoutLocked(r.app);
}

void scheduleServiceTimeoutLocked(ProcessRecord proc) {
    if (proc.executingServices.size() == 0 || proc.thread == null) {
        return;
    }
    long now = SystemClock.uptimeMillis();
    Message msg = mAm.mHandler.obtainMessage(
            ActivityManagerService.SERVICE_TIMEOUT_MSG);
    msg.obj = proc;

    //The service is still not removed after timeout_ TIMEOUT_ MSG message, execute the service timeout process
    mAm.mHandler.sendMessageAtTime(msg,
        proc.execServicesFg ? (now+SERVICE_TIMEOUT) : (now+ SERVICE_BACKGROUND_TIMEOUT));
}

4.1.2 create a service by entering the main thread of the target process

Enter the main thread of the target process through layer by layer calls such as binderhandleCreateService(CreateServiceData data)

ActivityThread.java

private void handleCreateService(CreateServiceData data) {
        ...
        java.lang.ClassLoader cl = packageInfo.getClassLoader();
        Service service = (Service) cl.loadClass(data.info.name).newInstance();
        ...

        try {
            //Create contextimpl object
            ContextImpl context = ContextImpl.createAppContext(this, packageInfo);
            context.setOuterContext(service);
            //Create application object
            Application app = packageInfo.makeApplication(false, mInstrumentation);
            service.attach(context, this, data.info.name, data.token, app,
                    ActivityManagerNative.getDefault());
            //Call the service oncreate() method 
            service.onCreate();

            //Cancels the delay message for ams.mainhandler
            ActivityManagerNative.getDefault().serviceDoneExecuting(
                    data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
        } catch (Exception e) {
            ...
        }
    }

In this method, the target service object will be created, as well as the common callback objectsServiceofonCreate()Method, followed byserviceDoneExecuting()Back to system_ The server executes a delay message to cancel ams.mainhandler.

4.1.3 return to system_ Server execution cancels the delay message of ams.mainhandler

AS.serviceDoneExecutingLocked

private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,
            boolean finishing) {
    ...
    if (r.executeNesting <= 0) {
        if (r.app != null) {
            r.app.execServicesFg = false;
            r.app.executingServices.remove(r);
            if (r.app.executingServices.size() == 0) {
                //There is no executing service in the process where the current service is located
                mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
        ...
    }
    ...
}

In this method, when the service logic processing is completed, the previously delayed messages are removedSERVICE_TIMEOUT_MSG。 If this method is not called until the execution is completed, it will be issued after timeoutSERVICE_TIMEOUT_MSGTo inform anr of the occurrence.

4.2 BroadcastReceiverBroadcastqueue timeout caused by

BroadcastReceiver TimeoutIs located“ActivityManager”Broadcastqueue.broadcasthandler in thread receivedBROADCAST_TIMEOUT_MSGTriggered when a message is received.

4.2.1 process the broadcast function processnextbroadcast() to send the delay message broadcasttimeoutlocked (false)

The broadcast processing order is to process the parallel broadcast first, and then the current ordered broadcast.

final void processNextBroadcast(boolean fromMsg) {
    synchronized(mService) {
        ...
        //Process the current ordered broadcast
        do {
            r = mOrderedBroadcasts.get(0);
            //Get all recipients of this broadcast
            int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
            if (mService.mProcessesReady && r.dispatchTime > 0) {
                long now = SystemClock.uptimeMillis();
                if ((numReceivers > 0) &&
                        (now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
                    //Step 1 \. Send delay message. This function handles many things, such as broadcast processing timeout and ending broadcast
                    broadcastTimeoutLocked(false);
                    ...
                }
            }
            if (r.receivers == null || r.nextReceiver >= numReceivers
                    || r.resultAbort || forceReceive) {
                if (r.resultTo != null) {
                    //2 \. Processing broadcast messages
                    performReceiveLocked(r.callerApp, r.resultTo,
                        new Intent(r.intent), r.resultCode,
                        r.resultData, r.resultExtras, false, false, r.userId);
                    r.resultTo = null;
                }
                //3 \. Cancel broadcast timeout anr message
                cancelBroadcastTimeoutLocked();
            }
        } while (r == null);
        ...

        //Get next ordered broadcast
        r.receiverTime = SystemClock.uptimeMillis();
        if (!mPendingBroadcastTimeoutMessage) {
            long timeoutTime = r.receiverTime + mTimeoutPeriod;
            //Set broadcast timeout
            setBroadcastTimeoutLocked(timeoutTime);
        }
        ...
    }
}

Step 1. Broadcasttimeoutlocked (false) function above: record the time information and call the function to set the sending delay message

final void broadcastTimeoutLocked(boolean fromMsg) {
    ...
        long now = SystemClock.uptimeMillis();
        if (fromMsg) {
            if (mService.mDidDexOpt) {
                // Delay timeouts until dexopt finishes.
                mService.mDidDexOpt = false;
                long timeoutTime = SystemClock.uptimeMillis() + mTimeoutPeriod;
                setBroadcastTimeoutLocked(timeoutTime);
                return;
            }
            if (!mService.mProcessesReady) {
                return;
            }

            long timeoutTime = r.receiverTime + mTimeoutPeriod;
            if (timeoutTime > now) {
                // step 2
                setBroadcastTimeoutLocked(timeoutTime);
                return;
            }
        }

The above step 2. Setbroadcasttimeoutlocked function: the specific operation of setting broadcast timeout is also to send delay messages

final void setBroadcastTimeoutLocked(long timeoutTime) {
    if (! mPendingBroadcastTimeoutMessage) {
        Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
        mHandler.sendMessageAtTime(msg, timeoutTime);
        mPendingBroadcastTimeoutMessage = true;
    }
}

4.2.2 the parameter timeouttime of the setbroadcasttimeoutlocked (long timeouttime) function is the current time plus the set timeout.

That is the above

long timeoutTime = SystemClock.uptimeMillis() + mTimeoutPeriod;

Mtimeoutperiod is 10s of the foreground queue and 60s of the background queue.

public ActivityManagerService(Context systemContext) {
    ...
    static final int BROADCAST_FG_TIMEOUT = 10 * 1000;
    static final int BROADCAST_BG_TIMEOUT = 60 * 1000;
    ...
    mFgBroadcastQueue = new BroadcastQueue(this, mHandler,
            "foreground", BROADCAST_FG_TIMEOUT, false);
    mBgBroadcastQueue = new BroadcastQueue(this, mHandler,
            "background", BROADCAST_BG_TIMEOUT, true);
    ...
}

4.2.3 calls cancelBroadcastTimeoutLocked in the processNextBroadcast () process after performReceiveLocked is executed.

Cancelbroadcasttimeoutlocked: cancelbroadcasttimeoutlocked() is called to cancel the timeout message after performreceivelocked() in processnextbroadcast() processes the broadcast message.

final void cancelBroadcastTimeoutLocked() {
    if (mPendingBroadcastTimeoutMessage) {
        mHandler.removeMessages(BROADCAST_TIMEOUT_MSG, this);
        mPendingBroadcastTimeoutMessage = false;
    }
}

4.3 ContentProvider timeout of ContentProvider

ContentProvider timeout is ams.mainhandler in the “activitymanager” thread. Content is received_ PROVIDER_ PUBLISH_ TIMEOUT_ Triggered when MSG message.
reference resourcesUnderstand the triggering principle of Android anrSection IV

5、 Android anr information collection

AMS. Appnotresponding () method will be called in the end whenever anr occurs in any of the four components or processes.
reference resources:Understand the information collection process of Android anr

reference material:

Understand the triggering principle of Android anr
Understand the information collection process of Android anr
Detailed explanation of Android App optimization anr
Android source code analysis anr

Author: marker_ Sky
Link:https://www.jianshu.com/p/388166988cef
Source: developeppaper
The copyright belongs to the author. For commercial reprint, please contact the author for authorization, and for non-commercial reprint, please indicate the source.