背景
Thread.UncaughtExceptionHandler
当一个线程因为未捕获的异常而即将终止时,Java虚拟机将使用Thread.getUncaughtExceptionHandler()查询该线程以获得其UncaughtExceptionHandler,并调用该handler的uncaughtException()方法,将线程和异常作为参数传递。
如果某一线程没有明确设置其UncaughtExceptionHandler,则将他的ThreadGroup对象作为其handler。如果ThreadGroup对象对异常没有什么特殊的要求,那么ThreadGroup可以将调用转发给默认的未捕获异常处理器(即Thread类中定义的静态的未捕获异常处理器对象)
异常可以分为受检异常(除了RuntimeException与其派生类(子类),以及错误(Error),其他的差不多都是受检异常)和非受检异常。
1.Android中无论任何线程抛出任何未处理的异常就会直接导致程序崩溃。
/**
* Use this to log a message when a thread exits due to an uncaught
* exception. The framework catches these for the main threads, so
* this should only matter for threads created by applications.
*/
private static class UncaughtHandler implements Thread.UncaughtExceptionHandler {
public void uncaughtException(Thread t, Throwable e) {
try {
// Don't re-enter -- avoid infinite loops if crash-reporting crashes.
if (mCrashing) return;
mCrashing = true;
if (mApplicationObject == null) {
Slog.e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
} else {
Slog.e(TAG, "FATAL EXCEPTION: " + t.getName(), e);
}
//弹出对话框
// Bring up crash dialog, wait for it to be dismissed
ActivityManagerNative.getDefault().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.CrashInfo(e));
} catch (Throwable t2) {
try {
Slog.e(TAG, "Error reporting crash", t2);
} catch (Throwable t3) {
// Even Slog.e() fails! Oh well.
}
} finally {
// Try everything to make sure this process goes away.
Process.killProcess(Process.myPid());
//杀死程序
System.exit(10);
}
}
}
//设置程序处理未捕捉的异常
private static final void commonInit() {
if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");
/* set default handler; this applies to all threads in the VM */
Thread.setDefaultUncaughtExceptionHandler(new UncaughtHandler());
}
可以看到uncaughtException在方法中,最终的处理是弹出异常dialog,然后退出程序。
2.如何保证程序不崩溃
1.主线程
可以通过以下代码catch住主线程的异常,代码可以放在生命周期的方法中
new Handler((Looper.getMainLooper())).post(new Runnable() {
@Override
public void run() {
for (;;) {
try {
Looper.loop();
} catch (Throwable e) {
Log.d("jackie", "==================");
e.printStackTrace();
}
}
}
});
原理就是通过Handler往主线程的MessageQueue中添加一个Runnable,当主线程执行到该Runnable时,就会进入我们的死循环,如果循环中是空的就会导致代码卡在这里,最终导致ANR,但是我们在while死循环中有调用了Looper.loop(),这就导致主宣传又开始不断拿的读取queue中的Message并执行,这样就可以保证以后主线程的所有异常都会从我们手动调用的Looper.loop()中抛出,一旦抛出就会被catch住,这样主线程就不会crash了。
为什么要通过new Handler.post方式而不是直接在主线程中任意位置执行 while (true) { try { Looper.loop(); } catch (Throwable e) {} }。原因是该方法是个死循环,如果在onCreate中执行会导致while后面的代码得不到执行,通过Handler.post方式可以保证不影响该条消息中后面的逻辑。其实感觉用MessageQueue.IdleHandler也不错,在队列空闲时做这个。
2.子线程
//所有的线程异常拦截,由于主线程的异常都被我们catch住了,所有下面的代码拦截到的都是子线程的异常
Thread.setDefaultUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler(){
@Override
public void uncaughtException(@NonNull Thread t, @NonNull Throwable e) {
//处理异常
Log.d("jackie","============"+Thread.currentThread().getName());
}
});
因为之前已经catch住主线程的异常了,所以上面的代码拦截到的都是子线程的异常了,在uncaughtException方法中我们可以进行上报异常,然后也可以选择处理异常的策略,比如该异常足够严重,我们可以直接退出程序,或者让程序继续运行,上面的主线程也可以这样。
3.拦截生命周期的异常
简单来说是替换了ActivityThread.mH.mCallback,当然不同的Android版本需要进行适配
Activity 生命周期所有方法都是在mH
的handleMessage
方法中调用的,只要能拦截这个handleMessage
方法就能拦截所有生命周期的异常。然而我们没法通过反射替换掉这个mH
对象。因为mH
是 ActivityThread 中一个 H 类的实例,H 类又继承自Handler
,H 类又是 ActivityThread 中的一个私有类,但是Handler
会在调用handleMessage
前调用mCallback.handleMessage
,mCallback
是可以被替换掉的
Class activityThreadClass = Class.forName("android.app.ActivityThread");
Object activityThread = activityThreadClass.getDeclaredMethod("currentActivityThread").invoke(null);
Field mhField = activityThreadClass.getDeclaredField("mH");
mhField.setAccessible(true);
final Handler mhHandler = (Handler) mhField.get(activityThread);
Field callbackField = Handler.class.getDeclaredField("mCallback");
callbackField.setAccessible(true);
//mhHandler是ActivityThread.mH,callbackField 是 mH 中的 mCallback 字段,可以通过反射得到
callbackField.set(mhHandler, new Handler.Callback() {
//拦截到生命周期相关的消息
@Override
public boolean handleMessage(Message msg) {
switch (msg.what) {
case LAUNCH_ACTIVITY:
try {
//调用ActivityThread.mH.handleMessage
mhHandler.handleMessage(msg);
return true;
} catch (Throwable throwable) {
//捕获到生命周期的异常,可以直接关闭该Activity,参考下文的 finish Activity生命周期异常的Activity
}
//...省略部分相似逻辑
}
return false;
}
});
3.firebase crashlytics原理
使用firebase,我们需要先从添加核心 Firebase SDK (com.google.firebase:firebase-core
) 开始
implementation 'com.google.firebase:firebase-core:17.0.0'
然后使用具体的模块在添加具体的SDK依赖,比如添加crashlytics
com.crashlytics.sdk.android:crashlytics:2.10.1
至于为什么是这个流程,是因为核心包添加了一些公共库,方便多个模块共用,想使用具体的库再依赖具体的包,可以查看公共库中的FirebaseInitProvider类
public class FirebaseInitProvider extends ContentProvider {
/** Called before {@link Application#onCreate()}. */
@Override
public boolean onCreate() {
if (FirebaseApp.initializeApp(getContext()) == null) {
Log.i(TAG, "FirebaseApp initialization unsuccessful");
} else {
Log.i(TAG, "FirebaseApp initialization successful");
}
return false;
}
在该类的onCreate方法中进行初始化FirebaseApp
@Nullable
public static FirebaseApp initializeApp(@NonNull Context context) {
synchronized (LOCK) {
if (INSTANCES.containsKey(DEFAULT_APP_NAME)) {
return getInstance();
}
//必须进行firebase的一些基本参数配置
FirebaseOptions firebaseOptions = FirebaseOptions.fromResource(context);
if (firebaseOptions == null) {
Log.w(
LOG_TAG,
"Default FirebaseApp failed to initialize because no default "
+ "options were found. This usually means that com.google.gms:google-services was "
+ "not applied to your gradle project.");
return null;
}
return initializeApp(context, firebaseOptions);
}
}
//1.initializeApp
@NonNull
public static FirebaseApp initializeApp(
@NonNull Context context, @NonNull FirebaseOptions options) {
return initializeApp(context, options, DEFAULT_APP_NAME);
}
//2.initializeApp
@NonNull
public static FirebaseApp initializeApp(
@NonNull Context context, @NonNull FirebaseOptions options, @NonNull String name) {
···
//初始化firebaseapp,里面包含我们的一个个依赖包
firebaseApp.initializeAllApis();
return firebaseApp;
}
进入FirebaseApp的构造器可以看到
protected FirebaseApp(Context applicationContext, String name, FirebaseOptions options) {
this.applicationContext = Preconditions.checkNotNull(applicationContext);
this.name = Preconditions.checkNotEmpty(name);
this.options = Preconditions.checkNotNull(options);
List<ComponentRegistrar> registrars =
ComponentDiscovery.forContext(applicationContext, ComponentDiscoveryService.class)
.discover();
String kotlinVersion = KotlinDetector.detectVersion();
//初始化各个依赖模块
componentRuntime =
new ComponentRuntime(
UI_EXECUTOR,
registrars,
Component.of(applicationContext, Context.class),
Component.of(this, FirebaseApp.class),
Component.of(options, FirebaseOptions.class),
LibraryVersionComponent.create(FIREBASE_ANDROID, ""),
LibraryVersionComponent.create(FIREBASE_COMMON, BuildConfig.VERSION_NAME),
kotlinVersion != null ? LibraryVersionComponent.create(KOTLIN, kotlinVersion) : null,
DefaultUserAgentPublisher.component(),
DefaultHeartBeatInfo.component());
···
}
进入ComponentRuntime构造器可以看到
public ComponentRuntime(
Executor defaultEventExecutor,
Iterable<ComponentRegistrar> registrars,
Component<?>... additionalComponents) {
eventBus = new EventBus(defaultEventExecutor);
List<Component<?>> componentsToAdd = new ArrayList<>();
componentsToAdd.add(Component.of(eventBus, EventBus.class, Subscriber.class, Publisher.class));
//这里把各个模块的实现的依赖添加进去,最后统一初始化
for (ComponentRegistrar registrar : registrars) {
componentsToAdd.addAll(registrar.getComponents());
}
for (Component<?> additionalComponent : additionalComponents) {
if (additionalComponent != null) {
componentsToAdd.add(additionalComponent);
}
}
CycleDetector.detect(componentsToAdd);
for (Component<?> component : componentsToAdd) {
Lazy<?> lazy =
new Lazy<>(
() ->
component.getFactory().create(new RestrictedComponentContainer(component, this)));
components.put(component, lazy);
}
processInstanceComponents();
processSetComponents();
}
我们进入ComponentRegistrar可以看到,只要各个实现的SDK实现该类,然后就会统一进行初始化
/**
* Represents an SDK Registrar.
*
* <p>Individual SDKs are expected to provide an implementation of this interface in order to
* register themselves and to participate in dependency injection.
*/
//表示SDK注册商。各个sdk被期望提供这个接口的实现,以便注册自己并参与依赖注入。
public interface ComponentRegistrar {
/** Returns a list of components provided by this registrar. */
List<Component<?>> getComponents();
}
Message
public class FirebaseMessagingRegistrar implements ComponentRegistrar {
firestore
public class FirestoreRegistrar implements ComponentRegistrar {
这样的好处就是通过一个ContentProvider统一管理依赖,因为这里面的很多库都是要尽快初始化的,但是放这么多的初始化在ContentProvider的onCreate方法中,会对启动速度有所影响。
下面来看看Crashlytics的实现
public class CrashlyticsRegistrar implements ComponentRegistrar {
@Override
public List<Component<?>> getComponents() {
return Arrays.asList(
Component.builder(FirebaseCrashlytics.class)
.add(Dependency.required(FirebaseApp.class))
.add(Dependency.required(FirebaseInstallationsApi.class))
.add(Dependency.optional(AnalyticsConnector.class))
.add(Dependency.optional(CrashlyticsNativeComponent.class))
.factory(this::buildCrashlytics) //关键方法
.eagerInDefaultApp()
.build(),
LibraryVersionComponent.create("fire-cls", BuildConfig.VERSION_NAME));
}
private FirebaseCrashlytics buildCrashlytics(ComponentContainer container) {
FirebaseApp app = container.get(FirebaseApp.class);
CrashlyticsNativeComponent nativeComponent = container.get(CrashlyticsNativeComponent.class);
AnalyticsConnector analyticsConnector = container.get(AnalyticsConnector.class);
FirebaseInstallationsApi firebaseInstallations = container.get(FirebaseInstallationsApi.class);
//FirebaseCrashlytics的初始化
return FirebaseCrashlytics.init(
app, firebaseInstallations, nativeComponent, analyticsConnector);
}
}
进入初始化方法
static @Nullable FirebaseCrashlytics init(
@NonNull FirebaseApp app,
@NonNull FirebaseInstallationsApi firebaseInstallationsApi,
@Nullable CrashlyticsNativeComponent nativeComponent,
@Nullable AnalyticsConnector analyticsConnector) {
Context context = app.getApplicationContext();
// Set up the IdManager
//处理异常线程
final ExecutorService crashHandlerExecutor =
ExecutorUtils.buildSingleThreadExecutorService("Crashlytics Exception Handler");
final CrashlyticsCore core =
new CrashlyticsCore(
app,
idManager,
nativeComponent,
arbiter,
breadcrumbSource,
analyticsEventLogger,
crashHandlerExecutor);
if (!onboarding.onPreExecute()) {
Logger.getLogger().e("Unable to start Crashlytics.");
return null;
}
final ExecutorService threadPoolExecutor =
ExecutorUtils.buildSingleThreadExecutorService("com.google.firebase.crashlytics.startup");
final SettingsController settingsController =
onboarding.retrieveSettingsData(context, app, threadPoolExecutor);
//初始化
final boolean finishCoreInBackground = core.onPreExecute(settingsController);
onPreExecute
public boolean onPreExecute(SettingsDataProvider settingsProvider) {
// before starting the crash detector make sure that this was built with our build
// tools.
final String mappingFileId = CommonUtils.getMappingFileId(context);
Logger.getLogger().d("Mapping file ID is: " + mappingFileId);
// Throw an exception and halt the app if the build ID is required and not present.
// TODO: This flag is no longer supported and should be removed, as part of a larger refactor
// now that the buildId is now only used for mapping file association.
final boolean requiresBuildId =
CommonUtils.getBooleanResourceValue(
context, CRASHLYTICS_REQUIRE_BUILD_ID, CRASHLYTICS_REQUIRE_BUILD_ID_DEFAULT);
if (!isBuildIdValid(mappingFileId, requiresBuildId)) {
throw new IllegalStateException(MISSING_BUILD_ID_MSG);
}
final String googleAppId = app.getOptions().getApplicationId();
try {
Logger.getLogger().i("Initializing Crashlytics " + getVersion());
final FileStore fileStore = new FileStoreImpl(context);
crashMarker = new CrashlyticsFileMarker(CRASH_MARKER_FILE_NAME, fileStore);
initializationMarker = new CrashlyticsFileMarker(INITIALIZATION_MARKER_FILE_NAME, fileStore);
final HttpRequestFactory httpRequestFactory = new HttpRequestFactory();
final AppData appData = AppData.create(context, idManager, googleAppId, mappingFileId);
final UnityVersionProvider unityVersionProvider = new ResourceUnityVersionProvider(context);
Logger.getLogger().d("Installer package name is: " + appData.installerPackageName);
controller =
new CrashlyticsController(
context,
backgroundWorker,
httpRequestFactory,
idManager,
dataCollectionArbiter,
fileStore,
crashMarker,
appData,
null,
null,
nativeComponent,
unityVersionProvider,
analyticsEventLogger,
settingsProvider);
// If the file is present at this point, then the previous run's initialization
// did not complete, and we want to perform initialization synchronously this time.
// We make this check early here because we want to guarantee that the async
// startup thread we're about to launch doesn't affect the value.
final boolean initializeSynchronously = didPreviousInitializationFail();
checkForPreviousCrash();
//在这里设置异常处理,把系统的DefaultUncaughtExceptionHandler传入,在进行包装
controller.enableExceptionHandling(
Thread.getDefaultUncaughtExceptionHandler(), settingsProvider);
//enableExceptionHandling
void enableExceptionHandling(
Thread.UncaughtExceptionHandler defaultHandler, SettingsDataProvider settingsProvider) {
// This must be called before installing the controller with
// Thread.setDefaultUncaughtExceptionHandler to ensure that we are ready to handle
// any crashes we catch.
openSession();
final CrashlyticsUncaughtExceptionHandler.CrashListener crashListener =
new CrashlyticsUncaughtExceptionHandler.CrashListener() {
@Override
public void onUncaughtException(
@NonNull SettingsDataProvider settingsDataProvider,
@NonNull Thread thread,
@NonNull Throwable ex) {
handleUncaughtException(settingsDataProvider, thread, ex);
}
};
crashHandler =
new CrashlyticsUncaughtExceptionHandler(crashListener, settingsProvider, defaultHandler);
//传入包装后的crashHandler
Thread.setDefaultUncaughtExceptionHandler(crashHandler);
CrashlyticsUncaughtExceptionHandler
public CrashlyticsUncaughtExceptionHandler(
CrashListener crashListener,
SettingsDataProvider settingsProvider,
Thread.UncaughtExceptionHandler defaultHandler) {
this.crashListener = crashListener;
this.settingsDataProvider = settingsProvider;
this.defaultHandler = defaultHandler;
this.isHandlingException = new AtomicBoolean(false);
}
@Override
public void uncaughtException(Thread thread, Throwable ex) {
isHandlingException.set(true);
try {
if (thread == null) {
Logger.getLogger().e("Could not handle uncaught exception; null thread");
} else if (ex == null) {
Logger.getLogger().e("Could not handle uncaught exception; null throwable");
} else {
crashListener.onUncaughtException(settingsDataProvider, thread, ex);
}
} catch (Exception e) {
Logger.getLogger().e("An error occurred in the uncaught exception handler", e);
} finally {
Logger.getLogger()
.d(
"Crashlytics completed exception processing."
+ " Invoking default exception handler.");
defaultHandler.uncaughtException(thread, ex);
isHandlingException.set(false);
}
}
我们对异常进行处理后,最后再交给系统的defaultHandler.uncaughtException处理,先来看一下firebase是如何处理的
@Override
public void uncaughtException(Thread thread, Throwable ex) {
isHandlingException.set(true);
try {
if (thread == null) {
Logger.getLogger().e("Could not handle uncaught exception; null thread");
} else if (ex == null) {
Logger.getLogger().e("Could not handle uncaught exception; null throwable");
} else {
//通过监听器处理
crashListener.onUncaughtException(settingsDataProvider, thread, ex);
}
} catch (Exception e) {
Logger.getLogger().e("An error occurred in the uncaught exception handler", e);
} finally {
Logger.getLogger()
.d(
"Crashlytics completed exception processing."
+ " Invoking default exception handler.");
defaultHandler.uncaughtException(thread, ex);
isHandlingException.set(false);
}
}
//监听器处理
final CrashlyticsUncaughtExceptionHandler.CrashListener crashListener =
new CrashlyticsUncaughtExceptionHandler.CrashListener() {
@Override
public void onUncaughtException(
@NonNull SettingsDataProvider settingsDataProvider,
@NonNull Thread thread,
@NonNull Throwable ex) {
handleUncaughtException(settingsDataProvider, thread, ex);
}
};
终于找到我们的处理方法handleUncaughtException
synchronized void handleUncaughtException(
@NonNull SettingsDataProvider settingsDataProvider,
@NonNull final Thread thread,
@NonNull final Throwable ex) {
Logger.getLogger()
.d(
"Crashlytics is handling uncaught "
+ "exception \""
+ ex
+ "\" from thread "
+ thread.getName());
// Capture the time that the crash occurs and close over it so that the time doesn't
// reflect when we get around to executing the task later.
final Date time = new Date();
final Task<Void> handleUncaughtExceptionTask =
backgroundWorker.submitTask(
new Callable<Task<Void>>() {
@Override
public Task<Void> call() throws Exception {
// We've fatally crashed, so write the marker file that indicates a crash occurred.
crashMarker.create();
long timestampSeconds = getTimestampSeconds(time);
reportingCoordinator.persistFatalEvent(ex, thread, timestampSeconds);
writeFatal(thread, ex, timestampSeconds);
writeAppExceptionMarker(time.getTime());
Settings settings = settingsDataProvider.getSettings();
int maxCustomExceptionEvents = settings.getSessionData().maxCustomExceptionEvents;
int maxCompleteSessionsCount = settings.getSessionData().maxCompleteSessionsCount;
doCloseSessions(maxCustomExceptionEvents);
doOpenSession();
trimSessionFiles(maxCompleteSessionsCount);
// If automatic data collection is disabled, we'll need to wait until the next run
// of the app.
if (!dataCollectionArbiter.isAutomaticDataCollectionEnabled()) {
return Tasks.forResult(null);
}
Executor executor = backgroundWorker.getExecutor();
return settingsDataProvider
.getAppSettings()
.onSuccessTask(
executor,
new SuccessContinuation<AppSettingsData, Void>() {
@NonNull
@Override
public Task<Void> then(@Nullable AppSettingsData appSettingsData)
throws Exception {
if (appSettingsData == null) {
Logger.getLogger()
.w(
"Received null app settings, cannot send reports at crash time.");
return Tasks.forResult(null);
}
// Data collection is enabled, so it's safe to send the report.
boolean dataCollectionToken = true;
sendSessionReports(appSettingsData, dataCollectionToken);
return Tasks.whenAll(
logAnalyticsAppExceptionEvents(),
reportingCoordinator.sendReports(
executor, DataTransportState.getState(appSettingsData)));
}
});
}
});
try {
Utils.awaitEvenIfOnMainThread(handleUncaughtExceptionTask);
} catch (Exception e) {
// Nothing to do in this case.
}
}
可以看到通过线程池进行处理,生成文件上传文件等等。
4. ANR(Application Not Responding)原理
首先来看,哪些场景会照成ANR呢?
- Service Timeout:比如前台服务在20s内未执行完成;
- BroadcastQueue Timeout:比如前台广播在10s内未执行完成
- ContentProvider Timeout:内容提供者,在publish过超时10s;
- InputDispatching Timeout: 输入事件分发超时5s,包括按键和触摸事件。
ANR的流程可以分为,埋炸弹->拆炸弹->引爆炸弹,如果埋在的炸弹在一定时间内没有被拆除,就是发送延迟消息一定时间内没有被移除,那就会被引爆(触发),产生ANR。具体的分析可以参考文末的链接。
参考文章