每日一问 | Service onStartCommand 返回STICKY是如何做到被拉活的?

75 阅读6分钟

截屏2023-08-16 15.08.04.png

原问题出处:每日一问 | Service onStartCommand 返回STICKY是如何做到被拉活的? https://wanandroid.com/wenda/show/22539

1.关于进程被杀系统是如何「复活」所在进程的?

1.1 如何监听进程被系统杀死

Binder机制中有 死亡讣告 的事件通知机制,当进程销毁时,其对应的 本地Binder 会通过 AppDeathRecipient 通知 代理Binder进程 销毁了。

所以当Service所在的App进程被系统杀死后,就会将这个事件通过 Binder死亡讣告 通知ActivityManagerService(管理四大组件,自然包括Service组件)。既然要知道Service如何被“复活”,那么肯定要先知道它是如何“出生”,下面是start Service的大概流程图: Service启动流程

由上图中我们关注一下Service进程被Zygote fork创建后执行ActivityThread.attach()方法,通过Binder调用ActivityManagerProxy.attchApplication()方法想ActivityManagerService绑定初始化当前新建的Service进程。

// frameworks/base/core/java/android/app/ActivityThread.java
public final class ActivityThread {
        ...
        final ApplicationThread mAppThread = new ApplicationThread();

        private void attach(boolean system) {
            ...
            // 获取AMS代理Binder
            final IActivityManager mgr = ActivityManagerNative.getDefault();
            try {
                // 将应用程序的本地binder对象ApplicationThread通过Binder调用传给AMS
                mgr.attachApplication(mAppThread);
            }
            ...
        }
    }
    private class ApplicationThread extends ApplicationThreadNative {...} // 继承BinderNative本地Binder对象
}

// frameworks/base/core/java/android/app/ApplicationThreadNative.java
public abstract class ApplicationThreadNative extends Binder implements IApplicationThread {...}// BinderNative对象

class ApplicationThreadProxy implements IApplicationThread {// BinderProxy对象
    private final IBinder mRemote;// 远端Binder引用
}

// frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
public final class ActivityManagerService extends ActivityManagerNative ... {
    ...

    @Override
    public final void attachApplication(IApplicationThread thread) {
        synchronized (this) {
            int callingPid = Binder.getCallingPid();
            final long origId = Binder.clearCallingIdentity();
            attachApplicationLocked(thread, callingPid);
            Binder.restoreCallingIdentity(origId);
        }
    }

    private final boolean attachApplicationLocked(IApplicationThread thread,int pid) {
        ProcessRecord app;
        ...
        try {
            // 创建AppDeathRecipient监听者
            AppDeathRecipient adr = new AppDeathRecipient(app, pid, thread);
            // 建立binder死亡回调
            thread.asBinder().linkToDeath(adr, 0);
            // 赋值到进程信息类ProcessRecord
            app.deathRecipient = adr;
        } catch (RemoteException e) {
           ...
        }
    }

    private final class AppDeathRecipient implements IBinder.DeathRecipient {
        final ProcessRecord mApp; // 进程信息类
        final int mPid; // 进程号
        final IApplicationThread mAppThread; // app进程代理Binder对象
        ...
        @Override
        public void binderDied() {
            synchronized(ActivityManagerService.this) {
                // 处理进程销毁后的清理工作
                appDiedLocked(mApp, mPid, mAppThread, true);
            }
        }
    }
}

Binder死亡讣告

1.2 AMS知道杀死带START_STICKY标记的Service进程后,如何「复活」

在ActivityManagerService启动完成后,向Service反馈,然后Service处理完成后有汇报给ActivityManagerService,然后交给ActiveService处理,将标识START_STICKY的Service对应信息封装类ServiceRecord的stopIfKilled设为false,表示杀死之后重启。

// frameworks/base/services/core/java/com/android/server/am/ActiveServices.java
public final class ActiveServices {
    void serviceDoneExecutingLocked(ServiceRecord r, int type, int startId, int res) {
        ...
        if (r != null) {
            if (type == ActivityThread.SERVICE_DONE_EXECUTING_START) {
                r.callStart = true;
                switch (res) {
                    case Service.START_STICKY_COMPATIBILITY:
                    case Service.START_STICKY: {
                        // 将ServiceRecord的stopIfKilled设为false,表示杀死之后重启
                        // Don't stop if killed.
                        r.stopIfKilled = false;
                        break;
                    }
                    case Service.START_NOT_STICKY: {
                        r.findDeliveredStart(startId, true);
                        if (r.getLastStartId() == startId) {
                            // 将ServiceRecord的stopIfKilled设为false,表示杀死之后不需要重启
                            // There is no more work, and this service
                            // doesn't want to hang around if killed.
                            r.stopIfKilled = true;
                        }
                        break;
                    }
                    ...
                }
                ...
            }
            ...
    }
}

承接上一小节的appDiedLocked()方法,主要是清理应用进程逻辑,对于发现符合重启条件的Service,将ServiceRestarter(实现Runnable)的任务post到ActivityManagerService的Handler执行恢复Service逻辑。

// frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
public final class ActivityManagerService extends ActivityManagerNative ... {
    ...
    final ActiveServices mServices; // Service管理类

    final void appDiedLocked(ProcessRecord app, int pid, IApplicationThread thread,boolean fromBinderDied) {
        ...
        handleAppDiedLocked(app, false, true);
        ...
    }

    private final void handleAppDiedLocked(ProcessRecord app,boolean restarting, boolean allowRestart) {
        ...
        // 清理应用进程数据
        boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1,false /*replacingPid*/);
        ..
    }

    private final boolean cleanUpApplicationRecordLocked(ProcessRecord app,boolean restarting, boolean allowRestart, int index, boolean replacingPid) {
       ...
       // 杀死Service
       mServices.killServicesLocked(app, allowRestart);
       ...
    }
}

// frameworks/base/services/core/java/com/android/server/am/ActiveServices.java
public final class ActiveServices {
    ...
    // 重启Service的ServiceRecord列表
    final ArrayList<ServiceRecord> mRestartingServices = new ArrayList<>();

    final void killServicesLocked(ProcessRecord app, boolean allowRestart) {
        ...
        for (int i=app.services.size()-1; i>=0; i--) {
            ServiceRecord sr = app.services.valueAt(i);
            ...
             // 超过两次的要避免再次重启Service,但是进程还是会被唤醒 如果是ApplicationInfo.FLAG_PERSISTENT表示系统应用则无视,仍旧重启
            if (allowRestart && sr.crashCount >= 2 && (sr.serviceInfo.applicationInfo.flags
            &ApplicationInfo.FLAG_PERSISTENT) == 0) {
                ...
                // 结束Service
                bringDownServiceLocked(sr);
            } else if (!allowRestart || !mAm.isUserRunningLocked(sr.userId, false)) { // 不允许重启的
             // 结束Service
                bringDownServiceLocked(sr);
            } else { // 允许重启的
                // START_STICKY标识的Service返回值canceled为false
                boolean canceled = scheduleServiceRestartLocked(sr, true);

                 // START_STICKY标识的Service的ServiceRecord的stopIfKilled为false,
                 // 在前面serviceDoneExecutingLocked()方法看得到,故不用结束Service
                if (sr.startRequested && (sr.stopIfKilled || canceled)) {
                        ...
                        // 结束Service
                        bringDownServiceLocked(sr);
                        ...
                    }
                }
            }
        }
        ...
    }
    
    private final boolean scheduleServiceRestartLocked(ServiceRecord r,boolean allowCancel) {
        boolean canceled = false;
        ...
        if (!mRestartingServices.contains(r)) {
            ...
            // 将重启Service的ServiceRecord添加进mRestartingServices
            mRestartingServices.add(r);
            ...
        }
        ...
        // 往ActivityManagerService的Handler里面移除过时的ServiceRestarter,并将更新的ServiceRestarter添加定时执行
        mAm.mHandler.removeCallbacks(r.restarter);
        mAm.mHandler.postAtTime(r.restarter, r.nextRestartTime);
        ...
        return canceled;
    }

    private class ServiceRestarter implements Runnable {// 实现Runnable,被添加到Handler执行
        private ServiceRecord mService; // Service信息封装类

        void setService(ServiceRecord service) {
            mService = service;
        }

        public void run() {
            synchronized(mAm) {
                // 执行重启Service逻辑
                performServiceRestartLocked(mService);
            }
        }
    }

    final void performServiceRestartLocked(ServiceRecord r) {
        if (!mRestartingServices.contains(r)) {
            return;
        }
        ...
        try {
            // 拉起进程并重启Service
            bringUpServiceLocked(r, r.intent.getIntent().getFlags(), r.createdFromFg, true);
        }
        ...
    }

    private final String bringUpServiceLocked(ServiceRecord r, int intentFlags, boolean execInFg,
            boolean whileRestarting) throws TransactionTooLargeException {
        ...
        //  移除重启Service列表的ServiceRecord
        if (mRestartingServices.remove(r)) {
            r.resetRestartCounter();
            clearRestartingIfNeededLocked(r);
        }
        ...
        if (!isolated) {
            ...
            if (app != null && app.thread != null) {
                try {
                    app.addPackage(r.appInfo.packageName, r.appInfo.versionCode, mAm.mProcessStats);
                    // 启动Service
                    realStartServiceLocked(r, app, execInFg);
                    return null;
                } 
                ...
            }
        }
    }
}

2.面对项目中一堆Service可能返回START_STICKY,如何可以在原生系统上避免被「拉活」?

一般来说Service拉活,无论是官方方法还是黑科技在国产ROM一般不管用,当然洋叔是在Pixel机子上原生系统遇到这问题,只能从项目的方面入手,而不是FrameWork。毫无疑问解决方法非常有限:Hook收敛 。依据业务建立黑白名单管理,根据需求Hook掉Service的onStartCommand(),将其返回值设为START_NOT_STICKY即能解决拉活问题。Hook方案很多,根据项目实际任君选择:

方案作用时机操作对象优点缺点要求
APT编译时:java文件还未编译成class文件.java文件1.可以织入所有类;2.编译时代理,减少运行时消耗1.需要使用apt编译器编译;2.需要手动拼接代理代码(可以使用Javapoet弥补);3.生成大量代理类设计模式和解耦思想的灵活应用
AspectJ编译时、加载时.java文件功能强大,除了hook之外,还可以为目标类添加变量,接口。也有抽象继承等各种更高级的玩法1.不够轻量级;2.定义的切点依赖编程语言,无法兼容Lambda语法;3.无法织入第三方库;4.会有一些兼容问题,如:D8、Gradle4.x等语法复杂,但掌握几个简单的,就能实现绝大多数场景
Javassist编译时:class文件未编译为dex或者运行时class字节码1.减少了生成类的开销;2.直接操作修改编译后的字节码,直接绕过java编译器,所以突破很多限制的事情,例如,跨dex引用,解决热修复中CLASS_ISPREVERIFIED问题运行时加入切面逻辑,产生性能开销1.自定义Gradle插件;2.掌握groovy语言
ASM编译时或运行时class字节码小巧轻便、性能好,效率比Javassist高学习成本高需要熟悉字节码语法,ASM通过树表示复杂的字节码结构,并利用Push模型来对树进行遍历,在遍历过程中对字节码进行修改
ASMDEX编译时和加载时:转化为.dex后Dex字节码,创建class文件可以织入所有类学习成本高需要对class文件比较熟悉,编写过程复杂
DexMaker同上Dex字节码,创建dex文件同上同上同上
Cglib运行时生成子类拦截方法字节码没有接口也可以织入1.不能代理final字段修饰的方法;2.需要和DexMaker结合使用--
xposed运行时hook--能hook自己应用进程的方法,能hook其他应用的方法,能hook系统方法依赖第三方包的支持,兼容性差,手机需要root--
dexposed运行时hook-只能hook自己应用的进程方法,但无需root1.依赖第三方包的支持,兼容性差;2.只能支持Dalvik虚拟机-
epic运行时hook-支持Dalvik和Art虚拟机只适合在开发调试中使用,碎片化严重有兼容性问题-