dubbo 注册层源码解析 注册中心分布式锁失效

789 阅读1分钟

每次一上线就发现有时dubbo调用接口超时会报如下的错误

2019-07-23 12:52:50,932 DEBUG [ZkEventThread.java:69] : Delivering event #3600 ZkEvent[Children of /dubbo/com.zoe.base.healthservice.api.mservice.indexmanager.RecIndexEddApi/providers changed sent to com.alibaba.dubbo.remoting.zookeeper.zkclient.ZkclientZookeeperClient$2@1ced40b]
2019-07-23 12:52:50,934 DEBUG [ClientCnxn.java:818] : Reading reply sessionid:0x100010739720373, packet:: clientPath:null serverPath:null finished:false header:: 10372,3  replyHeader:: 10372,42375930,0  request:: '/dubbo/com.zoe.base.healthservice.api.mservice.indexmanager.RecIndexEddApi/providers,T  response:: s{22154898,22154898,1551930057480,1551930057480,0,1674,0,0,0,2,42375930} 
2019-07-23 12:52:50,936  WARN [AbstractRegistry.java:221] :  [DUBBO] Failed to save registry store file, cause: Can not lock the registry cache file C:\Users\Du Wenqing\.dubbo\dubbo-registry-172.16.34.101.cache, ignore and retry later, maybe multi java process use the file, please config: dubbo.registry.file=xxx.properties, dubbo version: 2.5.3, current host: 172.16.36.42
java.io.IOException: Can not lock the registry cache file C:\Users\Du Wenqing\.dubbo\dubbo-registry-172.16.34.101.cache, ignore and retry later, maybe multi java process use the file, please config: dubbo.registry.file=xxx.properties
	at com.alibaba.dubbo.registry.support.AbstractRegistry.doSaveProperties(AbstractRegistry.java:193)
	at com.alibaba.dubbo.registry.support.AbstractRegistry$SaveProperties.run(AbstractRegistry.java:150)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

跟踪源码发现,在注册中心保存配置的时候

public void doSaveProperties(long version) {
        if (version >= this.lastCacheChanged.get()) {
            if (this.file != null) {
                Properties newProperties = new Properties();
                FileInputStream in = null;

                try {
                    if (this.file.exists()) {
                        in = new FileInputStream(this.file);
                        newProperties.load(in);
                    }
                } catch (Throwable var69) {
                    this.logger.warn("Failed to load registry store file, cause: " + var69.getMessage(), var69);
                } finally {
                    if (in != null) {
                        try {
                            in.close();
                        } catch (IOException var64) {
                            this.logger.warn(var64.getMessage(), var64);
                        }
                    }

                }

                try {
                    newProperties.putAll(this.properties);
                    File lockfile = new File(this.file.getAbsolutePath() + ".lock");
                    if (!lockfile.exists()) {
                        lockfile.createNewFile();
                    }

                    RandomAccessFile raf = new RandomAccessFile(lockfile, "rw");

                    try {
                        FileChannel channel = raf.getChannel();

                        try {
                            FileLock lock = channel.tryLock();
                            if (lock == null) {
                                throw new IOException("Can not lock the registry cache file " + this.file.getAbsolutePath() + ", ignore and retry later, maybe multi java process use the file, please config: dubbo.registry.file=xxx.properties");
                            }

                            try {
                                if (!this.file.exists()) {
                                    this.file.createNewFile();
                                }

                                FileOutputStream outputFile = new FileOutputStream(this.file);

                                try {
                                    newProperties.store(outputFile, "Dubbo Registry Cache");
                                } finally {
                                    outputFile.close();
                                }
                            } finally {
                                lock.release();
                            }
                        } finally {
                            channel.close();
                        }
                    } finally {
                        raf.close();
                    }
                } catch (Throwable var71) {
                    if (version < this.lastCacheChanged.get()) {
                        return;
                    }

                    this.registryCacheExecutor.execute(new AbstractRegistry.SaveProperties(this.lastCacheChanged.incrementAndGet()));
                    this.logger.warn("Failed to save registry store file, cause: " + var71.getMessage(), var71);
                }

            }
        }
    }

因为dubbo底层是用netty进行数据通信协议传输的,根据多路复用的思想,对于每一个线程都会为其创建一个channel进行读写。而在分布式系统中,如果有来自不同系统的线程同时对缓存文件进行读写时,就会造成死锁

所以在FileChannel中提供了一个自旋操作tryLock

public FileLock tryLock(long var1, long var3, boolean var5) throws IOException {
        this.ensureOpen();
        if (var5 && !this.readable) {
            throw new NonReadableChannelException();
        } else if (!var5 && !this.writable) {
            throw new NonWritableChannelException();
        } else {
            FileLockImpl var6 = new FileLockImpl(this, var1, var3, var5);
            FileLockTable var7 = this.fileLockTable();
            var7.add(var6);
            int var9 = this.threads.add();

            FileLockImpl var11;
            try {
                int var8;
                try {
                    this.ensureOpen();
                    var8 = this.nd.lock(this.fd, false, var1, var3, var5);
                } catch (IOException var15) {
                    var7.remove(var6);
                    throw var15;
                }

                FileLockImpl var10;
                if (var8 == -1) {
                    var7.remove(var6);
                    var10 = null;
                    return var10;
                }

                if (var8 != 1) {
                    var10 = var6;
                    return var10;
                }

                assert var5;

                var10 = new FileLockImpl(this, var1, var3, false);
                var7.replace(var6, var10);
                var11 = var10;
            } finally {
                this.threads.remove(var9);
            }

            return var11;
        }
    }

分为以下几个步骤

  • 定义了来自不同Channel的FileLock,并把FileLock的信息用FileLockImpl进行封装,交给FileLockTable进行记录。
  • 记录当前FileLock使用的线程数,加入线程队列
  • 根据获得锁的线程,把文件描述符指定给对应的锁
  • 如果锁对象不存在,就从线程队列中把该线程移除

这个时候如果consumer调用provider,首先会判断是否存在锁对象,如果不在的话,也就不能将文件描述符指定给相应的所对象,进而调用本地缓存,debug如下所示

index2.png

index4.png

index5.png

由于竞争文件锁导致的,那么让服务模块各自缓存自己的cache文件就可以避免这样的问题了。

具体做法是:在provider的xml配置文件中加入

<dubbo:registry id="zkcenter" protocol="zookeeper" address="${dubbo.zk_address}" file="${catalina.home}/dubbo-registry/dubbo-registry.properties"/>