MMKV分析

5,894 阅读14分钟

前言

MMKV是腾讯开源的一个基于mmap的高性能通用key-value组件。Android传统的key-value读写工具即是SharedPreferences,轻量级、简单、易用是它的优点,但是也存在性能差、不支持多进程通信等问题。MMKV 使用了mmap内存映射文件,将内存块映射到文件,读写文件就像读写内存一样,从而性能更高,因此也就可以作为SharedPreferences的替代品。

SharedPreferences的缺点

Android 传统的SharedPreferences虽然简单易用,但是也存在一些问题:

  • 全量更新,写入效率低下。

  • 每次都需要将所有的数据加载到内存,如果存储大量数据,会占用很多的内存。

  • 多进程情况下使用安全性问题。没有合适的机制防止多进程更新所造成的冲突,官方因此不推荐多进程模式下使用,已经废弃了MODE_MULTI_PROCESS 这个标记位。

    而MMKV正是解决了上述的问题。

MMKV 实现原理

mmap内存映射文件

MMKV的核心原理即是内存映射。mmap是linux提供出来作为应用级内存映射的工具。mmap函数定义如下:

void* mmap(void* __addr, size_t __size, int __prot, int __flags, int __fd, off_t __offset);

传入起始地址内存起始地址addr,创建size大小的内存区域,将文件描述符fd关联的文件映射到这个区域,从文件的offset偏移量开始。这里的prot是访问权限,可以指定为PROT_EXEC、PROT-READ、PROT_WRITE、PROT_NONE,分别代表可执行、可读、可写、不能访问。而flags可以设置映射的对象的类型,例如:MAP_SHARED、MAP_PRIVATE、MAP_ANONYMOUS分别代表了共享对象、写时复制的私有对象、不和文件关联的匿名对象。

  • mmap和传统IO的对比

    传统IO的读操作是DMA(Direct Memory Access)先复制磁盘数据到内核缓冲区,然后CPU再将内核缓冲区的数据复制到应用程序地址空间。单次读写操作涉及到两次内存拷贝、两次上下文切换。而mmap只需要首次将磁盘数据拷贝到内核缓冲区,然后建立用户空间和内核空间的映射,之后读取这块数据就不需要内存拷贝了,也不需要上下切换。传统IO的写操作同理,也涉及到两次拷贝和两次上下文切换。而mmap只需要建立用户和内存空间的映射,往里面写数据不需要拷贝和上下文切换,系统负责将数据写回到文件中。如此可见相比传统IO,mmap减少了数据的拷贝和上下文切换,提高了IO效率。

MMKV 具体的mmap逻辑写在了MemoryFile.cpp中:

bool MemoryFile::mmap() {
    m_ptr = (char *) ::mmap(m_ptr, m_size, PROT_READ | PROT_WRITE, MAP_SHARED, m_fd, 0);
    if (m_ptr == MAP_FAILED) {
        MMKVError("fail to mmap [%s], %s", m_name.c_str(), strerror(errno));
        m_ptr = nullptr;
        return false;
    }

    return true;
}

可以看到mmap用的是共享对象的模式,并且赋予了读写的权限。

mmap听上去很完美是不是?但是实际使用中还是有缺陷,例如:mmap必须映射整页的内存,可能会造成内存的浪费,所以mmap的适用场景是大文件的频繁读写,这样就可以节省很多IO的耗时;虽然写回文件的工作由系统负责,但是并不是实时的,是定期写回到磁盘的,中间如果发生内核崩溃、断电等,还是会丢失数据,不过可以通过msync将数据同步回磁盘。

数据结构

MMKV 存取数据的协议是protobuf。protobuf协议是google出品的。在数据大小上、序列化和反序列化的速度上都比XML更有优势,数据体积小3-10倍,反序列化速度提高20-100倍。具体的protobuf协议介绍可以参考这篇博文:halfrost.com/protobuf_en…

protobuf提升了数据序列化和反序列化的速度,但是由于protobuf不支持增量的更新,那么怎么实现只添加或修改一个key-value呢?这里mmkv的做法是将数据append到内存块的末尾,那么这样就可能产生很多相同的key,而mmkv在第一次从文件加载数据到内存中时,会将后写入的新的key替换之前旧的key,保证数据是最新的。翻看mmkv源码可以看到初始化的时候,用一个map来存放从文件当中加载的键值对,那么如果存在相同的key,前面的key就会被后面的key覆盖,保证这个key的值是最新的。代码片段如下:

MMKV_Android.cpp:
MMKV::MMKV(const string &mmapID, int size, MMKVMode mode, string *cryptKey, string *rootPath){
    ...

    // sensitive zone
    {
        SCOPED_LOCK(m_sharedProcessLock);
        loadFromFile();
    }
}

MMKV_IO.cpp:
void MMKV::loadFromFile() {
    ...
    if (!m_file->isFileValid()) {
        MMKVError("file [%s] not valid", m_path.c_str());
    } else {
        ...
        // loading
        if (loadFromFile && m_actualSize > 0) {
            ...
            if (needFullWriteback) {
#ifndef MMKV_DISABLE_CRYPT
                if (m_crypter) {
                    MiniPBCoder::greedyDecodeMap(*m_dicCrypt, inputBuffer, m_crypter);
                } else
#endif
                {
                    MiniPBCoder::greedyDecodeMap(*m_dic, inputBuffer);
                }
            } else {
#ifndef MMKV_DISABLE_CRYPT
                if (m_crypter) {
                    MiniPBCoder::decodeMap(*m_dicCrypt, inputBuffer, m_crypter);
                } else
#endif
                {
                    MiniPBCoder::decodeMap(*m_dic, inputBuffer);
                }
            }
            ...
        } else {
           ...
        }
        ...
    }

    m_needLoadFromFile = false;
}

MiniPBCoder.cpp:
void MiniPBCoder::decodeOneMap(MMKVMap &dic, size_t position, bool greedy) {
    auto block = [position, this](MMKVMap &dictionary) {
        ...
        while (!m_inputData->isAtEnd()) {
            KeyValueHolder kvHolder;
            const auto &key = m_inputData->readString(kvHolder);
            if (key.length() > 0) {
                m_inputData->readData(kvHolder);
                if (kvHolder.valueSize > 0) {
                    dictionary[key] = move(kvHolder);
                } else {
                    auto itr = dictionary.find(key);
                    if (itr != dictionary.end()) {
                        dictionary.erase(itr);
                    }
                }
            }
        }
    };

    if (greedy) {
        try {
            block(dic);
        } catch (std::exception &exception) {
            MMKVError("%s", exception.what());
        }
    } else {
        try {
            MMKVMap tmpDic;
            block(tmpDic);
            dic.swap(tmpDic);
        } catch (std::exception &exception) {
            MMKVError("%s", exception.what());
        }
    }
}

m_dic和m_dicCrypt是一个键值对的字典,他们的数据结构定义如下:

MMKVPredef.h:

using MMKVMap = std::unordered_map<std::string, mmkv::KeyValueHolder>;
using MMKVMapCrypt = std::unordered_map<std::string, mmkv::KeyValueHolderCrypt>;

都是unordered_map,存放的key是写入内存块的key,也就是我们传入的key,存放的值是KeyValueHolder这个结构体,看下定义:

KeyValueHolder.h:

struct KeyValueHolder {
    uint16_t computedKVSize; // internal use only
    uint16_t keySize;
    uint32_t valueSize;
    uint32_t offset;

    KeyValueHolder() = default;
    KeyValueHolder(uint32_t keyLength, uint32_t valueLength, uint32_t offset);

    MMBuffer toMMBuffer(const void *basePtr) const;
};

可以看到存储了key和value的大小以及在内存中的偏移量,并没有直接存储key和value,到用的时候,凭借偏移量和大小去内存中取,节约了内存。

数据写入

举例分析String数据是如何存入的。代码片段如下:

public class MMKV implements SharedPreferences, SharedPreferences.Editor {
	...
	public boolean encode(String key, String value) {
        return encodeString(nativeHandle, key, value);
  }
  ...
  private native boolean encodeString(long handle, String key, String value);
}

native-birdge.cpp:
MMKV_JNI jboolean encodeString(JNIEnv *env, jobject, jlong handle, jstring oKey, jstring oValue) {
    MMKV *kv = reinterpret_cast<MMKV *>(handle);
    if (kv && oKey) {
        string key = jstring2string(env, oKey);
        if (oValue) {
            string value = jstring2string(env, oValue);
            return (jboolean) kv->set(value, key);
        } else {
        		//移除这个键值对
            kv->removeValueForKey(key);
            return (jboolean) true;
        }
    }
    return (jboolean) false;
}

MMKV.cpp:
bool MMKV::set(const string &value, MMKVKey_t key) {
    if (isKeyEmpty(key)) {
        return false;
    }
    return setDataForKey(MMBuffer((void *) value.data(), value.length(), MMBufferNoCopy), key, true);
}

MMKV_IO.cpp:

bool MMKV::setDataForKey(MMBuffer &&data, MMKVKey_t key, bool isDataHolder) {
    if ((!isDataHolder && data.length() == 0) || isKeyEmpty(key)) {
        return false;
    }
    SCOPED_LOCK(m_lock);
    SCOPED_LOCK(m_exclusiveProcessLock);
    checkLoadData();

#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        ...
    } else
#endif // MMKV_DISABLE_CRYPT
    {
        auto itr = m_dic->find(key);
        if (itr != m_dic->end()) {
        		//字典中存在这个键值对,调用对应append函数
            auto ret = appendDataWithKey(data, itr->second, isDataHolder);
            if (!ret.first) {
                return false;
            }
            //更新最先找到的key&value
            itr->second = std::move(ret.second);
        } else {
        		//字典中不存在这个键值对,调用对应append函数
            auto ret = appendDataWithKey(data, key, isDataHolder);
            if (!ret.first) {
                return false;
            }
            m_dic->emplace(key, std::move(ret.second));
        }
    }
    m_hasFullWriteback = false;
#ifdef MMKV_APPLE
    [key retain];
#endif
    return true;
}

//字典中已经存在这个键值对所调用的append函数
KVHolderRet_t MMKV::appendDataWithKey(const MMBuffer &data, const KeyValueHolder &kvHolder, bool isDataHolder) {
    SCOPED_LOCK(m_exclusiveProcessLock);

    uint32_t keyLength = kvHolder.keySize;
    // size needed to encode the key
    size_t rawKeySize = keyLength + pbRawVarint32Size(keyLength);

    // ensureMemorySize() might change kvHolder.offset, so have to do it early
    {
        auto valueLength = static_cast<uint32_t>(data.length());
        if (isDataHolder) {
            valueLength += pbRawVarint32Size(valueLength);
        }
        auto size = rawKeySize + valueLength + pbRawVarint32Size(valueLength);
        bool hasEnoughSize = ensureMemorySize(size);
        if (!hasEnoughSize) {
            return make_pair(false, KeyValueHolder());
        }
    }
    auto basePtr = (uint8_t *) m_file->getMemory() + Fixed32Size;
    MMBuffer keyData(basePtr + kvHolder.offset, rawKeySize, MMBufferNoCopy);

    return doAppendDataWithKey(data, keyData, isDataHolder, keyLength);
}

//字典中不存在这个键值对时所调用的append函数
KVHolderRet_t MMKV::appendDataWithKey(const MMBuffer &data, MMKVKey_t key, bool isDataHolder) {
#ifdef MMKV_APPLE
    auto oData = [key dataUsingEncoding:NSUTF8StringEncoding];
    auto keyData = MMBuffer(oData, MMBufferNoCopy);
#else
    auto keyData = MMBuffer((void *) key.data(), key.size(), MMBufferNoCopy);
#endif
    return doAppendDataWithKey(data, keyData, isDataHolder, static_cast<uint32_t>(keyData.length()));
}

//两种情况都会调用到doAppendDataWithKey函数
KVHolderRet_t
MMKV::doAppendDataWithKey(const MMBuffer &data, const MMBuffer &keyData, bool isDataHolder, uint32_t originKeyLength) {
    ...
    try {
        if (isKeyEncoded) {
            m_output->writeRawData(keyData);
        } else {
            m_output->writeData(keyData);
        }
        if (isDataHolder) {
            m_output->writeRawVarint32((int32_t) valueLength);
        }
        m_output->writeData(data); // note: write size of data
    } catch (std::exception &e) {
        MMKVError("%s", e.what());
        return make_pair(false, KeyValueHolder());
    }

    auto offset = static_cast<uint32_t>(m_actualSize);
    auto ptr = (uint8_t *) m_file->getMemory() + Fixed32Size + m_actualSize;
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        m_crypter->encrypt(ptr, ptr, size);
    }
#endif
    m_actualSize += size;
    updateCRCDigest(ptr, size);

    return make_pair(true, KeyValueHolder(originKeyLength, valueLength, offset));
}

CodedOutputData.cpp:
void CodedOutputData::writeRawData(const MMBuffer &data) {
    size_t numberOfBytes = data.length();
    if (m_position + numberOfBytes > m_size) {
        auto msg = "m_position: " + to_string(m_position) + ", numberOfBytes: " + to_string(numberOfBytes) +
                   ", m_size: " + to_string(m_size);
        throw out_of_range(msg);
    }
    memcpy(m_ptr + m_position, data.getPtr(), numberOfBytes);
    m_position += numberOfBytes;
}

我们看最后一个CodedOutputData.cpp中的writeRawData方法,数据都被写入到内存块末尾了。m_position是一个写入数据的位置偏移,初始值为m_actualSize即文件的真实大小,这就定位到文件末尾了。再看MMKV_IO.cpp中的setDataForKey方法,它先根据key去找map中的元素,如果找到了,则将keydata的指针定位到这里,为的是取这里的key值和更新m_dic或m_dicCrypt字典中的键值对。看这句代码:itr->second = std::move(ret.second); 用了一个巧妙的方式,用move函数,用移动语义减少了内存的拷贝。对应的,如果没有找到则直接加入内存块的尾部,并将键值对添加到m_dic或m_dicCrypt中。

数据取出

再来看看String类型的数据如何取出。片段代码如下:

MMKV.java:

public String decodeString(String key) {
    return decodeString(nativeHandle, key, null);
}

private native String decodeString(long handle, String key, String defaultValue);

native-bridge.cpp:

MMKV_JNI jstring decodeString(JNIEnv *env, jobject obj, jlong handle, jstring oKey, jstring oDefaultValue) {
    MMKV *kv = reinterpret_cast<MMKV *>(handle);
    if (kv && oKey) {
        string key = jstring2string(env, oKey);
        string value;
        bool hasValue = kv->getString(key, value);
        if (hasValue) {
            return string2jstring(env, value);
        }
    }
    return oDefaultValue;
}

MMKV_IO.cpp:

bool MMKV::getString(MMKVKey_t key, string &result) {
    if (isKeyEmpty(key)) {
        return false;
    }
    SCOPED_LOCK(m_lock);
    auto data = getDataForKey(key);
    if (data.length() > 0) {
        try {
            CodedInputData input(data.getPtr(), data.length());
            result = input.readString();
            return true;
        } catch (std::exception &exception) {
            MMKVError("%s", exception.what());
        }
    }
    return false;
}

MMBuffer MMKV::getDataForKey(MMKVKey_t key) {
    checkLoadData();
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        auto itr = m_dicCrypt->find(key);
        if (itr != m_dicCrypt->end()) {
            auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size;
            return itr->second.toMMBuffer(basePtr, m_crypter);
        }
    } else
#endif
    {
        auto itr = m_dic->find(key);
        if (itr != m_dic->end()) {
            auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size;
            return itr->second.toMMBuffer(basePtr);
        }
    }
    MMBuffer nan;
    return nan;
}

KeyValueHolder.cpp:

MMBuffer KeyValueHolder::toMMBuffer(const void *basePtr) const {
    auto realPtr = (uint8_t *) basePtr + offset;
    realPtr += computedKVSize;
    return MMBuffer(realPtr, valueSize, MMBufferNoCopy);
}

CodedInputData.cpp:
string CodedInputData::readString() {
    int32_t size = readRawVarint32();
    if (size < 0) {
        throw length_error("InvalidProtocolBuffer negativeSize");
    }

    auto s_size = static_cast<size_t>(size);
    if (s_size <= m_size - m_position) {
        string result((char *) (m_ptr + m_position), s_size);
        m_position += s_size;
        return result;
    } else {
        throw out_of_range("InvalidProtocolBuffer truncatedMessage");
    }
}

代码比较简单。逻辑就是取出m_dic或m_dicCrypt字典中元素,即KeyValueHolder,将它转为MMBuffer,然后转为CodedInputData。调用里面的readString方法,根据指针和偏移量读取内存块中的数据。

扩容机制

MMKV的初始容量如果传入的size小于内存页大小,那容量就默认为一个内存页大小了。随着数据不断的插入,会伴随超出容量的情况,这时就需要扩容了。MMKV的扩容逻辑如下:

MMKV_IO.cpp:

// since we use append mode, when -[setData: forKey:] many times, space may not be enough
// try a full rewrite to make space
bool MMKV::ensureMemorySize(size_t newSize) {
    if (!isFileValid()) {
        MMKVWarning("[%s] file not valid", m_mmapID.c_str());
        return false;
    }

    if (newSize >= m_output->spaceLeft() || (m_crypter ? m_dicCrypt->empty() : m_dic->empty())) {
        // try a full rewrite to make space
        auto fileSize = m_file->getFileSize();
        auto preparedData = m_crypter ? prepareEncode(*m_dicCrypt) : prepareEncode(*m_dic);
        auto sizeOfDic = preparedData.second;
        size_t lenNeeded = sizeOfDic + Fixed32Size + newSize;
        size_t dicCount = m_crypter ? m_dicCrypt->size() : m_dic->size();
        size_t avgItemSize = lenNeeded / std::max<size_t>(1, dicCount);
        size_t futureUsage = avgItemSize * std::max<size_t>(8, (dicCount + 1) / 2);
        // 1. no space for a full rewrite, double it
        // 2. or space is not large enough for future usage, double it to avoid frequently full rewrite
        if (lenNeeded >= fileSize || (lenNeeded + futureUsage) >= fileSize) {
            size_t oldSize = fileSize;
            do {
                fileSize *= 2;
            } while (lenNeeded + futureUsage >= fileSize);
            MMKVInfo("extending [%s] file size from %zu to %zu, incoming size:%zu, future usage:%zu", m_mmapID.c_str(),
                     oldSize, fileSize, newSize, futureUsage);

            // if we can't extend size, rollback to old state
            if (!m_file->truncate(fileSize)) {
                return false;
            }

            // check if we fail to make more space
            if (!isFileValid()) {
                MMKVWarning("[%s] file not valid", m_mmapID.c_str());
                return false;
            }
        }
        return doFullWriteBack(move(preparedData), nullptr);
    }
    return true;
}

逻辑就是加上新元素的大小如果超过文件大小,或者这个大小加上未来可能使用到的大小(字典一半的大小或8个元素的大小)超过了文件大小,则double文件大小fileSize,并将数据写回。这里看下doFullWriteBack这个函数:

MMKV_IO.cpp:

bool MMKV::doFullWriteBack(pair<MMBuffer, size_t> preparedData, AESCrypt *newCrypter) {
    auto ptr = (uint8_t *) m_file->getMemory();
    auto totalSize = preparedData.second;
#ifdef MMKV_IOS
    auto ret = guardForBackgroundWriting(ptr + Fixed32Size, totalSize);
    if (!ret.first) {
        return false;
    }
#endif

#ifndef MMKV_DISABLE_CRYPT
    uint8_t newIV[AES_KEY_LEN];
    auto decrypter = m_crypter;
    auto encrypter = (newCrypter == InvalidCryptPtr) ? nullptr : (newCrypter ? newCrypter : m_crypter);
    if (encrypter) {
        AESCrypt::fillRandomIV(newIV);
        encrypter->resetIV(newIV, sizeof(newIV));
    }
#endif

    delete m_output;
    m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
#ifndef MMKV_DISABLE_CRYPT
    if (m_crypter) {
        memmoveDictionary(*m_dicCrypt, m_output, ptr, decrypter, encrypter, preparedData);
    } else {
#else
    {
        auto encrypter = m_crypter;
#endif
        memmoveDictionary(*m_dic, m_output, ptr, encrypter, totalSize);
    }

    m_actualSize = totalSize;
#ifndef MMKV_DISABLE_CRYPT
    if (encrypter) {
        recaculateCRCDigestWithIV(newIV);
    } else
#endif
    {
        recaculateCRCDigestWithIV(nullptr);
    }
    m_hasFullWriteback = true;
    // make sure lastConfirmedMetaInfo is saved
    sync(MMKV_SYNC);
    return true;
}

// we don't need to really serialize the dictionary, just reuse what's already in the file
static void
memmoveDictionary(MMKVMap &dic, CodedOutputData *output, uint8_t *ptr, AESCrypt *encrypter, size_t totalSize) {
    auto originOutputPtr = output->curWritePointer();
    // make space to hold the fake size of dictionary's serialization result
    auto writePtr = originOutputPtr + ItemSizeHolderSize;
    // reuse what's already in the file
    if (!dic.empty()) {
        // sort by offset
        vector<KeyValueHolder *> vec;
        vec.reserve(dic.size());
        for (auto &itr : dic) {
            vec.push_back(&itr.second);
        }
        //按偏移大小排序
        sort(vec.begin(), vec.end(), [](const auto &left, const auto &right) { return left->offset < right->offset; });

        // merge nearby items to make memmove quicker
        vector<pair<uint32_t, uint32_t>> dataSections; // pair(offset, size)
        dataSections.emplace_back(vec.front()->offset, vec.front()->computedKVSize + vec.front()->valueSize);
        for (size_t index = 1, total = vec.size(); index < total; index++) {
            auto kvHolder = vec[index];
            auto &lastSection = dataSections.back();
            if (kvHolder->offset == lastSection.first + lastSection.second) {
                lastSection.second += kvHolder->computedKVSize + kvHolder->valueSize;
            } else {
                dataSections.emplace_back(kvHolder->offset, kvHolder->computedKVSize + kvHolder->valueSize);
            }
        }
        // do the move
        auto basePtr = ptr + Fixed32Size;
        for (auto &section : dataSections) {
            // memmove() should handle this well: src == dst
            memmove(writePtr, basePtr + section.first, section.second);
            writePtr += section.second;
        }
        // update offset
        if (!encrypter) {
            auto offset = ItemSizeHolderSize;
            for (auto kvHolder : vec) {
                kvHolder->offset = offset;
                offset += kvHolder->computedKVSize + kvHolder->valueSize;
            }
        }
    }
    // hold the fake size of dictionary's serialization result
    output->writeRawVarint32(ItemSizeHolder);
    auto writtenSize = static_cast<size_t>(writePtr - originOutputPtr);
#ifndef MMKV_DISABLE_CRYPT
    if (encrypter) {
        encrypter->encrypt(originOutputPtr, originOutputPtr, writtenSize);
    }
#endif
    assert(writtenSize == totalSize);
    output->seek(writtenSize - ItemSizeHolderSize);
}

可以看到新建了一个CodedOutputData,指针定位到文件开头,并且将字典中的元素按照偏移从小到大排序,然后通过memmove将元素一个个写回。最后输出指针定位到新写入数据之后的末尾位置。经过这一番处理,数据就达到了合并、重整的目的。扩容操作在写入数据之前都会判断,如果容量不够,则执行扩容。

处理同步

MMKV内部定义了几种锁来处理不同场景下的同步问题:

mmkv::ThreadLock *m_lock; //线程锁,处理多线程同步
mmkv::FileLock *m_fileLock;//文件锁,处理多进程同步,被下面两种锁包装了一下
mmkv::InterProcessLock *m_sharedProcessLock;//共享锁,包装了上面的文件锁,读操作之前使用
mmkv::InterProcessLock *m_exclusiveProcessLock;//独占锁,包装了上面的文件锁,写操作之前使用

先来看看ThreadLock:

class ThreadLock {
private:
#if MMKV_USING_PTHREAD
    pthread_mutex_t m_lock;
#else
    CRITICAL_SECTION m_lock;
#endif
...
}

ThreadLock::ThreadLock() {
    pthread_mutexattr_t attr;
    pthread_mutexattr_init(&attr);
    pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);

    pthread_mutex_init(&m_lock, &attr);

    pthread_mutexattr_destroy(&attr);
}

可以看到就是对互斥锁pthread_mutex_t的封装,并且使用的是可重入的模式。这个用于线程间同步,在写入和读取数据之前加锁,保证数据的同步。在使用锁的时候用了一个ScopedLock包装类来操作锁,这个类的对象构造时加锁,析构时解锁,简化了操作。代码如下:

class ScopedLock {
    T *m_lock;

    void lock() {
        if (m_lock) {
            m_lock->lock();
        }
    }

    void unlock() {
        if (m_lock) {
            m_lock->unlock();
        }
    }

public:
    explicit ScopedLock(T *oLock) : m_lock(oLock) {
        MMKV_ASSERT(m_lock);
        lock();
    }

    ~ScopedLock() {
        unlock();
        m_lock = nullptr;
    }

    // just forbid it for possibly misuse
    explicit ScopedLock(const ScopedLock<T> &other) = delete;
    ScopedLock &operator=(const ScopedLock<T> &other) = delete;
};

与此同时,在读写操作的时候还会再加入独占锁(写锁)和共享锁(读锁)来处理多进程访问的同步问题。MMKV这里使用的是文件锁flock,对它进行包装。内部维护了读写的计数器,之所以这样做是因为文件锁不能实现递归,虽然可以多次加锁,但是解锁操作一次就都解锁了。再一个是为了处理锁升级和降级的情况,即读锁升级为写锁,写锁降级为读锁。对于升级,方案是先解读锁在加写锁,对于降级则是加个读锁,将写锁降级。代码片段如下:

bool FileLock::doLock(LockType lockType, bool wait) {
    if (!isFileLockValid()) {
        return false;
    }
    bool unLockFirstIfNeeded = false;

    if (lockType == SharedLockType) {
        // don't want shared-lock to break any existing locks
        if (m_sharedLockCount > 0 || m_exclusiveLockCount > 0) {
            m_sharedLockCount++;
            return true;
        }
    } else {
        // don't want exclusive-lock to break existing exclusive-locks
        if (m_exclusiveLockCount > 0) {
            m_exclusiveLockCount++;
            return true;
        }
        // prevent deadlock
        if (m_sharedLockCount > 0) {
            unLockFirstIfNeeded = true;
        }
    }

    auto ret = platformLock(lockType, wait, unLockFirstIfNeeded);
    if (ret) {
        if (lockType == SharedLockType) {
            m_sharedLockCount++;
        } else {
            m_exclusiveLockCount++;
        }
    }
    return ret;
}

bool FileLock::platformLock(LockType lockType, bool wait, bool unLockFirstIfNeeded) {
#    ifdef MMKV_ANDROID
    if (m_isAshmem) {
        return ashmemLock(lockType, wait, unLockFirstIfNeeded);
    }
#    endif
    auto realLockType = LockType2FlockType(lockType);
    auto cmd = wait ? realLockType : (realLockType | LOCK_NB);
    if (unLockFirstIfNeeded) {
        // try lock
        auto ret = flock(m_fd, realLockType | LOCK_NB);
        if (ret == 0) {
            return true;
        }
        // let's be gentleman: unlock my shared-lock to prevent deadlock
        ret = flock(m_fd, LOCK_UN);
        if (ret != 0) {
            MMKVError("fail to try unlock first fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
        }
    }

    auto ret = flock(m_fd, cmd);
    if (ret != 0) {
        MMKVError("fail to lock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
        // try recover my shared-lock
        if (unLockFirstIfNeeded) {
            ret = flock(m_fd, LockType2FlockType(SharedLockType));
            if (ret != 0) {
                // let's hope this never happen
                MMKVError("fail to recover shared-lock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
            }
        }
        return false;
    } else {
        return true;
    }
}

bool FileLock::unlock(LockType lockType) {
    if (!isFileLockValid()) {
        return false;
    }
    bool unlockToSharedLock = false;

    if (lockType == SharedLockType) {
        if (m_sharedLockCount == 0) {
            return false;
        }
        // don't want shared-lock to break any existing locks
        if (m_sharedLockCount > 1 || m_exclusiveLockCount > 0) {
            m_sharedLockCount--;
            return true;
        }
    } else {
        if (m_exclusiveLockCount == 0) {
            return false;
        }
        if (m_exclusiveLockCount > 1) {
            m_exclusiveLockCount--;
            return true;
        }
        // restore shared-lock when all exclusive-locks are done
        if (m_sharedLockCount > 0) {
            unlockToSharedLock = true;
        }
    }

    auto ret = platformUnLock(unlockToSharedLock);
    if (ret) {
        if (lockType == SharedLockType) {
            m_sharedLockCount--;
        } else {
            m_exclusiveLockCount--;
        }
    }
    return ret;
}

bool FileLock::platformUnLock(bool unlockToSharedLock) {
#    ifdef MMKV_ANDROID
    if (m_isAshmem) {
        return ashmemUnLock(unlockToSharedLock);
    }
#    endif
    int cmd = unlockToSharedLock ? LOCK_SH : LOCK_UN;
    auto ret = flock(m_fd, cmd);
    if (ret != 0) {
        MMKVError("fail to unlock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
        return false;
    } else {
        return true;
    }
}

另外除了锁,MMKV内部通过维护文件大小,在读写操作之前检查现在的文件大小和内部维护的大小是否匹配,如果不匹配说明发生了多进程的写入。另一方面还维护了一个m_sequence的序列号,当内存发生重整时,会递增这个序列号,一旦自己进程的序列号和文件的序列号不一致,则说明发生了其他进程的重整操作。这两种情况都需要重新加载:

MMKV_IO.cpp:

void MMKV::checkLoadData() {
    ...
    if (m_metaInfo->m_sequence != metaInfo.m_sequence) {
        MMKVInfo("[%s] oldSeq %u, newSeq %u", m_mmapID.c_str(), m_metaInfo->m_sequence, metaInfo.m_sequence);
        SCOPED_LOCK(m_sharedProcessLock);

        clearMemoryCache();
        loadFromFile();
        notifyContentChanged();
    } else if (m_metaInfo->m_crcDigest != metaInfo.m_crcDigest) {
        MMKVDebug("[%s] oldCrc %u, newCrc %u, new actualSize %u", m_mmapID.c_str(), m_metaInfo->m_crcDigest,
                  metaInfo.m_crcDigest, metaInfo.m_actualSize);
        SCOPED_LOCK(m_sharedProcessLock);

        size_t fileSize = m_file->getActualFileSize();
        if (m_file->getFileSize() != fileSize) {
            MMKVInfo("file size has changed [%s] from %zu to %zu", m_mmapID.c_str(), m_file->getFileSize(), fileSize);
            clearMemoryCache();
            loadFromFile();
        } else {
            partialLoadFromFile();
        }
        notifyContentChanged();
    }
}

数据加密

MMKV可以选择使用加密的模式,即将数据加密存储,使用的是AES加密,需要传密钥进去,但是需要自己保存好密钥,在下次初始化时传入进去。MMKV处理的比较灵活,可以变换密钥,或者明文和密文切换。在切换之后,需要重新将数据写入一遍。代码片段如下:

bool MMKV::reKey(const string &cryptKey) {
    ...
    if (m_crypter) {
        if (cryptKey.length() > 0) {
            string oldKey = this->cryptKey();
            if (cryptKey == oldKey) {
                return true;
            } else {
                // change encryption key
                MMKVInfo("reKey with new aes key");
                auto newCrypt = new AESCrypt(cryptKey.data(), cryptKey.length());
                ret = fullWriteback(newCrypt);
                if (ret) {
                    delete m_crypter;
                    m_crypter = newCrypt;
                } else {
                    delete newCrypt;
                }
            }
        } else {
            // decryption to plain text
            MMKVInfo("reKey to no aes key");
            ret = fullWriteback(InvalidCryptPtr);
            if (ret) {
                delete m_crypter;
                m_crypter = nullptr;
                if (!m_dic) {
                    m_dic = new MMKVMap();
                }
            }
        }
    } else {
        if (cryptKey.length() > 0) {
            // transform plain text to encrypted text
            MMKVInfo("reKey to a aes key");
            auto newCrypt = new AESCrypt(cryptKey.data(), cryptKey.length());
            ret = fullWriteback(newCrypt);
            if (ret) {
                m_crypter = newCrypt;
                if (!m_dicCrypt) {
                    m_dicCrypt = new MMKVMapCrypt();
                }
            } else {
                delete newCrypt;
            }
        } else {
            return true;
        }
    }
    // m_dic or m_dicCrypt is not valid after reKey
    if (ret) {
        clearMemoryCache();
    }
    return ret;
}

性能对比

在单进程和多进程的写入和读取性能上,MMKV都是大幅超过了Android传统的SharedPreferences和SQLite。这得益于mmap和增量写入。具体测试结果参考官方:github.com/Tencent/MMK…

总结

MMKV作为一种高性能大量数据的存储组件,对比Android传统的存储方式SharedPreferences和SQLite确实有不少优势。核心是使用mmap内存映射文件,对比传统IO,在性能上有很大优势,并且将读写文件的操作变得和操作内存一样简单。翻看源码,有不少优秀的设计点。比如增量写入,重整内存,通过文件大小校验对多进程操作感知,多进程读写锁等等。但它的缺点是可能造成内存的浪费,因为必须映射内存页的整数倍,如果只存储很少量的数据,则显得大材小用。因此,可以作为一种数据存储的选择方案,在一些需要大量存储数据场景时,替代SharedPreferences。