前言
MMKV是腾讯开源的一个基于mmap的高性能通用key-value组件。Android传统的key-value读写工具即是SharedPreferences,轻量级、简单、易用是它的优点,但是也存在性能差、不支持多进程通信等问题。MMKV 使用了mmap内存映射文件,将内存块映射到文件,读写文件就像读写内存一样,从而性能更高,因此也就可以作为SharedPreferences的替代品。
SharedPreferences的缺点
Android 传统的SharedPreferences虽然简单易用,但是也存在一些问题:
-
全量更新,写入效率低下。
-
每次都需要将所有的数据加载到内存,如果存储大量数据,会占用很多的内存。
-
多进程情况下使用安全性问题。没有合适的机制防止多进程更新所造成的冲突,官方因此不推荐多进程模式下使用,已经废弃了MODE_MULTI_PROCESS 这个标记位。
而MMKV正是解决了上述的问题。
MMKV 实现原理
mmap内存映射文件
MMKV的核心原理即是内存映射。mmap是linux提供出来作为应用级内存映射的工具。mmap函数定义如下:
void* mmap(void* __addr, size_t __size, int __prot, int __flags, int __fd, off_t __offset);
传入起始地址内存起始地址addr,创建size大小的内存区域,将文件描述符fd关联的文件映射到这个区域,从文件的offset偏移量开始。这里的prot是访问权限,可以指定为PROT_EXEC、PROT-READ、PROT_WRITE、PROT_NONE,分别代表可执行、可读、可写、不能访问。而flags可以设置映射的对象的类型,例如:MAP_SHARED、MAP_PRIVATE、MAP_ANONYMOUS分别代表了共享对象、写时复制的私有对象、不和文件关联的匿名对象。
-
mmap和传统IO的对比
传统IO的读操作是DMA(Direct Memory Access)先复制磁盘数据到内核缓冲区,然后CPU再将内核缓冲区的数据复制到应用程序地址空间。单次读写操作涉及到两次内存拷贝、两次上下文切换。而mmap只需要首次将磁盘数据拷贝到内核缓冲区,然后建立用户空间和内核空间的映射,之后读取这块数据就不需要内存拷贝了,也不需要上下切换。传统IO的写操作同理,也涉及到两次拷贝和两次上下文切换。而mmap只需要建立用户和内存空间的映射,往里面写数据不需要拷贝和上下文切换,系统负责将数据写回到文件中。如此可见相比传统IO,mmap减少了数据的拷贝和上下文切换,提高了IO效率。
MMKV 具体的mmap逻辑写在了MemoryFile.cpp中:
bool MemoryFile::mmap() {
m_ptr = (char *) ::mmap(m_ptr, m_size, PROT_READ | PROT_WRITE, MAP_SHARED, m_fd, 0);
if (m_ptr == MAP_FAILED) {
MMKVError("fail to mmap [%s], %s", m_name.c_str(), strerror(errno));
m_ptr = nullptr;
return false;
}
return true;
}
可以看到mmap用的是共享对象的模式,并且赋予了读写的权限。
mmap听上去很完美是不是?但是实际使用中还是有缺陷,例如:mmap必须映射整页的内存,可能会造成内存的浪费,所以mmap的适用场景是大文件的频繁读写,这样就可以节省很多IO的耗时;虽然写回文件的工作由系统负责,但是并不是实时的,是定期写回到磁盘的,中间如果发生内核崩溃、断电等,还是会丢失数据,不过可以通过msync将数据同步回磁盘。
数据结构
MMKV 存取数据的协议是protobuf。protobuf协议是google出品的。在数据大小上、序列化和反序列化的速度上都比XML更有优势,数据体积小3-10倍,反序列化速度提高20-100倍。具体的protobuf协议介绍可以参考这篇博文:halfrost.com/protobuf_en…
protobuf提升了数据序列化和反序列化的速度,但是由于protobuf不支持增量的更新,那么怎么实现只添加或修改一个key-value呢?这里mmkv的做法是将数据append到内存块的末尾,那么这样就可能产生很多相同的key,而mmkv在第一次从文件加载数据到内存中时,会将后写入的新的key替换之前旧的key,保证数据是最新的。翻看mmkv源码可以看到初始化的时候,用一个map来存放从文件当中加载的键值对,那么如果存在相同的key,前面的key就会被后面的key覆盖,保证这个key的值是最新的。代码片段如下:
MMKV_Android.cpp:
MMKV::MMKV(const string &mmapID, int size, MMKVMode mode, string *cryptKey, string *rootPath){
...
// sensitive zone
{
SCOPED_LOCK(m_sharedProcessLock);
loadFromFile();
}
}
MMKV_IO.cpp:
void MMKV::loadFromFile() {
...
if (!m_file->isFileValid()) {
MMKVError("file [%s] not valid", m_path.c_str());
} else {
...
// loading
if (loadFromFile && m_actualSize > 0) {
...
if (needFullWriteback) {
#ifndef MMKV_DISABLE_CRYPT
if (m_crypter) {
MiniPBCoder::greedyDecodeMap(*m_dicCrypt, inputBuffer, m_crypter);
} else
#endif
{
MiniPBCoder::greedyDecodeMap(*m_dic, inputBuffer);
}
} else {
#ifndef MMKV_DISABLE_CRYPT
if (m_crypter) {
MiniPBCoder::decodeMap(*m_dicCrypt, inputBuffer, m_crypter);
} else
#endif
{
MiniPBCoder::decodeMap(*m_dic, inputBuffer);
}
}
...
} else {
...
}
...
}
m_needLoadFromFile = false;
}
MiniPBCoder.cpp:
void MiniPBCoder::decodeOneMap(MMKVMap &dic, size_t position, bool greedy) {
auto block = [position, this](MMKVMap &dictionary) {
...
while (!m_inputData->isAtEnd()) {
KeyValueHolder kvHolder;
const auto &key = m_inputData->readString(kvHolder);
if (key.length() > 0) {
m_inputData->readData(kvHolder);
if (kvHolder.valueSize > 0) {
dictionary[key] = move(kvHolder);
} else {
auto itr = dictionary.find(key);
if (itr != dictionary.end()) {
dictionary.erase(itr);
}
}
}
}
};
if (greedy) {
try {
block(dic);
} catch (std::exception &exception) {
MMKVError("%s", exception.what());
}
} else {
try {
MMKVMap tmpDic;
block(tmpDic);
dic.swap(tmpDic);
} catch (std::exception &exception) {
MMKVError("%s", exception.what());
}
}
}
m_dic和m_dicCrypt是一个键值对的字典,他们的数据结构定义如下:
MMKVPredef.h:
using MMKVMap = std::unordered_map<std::string, mmkv::KeyValueHolder>;
using MMKVMapCrypt = std::unordered_map<std::string, mmkv::KeyValueHolderCrypt>;
都是unordered_map,存放的key是写入内存块的key,也就是我们传入的key,存放的值是KeyValueHolder这个结构体,看下定义:
KeyValueHolder.h:
struct KeyValueHolder {
uint16_t computedKVSize; // internal use only
uint16_t keySize;
uint32_t valueSize;
uint32_t offset;
KeyValueHolder() = default;
KeyValueHolder(uint32_t keyLength, uint32_t valueLength, uint32_t offset);
MMBuffer toMMBuffer(const void *basePtr) const;
};
可以看到存储了key和value的大小以及在内存中的偏移量,并没有直接存储key和value,到用的时候,凭借偏移量和大小去内存中取,节约了内存。
数据写入
举例分析String数据是如何存入的。代码片段如下:
public class MMKV implements SharedPreferences, SharedPreferences.Editor {
...
public boolean encode(String key, String value) {
return encodeString(nativeHandle, key, value);
}
...
private native boolean encodeString(long handle, String key, String value);
}
native-birdge.cpp:
MMKV_JNI jboolean encodeString(JNIEnv *env, jobject, jlong handle, jstring oKey, jstring oValue) {
MMKV *kv = reinterpret_cast<MMKV *>(handle);
if (kv && oKey) {
string key = jstring2string(env, oKey);
if (oValue) {
string value = jstring2string(env, oValue);
return (jboolean) kv->set(value, key);
} else {
//移除这个键值对
kv->removeValueForKey(key);
return (jboolean) true;
}
}
return (jboolean) false;
}
MMKV.cpp:
bool MMKV::set(const string &value, MMKVKey_t key) {
if (isKeyEmpty(key)) {
return false;
}
return setDataForKey(MMBuffer((void *) value.data(), value.length(), MMBufferNoCopy), key, true);
}
MMKV_IO.cpp:
bool MMKV::setDataForKey(MMBuffer &&data, MMKVKey_t key, bool isDataHolder) {
if ((!isDataHolder && data.length() == 0) || isKeyEmpty(key)) {
return false;
}
SCOPED_LOCK(m_lock);
SCOPED_LOCK(m_exclusiveProcessLock);
checkLoadData();
#ifndef MMKV_DISABLE_CRYPT
if (m_crypter) {
...
} else
#endif // MMKV_DISABLE_CRYPT
{
auto itr = m_dic->find(key);
if (itr != m_dic->end()) {
//字典中存在这个键值对,调用对应append函数
auto ret = appendDataWithKey(data, itr->second, isDataHolder);
if (!ret.first) {
return false;
}
//更新最先找到的key&value
itr->second = std::move(ret.second);
} else {
//字典中不存在这个键值对,调用对应append函数
auto ret = appendDataWithKey(data, key, isDataHolder);
if (!ret.first) {
return false;
}
m_dic->emplace(key, std::move(ret.second));
}
}
m_hasFullWriteback = false;
#ifdef MMKV_APPLE
[key retain];
#endif
return true;
}
//字典中已经存在这个键值对所调用的append函数
KVHolderRet_t MMKV::appendDataWithKey(const MMBuffer &data, const KeyValueHolder &kvHolder, bool isDataHolder) {
SCOPED_LOCK(m_exclusiveProcessLock);
uint32_t keyLength = kvHolder.keySize;
// size needed to encode the key
size_t rawKeySize = keyLength + pbRawVarint32Size(keyLength);
// ensureMemorySize() might change kvHolder.offset, so have to do it early
{
auto valueLength = static_cast<uint32_t>(data.length());
if (isDataHolder) {
valueLength += pbRawVarint32Size(valueLength);
}
auto size = rawKeySize + valueLength + pbRawVarint32Size(valueLength);
bool hasEnoughSize = ensureMemorySize(size);
if (!hasEnoughSize) {
return make_pair(false, KeyValueHolder());
}
}
auto basePtr = (uint8_t *) m_file->getMemory() + Fixed32Size;
MMBuffer keyData(basePtr + kvHolder.offset, rawKeySize, MMBufferNoCopy);
return doAppendDataWithKey(data, keyData, isDataHolder, keyLength);
}
//字典中不存在这个键值对时所调用的append函数
KVHolderRet_t MMKV::appendDataWithKey(const MMBuffer &data, MMKVKey_t key, bool isDataHolder) {
#ifdef MMKV_APPLE
auto oData = [key dataUsingEncoding:NSUTF8StringEncoding];
auto keyData = MMBuffer(oData, MMBufferNoCopy);
#else
auto keyData = MMBuffer((void *) key.data(), key.size(), MMBufferNoCopy);
#endif
return doAppendDataWithKey(data, keyData, isDataHolder, static_cast<uint32_t>(keyData.length()));
}
//两种情况都会调用到doAppendDataWithKey函数
KVHolderRet_t
MMKV::doAppendDataWithKey(const MMBuffer &data, const MMBuffer &keyData, bool isDataHolder, uint32_t originKeyLength) {
...
try {
if (isKeyEncoded) {
m_output->writeRawData(keyData);
} else {
m_output->writeData(keyData);
}
if (isDataHolder) {
m_output->writeRawVarint32((int32_t) valueLength);
}
m_output->writeData(data); // note: write size of data
} catch (std::exception &e) {
MMKVError("%s", e.what());
return make_pair(false, KeyValueHolder());
}
auto offset = static_cast<uint32_t>(m_actualSize);
auto ptr = (uint8_t *) m_file->getMemory() + Fixed32Size + m_actualSize;
#ifndef MMKV_DISABLE_CRYPT
if (m_crypter) {
m_crypter->encrypt(ptr, ptr, size);
}
#endif
m_actualSize += size;
updateCRCDigest(ptr, size);
return make_pair(true, KeyValueHolder(originKeyLength, valueLength, offset));
}
CodedOutputData.cpp:
void CodedOutputData::writeRawData(const MMBuffer &data) {
size_t numberOfBytes = data.length();
if (m_position + numberOfBytes > m_size) {
auto msg = "m_position: " + to_string(m_position) + ", numberOfBytes: " + to_string(numberOfBytes) +
", m_size: " + to_string(m_size);
throw out_of_range(msg);
}
memcpy(m_ptr + m_position, data.getPtr(), numberOfBytes);
m_position += numberOfBytes;
}
我们看最后一个CodedOutputData.cpp中的writeRawData方法,数据都被写入到内存块末尾了。m_position是一个写入数据的位置偏移,初始值为m_actualSize即文件的真实大小,这就定位到文件末尾了。再看MMKV_IO.cpp中的setDataForKey方法,它先根据key去找map中的元素,如果找到了,则将keydata的指针定位到这里,为的是取这里的key值和更新m_dic或m_dicCrypt字典中的键值对。看这句代码:itr->second = std::move(ret.second); 用了一个巧妙的方式,用move函数,用移动语义减少了内存的拷贝。对应的,如果没有找到则直接加入内存块的尾部,并将键值对添加到m_dic或m_dicCrypt中。
数据取出
再来看看String类型的数据如何取出。片段代码如下:
MMKV.java:
public String decodeString(String key) {
return decodeString(nativeHandle, key, null);
}
private native String decodeString(long handle, String key, String defaultValue);
native-bridge.cpp:
MMKV_JNI jstring decodeString(JNIEnv *env, jobject obj, jlong handle, jstring oKey, jstring oDefaultValue) {
MMKV *kv = reinterpret_cast<MMKV *>(handle);
if (kv && oKey) {
string key = jstring2string(env, oKey);
string value;
bool hasValue = kv->getString(key, value);
if (hasValue) {
return string2jstring(env, value);
}
}
return oDefaultValue;
}
MMKV_IO.cpp:
bool MMKV::getString(MMKVKey_t key, string &result) {
if (isKeyEmpty(key)) {
return false;
}
SCOPED_LOCK(m_lock);
auto data = getDataForKey(key);
if (data.length() > 0) {
try {
CodedInputData input(data.getPtr(), data.length());
result = input.readString();
return true;
} catch (std::exception &exception) {
MMKVError("%s", exception.what());
}
}
return false;
}
MMBuffer MMKV::getDataForKey(MMKVKey_t key) {
checkLoadData();
#ifndef MMKV_DISABLE_CRYPT
if (m_crypter) {
auto itr = m_dicCrypt->find(key);
if (itr != m_dicCrypt->end()) {
auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size;
return itr->second.toMMBuffer(basePtr, m_crypter);
}
} else
#endif
{
auto itr = m_dic->find(key);
if (itr != m_dic->end()) {
auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size;
return itr->second.toMMBuffer(basePtr);
}
}
MMBuffer nan;
return nan;
}
KeyValueHolder.cpp:
MMBuffer KeyValueHolder::toMMBuffer(const void *basePtr) const {
auto realPtr = (uint8_t *) basePtr + offset;
realPtr += computedKVSize;
return MMBuffer(realPtr, valueSize, MMBufferNoCopy);
}
CodedInputData.cpp:
string CodedInputData::readString() {
int32_t size = readRawVarint32();
if (size < 0) {
throw length_error("InvalidProtocolBuffer negativeSize");
}
auto s_size = static_cast<size_t>(size);
if (s_size <= m_size - m_position) {
string result((char *) (m_ptr + m_position), s_size);
m_position += s_size;
return result;
} else {
throw out_of_range("InvalidProtocolBuffer truncatedMessage");
}
}
代码比较简单。逻辑就是取出m_dic或m_dicCrypt字典中元素,即KeyValueHolder,将它转为MMBuffer,然后转为CodedInputData。调用里面的readString方法,根据指针和偏移量读取内存块中的数据。
扩容机制
MMKV的初始容量如果传入的size小于内存页大小,那容量就默认为一个内存页大小了。随着数据不断的插入,会伴随超出容量的情况,这时就需要扩容了。MMKV的扩容逻辑如下:
MMKV_IO.cpp:
// since we use append mode, when -[setData: forKey:] many times, space may not be enough
// try a full rewrite to make space
bool MMKV::ensureMemorySize(size_t newSize) {
if (!isFileValid()) {
MMKVWarning("[%s] file not valid", m_mmapID.c_str());
return false;
}
if (newSize >= m_output->spaceLeft() || (m_crypter ? m_dicCrypt->empty() : m_dic->empty())) {
// try a full rewrite to make space
auto fileSize = m_file->getFileSize();
auto preparedData = m_crypter ? prepareEncode(*m_dicCrypt) : prepareEncode(*m_dic);
auto sizeOfDic = preparedData.second;
size_t lenNeeded = sizeOfDic + Fixed32Size + newSize;
size_t dicCount = m_crypter ? m_dicCrypt->size() : m_dic->size();
size_t avgItemSize = lenNeeded / std::max<size_t>(1, dicCount);
size_t futureUsage = avgItemSize * std::max<size_t>(8, (dicCount + 1) / 2);
// 1. no space for a full rewrite, double it
// 2. or space is not large enough for future usage, double it to avoid frequently full rewrite
if (lenNeeded >= fileSize || (lenNeeded + futureUsage) >= fileSize) {
size_t oldSize = fileSize;
do {
fileSize *= 2;
} while (lenNeeded + futureUsage >= fileSize);
MMKVInfo("extending [%s] file size from %zu to %zu, incoming size:%zu, future usage:%zu", m_mmapID.c_str(),
oldSize, fileSize, newSize, futureUsage);
// if we can't extend size, rollback to old state
if (!m_file->truncate(fileSize)) {
return false;
}
// check if we fail to make more space
if (!isFileValid()) {
MMKVWarning("[%s] file not valid", m_mmapID.c_str());
return false;
}
}
return doFullWriteBack(move(preparedData), nullptr);
}
return true;
}
逻辑就是加上新元素的大小如果超过文件大小,或者这个大小加上未来可能使用到的大小(字典一半的大小或8个元素的大小)超过了文件大小,则double文件大小fileSize,并将数据写回。这里看下doFullWriteBack这个函数:
MMKV_IO.cpp:
bool MMKV::doFullWriteBack(pair<MMBuffer, size_t> preparedData, AESCrypt *newCrypter) {
auto ptr = (uint8_t *) m_file->getMemory();
auto totalSize = preparedData.second;
#ifdef MMKV_IOS
auto ret = guardForBackgroundWriting(ptr + Fixed32Size, totalSize);
if (!ret.first) {
return false;
}
#endif
#ifndef MMKV_DISABLE_CRYPT
uint8_t newIV[AES_KEY_LEN];
auto decrypter = m_crypter;
auto encrypter = (newCrypter == InvalidCryptPtr) ? nullptr : (newCrypter ? newCrypter : m_crypter);
if (encrypter) {
AESCrypt::fillRandomIV(newIV);
encrypter->resetIV(newIV, sizeof(newIV));
}
#endif
delete m_output;
m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size);
#ifndef MMKV_DISABLE_CRYPT
if (m_crypter) {
memmoveDictionary(*m_dicCrypt, m_output, ptr, decrypter, encrypter, preparedData);
} else {
#else
{
auto encrypter = m_crypter;
#endif
memmoveDictionary(*m_dic, m_output, ptr, encrypter, totalSize);
}
m_actualSize = totalSize;
#ifndef MMKV_DISABLE_CRYPT
if (encrypter) {
recaculateCRCDigestWithIV(newIV);
} else
#endif
{
recaculateCRCDigestWithIV(nullptr);
}
m_hasFullWriteback = true;
// make sure lastConfirmedMetaInfo is saved
sync(MMKV_SYNC);
return true;
}
// we don't need to really serialize the dictionary, just reuse what's already in the file
static void
memmoveDictionary(MMKVMap &dic, CodedOutputData *output, uint8_t *ptr, AESCrypt *encrypter, size_t totalSize) {
auto originOutputPtr = output->curWritePointer();
// make space to hold the fake size of dictionary's serialization result
auto writePtr = originOutputPtr + ItemSizeHolderSize;
// reuse what's already in the file
if (!dic.empty()) {
// sort by offset
vector<KeyValueHolder *> vec;
vec.reserve(dic.size());
for (auto &itr : dic) {
vec.push_back(&itr.second);
}
//按偏移大小排序
sort(vec.begin(), vec.end(), [](const auto &left, const auto &right) { return left->offset < right->offset; });
// merge nearby items to make memmove quicker
vector<pair<uint32_t, uint32_t>> dataSections; // pair(offset, size)
dataSections.emplace_back(vec.front()->offset, vec.front()->computedKVSize + vec.front()->valueSize);
for (size_t index = 1, total = vec.size(); index < total; index++) {
auto kvHolder = vec[index];
auto &lastSection = dataSections.back();
if (kvHolder->offset == lastSection.first + lastSection.second) {
lastSection.second += kvHolder->computedKVSize + kvHolder->valueSize;
} else {
dataSections.emplace_back(kvHolder->offset, kvHolder->computedKVSize + kvHolder->valueSize);
}
}
// do the move
auto basePtr = ptr + Fixed32Size;
for (auto §ion : dataSections) {
// memmove() should handle this well: src == dst
memmove(writePtr, basePtr + section.first, section.second);
writePtr += section.second;
}
// update offset
if (!encrypter) {
auto offset = ItemSizeHolderSize;
for (auto kvHolder : vec) {
kvHolder->offset = offset;
offset += kvHolder->computedKVSize + kvHolder->valueSize;
}
}
}
// hold the fake size of dictionary's serialization result
output->writeRawVarint32(ItemSizeHolder);
auto writtenSize = static_cast<size_t>(writePtr - originOutputPtr);
#ifndef MMKV_DISABLE_CRYPT
if (encrypter) {
encrypter->encrypt(originOutputPtr, originOutputPtr, writtenSize);
}
#endif
assert(writtenSize == totalSize);
output->seek(writtenSize - ItemSizeHolderSize);
}
可以看到新建了一个CodedOutputData,指针定位到文件开头,并且将字典中的元素按照偏移从小到大排序,然后通过memmove将元素一个个写回。最后输出指针定位到新写入数据之后的末尾位置。经过这一番处理,数据就达到了合并、重整的目的。扩容操作在写入数据之前都会判断,如果容量不够,则执行扩容。
处理同步
MMKV内部定义了几种锁来处理不同场景下的同步问题:
mmkv::ThreadLock *m_lock; //线程锁,处理多线程同步
mmkv::FileLock *m_fileLock;//文件锁,处理多进程同步,被下面两种锁包装了一下
mmkv::InterProcessLock *m_sharedProcessLock;//共享锁,包装了上面的文件锁,读操作之前使用
mmkv::InterProcessLock *m_exclusiveProcessLock;//独占锁,包装了上面的文件锁,写操作之前使用
先来看看ThreadLock:
class ThreadLock {
private:
#if MMKV_USING_PTHREAD
pthread_mutex_t m_lock;
#else
CRITICAL_SECTION m_lock;
#endif
...
}
ThreadLock::ThreadLock() {
pthread_mutexattr_t attr;
pthread_mutexattr_init(&attr);
pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutex_init(&m_lock, &attr);
pthread_mutexattr_destroy(&attr);
}
可以看到就是对互斥锁pthread_mutex_t的封装,并且使用的是可重入的模式。这个用于线程间同步,在写入和读取数据之前加锁,保证数据的同步。在使用锁的时候用了一个ScopedLock包装类来操作锁,这个类的对象构造时加锁,析构时解锁,简化了操作。代码如下:
class ScopedLock {
T *m_lock;
void lock() {
if (m_lock) {
m_lock->lock();
}
}
void unlock() {
if (m_lock) {
m_lock->unlock();
}
}
public:
explicit ScopedLock(T *oLock) : m_lock(oLock) {
MMKV_ASSERT(m_lock);
lock();
}
~ScopedLock() {
unlock();
m_lock = nullptr;
}
// just forbid it for possibly misuse
explicit ScopedLock(const ScopedLock<T> &other) = delete;
ScopedLock &operator=(const ScopedLock<T> &other) = delete;
};
与此同时,在读写操作的时候还会再加入独占锁(写锁)和共享锁(读锁)来处理多进程访问的同步问题。MMKV这里使用的是文件锁flock,对它进行包装。内部维护了读写的计数器,之所以这样做是因为文件锁不能实现递归,虽然可以多次加锁,但是解锁操作一次就都解锁了。再一个是为了处理锁升级和降级的情况,即读锁升级为写锁,写锁降级为读锁。对于升级,方案是先解读锁在加写锁,对于降级则是加个读锁,将写锁降级。代码片段如下:
bool FileLock::doLock(LockType lockType, bool wait) {
if (!isFileLockValid()) {
return false;
}
bool unLockFirstIfNeeded = false;
if (lockType == SharedLockType) {
// don't want shared-lock to break any existing locks
if (m_sharedLockCount > 0 || m_exclusiveLockCount > 0) {
m_sharedLockCount++;
return true;
}
} else {
// don't want exclusive-lock to break existing exclusive-locks
if (m_exclusiveLockCount > 0) {
m_exclusiveLockCount++;
return true;
}
// prevent deadlock
if (m_sharedLockCount > 0) {
unLockFirstIfNeeded = true;
}
}
auto ret = platformLock(lockType, wait, unLockFirstIfNeeded);
if (ret) {
if (lockType == SharedLockType) {
m_sharedLockCount++;
} else {
m_exclusiveLockCount++;
}
}
return ret;
}
bool FileLock::platformLock(LockType lockType, bool wait, bool unLockFirstIfNeeded) {
# ifdef MMKV_ANDROID
if (m_isAshmem) {
return ashmemLock(lockType, wait, unLockFirstIfNeeded);
}
# endif
auto realLockType = LockType2FlockType(lockType);
auto cmd = wait ? realLockType : (realLockType | LOCK_NB);
if (unLockFirstIfNeeded) {
// try lock
auto ret = flock(m_fd, realLockType | LOCK_NB);
if (ret == 0) {
return true;
}
// let's be gentleman: unlock my shared-lock to prevent deadlock
ret = flock(m_fd, LOCK_UN);
if (ret != 0) {
MMKVError("fail to try unlock first fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
}
}
auto ret = flock(m_fd, cmd);
if (ret != 0) {
MMKVError("fail to lock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
// try recover my shared-lock
if (unLockFirstIfNeeded) {
ret = flock(m_fd, LockType2FlockType(SharedLockType));
if (ret != 0) {
// let's hope this never happen
MMKVError("fail to recover shared-lock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
}
}
return false;
} else {
return true;
}
}
bool FileLock::unlock(LockType lockType) {
if (!isFileLockValid()) {
return false;
}
bool unlockToSharedLock = false;
if (lockType == SharedLockType) {
if (m_sharedLockCount == 0) {
return false;
}
// don't want shared-lock to break any existing locks
if (m_sharedLockCount > 1 || m_exclusiveLockCount > 0) {
m_sharedLockCount--;
return true;
}
} else {
if (m_exclusiveLockCount == 0) {
return false;
}
if (m_exclusiveLockCount > 1) {
m_exclusiveLockCount--;
return true;
}
// restore shared-lock when all exclusive-locks are done
if (m_sharedLockCount > 0) {
unlockToSharedLock = true;
}
}
auto ret = platformUnLock(unlockToSharedLock);
if (ret) {
if (lockType == SharedLockType) {
m_sharedLockCount--;
} else {
m_exclusiveLockCount--;
}
}
return ret;
}
bool FileLock::platformUnLock(bool unlockToSharedLock) {
# ifdef MMKV_ANDROID
if (m_isAshmem) {
return ashmemUnLock(unlockToSharedLock);
}
# endif
int cmd = unlockToSharedLock ? LOCK_SH : LOCK_UN;
auto ret = flock(m_fd, cmd);
if (ret != 0) {
MMKVError("fail to unlock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno));
return false;
} else {
return true;
}
}
另外除了锁,MMKV内部通过维护文件大小,在读写操作之前检查现在的文件大小和内部维护的大小是否匹配,如果不匹配说明发生了多进程的写入。另一方面还维护了一个m_sequence的序列号,当内存发生重整时,会递增这个序列号,一旦自己进程的序列号和文件的序列号不一致,则说明发生了其他进程的重整操作。这两种情况都需要重新加载:
MMKV_IO.cpp:
void MMKV::checkLoadData() {
...
if (m_metaInfo->m_sequence != metaInfo.m_sequence) {
MMKVInfo("[%s] oldSeq %u, newSeq %u", m_mmapID.c_str(), m_metaInfo->m_sequence, metaInfo.m_sequence);
SCOPED_LOCK(m_sharedProcessLock);
clearMemoryCache();
loadFromFile();
notifyContentChanged();
} else if (m_metaInfo->m_crcDigest != metaInfo.m_crcDigest) {
MMKVDebug("[%s] oldCrc %u, newCrc %u, new actualSize %u", m_mmapID.c_str(), m_metaInfo->m_crcDigest,
metaInfo.m_crcDigest, metaInfo.m_actualSize);
SCOPED_LOCK(m_sharedProcessLock);
size_t fileSize = m_file->getActualFileSize();
if (m_file->getFileSize() != fileSize) {
MMKVInfo("file size has changed [%s] from %zu to %zu", m_mmapID.c_str(), m_file->getFileSize(), fileSize);
clearMemoryCache();
loadFromFile();
} else {
partialLoadFromFile();
}
notifyContentChanged();
}
}
数据加密
MMKV可以选择使用加密的模式,即将数据加密存储,使用的是AES加密,需要传密钥进去,但是需要自己保存好密钥,在下次初始化时传入进去。MMKV处理的比较灵活,可以变换密钥,或者明文和密文切换。在切换之后,需要重新将数据写入一遍。代码片段如下:
bool MMKV::reKey(const string &cryptKey) {
...
if (m_crypter) {
if (cryptKey.length() > 0) {
string oldKey = this->cryptKey();
if (cryptKey == oldKey) {
return true;
} else {
// change encryption key
MMKVInfo("reKey with new aes key");
auto newCrypt = new AESCrypt(cryptKey.data(), cryptKey.length());
ret = fullWriteback(newCrypt);
if (ret) {
delete m_crypter;
m_crypter = newCrypt;
} else {
delete newCrypt;
}
}
} else {
// decryption to plain text
MMKVInfo("reKey to no aes key");
ret = fullWriteback(InvalidCryptPtr);
if (ret) {
delete m_crypter;
m_crypter = nullptr;
if (!m_dic) {
m_dic = new MMKVMap();
}
}
}
} else {
if (cryptKey.length() > 0) {
// transform plain text to encrypted text
MMKVInfo("reKey to a aes key");
auto newCrypt = new AESCrypt(cryptKey.data(), cryptKey.length());
ret = fullWriteback(newCrypt);
if (ret) {
m_crypter = newCrypt;
if (!m_dicCrypt) {
m_dicCrypt = new MMKVMapCrypt();
}
} else {
delete newCrypt;
}
} else {
return true;
}
}
// m_dic or m_dicCrypt is not valid after reKey
if (ret) {
clearMemoryCache();
}
return ret;
}
性能对比
在单进程和多进程的写入和读取性能上,MMKV都是大幅超过了Android传统的SharedPreferences和SQLite。这得益于mmap和增量写入。具体测试结果参考官方:github.com/Tencent/MMK…
总结
MMKV作为一种高性能大量数据的存储组件,对比Android传统的存储方式SharedPreferences和SQLite确实有不少优势。核心是使用mmap内存映射文件,对比传统IO,在性能上有很大优势,并且将读写文件的操作变得和操作内存一样简单。翻看源码,有不少优秀的设计点。比如增量写入,重整内存,通过文件大小校验对多进程操作感知,多进程读写锁等等。但它的缺点是可能造成内存的浪费,因为必须映射内存页的整数倍,如果只存储很少量的数据,则显得大材小用。因此,可以作为一种数据存储的选择方案,在一些需要大量存储数据场景时,替代SharedPreferences。