IntelliJ IDEA 引用搜索原理

4,645 阅读7分钟

IntelliJ IDEA 如何实现引用搜索?

IntelliJ IDEA 我们都很熟悉,强大的开源IDE。Android Studio 就是基于IDEA社区版开发的。平常我们会经常使用到Find Usage 功能,查找一个类或者方法的引用,那我们看看它内部是如何实现的?

IDEA 在Github上是开源的,地址:github.com/JetBrains/i…。clone 下来可以直接用它自己打开,感觉很奇妙,自己可以开发自己:-D

IDEA整个源码量是非常庞大的,只是Java和Python源代码加起来就有四百多万行。从这样庞大的项目中找到某个功能的入口,寻找相应的测试用例是个比较好的方法,那我们试着搜下有没有Find Usage的测试用例,果然在com.intellij.java.psi.search包下有个类是FindUsagesTest,继续看下这个类里面有没有我们想要的入口,测试用例里面大部分都引用了一个函数:ReferencesSearch.search,看来感觉有点像,跟进去看看这个函数的定义:

/**
 * Searches for references to the specified element in the scope in which such references are expected to be found, according to
 * dependencies and access rules.
 *
 * @param element the element (declaration) the references to which are requested.
 * @return the query allowing to enumerate the references.
 */
@NotNull
public static Query<PsiReference> search(@NotNull PsiElement element) {
  return search(element, GlobalSearchScope.allScope(PsiUtilCore.getProjectInReadAction(element)), false);
}

大概就是根据搜索的范围,去找到这个element的引用,注意这个函数返回的是一个Query接口。

搜索引用大概就是这个函数了,继续跟进去。

public static Query<PsiReference> search(@NotNull PsiElement element, @NotNull SearchScope searchScope, boolean ignoreAccessScope) {
    return search(new SearchParameters(element, searchScope, ignoreAccessScope));
  }

根据上一步的参数组成搜索参数,忽略无关紧要的细节,继续。

/**
 * Searches for references to the specified element according to the specified parameters.
 *
 * @param parameters the parameters for the search (contain also the element the references to which are requested).
 * @return the query allowing to enumerate the references.
 */
@NotNull
public static Query<PsiReference> search(@NotNull final SearchParameters parameters) {
  final Query<PsiReference> result = INSTANCE.createQuery(parameters);
  if (parameters.isSharedOptimizer) {
    return uniqueResults(result);
  }

  final SearchRequestCollector requests = parameters.getOptimizer();

  final PsiElement element = parameters.getElementToSearch();

  return uniqueResults(new MergeQuery<>(result, new SearchRequestQuery(PsiUtilCore.getProjectInReadAction(element), requests)));
}

这一步大概就是,创建了两个Query,然后进行了合并,返回UniqueResultsQuery,这里面的SearchRequestQuery感觉比较重要,先着重留意下。返回的Query,肯定是为了让上层调用查找接口,那我们看下这个Query里面的查找接口是如何实现的,如下

@Override
@NotNull
public Collection<T> findAll() {
  List<T> result = Collections.synchronizedList(new ArrayList<>());
  Processor<T> processor = Processors.cancelableCollectProcessor(result);
  forEach(processor);
  return result;
}

这个意思就比较清楚了,把一个结果的List作为引用经过处理,最后返回给上层。Processor只是把结果List进行暂存,处理还是在forEach里面,forEach里面还是调用了myOriginal的Query的forEach,如下:

private boolean process(@NotNull Set<M> processedElements, @NotNull Processor<? super T> consumer) {
  return myOriginal.forEach(new MyProcessor(processedElements, consumer));
}

myOriginal就是刚刚的MergeQuery,那看下MergeQuery里面的forEach做了什么,最后调用了processSubQuery,如下:

private <V extends T> boolean processSubQuery(@NotNull Query<V> subQuery, @NotNull final Processor<? super T> consumer) {
  return subQuery.forEach(consumer);
}

也就是MergeQuery最后调用了各个子Query的forEach,上面我们注意到SearchRequestQuery嫌疑比较大,先跟进去看下,forEach最后调用到了processResults,如下:

@Override
protected boolean processResults(@NotNull Processor<? super PsiReference> consumer) {
  return PsiSearchHelper.getInstance(myProject).processRequests(myRequests, consumer);
}

里面调用到了PsiSearchHelper的processRequests,如下:

@Override
public boolean processRequests(@NotNull SearchRequestCollector collector, @NotNull Processor<? super PsiReference> processor) {
  ......
  do {
  ......
    if (!processGlobalRequestsOptimized(globals, progress, localProcessors)) {
      return false;
    }
    for (RequestWithProcessor local : locals) {
      progress.checkCanceled();
      if (!processSingleRequest(local.request, local.refProcessor)) {
        return false;
      }
    }
  ......
  }
  while (true);
}

其中省略掉了一些无关代码,注意到有个processGlobalRequestsOptimized还有个processSingleRequest,先看下processGlobalRequestsOptimized的实现,如下:

private boolean processGlobalRequestsOptimized(@NotNull MultiMap<Set<IdIndexEntry>, RequestWithProcessor> singles,
                                               @NotNull ProgressIndicator progress,
                                               @NotNull final Map<RequestWithProcessor, Processor<PsiElement>> localProcessors) {
  ......
  if (singles.size() == 1) {
    final Collection<? extends RequestWithProcessor> requests = singles.values();
    if (requests.size() == 1) {
      final RequestWithProcessor theOnly = requests.iterator().next();
      return processSingleRequest(theOnly.request, theOnly.refProcessor);
    }
  }
  ......
  return result;
}

忽略掉无关代码,发现当请求为1的时候,还是调用了上层的processSingleRequest,那我们就先分析简单情况,跟进去看下实现,如下:

private boolean processSingleRequest(@NotNull PsiSearchRequest single, @NotNull Processor<? super PsiReference> consumer) {
  final EnumSet<Options> options = EnumSet.of(Options.PROCESS_ONLY_JAVA_IDENTIFIERS_IF_POSSIBLE);
  if (single.caseSensitive) options.add(Options.CASE_SENSITIVE_SEARCH);
  if (shouldProcessInjectedPsi(single.searchScope)) options.add(Options.PROCESS_INJECTED_PSI);

  return bulkProcessElementsWithWord(single.searchScope, single.word, single.searchContext, options, single.containerName,
                                     adaptProcessor(single, consumer)
  );
}

先配置了请求参数,然后调用了bulkProcessElementsWithWord,先看下adaptProcessor实现,如下:

@NotNull
private static BulkOccurrenceProcessor adaptProcessor(@NotNull PsiSearchRequest singleRequest,
                                                     @NotNull Processor<? super PsiReference> consumer) {
  ......
  final RequestResultProcessor wrapped = singleRequest.processor;
  return new BulkOccurrenceProcessor() {
    @Override
    public boolean execute(@NotNull PsiElement scope, @NotNull int[] offsetsInScope, @NotNull StringSearcher searcher) {
      ......
        return LowLevelSearchUtil.processElementsAtOffsets(scope, searcher, !ignoreInjectedPsi,
                                                           getOrCreateIndicator(), offsetsInScope,
                                                           (element, offsetInElement) -> {
          if (ignoreInjectedPsi && element instanceof PsiLanguageInjectionHost) return true;
          return wrapped.processTextOccurrence(element, offsetInElement, consumer);
        });
    }
  };
}

adaptProcessor最后还是调用了wrapped.processTextOccurrence调用,先留意下这个地方,从上一层继续向下看,bulkProcessElementsWithWord的实现,如下:

private boolean bulkProcessElementsWithWord(@NotNull SearchScope searchScope,
                                            @NotNull final String text,
                                            final short searchContext,
                                            @NotNull EnumSet<Options> options,
                                            @Nullable String containerName, @NotNull final BulkOccurrenceProcessor processor) {
  ......
  if (searchScope instanceof GlobalSearchScope) {
    StringSearcher searcher = new StringSearcher(text, options.contains(Options.CASE_SENSITIVE_SEARCH), true,
                                                 searchContext == UsageSearchContext.IN_STRINGS,
                                                 options.contains(Options.PROCESS_ONLY_JAVA_IDENTIFIERS_IF_POSSIBLE));

    return processElementsWithTextInGlobalScope((GlobalSearchScope)searchScope, searcher, searchContext,
                                                options.contains(Options.CASE_SENSITIVE_SEARCH), containerName, progress, processor);
  }
  ......
  return JobLauncher.getInstance().invokeConcurrentlyUnderProgress(Arrays.asList(scopeElements), progress, localProcessor);
}

跟进去看下processElementsWithTextInGlobalScope的实现,如下:

private boolean processElementsWithTextInGlobalScope(@NotNull final GlobalSearchScope scope,
                                                     @NotNull final StringSearcher searcher,
                                                     final short searchContext,
                                                     final boolean caseSensitively,
                                                     @Nullable String containerName,
                                                     @NotNull ProgressIndicator progress,
                                                     @NotNull final BulkOccurrenceProcessor processor) {
  progress.pushState();
  boolean result;
  try {
    progress.setText(PsiBundle.message("psi.scanning.files.progress"));

    String text = searcher.getPattern();
    Set<VirtualFile> fileSet = new THashSet<>();
    getFilesWithText(scope, searchContext, caseSensitively, text, fileSet);

    progress.setText(PsiBundle.message("psi.search.for.word.progress", text));

    final Processor<PsiElement> localProcessor = localProcessor(progress, searcher, processor);
    ......
    result = fileSet.isEmpty() || processPsiFileRoots(new ArrayList<>(fileSet), fileSet.size(), 0, progress, localProcessor);
  }
  finally {
    progress.popState();
  }
  return result;
}

localProcessor比较可疑,跟进去看下,如下:

private static Processor<PsiElement> localProcessor(@NotNull final ProgressIndicator progress,
                                                    @NotNull final StringSearcher searcher,
                                                    @NotNull final BulkOccurrenceProcessor processor) {
  return new ReadActionProcessor<PsiElement>() {
    @Override
    public boolean processInReadAction(PsiElement scopeElement) {
      ......
      return scopeElement.isValid() &&
             processor.execute(scopeElement, LowLevelSearchUtil.getTextOccurrencesInScope(scopeElement, searcher, progress), searcher);
    }
  };
}

终于,看到了processor的execute调用的地方,这个processor就是adaptProcessor返回的,执行的就是wrapped.processTextOccurrence,wrapped指向的processor就是SingleTargetRequestResultProcessor。

那这个wrapped是什么时候注入进来的呢?还记得在新建MergeQuery时有两个Query一个是Search,另一个就是ExecutorsQuery,
这个Query在执行时会根据参数通过一系列流程把wrapped指向SingleTargetRequestResultProcessor类型的Processor

所以,最后是执行的SingleTargetRequestResultProcessor的processTextOccurrence,看下实现,如下:

@Override
public boolean processTextOccurrence(@NotNull PsiElement element, int offsetInElement, @NotNull final Processor<? super PsiReference> consumer) {
  ......
  final List<PsiReference> references = ourReferenceService.getReferences(element,
                                                                          new PsiReferenceService.Hints(myTarget, offsetInElement));
  ......
  return true;
}

跟进去getReferences实现,一路跳转…

private static PsiReferenceRegistrarImpl createRegistrar(Language language) {
  ......
  List<PsiReferenceProviderBean> referenceProviderBeans = REFERENCE_PROVIDER_EXTENSION.allForLanguageOrAny(language);
  for (final PsiReferenceProviderBean providerBean : referenceProviderBeans) {
    final ElementPattern<PsiElement> pattern = providerBean.createElementPattern();
    if (pattern != null) {
      registrar.registerReferenceProvider(pattern, new PsiReferenceProvider() {

        PsiReferenceProvider myProvider;

        @NotNull
        @Override
        public PsiReference[] getReferencesByElement(@NotNull PsiElement element, @NotNull ProcessingContext context) {
          if (myProvider == null) {

            myProvider = providerBean.instantiate();
            if (myProvider == null) {
              myProvider = NULL_REFERENCE_PROVIDER;
            }
          }
          return myProvider.getReferencesByElement(element, context);
        }
      });
    }
  }

  registrar.markInitialized();

  return registrar;
}

最终调用的是PsiReferenceProvider的getReferencesByElement,myProvider又是通过PsiReferenceProviderBean转化而来的,看下这里面做了什么事情,然后发现如下注释:

/**
 * Registers a {@link PsiReferenceProvider} in plugin.xml
 */
public class PsiReferenceProviderBean extends AbstractExtensionPointBean implements KeyedLazyInstance<PsiReferenceProviderBean> {

  public static final ExtensionPointName<PsiReferenceProviderBean> EP_NAME =
    new ExtensionPointName<>("com.intellij.psi.referenceProvider");

  @Attribute("language")
  public String language = Language.ANY.getID();

  @Attribute("providerClass")
  public String className;

原来是在plugin.xml 里面注册PsiReferenceProvider类型的Class,用时再去反射实例化调用,那我们现在看看有哪些类继承了PsiReferenceProvider,其中的JavaClassReferenceProvider应该是我们想要的实现,跟进getReferencesByElement,又是一路跳转到JavaClassReferenceSet的reparse,终于找到了类引用搜索最核心的东西,如下:

private void reparse(@NotNull String str, @NotNull PsiElement element, final boolean isStaticImport, JavaClassReferenceSet context) {
  myElement = element;
  myContext = context;
  List<JavaClassReference> referencesList = new ArrayList<>();
  int currentDot = -1;
  int referenceIndex = 0;
  boolean allowDollarInNames = isAllowDollarInNames();
  boolean allowSpaces = isAllowSpaces();
  boolean allowGenerics = false;
  boolean allowWildCards = JavaClassReferenceProvider.ALLOW_WILDCARDS.getBooleanValue(getOptions());
  boolean allowGenericsCalculated = false;
  boolean parsingClassNames = true;

  while (parsingClassNames) {
    int nextDotOrDollar = -1;
    for (int curIndex = currentDot + 1; curIndex < str.length(); ++curIndex) {
      char ch = str.charAt(curIndex);

      if (ch == DOT || ch == DOLLAR && allowDollarInNames) {
        nextDotOrDollar = curIndex;
        break;
      }

      if (ch == LT || ch == COMMA) {
        if (!allowGenericsCalculated) {
          allowGenerics = !isStaticImport && PsiUtil.isLanguageLevel5OrHigher(element);
          allowGenericsCalculated = true;
        }

        if (allowGenerics) {
          nextDotOrDollar = curIndex;
          break;
        }
      }
    }

    if (nextDotOrDollar == -1) {
      nextDotOrDollar = currentDot + 1;
      for (int i = nextDotOrDollar; i < str.length() && Character.isJavaIdentifierPart(str.charAt(i)); ++i) nextDotOrDollar++;
      parsingClassNames = false;
      int j = skipSpaces(nextDotOrDollar, str.length(), str, allowSpaces);

      if (j < str.length()) {
        char ch = str.charAt(j);
        boolean recognized = false;

        if (ch == '[') {
          j = skipSpaces(j + 1, str.length(), str, allowSpaces);
          if (j < str.length() && str.charAt(j) == ']') {
            j = skipSpaces(j + 1, str.length(), str, allowSpaces);
            recognized = j == str.length();
          }
        }

        Boolean aBoolean = JavaClassReferenceProvider.JVM_FORMAT.getValue(getOptions());
        if (!recognized && (aBoolean == null || !aBoolean.booleanValue())) {
          nextDotOrDollar = -1; // abort resolve
        }
      }
    }

    if (nextDotOrDollar != -1 && nextDotOrDollar < str.length()) {
      char c = str.charAt(nextDotOrDollar);
      if (c == LT) {
        boolean recognized = false;
        int start = skipSpaces(nextDotOrDollar + 1, str.length(), str, allowSpaces);
        int j = str.lastIndexOf(GT);
        int end = skipSpacesBackward(j, 0, str, allowSpaces);
        if (end != -1 && end > start) {
          if (myNestedGenericParameterReferences == null) myNestedGenericParameterReferences = new ArrayList<>(1);
          myNestedGenericParameterReferences.add(new JavaClassReferenceSet(
            str.substring(start, end), myElement, myStartInElement + start, isStaticImport, myProvider, this));
          parsingClassNames = false;
          j = skipSpaces(j + 1, str.length(), str, allowSpaces);
          recognized = j == str.length();
        }
        if (!recognized) {
          nextDotOrDollar = -1; // abort resolve
        }
      }
      else if (c == COMMA && myContext != null) {
        if (myContext.myNestedGenericParameterReferences == null) myContext.myNestedGenericParameterReferences = new ArrayList<>(1);
        int start = skipSpaces(nextDotOrDollar + 1, str.length(), str, allowSpaces);
        myContext.myNestedGenericParameterReferences.add(new JavaClassReferenceSet(
          str.substring(start), myElement, myStartInElement + start, isStaticImport, myProvider, this));
        parsingClassNames = false;
      }
    }

    int maxIndex = nextDotOrDollar > 0 ? nextDotOrDollar : str.length();
    int beginIndex = skipSpaces(currentDot + 1, maxIndex, str, allowSpaces);
    int endIndex = skipSpacesBackward(maxIndex, beginIndex, str, allowSpaces);
    boolean skipReference = false;
    if (allowWildCards && str.charAt(beginIndex) == QUESTION) {
      int next = skipSpaces(beginIndex + 1, endIndex, str, allowSpaces);
      if (next != beginIndex + 1) {
        String keyword = str.startsWith(EXTENDS, next) ? EXTENDS : str.startsWith(SUPER, next) ? SUPER : null;
        if (keyword != null) {
          next = skipSpaces(next + keyword.length(), endIndex, str, allowSpaces);
          beginIndex = next;
        }
      }
      else if (endIndex == beginIndex + 1) {
        skipReference = true;
      }
    }
    if (!skipReference) {
      TextRange textRange = TextRange.create(myStartInElement + beginIndex, myStartInElement + endIndex);
      JavaClassReference currentContextRef = createReference(
        referenceIndex, str.substring(beginIndex, endIndex), textRange, isStaticImport);
      referenceIndex++;
      referencesList.add(currentContextRef);
    }
    if ((currentDot = nextDotOrDollar) < 0) {
      break;
    }
  }

  myReferences = referencesList.toArray(new JavaClassReference[0]);
}

很长,但是基本可以理解为就是一个简单的语言Parser,和我最初的猜想也是相符的,就是基于源文件字符解析,引用相关的信息都包含在返回的PsiReference列表里面。

IDEA 确实有一个优秀的架构,虽然也有槽点:-D