This series is different with other tutorials. Normally people will teach you how to write a simple QueryParser and explain that. But it have some drawbacks :
- It far from the complex QueryParser that you have to write.
- It does not embrace all the sides of the QueryParser.
- I find myself cant remember anything after that, when i have to write another QueryParser I must look into the tutorial again and again.
So we will look into Solr with top-down approach. From solrconfig.xml -> SearchHandler -> SearchComponent -> QueryParser, it will help you have an deep understanding of searching flow of Solr.
Note : for any file that linked in this article. You can search it through Intellij IDEA by press double shift and type the name of the file.
Understand searching flow of Solr
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
</lst>
</requestHandler>
SearchHandler is a RequestHandler that handle search request from client and injected into “/select”. Let look into its source code (I’d love to look at source code, it self explained, it always up-to-date, it can be debug).
The highlighted method is where your request be handled, and it go through numbers SearchComponent like this (the image from Solr in Action).
SearchComponents :
- QueryComponent is where solr find relevent documents coresspond to your query.
- FacetComponent is where solr running the faceting on result.
- …
The awesome point of Solr is you can replace the above components with your components, you can append your components in anywhere of this chain. Let look into the source code of SearchHandler one more time. Here are the simplify code from handleRequestBody method in non distributed mode (it have some parts be hidden in ...).
ResponseBuilder rb hold refereces to almost any other objects of the request. The code above is quite self-explained, we just have to care aboutprepare and process method of SearchComponents. Here are the QueryComponent.prepare(ResponseBuilder) method.
So in QueryComponent.prepare(ResponseBuilder) :
- (1) : it get the defType from request (defType is the name of QueryParser), the default defType is
lucene
- (2) : it get the real QueryParser object from defType, and convert query string to a Lucene query object -> this is the rule of QueryParser in Solr.
- (3) : set the query object to ResponseBuilder to make sure that in another steps/components we can retrieve it.
Why we need to write a QueryParser?
The role of QueryParser is convert the request to a lucene query object. So why we need to write a QueryParser? Here are some reason :
- When you write a custom Lucene Query.
- When you want to blur the long parameters of Solr.
- When you find all Solr QueryParsers is not suitable for your application
Explain RawQParser
I will use RawQParserPlugin as an example because of its simplicity
QParserPlugin is the first class you have to write. It is like a Factory used to create QParser, Solr apply Factory pattern a lot in its source. So you can feel free to write any non thread-safe code inside your QParser. Inside the parse() method, we create a new TermQuery with (1) is field name and (2) is the term.
I hope that through this post, you can confidently view Solr source and understand the searching flow of Solr. In the next part we will come back to QueryParser in more practically way, we will write a custom QParser in new project, build, test, and inject the custom QParser to Solr.