Here you find frequently asked questions about searching OpenCms with Solr.
If you are interested in Solr in general the Solr wiki is a good starting point: http://wiki.apache.org/solr/. OpenCms specific topics are covered by this documentation.
Independent from OpenCms a standard Solr Server offers a HTTP-Interface that is reachable at http://localhost:8983/solr/select
in a default Apache Solr Installation.
You are able to attach each valid Solr query to this URL. The HTTP response can either be JSON or XML. For example, the answer of the query http://localhost:8983/solr/select?q=*:*&rows=2
could look like:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">32</int>
<lst name="params">
<str name="q">*:*</str>
<int name="rows">2</int>
<long name="start">0</long>
</lst>
<result name="response" numFound="139" start="0">
<doc>...</doc>
<doc>...</doc>
</result>
</response>
In this example, text is sorted according to the default German rules provided by Java. The rules for sorting German in Java are defined in a package called Java Locale.
Locales are typically defined as a combination of language and country, but you can specify just the language if you want. For example, if you specify "de" as the language, you will get sorting that works well for German language. If you specify "de" as the language and "CH" as the country, you will get German sorting specifically tailored for Switzerland. You can see a list of supported Locales here. And in order to get more general information about how text analysis is working with Solr have a look at Language Analysis page.
<!-- define a field type for German collation -->
<fieldType name="collatedGERMAN" class="solr.TextField">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.CollationKeyFilterFactory"
language="de"
strength="primary"
/>
</analyzer>
</fieldType>
...
<!-- define a field to store the German collated manufacturer names -->
<field name="manuGERMAN" type="collatedGERMAN" indexed="true" stored="false" />
...
<!-- copy the text to this field. We could create French, English, Spanish versions too,
-- and sort differently for different users!
-->
<copyField source="manu" dest="manuGERMAN"/>
In order to return only permission checked resources (what is an expensive task) we only return this limited number of results. For paging over results please have a look at at the Solr parameters rows
and start
, e.g., at http://wiki.apache.org/solr/CommonQueryParameters. Since OpenCms version 8.5.x you can increase the resulting documents to a size of your choice.
Yes, use the OpenCms Solr Select handler at localhost:8080/opencms/opencms/handleSolrSelect
and you will find the highlighting section below the list of documents within the returned XML/JSON:
<lst name="highlighting">
<lst name="a710bb16-1e04-11e2-b767-6805ca037347">
<arr name="content_en">
<str><em>YIPI</em> <em>YOHO</em> text text text</str>
</arr>
</lst>
[...]
</lst>
Currently the OpenCms search API does not support full-featured Solr highlighting. But you can make use of the Solr default highlighting mechanism or course and:
org.opencms.search.solr.CmsSolrResultList#getSolrQueryResponse()
that returns a SolrQueryResponse
as documented in the solr API documentation.http://localhost:8080/opencms/opencms/handleSolrSelect
Yes, for this reason highlighting is turned off before the first search is executed. After all not permitted resources are filtered out of the result list, the highlighting is performed again.
As the name of the indexes let assume Offline indexes are also containing changes that have not yet been published and Online indexes only contain thoses resources that have already been published. The "Online EN VFS" is a Lucene based index and also contains only those resources that have been published.
No, permissions are checked by OpenCms API afterwards.
You can copy the index folder WEB-INF/index/${INDEX_NAME}
by hand.
Edit the opencms-search.xml
within your WEB-INF/config
directory and add the following node to your index:
<param name="org.opencms.search.CmsSearchIndex.useBackupReindexing">true</param>
This will create a snapshot as explained here.
You have to set the right classes for the index, and the field configuration otherwise the Lucene search index implementation is used.
<index class="org.opencms.search.solr.CmsSolrIndex">[...]</index>
<fieldconfiguration class="org.opencms.search.solr.CmsSolrFieldConfiguration">
[...]
</fieldconfiguration>