Class SolrIndexService


  • @DeclareRoles({"org.imixs.ACCESSLEVEL.NOACCESS","org.imixs.ACCESSLEVEL.READERACCESS","org.imixs.ACCESSLEVEL.AUTHORACCESS","org.imixs.ACCESSLEVEL.EDITORACCESS","org.imixs.ACCESSLEVEL.MANAGERACCESS"})
    @RolesAllowed({"org.imixs.ACCESSLEVEL.NOACCESS","org.imixs.ACCESSLEVEL.READERACCESS","org.imixs.ACCESSLEVEL.AUTHORACCESS","org.imixs.ACCESSLEVEL.EDITORACCESS","org.imixs.ACCESSLEVEL.MANAGERACCESS"})
    public class SolrIndexService
    extends Object
    The SolrIndexService provides methods to add, update and remove imixs documents from a solr index.

    The service validates the solr index schema and updates the schema it changed.

    The SolrIndexService is used by the SolrUpdateService and the SolrSearchService which are extending and implementing the Imix-Index concept.

    The SolrIndexService can be configured by the following properties:

    • solr.api - api endpoint for the solr index
    • solr.core - name of the solr index core (default 'imixs-workflow')
    • solr.configset - an optinal solr configset (default '_default')
    • solr.user - userid for optional basic authentication
    • solr.password - password for optional basic authentication
    Version:
    1.0
    Author:
    rsoika
    • Constructor Detail

      • SolrIndexService

        public SolrIndexService()
    • Method Detail

      • init

        @PostConstruct
        public void init()
        Create a rest client instance
      • setup

        public void setup​(@Observes
                          SetupEvent setupEvent)
                   throws RestAPIException
        This method verifies the schema of the Solr core. If field definitions have change, a schema update is posted to the Solr rest API.

        The method assumes that a core is already created with a manageable schema.

        Parameters:
        setupEvent -
        Throws:
        RestAPIException
      • updateSchema

        public void updateSchema​(String schema)
                          throws RestAPIException
        Updates the schema definition of an existing Solr core.

        The schema definition is build by the method builUpdateSchema(). The updateSchema adds or replaces field definitions depending on the fieldList definitions provided by the Imixs SchemaService. See the method builUpdateSchema() for details.

        The method asumes that a core already exits. Otherwise an exception is thrown.

        Parameters:
        schema - - existing schema defintion
        Throws:
        RestAPIException
      • indexDocuments

        public void indexDocuments​(List<ItemCollection> documents)
                            throws RestAPIException
        This method adds a collection of documents to the Lucene Solr index. The documents are added immediately to the index. Calling this method within a running transaction leads to a uncommitted reads in the index. For transaction control, it is recommended to use instead the the method solrUpdateService.updateDocuments() which takes care of uncommitted reads.

        This method is used by the JobHandlerRebuildIndex only.

        Parameters:
        documents - of ItemCollections to be indexed
        Throws:
        RestAPIException
      • indexDocument

        public void indexDocument​(ItemCollection document)
                           throws RestAPIException
        This method adds a single document to the Lucene Solr index. The document is added immediately to the index. Calling this method within a running transaction leads to a uncommitted reads in the index. For transaction control, it is recommended to use instead the the method solrUpdateService.updateDocuments() which takes care of uncommitted reads.
        Parameters:
        documents - of ItemCollections to be indexed
        Throws:
        RestAPIException
      • removeDocuments

        public void removeDocuments​(List<String> documentIDs)
                             throws RestAPIException
        This method removes a collection of documents from the Lucene Solr index.
        Parameters:
        documents - of collection of UniqueIDs to be removed from the index
        Throws:
        RestAPIException
      • removeDocument

        public void removeDocument​(String id)
                            throws RestAPIException
        This method removes a single document from the Lucene Solr index.
        Parameters:
        document - - UniqueID of the document to be removed from the index
        Throws:
        RestAPIException
      • rebuildIndex

        public void rebuildIndex()
        This method forces an update of the full text index.
      • query

        public String query​(String searchTerm,
                            int pageSize,
                            int pageIndex,
                            SortOrder sortOrder,
                            DefaultOperator defaultOperator,
                            boolean loadStubs)
                     throws QueryException
        This method post a search query and returns the result.

        The method will return the documents containing all stored or DocValues fields. Only if the param 'loadStubs' is false, then only the field '$uniqueid' will be returnded by the method. The caller is responsible to load the full document from DocumentService.

        Because fieldnames must not contain $ symbols we need to replace those field names used in a query.

        Parameters:
        searchterm -
        Returns:
        Throws:
        QueryException
      • adaptSolrFieldName

        public String adaptSolrFieldName​(String itemName)
        This method adapts an Solr field name to the corresponding Imixs Item name. Because Solr does not accept $ char at the beginning of an field we need to replace starting _ with $ if the item is part of the Imixs Index Schema.
        Parameters:
        itemName -
        Returns:
        adapted Imixs item name
      • adaptImixsItemName

        public String adaptImixsItemName​(String itemName)
        This method adapts an Imixs item name to the corresponding Solr field name. Because Solr does not accept $ char at the beginning of an field we need to replace starting $ with _ if the item is part of the Imixs Index Schema.
        Parameters:
        itemName -
        Returns:
        adapted Solr field name
      • buildUpdateSchema

        protected String buildUpdateSchema​(String oldSchema)
        This method builds a JSON structure to be used to update an existing Solr schema. The method adds or replaces field definitions into a solr update schema.

        The param oldSchema contains the current schema definition of the core.

        In Solr there a two field types defining if the value of a field is stored and returned by a

        {"add-field":{name=field1, type=text_general, stored=true, docValues=true}}

        For both cases the values are stored in the lucene index and returned by a query.

        Stored fields (stored=true) are row orientated. That means that like in a sql table the values are stored based on the ID of the document.

        In difference the docValues are stored column orientated (forward index). The values are ordered based on the search term. For features like sorting, grouping or faceting, docValues increase the performance in general. So it may look like docValues are the better choice. But one important different is how the values are stored. In case of a stored field with multi-values, the values are exactly stored in the same order as they were indexed. DocValues instead are sorted and reordered. So this will falsify the result of a document returned by a query.

        In Imixs-Workflow we use the stored attribute to return parts of a document at query time. We call this a document-stub which contains only a subset of fields. Later we load the full document from the SQL database. As stored fields in our workflow application are also often used for sorting we combine both attributes. In case of a non-stored field we set also docValues=false to avoid unnecessary storing of fields.

        Returns:
      • buildAddDoc

        protected String buildAddDoc​(List<ItemCollection> documents)
        This method returns a XML structure to add new documents into the solr index.
        Returns:
        xml content to update documents
      • stripControlCodes

        protected String stripControlCodes​(String s)
        This helper method is to strip control codes and extended characters from a string. We can not put those chars into the XML request send to solr.

        Background:

        In ASCII, the control codes have decimal codes 0 through to 31 and 127. On an ASCII based system, if the control codes are stripped, the resultant string would have all of its characters above 32 and not 127

        Parameters:
        s -
        include -
        Returns:
      • stripCDATA

        protected String stripCDATA​(String s)
        This helper method strips CDATA blocks from a string. We can not post embedded CDATA in an alredy existing CDATA when we post the xml to solr.

        Parameters:
        s -
        Returns:
      • flushEventLogByCount

        public boolean flushEventLogByCount​(int count)
        This method flushes a given count of eventLogEntries. The method return true if no more eventLogEntries exist.
        Parameters:
        count - the max size of a eventLog engries to remove.
        Returns:
        true if the cache was totally flushed.
      • flushEventLog

        public boolean flushEventLog​(int junkSize)
        Flush the EventLog cache. This method is called by the LuceneSerachService only.

        The method flushes the cache in smaller blocks of the given junkSize. to avoid a heap size problem. The default flush size is 16. The eventLog cache is tracked by the flag 'dirtyIndex'.

        issue #439 - The method returns false if the event log contains more entries as defined by the given JunkSize. In this case the caller should recall the method which runs always in a new transaction. The goal of this mechanism is to reduce the event log even in cases the outer transaction breaks.

        Returns:
        true if the the complete event log was flushed. If false the method must be recalled.
        See Also:
        LuceneSearchService