com.jalios.jcms.search
Class LuceneCategorySearchEngine

java.lang.Object
  extended by com.jalios.jcms.search.LuceneDataSearchEngine
      extended by com.jalios.jcms.search.LuceneCategorySearchEngine
All Implemented Interfaces:
JcmsConstants, CategorySearchEngine, JaliosConstants

public class LuceneCategorySearchEngine
extends LuceneDataSearchEngine
implements CategorySearchEngine, JcmsConstants

This CategorySearchEngine is reponsible for the indexing and searching of JCMS content using lucene.

Architecture and notable points:

  • 1 lucene index per language: WEB-INF/data/lucene/CategoriesIndices/<lang>/.
  • 1 Document per indexed Category.
  • Indices' optimization occurs using schedule specified by property "search-engine.optimize-schedule" (jdring's AlarmEntry cron-like format)

  • Possible Hooks/Modification:
  • Specify analyzer for each language: Analyzer getAnalyzer(String lang);
  • Specify boost for each Document, in each language: LuceneSearchEnginePolicyFilter.getCategoryBoost(Category, String, float)
  • Specify boost for each Document'Field, in each language: LuceneSearchEnginePolicyFilter.getFieldBoost(Category, String, String, String, float)
  • Since:
    jcms-5.5.0
    Version:
    $Revision: 31378 $
    Author:
    Olivier Jaquemet

    Nested Class Summary
     
    Nested classes/interfaces inherited from class com.jalios.jcms.search.LuceneDataSearchEngine
    LuceneDataSearchEngine.MultiSearcherWrapper
     
    Field Summary
    static String ALLFIELDS_FIELD
               
    protected static String CATEGORY_INDEX_DIRECTORY
               
    static String DESCRIPTION_FIELD
               
    static String MATCHED_CATEGORIES_ATTRIBUTE
              This variable is the attribute's key used by the LuceneCategorySearchEngine to set the matched categories HashSet in the QueryResultSet Attribute.
    static String NAME_FIELD
               
    static String REVISION
               
    static String SYNONYMS_FIELD
               
     
    Fields inherited from class com.jalios.jcms.search.LuceneDataSearchEngine
    alarmMgr, channel, directoryName, ID_FIELD, indexAccessLock, INDEXING_DATE_EXTRAINFO, INDEXING_DATE_FIELD, langList, langToIndexDirMap, MAX_BUFFERED_DOCS, MAX_FIELD_LENGTH, MAX_MERGE_DOCS, MERGE_FACTOR
     
    Fields inherited from interface com.jalios.jcms.JcmsConstants
    ADATE_SEARCH, ADMIN_NOTES_PROP, ADVANCED_TAB, ARCHIVES_DIR, ASCII_WIDTH, CATEGORY_TAB, CDATE_SEARCH, COMMON_ALARM, CONTENT_TAB, COOKIE_MAX_AGE, CTRL_TOPIC_INTERNAL, CTRL_TOPIC_REF, CTRL_TOPIC_VALUE, CTRL_TOPIC_WRITE, CUSTOM_PROP, DOCCHOOSER_HEIGHT, DOCCHOOSER_WIDTH, DOCS_DIR, EDATE_SEARCH, EMAIL_REGEXP, ERROR_MSG, FORBIDDEN_FILE_ACCESS, FORBIDDEN_REDIRECT, FORCE_REDIRECT, ICON_ARCHIVE, ICON_LOCK, ICON_LOCK_STRONG, ICON_WARN, ICON_WH_BOOK_CLOSED, ICON_WH_BOOK_OPEN, INFORMATION_MSG, JALIOS_JUNIT_PROP, JCMS_CADDY, JCMS_MSG_LIST, JSYNC_DOWNLOAD_DIR, JSYNC_SYNC_ALARM, LOG_FILE, LOG_TOPIC_SECURITY, LOGGER_PROP, LOGGER_XMLPROP, MBR_PHOTO_DIR, MDATE_SEARCH, MONITOR_XML, OP_CREATE, OP_DEEP_COPY, OP_DEEP_DELETE, OP_DELETE, OP_MERGE, OP_UPDATE, PDATE_SEARCH, PHOTO_DIR, PHOTO_ICON, PHOTO_ICON_HEIGHT, PHOTO_ICON_WIDTH, PHOTO_LARGE, PHOTO_LARGE_HEIGHT, PHOTO_LARGE_WIDTH, PHOTO_NORMAL, PHOTO_NORMAL_HEIGHT, PHOTO_NORMAL_WIDTH, PHOTO_SMALL, PHOTO_SMALL_HEIGHT, PHOTO_SMALL_WIDTH, PHOTO_TINY, PHOTO_TINY_HEIGHT, PHOTO_TINY_WIDTH, PREVIOUS_TAB, PRINT_VIEW, PRIVATE_FILE_ACCESS, PUBLIC_FILE_ACCESS, READ_RIGHT_TAB, SDATE_SEARCH, SEARCHENGINE_ALARM, SESSION_AUTHORIZED_FILENAMES_SET, STATS_REPORT_DIR, STATUS_PROP, STORE_XML, TEMPLATE_TAB, THUMBNAIL_LARGE_HEIGHT, THUMBNAIL_LARGE_WIDTH, THUMBNAIL_SMALL_HEIGHT, THUMBNAIL_SMALL_WIDTH, UDATE_SEARCH, UPDATE_RIGHT_TAB, UPLOAD_DIR, URL_REGEXP, WARNING_MSG, WEBAPP_PROP, WFEXPRESS_ALARM, WFREMINDER_ALARM, WORKFLOW_TAB, WORKFLOW_XML
     
    Fields inherited from interface com.jalios.util.JaliosConstants
    CRLF, MILLIS_IN_ONE_DAY, MILLIS_IN_ONE_HOUR, MILLIS_IN_ONE_MINUTE, MILLIS_IN_ONE_MONTH, MILLIS_IN_ONE_SECOND, MILLIS_IN_ONE_WEEK, MILLIS_IN_ONE_YEAR
     
    Constructor Summary
    LuceneCategorySearchEngine()
              Initialize the Lucene Search Engine
     
    Method Summary
     void add(Category cat)
              Add given Category to this lucene search engine.
     void add(Collection<Category> coll)
              Add given Collection of Category to this lucene search engine.
    protected  void addKeywordField(org.apache.lucene.document.Document doc, Category cat, String lang, String fieldName, String fieldValue, boolean applyBoost)
              This methods create a unstored Lucene Field with the given field's value of the given Category in the given language, and add into the given Document.
    protected  void addUnStoredField(org.apache.lucene.document.Document doc, Category cat, String lang, String fieldName, String fieldValue, boolean applyBoost)
              This methods create a unstored Lucene Field with the given field's value of the given Category in the given language, and add into the given Document.
     void clearAll()
              Clear indices in this searchEngine (undoable!).
     void delete(Category cat)
              Delete given Category from this lucene search engine.
     void delete(Collection<Category> coll)
              Delete given Collection of Category from this lucene search engine.
    protected  com.jalios.jcms.search.LuceneDataSearchEngine.DataIterator<Data> getAllDataIterator()
              This methods must be implemented by the LuceneSearchEngine.
     Date getIndexingDate(Category cat)
              Retrieve the Date at which the specified Category was indexed in the search engine.
    protected  org.apache.log4j.Logger getLogger()
              This methods must be implemented by the LuceneSearchEngine.
    protected  void indexData(org.apache.lucene.index.IndexWriter writer, Data data, String lang)
              This methods index the given Category in the given language, into the given index writer.
     boolean search(QueryHandler qh, HashSet<? extends Publication> pubSet, QueryResultSet resultSet)
              Perform a full-text search.
     Collection<Category> searchCategories(org.apache.lucene.search.Query query)
              Find Category matching the specified lucene Query
     Collection<Category> searchCategories(QueryHandler qh)
              Find Category matching the specified QueryHandler search options (uses mainly the text search param)
     void update(Category cat)
              Update given Category in this lucene search engine.
     void update(Collection<Category> coll)
              Update given Collection of Category in this lucene search engine.
     
    Methods inherited from class com.jalios.jcms.search.LuceneDataSearchEngine
    addData, addDataCollection, clearIndices, deleteData, deleteDataCollection, getDirectory, getIndexingDate, getIndexingDate, getLastOptimizeDateSinceRestart, getLastOptimizeDuration, getLastReindexDateSinceRestart, getLastReindexDuration, getLuceneDocument, getOperationStartTime, getProgressState, getSearcher, index, index, isOperationRunning, optimizeIndices, reindexAll, remove, setIndexWriterOptions, updateData, updateDataCollection
     
    Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Field Detail

    REVISION

    public static final String REVISION
    See Also:
    Constant Field Values

    MATCHED_CATEGORIES_ATTRIBUTE

    public static final String MATCHED_CATEGORIES_ATTRIBUTE
    This variable is the attribute's key used by the LuceneCategorySearchEngine to set the matched categories HashSet in the QueryResultSet Attribute.

    See Also:
    Constant Field Values

    CATEGORY_INDEX_DIRECTORY

    protected static final String CATEGORY_INDEX_DIRECTORY
    See Also:
    Constant Field Values

    NAME_FIELD

    public static final String NAME_FIELD
    See Also:
    Constant Field Values

    SYNONYMS_FIELD

    public static final String SYNONYMS_FIELD
    See Also:
    Constant Field Values

    DESCRIPTION_FIELD

    public static final String DESCRIPTION_FIELD
    See Also:
    Constant Field Values

    ALLFIELDS_FIELD

    public static final String ALLFIELDS_FIELD
    See Also:
    Constant Field Values
    Constructor Detail

    LuceneCategorySearchEngine

    public LuceneCategorySearchEngine()
                               throws Exception
    Initialize the Lucene Search Engine

    Throws:
    Exception - if error occurs during initialization
    Method Detail

    add

    public void add(Category cat)
    Add given Category to this lucene search engine.

    Specified by:
    add in interface CategorySearchEngine
    Parameters:
    cat - the Category to index .

    update

    public void update(Category cat)
    Update given Category in this lucene search engine.

    Specified by:
    update in interface CategorySearchEngine
    Parameters:
    cat - the Category to reindex .

    delete

    public void delete(Category cat)
    Delete given Category from this lucene search engine.

    Specified by:
    delete in interface CategorySearchEngine
    Parameters:
    cat - the Category to reindex .

    add

    public void add(Collection<Category> coll)
    Add given Collection of Category to this lucene search engine.

    Specified by:
    add in interface CategorySearchEngine
    Parameters:
    coll - the Collection of Category to index .

    update

    public void update(Collection<Category> coll)
    Update given Collection of Category in this lucene search engine.

    Specified by:
    update in interface CategorySearchEngine
    Parameters:
    coll - the Collection of Category to reindex .

    delete

    public void delete(Collection<Category> coll)
    Delete given Collection of Category from this lucene search engine.

    Specified by:
    delete in interface CategorySearchEngine
    Parameters:
    coll - the Collection of Category to reindex .

    getIndexingDate

    public Date getIndexingDate(Category cat)
    Retrieve the Date at which the specified Category was indexed in the search engine.

    Specified by:
    getIndexingDate in interface CategorySearchEngine
    Parameters:
    cat - the Category for which to retrieve the indexing date.
    Returns:
    the indexing date of the category or null if was not indexed.
    Since:
    jcms-6.0.1

    clearAll

    public void clearAll()
    Clear indices in this searchEngine (undoable!).

    Specified by:
    clearAll in interface CategorySearchEngine

    search

    public boolean search(QueryHandler qh,
                          HashSet<? extends Publication> pubSet,
                          QueryResultSet resultSet)
    Description copied from interface: CategorySearchEngine
    Perform a full-text search.

    Specified by:
    search in interface CategorySearchEngine
    Parameters:
    qh - the Queryhandler in which to find search text and search options.
    pubSet - a HashSet containing all the Publication to search.
    if empty, search is not performed at all.
    if null, all Publication found will be returned.
    This set MUST NOT be modified by implementation.
    resultSet - the QueryResultSet that must be filled with matching Publication
    Returns:
    true if a search was performed in the CategorySearchEngine. Useful to differenciate a query returning zero result from a query not performed due to missing paramerters (text for example).

    searchCategories

    public Collection<Category> searchCategories(QueryHandler qh)
    Find Category matching the specified QueryHandler search options (uses mainly the text search param)

    Parameters:
    qh - the QueryHandler used to store
    Returns:
    a collection of Category, ordered by relevance
    Since:
    jcms-7.1

    searchCategories

    public Collection<Category> searchCategories(org.apache.lucene.search.Query query)
    Find Category matching the specified lucene Query

    Parameters:
    query - a Lucene Query
    Returns:
    a collection of Category, ordered by relevance
    Since:
    jcms-7.1

    getLogger

    protected org.apache.log4j.Logger getLogger()
    Description copied from class: LuceneDataSearchEngine
    This methods must be implemented by the LuceneSearchEngine. It must return the logger to be used for log messages.

    Specified by:
    getLogger in class LuceneDataSearchEngine
    Returns:
    Logger of this engine.

    getAllDataIterator

    protected com.jalios.jcms.search.LuceneDataSearchEngine.DataIterator<Data> getAllDataIterator()
    Description copied from class: LuceneDataSearchEngine
    This methods must be implemented by the LuceneSearchEngine. It must return a DataIterator used to iterate on all Data to index. Used by LuceneDataSearchEngine.reindexAll().

    Specified by:
    getAllDataIterator in class LuceneDataSearchEngine

    indexData

    protected void indexData(org.apache.lucene.index.IndexWriter writer,
                             Data data,
                             String lang)
                      throws IOException
    This methods index the given Category in the given language, into the given index writer.

    Specified by:
    indexData in class LuceneDataSearchEngine
    Throws:
    IOException

    addUnStoredField

    protected void addUnStoredField(org.apache.lucene.document.Document doc,
                                    Category cat,
                                    String lang,
                                    String fieldName,
                                    String fieldValue,
                                    boolean applyBoost)
    This methods create a unstored Lucene Field with the given field's value of the given Category in the given language, and add into the given Document.

    Parameters:
    applyBoost - whether to apply the boost, useful for appendable field in which case the boost should only be applied for the first element.

    addKeywordField

    protected void addKeywordField(org.apache.lucene.document.Document doc,
                                   Category cat,
                                   String lang,
                                   String fieldName,
                                   String fieldValue,
                                   boolean applyBoost)
    This methods create a unstored Lucene Field with the given field's value of the given Category in the given language, and add into the given Document.

    Parameters:
    applyBoost - whether to apply the boost, useful for appendable field in which case the boost should only be applied for the first element.


    Copyright © 2001-2010 Jalios SA. All Rights Reserved.