LuceneFileSearchEngine (JCMS API)

java.lang.Object
- com.jalios.jcms.search.LuceneDataSearchEngine
- - com.jalios.jcms.search.LuceneFileSearchEngine

All Implemented Interfaces:

JcmsConstants, FileSearchEngine, JaliosConstants
```
public class LuceneFileSearchEngine
extends LuceneDataSearchEngine
implements FileSearchEngine, JcmsConstants
```
This class is an implementation of FileSearchEngine base on Lucene search engine.

Since:

jcms-4.1

Version:

$Revision: 49040 $

Field Summary

Fields
Modifier and Type	Field and Description
`static java.lang.String`	`AUTHORID_FIELD` Field name for the id of the FileDocument's author, e.g "j_2", or "21243_DBMember"
`static java.lang.String`	`CLASSNAME_FIELD` Field name for the className in jcms, e.g.
`static java.lang.String`	`CONTENTS_FIELD` Field name for the content of the file
`static java.lang.String`	`FILE_INDEX_DIRECTORY`
`static java.lang.String`	`JALIOS_DATE_FIELD` Field name for the Indexing Date (time in ms)
`static java.lang.String`	`JCMS_ID_FIELD` Field name for the id in jcms, e.g.
`static java.lang.String`	`JCMS_PATH_FIELD` Field name for the relative path in jcms, e.g.
`static java.lang.String`	`MODIFIED_FIELD` Field name for the last modified date of the file (time in ms) when it was indexed
`static java.lang.String`	`PATH_FIELD` Field name for the file path (file.getPath()) when it was indexed
`static java.lang.String`	`PSTATUS_FIELD` Field name for the pstatus of the FileDocument, eg : "-10", "0", "100"
`static java.lang.String`	`REVISION`
`static java.lang.String`	`WORKSPACEID_FIELD` Field name for the id of the FileDocument's workspace, e.g "j_4"

Fields inherited from class com.jalios.jcms.search.LuceneDataSearchEngine
alarmMgr, directoryName, ID_FIELD, indexAccessLock, INDEXING_DATE_EXTRAINFO, INDEXING_DATE_FIELD, langList, langToIndexDirMap, MAX_BUFFERED_DOCS, MAX_FIELD_LENGTH, MAX_MERGE_DOCS, MERGE_FACTOR, multilingual

Fields inherited from interface com.jalios.jcms.JcmsConstants
ADATE_SEARCH, ADMIN_NOTES_PROP, ADVANCED_TAB, ARCHIVES_DIR, ASCII_WIDTH, CATEGORY_TAB, CDATE_SEARCH, COMMON_ALARM, CONTENT_TAB, COOKIE_MAX_AGE, CTRL_TOPIC_INTERNAL, CTRL_TOPIC_REF, CTRL_TOPIC_VALUE, CTRL_TOPIC_WRITE, CUSTOM_PROP, DOCCHOOSER_HEIGHT, DOCCHOOSER_WIDTH, DOCS_DIR, EDATE_SEARCH, EMAIL_REGEXP, ERROR_MSG, FORBIDDEN_FILE_ACCESS, FORBIDDEN_REDIRECT, FORCE_REDIRECT, ICON_ARCHIVE, ICON_LOCK, ICON_LOCK_STRONG, ICON_WARN, ICON_WH_BOOK_CLOSED, ICON_WH_BOOK_OPEN, INFORMATION_MSG, JALIOS_JUNIT_PROP, JCMS_CADDY, JCMS_MSG_LIST, JSYNC_DOWNLOAD_DIR, JSYNC_SYNC_ALARM, LOG_FILE, LOG_TOPIC_SECURITY, LOGGER_PROP, LOGGER_XMLPROP, MBR_PHOTO_DIR, MDATE_SEARCH, MONITOR_XML, OP_CREATE, OP_CREATE_STR, OP_DEEP_COPY, OP_DEEP_COPY_STR, OP_DEEP_DELETE, OP_DEEP_DELETE_STR, OP_DELETE, OP_DELETE_STR, OP_MERGE, OP_MERGE_STR, OP_UPDATE, OP_UPDATE_STR, PDATE_SEARCH, PHOTO_DIR, PHOTO_ICON, PHOTO_ICON_HEIGHT, PHOTO_ICON_WIDTH, PHOTO_LARGE, PHOTO_LARGE_HEIGHT, PHOTO_LARGE_WIDTH, PHOTO_NORMAL, PHOTO_NORMAL_HEIGHT, PHOTO_NORMAL_WIDTH, PHOTO_SMALL, PHOTO_SMALL_HEIGHT, PHOTO_SMALL_WIDTH, PHOTO_TINY, PHOTO_TINY_HEIGHT, PHOTO_TINY_WIDTH, PREVIOUS_TAB, PRINT_VIEW, PRIVATE_FILE_ACCESS, PUBLIC_FILE_ACCESS, READ_RIGHT_TAB, SDATE_SEARCH, SEARCHENGINE_ALARM, SESSION_AUTHORIZED_FILENAMES_SET, STATS_REPORT_DIR, STATUS_PROP, STORE_XML, TEMPLATE_TAB, THUMBNAIL_LARGE_HEIGHT, THUMBNAIL_LARGE_WIDTH, THUMBNAIL_SMALL_HEIGHT, THUMBNAIL_SMALL_WIDTH, TYPES_ICON_ALT_PROP, TYPES_ICON_SUFFIX_PROP, TYPES_ICON_TITLE_PROP, TYPES_PREFIX_PROP, TYPES_THUMB_SUFFIX_PROP, UDATE_SEARCH, UPDATE_RIGHT_TAB, UPLOAD_DIR, URL_REGEXP, WARNING_MSG, WEBAPP_PROP, WFEXPRESS_ALARM, WFREMINDER_ALARM, WORKFLOW_TAB, WORKFLOW_XML

Fields inherited from interface com.jalios.util.JaliosConstants
CRLF, MILLIS_IN_ONE_DAY, MILLIS_IN_ONE_HOUR, MILLIS_IN_ONE_MINUTE, MILLIS_IN_ONE_MONTH, MILLIS_IN_ONE_SECOND, MILLIS_IN_ONE_WEEK, MILLIS_IN_ONE_YEAR

Constructor Summary

Constructors
Constructor and Description

LuceneFileSearchEngine()

Constructors
Constructor and Description
`LuceneFileSearchEngine()`

Method Summary

Methods
Modifier and Type	Method and Description
`void`	`add(FileDocument fileDocument)` Add given `FileDocument` to this lucene search engine.
`void`	`delete(FileDocument fileDocument)` Delete given `FileDocument` from this lucene search engine.
`protected com.jalios.jcms.search.DataIterator<Data>`	`getAllDataIterator()` This methods must be implemented by the LuceneSearchEngine.
`org.apache.lucene.store.FSDirectory`	`getDirectory()` Returns the lucene directory used by this LuceneFileSearchEngine.
`org.apache.lucene.document.Document`	`getDocument(java.lang.String filename)` Retrieve the Lucene Document bound to the specified filename.
`int`	`getFileCount()`
`protected org.apache.log4j.Logger`	`getLogger()` This methods must be implemented by the LuceneSearchEngine.
`org.apache.lucene.document.Document`	`getLuceneDocument(FileDocument fileDoc, java.lang.String content)` Retrieve a new lucene Document for the specified file in preparation of indexing.
`protected org.apache.lucene.index.Term`	`getPrimaryTerm(Data data)` Override method for compatibility with legacy lucene file index which uses lucene field "id" (JCMS_ID_FIELD) for Data id, instead of the lucene field "_id_" (ID_FIELD) expected by default by LuceneDataSearchEngine.
`void`	`index(FileDocument fileDoc, java.lang.String content)` Add the specified FileDocument to the index, with the specified content.
`protected void`	`indexData(org.apache.lucene.index.IndexWriter writer, Data data, java.lang.String lang)` This methods index the given FileDocument in the default language, into the given index writer.
`boolean`	`isAvailable()`
`java.util.LinkedHashMap<java.lang.String,java.lang.Float>`	`search(QueryHandler qh)` Return the list of publication's identifier with a lucene search.
`boolean`	`search(QueryHandler qh, java.util.HashSet<? extends Publication> pubSet, java.util.LinkedHashMap<java.lang.String,java.lang.Float> resultMap)` Perform a full-text search on indexed files
`boolean`	`search(QueryHandler qh, java.util.HashSet<? extends Publication> pubSet, QueryResultSet resultSet, boolean searchInDB)` Perform a full-text search on indexed files
`java.util.LinkedHashMap<java.lang.String,java.lang.Float>`	`search(QueryHandler qh, java.util.List<java.lang.String> idList)` Filters the given list of publication's identifier with a lucene search.
`void`	`update(FileDocument fileDocument)` Update given `Publication` in this lucene search engine.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - REVISION
```
public static final java.lang.String REVISION
```
    See Also:
    Constant Field Values
  - FILE_INDEX_DIRECTORY
```
public static final java.lang.String FILE_INDEX_DIRECTORY
```
    See Also:
    Constant Field Values
  - PATH_FIELD
```
public static final java.lang.String PATH_FIELD
```
    Field name for the file path (file.getPath()) when it was indexed
    
    See Also:
    Constant Field Values
  - CONTENTS_FIELD
```
public static final java.lang.String CONTENTS_FIELD
```
    Field name for the content of the file
    
    See Also:
    Constant Field Values
  - MODIFIED_FIELD
```
public static final java.lang.String MODIFIED_FIELD
```
    Field name for the last modified date of the file (time in ms) when it was indexed
    
    See Also:
    Constant Field Values
  - JALIOS_DATE_FIELD
```
public static final java.lang.String JALIOS_DATE_FIELD
```
    Field name for the Indexing Date (time in ms)
    
    See Also:
    Constant Field Values
  - JCMS_PATH_FIELD
```
public static final java.lang.String JCMS_PATH_FIELD
```
    Field name for the relative path in jcms, e.g. "upload/docs/file.txt", or fileDoc.getFilename()
    
    See Also:
    Constant Field Values
  - JCMS_ID_FIELD
```
public static final java.lang.String JCMS_ID_FIELD
```
    Field name for the id in jcms, e.g. "c_345", or "21243_DBFileDocument
    
    See Also:
    Constant Field Values
  - AUTHORID_FIELD
```
public static final java.lang.String AUTHORID_FIELD
```
    Field name for the id of the FileDocument's author, e.g "j_2", or "21243_DBMember"
    
    See Also:
    Constant Field Values
  - PSTATUS_FIELD
```
public static final java.lang.String PSTATUS_FIELD
```
    Field name for the pstatus of the FileDocument, eg : "-10", "0", "100"
    
    See Also:
    Constant Field Values
  - WORKSPACEID_FIELD
```
public static final java.lang.String WORKSPACEID_FIELD
```
    Field name for the id of the FileDocument's workspace, e.g "j_4"
    
    See Also:
    Constant Field Values
  - CLASSNAME_FIELD
```
public static final java.lang.String CLASSNAME_FIELD
```
    Field name for the className in jcms, e.g. "c_345", or "com.jalios.jcms.DBFileDocument
    
    See Also:
    Constant Field Values
- Constructor Detail
  - LuceneFileSearchEngine
```
public LuceneFileSearchEngine()
                       throws java.lang.Exception
```
    Throws:
    
    java.lang.Exception
- Method Detail
  - getDirectory
```
public org.apache.lucene.store.FSDirectory getDirectory()
```
    Returns the lucene directory used by this LuceneFileSearchEngine.
    Warning!!! you should not modify the index, use this method only to access the directory in readonly !.
    Note: The directory may not exists, check with IndexReader.indexExists(Directory).
    
    Returns:
    the instance of the FSDirectory used internally.
  - isAvailable
```
public boolean isAvailable()
```
    Specified by:
    
    isAvailable in interface FileSearchEngine
    
    Returns:
    true if the FileSearchEngine is available
    Since:
    
    jcms-4.0
  - getDocument
```
public org.apache.lucene.document.Document getDocument(java.lang.String filename)
```
    Retrieve the Lucene Document bound to the specified filename.
    
    Parameters:
    filename - relative file path e.g. "upload/docs/file.txt"
    
    Returns:
    the Lucene Document bound to the given filename or null if it could not found
    Since:
    
    jcms-4.0.1
  - search
```
public boolean search(QueryHandler qh,
             java.util.HashSet<? extends Publication> pubSet,
             QueryResultSet resultSet,
             boolean searchInDB)
```
    Perform a full-text search on indexed files
    
    Specified by:
    
    search in interface FileSearchEngine
    
    Parameters:
    qh - the Queryhandler in which to find search text and search options.
    pubSet - a HashSet containing all the Publication to search.
    if empty, search is not performed at all.
    if null, all Publication found will be returned.
    This set MUST NOT be modified by implementation.
    resultSet - the QueryResultSet that must be filled with matching Publication
    searchInDB - if false, only JStore publication are set in pubSet
    
    Returns:
    true if a search was performed in the FileSearchEngine. Useful to differenciate a query returning zero result from a query not performed due to missing paramerters (text for example)
    Since:
    
    jcms-5.5.0
  - search
```
public boolean search(QueryHandler qh,
             java.util.HashSet<? extends Publication> pubSet,
             java.util.LinkedHashMap<java.lang.String,java.lang.Float> resultMap)
```
    Perform a full-text search on indexed files
    
    Parameters:
    qh - the Queryhandler in which to find search text and search options.
    pubSet - a HashSet containing all the Publication to search.
    if empty, search is not performed at all.
    if null, all Publication found will be returned.
    This set MUST NOT be modified by implementation.
    resultMap - the map that must be filled with matching {Publication's Id, score}
    
    Returns:
    true if a search was performed in the FileSearchEngine. Useful to differenciate a query returning zero result from a query not performed due to missing paramerters (text for example)
    Since:
    
    jcms-5.5.0
  - search
```
public java.util.LinkedHashMap<java.lang.String,java.lang.Float> search(QueryHandler qh,
                                                               java.util.List<java.lang.String> idList)
```
    Description copied from interface: FileSearchEngine
    
    Filters the given list of publication's identifier with a lucene search.
    
    Specified by:
    
    search in interface FileSearchEngine
    
    Parameters:
    qh - the Queryhandler in which to find search text and search options.
    idList - the list of publication's identifier
    
    Returns:
    a map of publication's matching the lucene query and their score. This map is a subset of idList and respect its order.
  - search
```
public java.util.LinkedHashMap<java.lang.String,java.lang.Float> search(QueryHandler qh)
```
    Description copied from interface: FileSearchEngine
    
    Return the list of publication's identifier with a lucene search.
    
    Specified by:
    
    search in interface FileSearchEngine
    
    Parameters:
    qh - the Queryhandler in which to find search text and search options.
    
    Returns:
    a map of publication's matching the lucene query and their score.
  - getFileCount
```
public int getFileCount()
```
    Specified by:
    
    getFileCount in interface FileSearchEngine
    
    Returns:
    the number of indexed files
    Since:
    
    jcms-4.1
  - getLuceneDocument
```
public org.apache.lucene.document.Document getLuceneDocument(FileDocument fileDoc,
                                                    java.lang.String content)
```
    Retrieve a new lucene Document for the specified file in preparation of indexing.
    
    Parameters:
    fileDoc - the FileDocument for which file is being indexed
    content - the content of the file, optionnal.
    
    Returns:
    A new instance of Document suitable for indexation through
  - add
```
public void add(FileDocument fileDocument)
```
    Add given FileDocument to this lucene search engine. This method is asynchronous, the given data may not be (and will certainly not be) added immediately after call.
    
    Specified by:
    
    add in interface FileSearchEngine
    
    Parameters:
    fileDocument - the FileDocument to index .
  - update
```
public void update(FileDocument fileDocument)
```
    Update given Publication in this lucene search engine. This method is asynchronous, the given data may not be (and will certainly not be) updated immediately after call.
    
    Specified by:
    
    update in interface FileSearchEngine
    
    Parameters:
    fileDocument - the FileDocument to reindex .
  - delete
```
public void delete(FileDocument fileDocument)
```
    Delete given FileDocument from this lucene search engine. This method is asynchronous, the given data may not be (and will certainly not be) deleted immediately after call.
    
    Specified by:
    
    delete in interface FileSearchEngine
    
    Parameters:
    fileDocument - the FileDocument to delete from index .
  - index
```
public void index(FileDocument fileDoc,
         java.lang.String content)
```
    Add the specified FileDocument to the index, with the specified content.
    Thread safety : This method is ran against the indexing thread created for LuceneFileSearchEngine (ie using the same lock). and therefore it will block if a indexing is already being performed, and it will block indexing until finished .
    Therefore invoke wisely (it should only be needed by JCMSUploadIndexer and during unitest).
    
    Parameters:
    fileDoc - the FileDocument to be indexed in lucene
    content - the content that was extracted for the FileDocument
  - getLogger
```
protected org.apache.log4j.Logger getLogger()
```
    Description copied from class: LuceneDataSearchEngine
    
    This methods must be implemented by the LuceneSearchEngine. It must return the logger to be used for log messages.
    
    Specified by:
    
    getLogger in class LuceneDataSearchEngine
    
    Returns:
    Logger of this engine.
  - getAllDataIterator
```
protected com.jalios.jcms.search.DataIterator<Data> getAllDataIterator()
```
    Description copied from class: LuceneDataSearchEngine
    
    This methods must be implemented by the LuceneSearchEngine. It must return a DataIterator used to iterate on all Data to index. Used by LuceneDataSearchEngine.reindexAll().
    
    Specified by:
    
    getAllDataIterator in class LuceneDataSearchEngine
  - indexData
```
protected void indexData(org.apache.lucene.index.IndexWriter writer,
             Data data,
             java.lang.String lang)
                  throws java.io.IOException
```
    This methods index the given FileDocument in the default language, into the given index writer.
    
    Specified by:
    
    indexData in class LuceneDataSearchEngine
    
    Throws:
    
    java.io.IOException
  - getPrimaryTerm
```
protected org.apache.lucene.index.Term getPrimaryTerm(Data data)
```
    Override method for compatibility with legacy lucene file index which uses lucene field "id" (JCMS_ID_FIELD) for Data id, instead of the lucene field "_id_" (ID_FIELD) expected by default by LuceneDataSearchEngine.
    
    Overrides:
    
    getPrimaryTerm in class LuceneDataSearchEngine
    
    Returns:
    a Term instance, must not return null

Class LuceneFileSearchEngine

Field Summary

Fields inherited from class com.jalios.jcms.search.LuceneDataSearchEngine

Fields inherited from interface com.jalios.jcms.JcmsConstants

Fields inherited from interface com.jalios.util.JaliosConstants

Constructor Summary

Method Summary

Methods inherited from class com.jalios.jcms.search.LuceneDataSearchEngine

Methods inherited from class java.lang.Object

Field Detail

REVISION

FILE_INDEX_DIRECTORY

PATH_FIELD

CONTENTS_FIELD

MODIFIED_FIELD

JALIOS_DATE_FIELD

JCMS_PATH_FIELD

JCMS_ID_FIELD

AUTHORID_FIELD

PSTATUS_FIELD

WORKSPACEID_FIELD

CLASSNAME_FIELD

Constructor Detail

LuceneFileSearchEngine

Method Detail

getDirectory

isAvailable

getDocument

search

search

search

search

getFileCount

getLuceneDocument

add

update

delete

index

getLogger

getAllDataIterator

indexData

getPrimaryTerm