public class LuceneFileSearchEngine extends LuceneDataSearchEngine implements FileSearchEngine, JcmsConstants
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
AUTHORID_FIELD
Field name for the id of the FileDocument's author, e.g "j_2", or "21243_DBMember"
|
static java.lang.String |
CLASSNAME_FIELD
Field name for the className in jcms, e.g.
|
static java.lang.String |
CONTENTS_FIELD
Field name for the content of the file
|
static java.lang.String |
FILE_INDEX_DIRECTORY |
static java.lang.String |
JALIOS_DATE_FIELD
Field name for the Indexing Date (time in ms)
|
static java.lang.String |
JCMS_ID_FIELD
Field name for the id in jcms, e.g.
|
static java.lang.String |
JCMS_PATH_FIELD
Field name for the relative path in jcms, e.g.
|
static java.lang.String |
MODIFIED_FIELD
Field name for the last modified date of the file (time in ms) when it was indexed
|
static java.lang.String |
PATH_FIELD
Field name for the file path (file.getPath()) when it was indexed
|
static java.lang.String |
PSTATUS_FIELD
Field name for the pstatus of the FileDocument, eg : "-10", "0", "100"
|
static java.lang.String |
REVISION |
static java.lang.String |
WORKSPACEID_FIELD
Field name for the id of the FileDocument's workspace, e.g "j_4"
|
alarmMgr, directoryName, ID_FIELD, indexAccessLock, INDEXING_DATE_EXTRAINFO, INDEXING_DATE_FIELD, langList, langToIndexDirMap, MAX_BUFFERED_DOCS, MAX_FIELD_LENGTH, MAX_MERGE_DOCS, MERGE_FACTOR, multilingual
ADATE_SEARCH, ADMIN_NOTES_PROP, ADVANCED_TAB, ARCHIVES_DIR, ASCII_WIDTH, CATEGORY_TAB, CDATE_SEARCH, COMMON_ALARM, CONTENT_TAB, COOKIE_MAX_AGE, CTRL_TOPIC_INTERNAL, CTRL_TOPIC_REF, CTRL_TOPIC_VALUE, CTRL_TOPIC_WRITE, CUSTOM_PROP, DOCCHOOSER_HEIGHT, DOCCHOOSER_WIDTH, DOCS_DIR, EDATE_SEARCH, EMAIL_REGEXP, ERROR_MSG, FORBIDDEN_FILE_ACCESS, FORBIDDEN_REDIRECT, FORCE_REDIRECT, ICON_ARCHIVE, ICON_LOCK, ICON_LOCK_STRONG, ICON_WARN, ICON_WH_BOOK_CLOSED, ICON_WH_BOOK_OPEN, INFORMATION_MSG, JALIOS_JUNIT_PROP, JCMS_CADDY, JCMS_MSG_LIST, JSYNC_DOWNLOAD_DIR, JSYNC_SYNC_ALARM, LOG_FILE, LOG_TOPIC_SECURITY, LOGGER_PROP, LOGGER_XMLPROP, MBR_PHOTO_DIR, MDATE_SEARCH, MONITOR_XML, OP_CREATE, OP_CREATE_STR, OP_DEEP_COPY, OP_DEEP_COPY_STR, OP_DEEP_DELETE, OP_DEEP_DELETE_STR, OP_DELETE, OP_DELETE_STR, OP_MERGE, OP_MERGE_STR, OP_UPDATE, OP_UPDATE_STR, PDATE_SEARCH, PHOTO_DIR, PHOTO_ICON, PHOTO_ICON_HEIGHT, PHOTO_ICON_WIDTH, PHOTO_LARGE, PHOTO_LARGE_HEIGHT, PHOTO_LARGE_WIDTH, PHOTO_NORMAL, PHOTO_NORMAL_HEIGHT, PHOTO_NORMAL_WIDTH, PHOTO_SMALL, PHOTO_SMALL_HEIGHT, PHOTO_SMALL_WIDTH, PHOTO_TINY, PHOTO_TINY_HEIGHT, PHOTO_TINY_WIDTH, PREVIOUS_TAB, PRINT_VIEW, PRIVATE_FILE_ACCESS, PUBLIC_FILE_ACCESS, READ_RIGHT_TAB, SDATE_SEARCH, SEARCHENGINE_ALARM, SESSION_AUTHORIZED_FILENAMES_SET, STATS_REPORT_DIR, STATUS_PROP, STORE_XML, TEMPLATE_TAB, THUMBNAIL_LARGE_HEIGHT, THUMBNAIL_LARGE_WIDTH, THUMBNAIL_SMALL_HEIGHT, THUMBNAIL_SMALL_WIDTH, TYPES_ICON_ALT_PROP, TYPES_ICON_SUFFIX_PROP, TYPES_ICON_TITLE_PROP, TYPES_PREFIX_PROP, TYPES_THUMB_SUFFIX_PROP, UDATE_SEARCH, UPDATE_RIGHT_TAB, UPLOAD_DIR, URL_REGEXP, WARNING_MSG, WEBAPP_PROP, WFEXPRESS_ALARM, WFREMINDER_ALARM, WORKFLOW_TAB, WORKFLOW_XML
CRLF, MILLIS_IN_ONE_DAY, MILLIS_IN_ONE_HOUR, MILLIS_IN_ONE_MINUTE, MILLIS_IN_ONE_MONTH, MILLIS_IN_ONE_SECOND, MILLIS_IN_ONE_WEEK, MILLIS_IN_ONE_YEAR
Constructor and Description |
---|
LuceneFileSearchEngine() |
Modifier and Type | Method and Description |
---|---|
void |
add(FileDocument fileDocument)
Add given
FileDocument to this lucene search engine. |
void |
delete(FileDocument fileDocument)
Delete given
FileDocument from this lucene search engine. |
protected com.jalios.jcms.search.DataIterator<Data> |
getAllDataIterator()
This methods must be implemented by the LuceneSearchEngine.
|
org.apache.lucene.store.FSDirectory |
getDirectory()
Returns the lucene directory used by this LuceneFileSearchEngine.
|
org.apache.lucene.document.Document |
getDocument(java.lang.String filename)
Retrieve the Lucene Document bound to the specified filename.
|
int |
getFileCount() |
protected org.apache.log4j.Logger |
getLogger()
This methods must be implemented by the LuceneSearchEngine.
|
org.apache.lucene.document.Document |
getLuceneDocument(FileDocument fileDoc,
java.lang.String content)
Retrieve a new lucene Document for the specified file in preparation of indexing.
|
protected org.apache.lucene.index.Term |
getPrimaryTerm(Data data)
Override method for compatibility with legacy lucene file index
which uses lucene field "id" (JCMS_ID_FIELD) for Data id, instead of
the lucene field "_id_" (ID_FIELD) expected by default by
LuceneDataSearchEngine.
|
void |
index(FileDocument fileDoc,
java.lang.String content)
Add the specified FileDocument to the index, with the specified content.
|
protected void |
indexData(org.apache.lucene.index.IndexWriter writer,
Data data,
java.lang.String lang)
This methods index the given FileDocument in the default language, into the given index writer.
|
boolean |
isAvailable() |
java.util.LinkedHashMap<java.lang.String,java.lang.Float> |
search(QueryHandler qh)
Return the list of publication's identifier with a lucene search.
|
boolean |
search(QueryHandler qh,
java.util.HashSet<? extends Publication> pubSet,
java.util.LinkedHashMap<java.lang.String,java.lang.Float> resultMap)
Perform a full-text search on indexed files
|
boolean |
search(QueryHandler qh,
java.util.HashSet<? extends Publication> pubSet,
QueryResultSet resultSet,
boolean searchInDB)
Perform a full-text search on indexed files
|
java.util.LinkedHashMap<java.lang.String,java.lang.Float> |
search(QueryHandler qh,
java.util.List<java.lang.String> idList)
Filters the given list of publication's identifier with a lucene search.
|
void |
update(FileDocument fileDocument)
Update given
Publication in this lucene search engine. |
addData, addDataCollection, clearIndices, clearSearcher, deleteData, deleteDataCollection, getDirectory, getIndexingDate, getIndexingDate, getLastOptimizeDateSinceRestart, getLastOptimizeDuration, getLastReindexDateSinceRestart, getLastReindexDuration, getLuceneDocument, getOperationStartTime, getProgressState, getSearcher, index, index, isOperationRunning, optimizeIndices, reindexAll, remove, setIndexWriterOptions, updateData, updateDataCollection
public static final java.lang.String REVISION
public static final java.lang.String FILE_INDEX_DIRECTORY
public static final java.lang.String PATH_FIELD
public static final java.lang.String CONTENTS_FIELD
public static final java.lang.String MODIFIED_FIELD
public static final java.lang.String JALIOS_DATE_FIELD
public static final java.lang.String JCMS_PATH_FIELD
public static final java.lang.String JCMS_ID_FIELD
public static final java.lang.String AUTHORID_FIELD
public static final java.lang.String PSTATUS_FIELD
public static final java.lang.String WORKSPACEID_FIELD
public static final java.lang.String CLASSNAME_FIELD
public LuceneFileSearchEngine() throws java.lang.Exception
java.lang.Exception
public org.apache.lucene.store.FSDirectory getDirectory()
IndexReader.indexExists(Directory)
.public boolean isAvailable()
isAvailable
in interface FileSearchEngine
public org.apache.lucene.document.Document getDocument(java.lang.String filename)
filename
- relative file path e.g. "upload/docs/file.txt"public boolean search(QueryHandler qh, java.util.HashSet<? extends Publication> pubSet, QueryResultSet resultSet, boolean searchInDB)
search
in interface FileSearchEngine
qh
- the Queryhandler in which to find search text and search options.pubSet
- a HashSet containing all the Publication
to search. Publication
found will be returned. resultSet
- the QueryResultSet
that must be filled
with matching Publication
searchInDB
- if false, only JStore publication are set in pubSetpublic boolean search(QueryHandler qh, java.util.HashSet<? extends Publication> pubSet, java.util.LinkedHashMap<java.lang.String,java.lang.Float> resultMap)
qh
- the Queryhandler in which to find search text and search options.pubSet
- a HashSet containing all the Publication
to search. Publication
found will be returned. resultMap
- the map that must be filled with matching {Publication's Id, score}public java.util.LinkedHashMap<java.lang.String,java.lang.Float> search(QueryHandler qh, java.util.List<java.lang.String> idList)
FileSearchEngine
search
in interface FileSearchEngine
qh
- the Queryhandler in which to find search text and search options.idList
- the list of publication's identifieridList
and respect its order.public java.util.LinkedHashMap<java.lang.String,java.lang.Float> search(QueryHandler qh)
FileSearchEngine
search
in interface FileSearchEngine
qh
- the Queryhandler in which to find search text and search options.public int getFileCount()
getFileCount
in interface FileSearchEngine
public org.apache.lucene.document.Document getLuceneDocument(FileDocument fileDoc, java.lang.String content)
fileDoc
- the FileDocument for which file is being indexedcontent
- the content of the file, optionnal.public void add(FileDocument fileDocument)
FileDocument
to this lucene search engine.
This method is asynchronous, the given data may not be (and will
certainly not be) added immediately after call.add
in interface FileSearchEngine
fileDocument
- the FileDocument
to index .public void update(FileDocument fileDocument)
Publication
in this lucene search engine.
This method is asynchronous, the given data may not be (and will
certainly not be) updated immediately after call.update
in interface FileSearchEngine
fileDocument
- the FileDocument
to reindex .public void delete(FileDocument fileDocument)
FileDocument
from this lucene search engine.
This method is asynchronous, the given data may not be (and will
certainly not be) deleted immediately after call.delete
in interface FileSearchEngine
fileDocument
- the FileDocument
to delete from index .public void index(FileDocument fileDoc, java.lang.String content)
Thread safety : This method is ran against the indexing thread
created for LuceneFileSearchEngine (ie using the same lock).
and therefore it will block if a indexing is already being performed,
and it will block indexing until finished .
Therefore invoke wisely (it should only be needed
by JCMSUploadIndexer
and during unitest).
fileDoc
- the FileDocument to be indexed in lucenecontent
- the content that was extracted for the FileDocumentprotected org.apache.log4j.Logger getLogger()
LuceneDataSearchEngine
getLogger
in class LuceneDataSearchEngine
Logger
of this engine.protected com.jalios.jcms.search.DataIterator<Data> getAllDataIterator()
LuceneDataSearchEngine
LuceneDataSearchEngine.reindexAll()
.getAllDataIterator
in class LuceneDataSearchEngine
protected void indexData(org.apache.lucene.index.IndexWriter writer, Data data, java.lang.String lang) throws java.io.IOException
indexData
in class LuceneDataSearchEngine
java.io.IOException
protected org.apache.lucene.index.Term getPrimaryTerm(Data data)
getPrimaryTerm
in class LuceneDataSearchEngine
Copyright © 2001-2010 Jalios SA. All Rights Reserved.