com.jalios.jcms.search
Class LuceneFileSearchEngine

java.lang.Object
  extended by com.jalios.jcms.search.LuceneFileSearchEngine
All Implemented Interfaces:
JcmsConstants, FileSearchEngine, JaliosConstants

public class LuceneFileSearchEngine
extends Object
implements FileSearchEngine, JcmsConstants

This class is an implementation of FileSearchEngine base on Lucene search engine.

Since:
jcms-4.1
Version:
$Revision: 24071 $
Author:
Olivier Dedieu , Olivier Jaquemet

Field Summary
static String CONTENTS_FIELD
          Field name for the content of the file
static String FILE_INDEX_DIRECTORY
           
static String JALIOS_DATE_FIELD
          Field name for the Indexing Date (time in ms)
static String JCMS_PATH_FIELD
          Field name for the relative path in jcms, e.g.
static String MODIFIED_FIELD
          Field name for the last modified date of the file (time in ms) when it was indexed
static String PATH_FIELD
          Field name for the file path (file.getPath()) when it was indexed
static String REVISION
           
 
Fields inherited from interface com.jalios.jcms.JcmsConstants
ADATE_SEARCH, ADMIN_NOTES_PROP, ADVANCED_TAB, ARCHIVES_DIR, ASCII_WIDTH, CATEGORY_TAB, CDATE_SEARCH, COMMON_ALARM, CONTENT_TAB, COOKIE_MAX_AGE, CRYPT_MD5, CRYPT_UNDEFINED, CRYPT_UNIX, CTRL_TOPIC_INTERNAL, CTRL_TOPIC_REF, CTRL_TOPIC_VALUE, CTRL_TOPIC_WRITE, CUSTOM_PROP, DOCCHOOSER_HEIGHT, DOCCHOOSER_WIDTH, DOCS_DIR, EDATE_SEARCH, EMAIL_REGEXP, ERROR_MSG, FORBIDDEN_FILE_ACCESS, FORBIDDEN_REDIRECT, FORCE_REDIRECT, ICON_ARCHIVE, ICON_LOCK, ICON_LOCK_STRONG, ICON_WARN, ICON_WH_BOOK_CLOSED, ICON_WH_BOOK_OPEN, INFORMATION_MSG, JALIOS_JUNIT_PROP, JCMS_CADDY, JSYNC_DOWNLOAD_DIR, JSYNC_SYNC_ALARM, LOG_FILE, LOG_TOPIC_SECURITY, LOGGER_PROP, LOGGER_XMLPROP, MBR_PHOTO_DIR, MDATE_SEARCH, MONITOR_XML, OP_CREATE, OP_DEEP_COPY, OP_DEEP_DELETE, OP_DELETE, OP_MERGE, OP_UPDATE, PDATE_SEARCH, PHOTO_LARGE, PHOTO_LARGE_HEIGHT, PHOTO_LARGE_WIDTH, PHOTO_NORMAL, PHOTO_NORMAL_HEIGHT, PHOTO_NORMAL_WIDTH, PHOTO_SMALL, PHOTO_SMALL_HEIGHT, PHOTO_SMALL_WIDTH, PHOTO_TINY, PHOTO_TINY_HEIGHT, PHOTO_TINY_WIDTH, PREVIOUS_TAB, PRINT_VIEW, PRIVATE_FILE_ACCESS, PUBLIC_FILE_ACCESS, READ_RIGHT_TAB, SDATE_SEARCH, SEARCHENGINE_ALARM, SESSION_AUTHORIZED_FILENAMES_SET, STATS_REPORT_DIR, STATUS_PROP, STORE_XML, TEMPLATE_TAB, THUMBNAIL_LARGE_HEIGHT, THUMBNAIL_LARGE_WIDTH, THUMBNAIL_SMALL_HEIGHT, THUMBNAIL_SMALL_WIDTH, UDATE_SEARCH, UPDATE_RIGHT_TAB, UPLOAD_DIR, URL_REGEXP, WARNING_MSG, WEBAPP_PROP, WFEXPRESS_ALARM, WFREMINDER_ALARM, WORKFLOW_TAB, WORKFLOW_XML
 
Fields inherited from interface com.jalios.util.JaliosConstants
CRLF, MILLIS_IN_ONE_DAY, MILLIS_IN_ONE_HOUR, MILLIS_IN_ONE_MINUTE, MILLIS_IN_ONE_MONTH, MILLIS_IN_ONE_SECOND, MILLIS_IN_ONE_WEEK, MILLIS_IN_ONE_YEAR
 
Constructor Summary
LuceneFileSearchEngine()
           
 
Method Summary
 org.apache.lucene.store.FSDirectory getDirectory()
          Returns the lucene directory used by this LuceneFileSearchEngine.
 org.apache.lucene.document.Document getDocument(String filename)
          Retrieve the Lucene Document bound to the specified filename.
 int getFileCount()
           
 org.apache.lucene.document.Document getLuceneDocument(File file, String content)
          Retrieve a new lucene Document for the specified file in preparation of indexing.
 void index(File file, org.apache.lucene.document.Document doc)
          Add the specified lucene Document to the index.
 boolean isAvailable()
           
 void optimize()
          Realize a Lucene optimization of the Lucene File Index.
 void remove(File file)
          Remove the specified file from the lucene index.
 boolean search(QueryHandler qh, HashSet<? extends Publication> pubSet, QueryResultSet resultSet)
          Perform a full-text search on indexed files
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

REVISION

public static final String REVISION
See Also:
Constant Field Values

FILE_INDEX_DIRECTORY

public static final String FILE_INDEX_DIRECTORY
See Also:
Constant Field Values

PATH_FIELD

public static final String PATH_FIELD
Field name for the file path (file.getPath()) when it was indexed

See Also:
Constant Field Values

CONTENTS_FIELD

public static final String CONTENTS_FIELD
Field name for the content of the file

See Also:
Constant Field Values

MODIFIED_FIELD

public static final String MODIFIED_FIELD
Field name for the last modified date of the file (time in ms) when it was indexed

See Also:
Constant Field Values

JALIOS_DATE_FIELD

public static final String JALIOS_DATE_FIELD
Field name for the Indexing Date (time in ms)

See Also:
Constant Field Values

JCMS_PATH_FIELD

public static final String JCMS_PATH_FIELD
Field name for the relative path in jcms, e.g. "upload/docs/file.txt", or fileDoc.getFilename()

See Also:
Constant Field Values
Constructor Detail

LuceneFileSearchEngine

public LuceneFileSearchEngine()
                       throws Exception
Throws:
Exception
Method Detail

getDirectory

public org.apache.lucene.store.FSDirectory getDirectory()
Returns the lucene directory used by this LuceneFileSearchEngine.
Warning!!! you should not modify the index, use this method only to access the directory in readonly !.
Note: The directory may not exists, check with IndexReader.indexExists(Directory).

Returns:
the instance of the FSDirectory used internally.

isAvailable

public boolean isAvailable()
Specified by:
isAvailable in interface FileSearchEngine
Returns:
true if the FileSearchEngine is available
Since:
jcms-4.0

getDocument

public org.apache.lucene.document.Document getDocument(String filename)
Retrieve the Lucene Document bound to the specified filename.

Parameters:
filename - relative file path e.g. "upload/docs/file.txt"
Returns:
the Lucene Document bound to the given filename or null if it could not found
Since:
jcms-4.0.1

search

public boolean search(QueryHandler qh,
                      HashSet<? extends Publication> pubSet,
                      QueryResultSet resultSet)
Perform a full-text search on indexed files

Specified by:
search in interface FileSearchEngine
Parameters:
qh - the Queryhandler in which to find search text and search options.
pubSet - a HashSet containing all the Publication to search.
if empty, search is not performed at all.
if null, all Publication found will be returned.
This set MUST NOT be modified by implementation.
resultSet - the QueryResultSet that must be filled with matching Publication
Returns:
true if a search was performed in the FileSearchEngine. Useful to differenciate a query returning zero result from a query not performed due to missing paramerters (text for example)
Since:
jcms-5.5.0

getFileCount

public int getFileCount()
Specified by:
getFileCount in interface FileSearchEngine
Returns:
the number of indexed files
Since:
jcms-4.1

getLuceneDocument

public org.apache.lucene.document.Document getLuceneDocument(File file,
                                                             String content)
Retrieve a new lucene Document for the specified file in preparation of indexing.

Parameters:
file - the File to index (must no be null and file must exists). this file MUST be located under the webapp root directory (usually inside the upload directory).
content - the content of the file, optionnal.
Returns:
A new instance of Document suitable for indexation through index(File, Document)
See Also:
index(File, Document)

index

public void index(File file,
                  org.apache.lucene.document.Document doc)
Add the specified lucene Document to the index.

Parameters:
file - the File to be indexed in lucene this file MUST be located under the webapp root directory (usually inside the upload directory).
doc - the lucene Document instance build (see getLuceneDocument(File, String)

remove

public void remove(File file)
Remove the specified file from the lucene index.

Parameters:
file - the File to be removed from lucene. this file MUST be located under the webapp root directory (usually inside the upload directory).

optimize

public void optimize()
Realize a Lucene optimization of the Lucene File Index.

Since:
JCMS-6.0.2


Copyright © 2001-2007 Jalios SA. All Rights Reserved.