Class Repository


  • public class Repository
    extends java.lang.Object
    Represent a configuration of a repository.

    A repository is a directory in which a scan will be done.

    A repository configuration is composed of :

    • an id for this repository, used essentiely to link all configuration elements in the properties (and in the logs and IHM);
    • the directory to scan;
    • a class implementing "FileRepository" RepositoryIndexer;
    • the Lucene Index to use; if any;
    • Scheduling parameters :
      • period : number of *minutes* between scans (if positive integer)
      • cron : parameter to schedule scans in cron syntax (if not null and correct syntax) (The cron parameter is priority in regards with period parameter.)
    • stabilisation duration : a file must not be processed as soon as it as been modified, (because of CTRL-S using webdav for instance), therefore this parameter is used. Unit : *seconds*, default value : 20 seconds;
    • processingLimitDuration: if a processing (or indexing) generates an unknown exception AND the duration of the processing exceeds this parameter (in seconds), then the file is blacklisted for this processor (default value 120s, if -1, ignored);
    • excludedDirRegExp : perl reg exp : to exclude some directory while scanning the repository (eg. SVN, CVS, stat...).
    A validate() method exists to see if all requirements are respected.

    All registration of processors are done using the Class object of these processors. Instanciation of processorss is done lasily at a first call to getFileProcessorSet(String), after a modification in registration.

    Version:
    $Revision: 118093 $
    • Field Detail

      • CURRENT_REPOSITORY

        public static final java.lang.String CURRENT_REPOSITORY
        When a FileActionComponent (FileProcessor or FileParser) is invoked, the repository for which it is invoked is availble through the processing context using this attribute :
         Repository repository = (Repository) ctxt.get(Repository.CURRENT_REPOSITORY);
         
        See Also:
        Constant Field Values
    • Constructor Detail

      • Repository

        public Repository​(java.lang.String id,
                          JProperties properties)
        The constructor set some default value and then read the specified properties.
        Parameters:
        id - the id of the repository to create
        properties - the properties to be used by this Repository
    • Method Detail

      • directoryIgnored

        public boolean directoryIgnored​(java.io.File file)
        Tells if a directory under upload should be ignored given an exclusion pattern given by the property :
            fileprocessor.repository.{repository-name}.excludedDirRegExp
         
        Parameters:
        file - the directoy to check
        Returns:
        true if the directory should be ignored
      • validate

        public boolean validate()
        Validate the configuration is good for this repository.
        Returns:
        true if the condition for the configuration is good
      • getIndexer

        public RepositoryIndexer getIndexer()
        For one repository, the way LuceneDocument are constructed from the content of a file and the file itself, is specific to the structure of the index. The implementation of the RepositoryIndexer knows the structure of the index for that repository.
        Returns:
        an instance of the indexer for that repository
      • isIndexable

        public boolean isIndexable()
      • setIndexerClass

        public void setIndexerClass​(java.lang.Class<? extends RepositoryIndexer> indexerClass)
      • getId

        public java.lang.String getId()
      • setId

        public void setId​(java.lang.String id)
      • getName

        public java.lang.String getName()
      • setName

        public void setName​(java.lang.String name)
      • getActionComponent

        public FileActionComponent getActionComponent​(java.lang.Class<? extends FileActionComponent> componentClass)
        Retrieve any instance already created for the specified FileActionComponent class.
        Parameters:
        componentClass - the FileActionComponent class
        Returns:
        null if the component has not been added previously to this Repository
      • addActionComponent

        public void addActionComponent​(java.lang.Class<? extends FileActionComponent> componentClazz,
                                       java.lang.String[] extensions)
        Add the specified FileActionComponent (parser or processor) to this repository for the specified extensions.
        Parameters:
        componentClazz - the class of the component to add, must not be null
        extensions - the extension for which the FileActionComponent will be configured, must not be null or empty (otherwise method will have no effect)
      • removeActionComponent

        public boolean removeActionComponent​(java.lang.Class<? extends FileActionComponent> componentClass)
        Remove the specified component class from this repository configuration.
        Parameters:
        componentClass - the FileActionComponent class to remove
        Returns:
        true if the component was removed, false if their was no component of this class in this repository
      • removeActionComponent

        public boolean removeActionComponent​(java.lang.Class<? extends FileActionComponent> componentClass,
                                             java.lang.String[] extensions)
        Remove the specified component class from this repository configuration for the specified extension (if any).
        Parameters:
        componentClass - the FileActionComponent class to remove
        extensions - the extension for which the FileActionComponent will be removed, if null or empty, the component will be completely removed from repository
        Returns:
        true if the component was removed, false if their was no component of this class in this repository
      • isProcessedByActionComponentClass

        public boolean isProcessedByActionComponentClass​(java.io.File file,
                                                         java.lang.Class<? extends FileActionComponent> clazz)
        Test whether a file may be processed by a FileActionComponent.

        Tested with the file's extension in this current implementation.

        Parameters:
        file - the File to test
        clazz - the class of the FileActionComponent
        Returns:
        true if the FileActionComponent may process this file
      • getActionComponentClassSet

        public java.util.Set<java.lang.Class<? extends FileActionComponent>> getActionComponentClassSet()
        Retrieve a Set of all class FileActionComponent class registered and enabled for this repository
        Returns:
        a Set of FileActionComponent's class
      • getFileProcessorSet

        public java.util.SortedSet<? extends FileProcessor> getFileProcessorSet​(java.lang.String extension)
        Retrieve a Set of all FileProcessor configured for the specified extension.
        Parameters:
        extension - a File extension, eg. "txt"
        Returns:
        a SortedSet of FileProcessor instances : may be null
      • getFileProcessorExtensionSet

        public java.util.SortedSet<java.lang.String> getFileProcessorExtensionSet()
        Retrieve a Set of all extensions with at least one FileProcessor configured.
        Returns:
        a SortedSet of file extensions : may be null
      • getFileProcessorClassSet

        public java.util.Set<java.lang.Class<? extends FileProcessor>> getFileProcessorClassSet()
        Retrieve a Set of all FileProcessor class.
        Returns:
        a Set of FileProcessor class : never return null
      • getFileParserSet

        public java.util.SortedSet<? extends FileParser> getFileParserSet​(java.lang.String extension)
        Retrieve a Set of all FileParser configured for the specified extension.
        Parameters:
        extension - a File extension, eg. "txt"
        Returns:
        a SortedSet of FileParser instances : may be null
      • getFileParserExtensionSet

        public java.util.SortedSet<java.lang.String> getFileParserExtensionSet()
        Retrieve a Set of all extensions with at least one FileParser configured.
        Returns:
        a SortedSet of file extensions : may be null
      • getFileParserClassSet

        public java.util.Set<java.lang.Class<? extends FileParser>> getFileParserClassSet()
        Retrieve a Set of all FileParser class.
        Returns:
        a Set of FileParser class : never return null
      • getComponentClassToExtensionSetMap

        public java.util.Map<java.lang.Class<? extends FileActionComponent>,​java.util.SortedSet<java.lang.String>> getComponentClassToExtensionSetMap()
        Retrieve a map of all component class to the extensions registered for them.
        Returns:
        a Map of class to Set of String, never return null
      • getBaseDirectory

        public java.io.File getBaseDirectory()
      • setBaseDirectory

        public void setBaseDirectory​(java.io.File baseDirectory)
      • getLuceneIndex

        public java.io.File getLuceneIndex()
      • setLuceneIndex

        public void setLuceneIndex​(java.io.File luceneIndex)
      • getSchedulePeriod

        public int getSchedulePeriod()
        Retrieve the period in minutes between scan of the directory.
        Returns:
        a period in minutes (0 or negative if undefined)
      • setSchedulePeriod

        public void setSchedulePeriod​(int schedulePeriod)
        Set the period in minutes between scan of the directory.
        Parameters:
        schedulePeriod - a period in minutes
      • getScheduleCron

        public java.lang.String getScheduleCron()
        Retrieve the cron defining rythm of scan of the directory.
        Returns:
        a cron like schedule (empty or null if undefined)
      • setScheduleCron

        public void setScheduleCron​(java.lang.String scheduleCron)
      • hasSchedule

        public boolean hasSchedule()
        Check if this repository at a scanning scheduled defined.
        Returns:
        true if a schedule cron or schedule period was set, false otherwise
        Since:
        jcms-7.0.0
      • getExcludedDirRegExp

        public java.lang.String getExcludedDirRegExp()
      • setExcludedDirRegExp

        public void setExcludedDirRegExp​(java.lang.String excludedDirRegExp)
      • getMaxFilesPerScan

        public long getMaxFilesPerScan()
      • setMaxFilesPerScan

        public void setMaxFilesPerScan​(long maxFilesPerScan)
      • getPostponedAlarmManagerName

        public java.lang.String getPostponedAlarmManagerName()
      • setPostponedAlarmManagerName

        public void setPostponedAlarmManagerName​(java.lang.String alarmManagerName)
      • getPostponedSchedulePeriod

        public int getPostponedSchedulePeriod()
        Retrieve the period in minutes between postponed item check.
        Returns:
        a period in minutes (0 or negative if undefined)
      • setPostponedSchedulePeriod

        public void setPostponedSchedulePeriod​(int postponedSchedulePeriod)
        Set the period in minutes between postponed item check.
        Parameters:
        postponedSchedulePeriod - a period in minutes
      • getPostponedScheduleCron

        public java.lang.String getPostponedScheduleCron()
        Retrieve the cron defining rythm of the postponed item check.
        Returns:
        a cron like schedule (empty or null if undefined)
      • setPostponedScheduleCron

        public void setPostponedScheduleCron​(java.lang.String postponedScheduleCron)
        Set the cron defining rythm of the postponed item check.
        Parameters:
        postponedScheduleCron - JDring cron syntax
      • toStringConfiguration

        public java.lang.String toStringConfiguration()
        Convenient method to display configuration informations in logs.
        Returns:
        a convenient string to display configuration informations in logs
      • getRepositoryProperties

        public JProperties getRepositoryProperties()
      • setRepositoryProperties

        public void setRepositoryProperties​(JProperties properties)
      • getAlarmManagerName

        public java.lang.String getAlarmManagerName()
      • setAlarmManagerName

        public void setAlarmManagerName​(java.lang.String alarmManagerName)
      • getFilenameRelativeToBaseDirectory

        public java.lang.String getFilenameRelativeToBaseDirectory​(java.io.File file)
        Retrieve the filename of the specified file relative to the base directory of this Repository.
        Parameters:
        file - any file under
        Returns:
        a platform independant relative path, eg : docs/text/plain/foobar.txt or null if specified file was null or not inside repository base directory
      • getFile

        public java.io.File getFile​(java.lang.String path)
        Retrieve a File instance from the specified relative path
        Parameters:
        path - a platform independant path relative to the repository base directory, eg : docs/text/plain/foobar.txt
        Returns:
        a new File object or null if specified path was null.
      • getBaseDirectoryPath

        public java.lang.String getBaseDirectoryPath()
        Retrieve the base directory absolute path of this Repository
        Returns:
        an absolute directory path
      • getProcessingLimitDuration

        public int getProcessingLimitDuration()
      • setProcessingLimitDuration

        public void setProcessingLimitDuration​(int processingLimitDuration)
      • getProcessingMaximumAttempts

        public int getProcessingMaximumAttempts()
        Retrieve the maximum number of attempts allowed per processing before file gets blacklisted for the corresponding component
        Returns:
        a number of attempt
        Since:
        JCMS-5445
      • setProcessingMaximumAttempts

        public void setProcessingMaximumAttempts​(int maxAttempts)
        Set the maximum number of attempts allowed per processing before file gets blacklisted for the corresponding component
        Parameters:
        maxAttempts - a number of attempt
        Since:
        JCMS-5445
      • getFileProcessingLogs

        public java.util.List<FileProcessingLog> getFileProcessingLogs​(java.io.File file)
        Retrieve all the FileProcessingLog for the specified file in this repository and webapp instance (urid).
        Parameters:
        file - the file for which logs are retrieved
        Returns:
        a list of FileProcessingLog instance corresponding to the specified file (never return null)
      • getFileProcessingLogs

        public java.util.List<FileProcessingLog> getFileProcessingLogs​(java.lang.String filename)
        Retrieve all the FileProcessingLog for the specified file in this repository and webapp instance (urid).
        Parameters:
        filename - the file path relative to the base directory of the repository
        Returns:
        a list of FileProcessingLog instance corresponding to the specified file (never return null)
      • getPostponedProcessingLogs

        public java.util.List<FileProcessingLog> getPostponedProcessingLogs()
        Retrieve all the postponed FileProcessingLog in this repository and webapp instance (urid).
        Returns:
        a list of FileProcessingLog instance (never return null)
      • updateAllFileProcessingInfo

        public void updateAllFileProcessingInfo()
        Create or Update all the FileProcessingInfo instance for this repository and webapp instance (urid).

        WARNING: VERY EXPENSIVE METHODS WHICH ITERATES ALL THE DATABASE ENTRY.

      • updateFileProcessingInfo

        public FileProcessingInfo updateFileProcessingInfo​(java.io.File file,
                                                           java.util.List<FileProcessingLog> logList)
        Create, Update or Delete the FileProcessingInfo instance for the specified file (in this repository and webapp instance, ie urid), from the specified list of FileProcessingLog
        Parameters:
        file - the file for which FileProcessingInfo should be updated
        logList - the logList from which info will be updated
        Returns:
        the created/update info, or null if the processing was deleted or none was created
      • getFileProcessingInfo

        public FileProcessingInfo getFileProcessingInfo​(java.io.File file)
        Retrieve the FileProcessingInfo for the specified file in this repository and webapp instance (urid).
        Parameters:
        file - the File for which FileProcessingInfo is being retrieved
        Returns:
        a FileProcessingInfo instance or null if none could be found
      • getFileProcessingInfo

        public FileProcessingInfo getFileProcessingInfo​(java.lang.String filename)
        Retrieve the FileProcessingInfo for the specified file in this repository and webapp instance (urid).
        Parameters:
        filename - the file path relative to the base directory of the repository
        Returns:
        a FileProcessingInfo instance or null if none could be found
      • deleteFileProcessingInfo

        public void deleteFileProcessingInfo​(java.lang.String filename)
        Delete the FileProcessingInfo for the specified filename
        Parameters:
        filename - the filename relative to the repository base directory
      • addFileProcessingLog

        public FileProcessingLog addFileProcessingLog​(java.io.File file,
                                                      FileActionComponent component,
                                                      ProcessingType type,
                                                      ProcessingStatus status,
                                                      long duration,
                                                      java.lang.Exception exception,
                                                      int attempt)
        Create a new instance of FileProcessingLog with the specified parameters and store it.

        The new instance will uses this repository and current webapp URID.

        Parameters:
        file - the file for which log must be added
        component - the component for which log is added
        type - the type of processing performed
        status - the new processing status
        duration - the duration it took to process
        exception - any exception that might have occured during processing
        attempt - the number of attempt performed (currently only applies to postponed processing).
        Returns:
        the newly created FileProcessingLog, never return null
      • deleteFileProcessingLog

        public void deleteFileProcessingLog​(java.lang.String filename,
                                            java.lang.Class<? extends FileActionComponent> actionComponentClass)
        Remove all log and info for the specified filename and component.

        Update the info if logs were partially deleted.

        Parameters:
        filename - the filename relative to the repository base directory
        actionComponentClass - the FileActionComponent class concerned (optional)
      • deleteFileProcessingLog

        public void deleteFileProcessingLog​(java.lang.String filename,
                                            java.lang.String component)
      • blacklistFile

        public void blacklistFile​(java.lang.String filename,
                                  java.lang.Class<? extends FileActionComponent> actionComponentClass)
        Blacklist the specified file.

        If component class is specified, file is blacklisted for this class only.
        Files gets globally blacklisted if class is not specified.

        Parameters:
        filename - a filename, relative to the base directory of this repository
        actionComponentClass - an optional compoment class to blacklist
      • unBlacklistFile

        public void unBlacklistFile​(java.lang.String filename,
                                    java.lang.Class<? extends FileActionComponent> actionComponentClass)
        Unblacklist the specified file.

        If component class is specified, file is unblacklisted for this class only.
        Files gets globally unblacklisted if class is not specified.

        Parameters:
        filename - a filename, relative to the base directory of this repository
        actionComponentClass - an optional compoment class to unblacklist
      • deleteAllNonBlacklistedLogs

        public void deleteAllNonBlacklistedLogs()
        Remove all successful logs from the database and update the remaining information. WARNING: VERY EXPENSIVE METHODS WHICH ITERATES ALL THE DATABASE ENTRY.
      • setAttribute

        public java.lang.Object setAttribute​(java.lang.String name,
                                             java.lang.Object obj)
        Stores an attribute in this repository.
        If the object passed in is null, the effect is the same as calling removeAttribute(java.lang.String).
        Parameters:
        name - a String specifying the name of the attribute
        obj - the Object to be stored
        Returns:
        previous value associated with specified name, or null if there was no mapping for name. A null return can also indicate that null was associated with the specified name.
      • getAttribute

        public java.lang.Object getAttribute​(java.lang.Object name)
        Returns the value of the named attribute as an Object, or null if no attribute of the given name exists.
        Parameters:
        name - a String specifying the name of the attribute
        Returns:
        an Object containing the value of the attribute, or null if the attribute does not exist
      • removeAttribute

        public java.lang.Object removeAttribute​(java.lang.String name)
        Removes an attribute from this repository.
        Parameters:
        name - a String specifying
        Returns:
        previous value associated with specified name, or null if there was no mapping for name. A null return can also indicate that null was associated with the specified name.
      • logProcessingEvent

        public void logProcessingEvent​(ProcessingEvent newAction)
      • getDirectoryScanner

        public DirectoryScanner getDirectoryScanner()
        Retrieve the DirectoryScanner instance used for repository scan
        Returns:
        DirectoryScanner instance, never return null
      • scanNow

        public boolean scanNow()
        Trigger the repository scan now, in its specific thread (not in the current thread).

        The scan will NOT be triggered if it already running.

        Returns:
        false if the scan was not started (due to scheduling not enabled or scan already running), true if the scan was successfully launched.
      • applySubstitutions

        public static java.lang.String applySubstitutions​(java.lang.String abstractFileName,
                                                          Repository repository)
        Returns a String constructed from the abstractFileName in parameter where :
        • <realpath> is replaced by the absolute real path of the webapp
        • <uploadpath> is replaced by the absolute real path of the upload/ directory in the webapp
        • <webinfpath> is replaced by the absolute real path of the WEB-INF/ directory in the webapp
        • <datapath> is replaced by the absolute real path of the WEB-INF/data/ directory in the webapp
        • <repositoryid> is replaced by the id of the repository
        • <junit> is replaced by the -junit string
        Parameters:
        abstractFileName - the filename in which any matching pattern is replaced as explained above
        repository - the repository used for this subsitution
        Returns:
        the replaced String