Class DefaultQueryParser

  • All Implemented Interfaces:
    org.apache.lucene.queryparser.classic.QueryParserConstants, org.apache.lucene.queryparser.flexible.standard.CommonQueryParserConfiguration

    public class DefaultQueryParser
    extends org.apache.lucene.queryparser.classic.QueryParser
    Classic QueryParser with some added feature.

    Automatic right truncation if enabled with property query.lucene.use-right-truncation.

    Additionnal behavior can be configured through properties :

    • Disable wildcard query through property query.lucene.wildcard.enabled.
    • Disable fuzzy query through property query.lucene.fuzzy.enabled.
    • Use scoring boolean rewrite
      • for all multi-term query, right truncation PrefixQuery, through property query.lucene.scoring-boolean-rewrite.all.enabled.
      • for right truncation PrefixQuery only, through property query.lucene.scoring-boolean-rewrite.right-truncation.enabled.
      Warning, CPU intensive option disabled by default. (available since jcms-10.0.2 / JCMS-7005)
    Since:
    jcms-10.0
    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.lucene.queryparser.classic.QueryParser

        org.apache.lucene.queryparser.classic.QueryParser.Operator
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static boolean LUCENE_FUZZY_ENABLED  
      static boolean LUCENE_WILDCARD_ENABLED  
      static java.lang.String RIGHT_TRUNCATION_EXCLUSION_REGEX
      A regular expression used to match words in user's search string for which right truncation will NOT be applied.
      static boolean USE_RIGHT_TRUNCATION  
      • Fields inherited from class org.apache.lucene.queryparser.classic.QueryParser

        DEFAULT_SPLIT_ON_WHITESPACE, jj_nt, token, token_source
      • Fields inherited from class org.apache.lucene.queryparser.classic.QueryParserBase

        AND_OPERATOR, field, OR_OPERATOR
      • Fields inherited from class org.apache.lucene.util.QueryBuilder

        analyzer, autoGenerateMultiTermSynonymsPhraseQuery, enableGraphQueries, enablePositionIncrements
      • Fields inherited from interface org.apache.lucene.queryparser.classic.QueryParserConstants

        _ESCAPED_CHAR, _NUM_CHAR, _QUOTED_CHAR, _TERM_CHAR, _TERM_START_CHAR, _WHITESPACE, AND, BAREOPER, Boost, CARAT, COLON, DEFAULT, EOF, FUZZY_SLOP, LPAREN, MINUS, NOT, NUMBER, OR, PLUS, PREFIXTERM, QUOTED, Range, RANGE_GOOP, RANGE_QUOTED, RANGE_TO, RANGEEX_END, RANGEEX_START, RANGEIN_END, RANGEIN_START, REGEXPTERM, RPAREN, STAR, TERM, tokenImage, WILDTERM
    • Constructor Summary

      Constructors 
      Constructor Description
      DefaultQueryParser​(java.lang.String field, org.apache.lucene.analysis.Analyzer analyzer, ParseOptions options)
      Builds a new DefaultQueryParser.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static float getFieldBoost​(ParseOptions.Engine engine, java.lang.String fieldName)
      Return the boost that should be applied to a field at query time for text search.
      protected org.apache.lucene.search.Query getFuzzyQuery​(java.lang.String field, java.lang.String termStr, float minSimilarity)
      Disable fuzzy query.
      protected org.apache.lucene.search.Query getWildcardQuery​(java.lang.String field, java.lang.String termStr)
      Disable wildcard query.
      protected org.apache.lucene.search.Query newTermQuery​(org.apache.lucene.index.Term term)  
      static org.apache.lucene.search.Query parse​(java.lang.String searchString, ParseOptions options, org.apache.lucene.analysis.Analyzer analyzer)
      Invoked by JCMS Lucene search engines to parse query.
      static java.lang.String rewriteSearchString​(java.lang.String searchString, ParseOptions options)
      Compute the search string to use from the specified options
      • Methods inherited from class org.apache.lucene.queryparser.classic.QueryParser

        Clause, Conjunction, disable_tracing, enable_tracing, generateParseException, getNextToken, getSplitOnWhitespace, getToken, Modifiers, MultiTerm, Query, ReInit, ReInit, setAutoGeneratePhraseQueries, setSplitOnWhitespace, Term, TopLevelQuery
      • Methods inherited from class org.apache.lucene.queryparser.classic.QueryParserBase

        addClause, addMultiTermClauses, escape, getAllowLeadingWildcard, getAutoGeneratePhraseQueries, getBooleanQuery, getDateResolution, getDefaultOperator, getField, getFieldQuery, getFieldQuery, getFuzzyMinSim, getFuzzyPrefixLength, getLocale, getMaxDeterminizedStates, getMultiTermRewriteMethod, getPhraseSlop, getPrefixQuery, getRangeQuery, getRegexpQuery, getTimeZone, init, newBooleanClause, newFieldQuery, newFuzzyQuery, newMatchAllDocsQuery, newPrefixQuery, newRangeQuery, newRegexpQuery, newWildcardQuery, parse, setAllowLeadingWildcard, setDateResolution, setDateResolution, setDefaultOperator, setFuzzyMinSim, setFuzzyPrefixLength, setLocale, setMaxDeterminizedStates, setMultiTermRewriteMethod, setPhraseSlop, setTimeZone
      • Methods inherited from class org.apache.lucene.util.QueryBuilder

        add, analyzeBoolean, analyzeGraphBoolean, analyzeGraphPhrase, analyzeMultiBoolean, analyzeMultiPhrase, analyzePhrase, analyzeTerm, createBooleanQuery, createBooleanQuery, createFieldQuery, createFieldQuery, createMinShouldMatchQuery, createPhraseQuery, createPhraseQuery, createSpanQuery, getAnalyzer, getAutoGenerateMultiTermSynonymsPhraseQuery, getEnableGraphQueries, getEnablePositionIncrements, newBooleanQuery, newGraphSynonymQuery, newMultiPhraseQueryBuilder, newSynonymQuery, setAnalyzer, setAutoGenerateMultiTermSynonymsPhraseQuery, setEnableGraphQueries, setEnablePositionIncrements
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface org.apache.lucene.queryparser.flexible.standard.CommonQueryParserConfiguration

        getAnalyzer, getEnablePositionIncrements, setEnablePositionIncrements
    • Field Detail

      • LUCENE_WILDCARD_ENABLED

        public static boolean LUCENE_WILDCARD_ENABLED
      • LUCENE_FUZZY_ENABLED

        public static boolean LUCENE_FUZZY_ENABLED
      • USE_RIGHT_TRUNCATION

        public static boolean USE_RIGHT_TRUNCATION
      • RIGHT_TRUNCATION_EXCLUSION_REGEX

        public static java.lang.String RIGHT_TRUNCATION_EXCLUSION_REGEX
        A regular expression used to match words in user's search string for which right truncation will NOT be applied.

        Default regex applies to :

        • punctuation...
        • words ending with punctuation (for example "said" and "world" in : "john said: hello world!")
        • numbers, as we want to match them precisely (it also prevents TooManyBoolean with big index)
    • Constructor Detail

      • DefaultQueryParser

        public DefaultQueryParser​(java.lang.String field,
                                  org.apache.lucene.analysis.Analyzer analyzer,
                                  ParseOptions options)
        Builds a new DefaultQueryParser.
        Parameters:
        field - the default field for query terms.
        analyzer - used to find terms in the query text.
        options - parsing options
    • Method Detail

      • getWildcardQuery

        protected org.apache.lucene.search.Query getWildcardQuery​(java.lang.String field,
                                                                  java.lang.String termStr)
                                                           throws org.apache.lucene.queryparser.classic.ParseException
        Disable wildcard query.
        Overrides:
        getWildcardQuery in class org.apache.lucene.queryparser.classic.QueryParserBase
        Throws:
        org.apache.lucene.queryparser.classic.ParseException
      • getFuzzyQuery

        protected org.apache.lucene.search.Query getFuzzyQuery​(java.lang.String field,
                                                               java.lang.String termStr,
                                                               float minSimilarity)
                                                        throws org.apache.lucene.queryparser.classic.ParseException
        Disable fuzzy query.
        Overrides:
        getFuzzyQuery in class org.apache.lucene.queryparser.classic.QueryParserBase
        Throws:
        org.apache.lucene.queryparser.classic.ParseException
      • newTermQuery

        protected org.apache.lucene.search.Query newTermQuery​(org.apache.lucene.index.Term term)
        Overrides:
        newTermQuery in class org.apache.lucene.util.QueryBuilder
      • parse

        public static org.apache.lucene.search.Query parse​(java.lang.String searchString,
                                                           ParseOptions options,
                                                           org.apache.lucene.analysis.Analyzer analyzer)
                                                    throws org.apache.lucene.queryparser.classic.ParseException
        Invoked by JCMS Lucene search engines to parse query.
        Parameters:
        searchString - the string to search/parse
        options - parsing options
        analyzer - TODO
        Returns:
        a new Query instance
        Throws:
        org.apache.lucene.queryparser.classic.ParseException - if lucene query was invalid
        Since:
        jcms-10.0.0
      • rewriteSearchString

        public static java.lang.String rewriteSearchString​(java.lang.String searchString,
                                                           ParseOptions options)
        Compute the search string to use from the specified options
        Parameters:
        searchString - the string to search
        options - a SearchOptions instance holder of search parameters
        Returns:
        a String modified to be suitable for lucene search.
      • getFieldBoost

        public static float getFieldBoost​(ParseOptions.Engine engine,
                                          java.lang.String fieldName)
        Return the boost that should be applied to a field at query time for text search.

        Boost value are being read from property declaration :

        query.lucene.boost.{search-engine-name}.{field-name}: 2.0
        For example :
           query.lucene.boost.PUBLICATION.title: 2.0
           query.lucene.boost.CATEGORY.name: 1.2
           query.lucene.boost.PUBLICATION._allfields_: 1.0
         
        Parameters:
        engine - the search engine in which search is performed
        fieldName - the field name being search and for which a Query object is being built
        Returns:
        a float noting the boost value of the field to apply to the Query object