Class HtmlUtil


  • public class HtmlUtil
    extends java.lang.Object
    Html manipulation methods.
    Since:
    jcms-7.1.1
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String URL_REGEX  
    • Constructor Summary

      Constructors 
      Constructor Description
      HtmlUtil()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.String getHtmlIcon​(java.lang.String src, java.lang.String alt)
      Build an icon
      static java.lang.String getHtmlIcon​(java.lang.String src, java.lang.String alt, java.lang.String title)
      Build an icon
      static java.lang.String getHtmlIcon​(java.lang.String src, java.lang.String alt, java.lang.String title, java.lang.String css)
      Build an icon
      static java.lang.String getHtmlIcon​(java.lang.String src, java.lang.String alt, java.lang.String title, java.lang.String css, java.lang.String htmlAttributes)
      Build an icon
      static java.lang.String getSpanIcon​(java.lang.String css, java.lang.String alt, java.lang.String title, java.lang.String htmlAttributes)
      Build a span icon
      static java.lang.String html2text​(java.lang.String html)
      Extract all text from the specified html and returns it.
      static boolean isSameHtml​(java.lang.String htmlFragment1, java.lang.String htmlFragment2)
      Check if the two HTML fragment are the same.
      static java.lang.String text2html​(java.lang.String text)
      Convert a text as HTML.
      static void traverse​(org.jsoup.nodes.Node node, org.jsoup.select.NodeVisitor visitor, int depth, boolean reverse)
      Traverse the tree of the specified HTML node.
      static void trimHtml​(org.jsoup.nodes.Element element)
      Remove leading and trailing empty node from the specified JSoup element.
      static java.lang.String truncate​(java.lang.String fragmentHtml, int maxTextLength)
      Truncate the specified HTML fragment to the maximum text length specified.
      static java.lang.String truncate​(java.lang.String fragmentHtml, int maxTextLength, java.lang.String suffix)
      Truncate the specified HTML fragment to the maximum text length specified.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • URL_REGEX

        public static java.lang.String URL_REGEX
    • Constructor Detail

      • HtmlUtil

        public HtmlUtil()
    • Method Detail

      • truncate

        public static java.lang.String truncate​(java.lang.String fragmentHtml,
                                                int maxTextLength)
        Truncate the specified HTML fragment to the maximum text length specified.

        Important remark regarding behavior :
        Specified HTML fragment is parsed using an HTML parser, thus when the HTML is rewritten it WILL be modified according to default output rule of the parser. Therefore line break, indentation and other whitespace may be lost during the process, but the output should not be changed as seen from the web browser.

        Parameters:
        fragmentHtml - the HTML fragment to truncate
        maxTextLength - the maximum length of text to keep (does not include HTML tag, comment and attribute length)
        Returns:
        the cleaned and truncated HTML, never return null (return empty string if null was specified in input)
        Since:
        jcms-7.1.1
      • truncate

        public static java.lang.String truncate​(java.lang.String fragmentHtml,
                                                int maxTextLength,
                                                java.lang.String suffix)
        Truncate the specified HTML fragment to the maximum text length specified.

        Inserting an optional suffix inside the block element in which text was truncated.

        Important remark regarding behavior :
        Specified HTML fragment is parsed using an HTML parser, thus when the HTML is rewritten it WILL be modified according to default output rule of the parser. Therefore line break, indentation and other whitespace may be lost during the process, but the output should not be changed as seen from the web browser.

        Example (notice the indentation and HTML compliance output):

          String html = "<div><p>Hello &ltspan>World!</span></p></div>";
          String suffix = "<a href='#'> Read More...</a>";
          assertEquals("<div>\n <p>Hello &ltspan>Wo</span><a href=\"#\"> Read More...</a></p>\n</div>", 
                       HtmlUtil.truncate(html, 8, suffix));
         
         
        Parameters:
        fragmentHtml - the HTML fragment to truncate
        maxTextLength - the maximum length of text to keep (does not include HTML tag, comment and attribute length)
        suffix - a suffix to append inside the first truncated HTML node
        Returns:
        the cleaned and truncated HTML, never return null (return empty string if null was specified in input)
        Since:
        jcms-7.1.1
      • traverse

        public static void traverse​(org.jsoup.nodes.Node node,
                                    org.jsoup.select.NodeVisitor visitor,
                                    int depth,
                                    boolean reverse)
        Traverse the tree of the specified HTML node.

        Contrary to native JSoup NodeTraversor, this implementation is resurive and is robust to node removal.

        Parameters:
        node - node to traverse
        visitor - the visitor to use
        depth - current depth
        reverse - set to true to traverse in reverse order (starting from the end), in which case head is invoked for the "closing" part of an Node
      • trimHtml

        public static void trimHtml​(org.jsoup.nodes.Element element)
        Remove leading and trailing empty node from the specified JSoup element.

        Example of use :

          final String html = "<p> <br/> &nbsp; </p> \n"
              + "<p> Hello <strong>World!</strong>! </p> \n"
              + "<p> <br/> &nbsp; </p> ";
          org.jsoup.nodes.Document doc = Jsoup.parseBodyFragment(html);
          
          HtmlUtil.trimHtml(doc.body());
          
          assertEquals("<p> Hello <strong>World!</strong>! </p>", doc.body().html());
         
         
        Parameters:
        element - the JSoup Element to trim, usually the Document body.
        Since:
        jcms-10.0.0
      • html2text

        public static java.lang.String html2text​(java.lang.String html)
        Extract all text from the specified html and returns it.
        Parameters:
        html - the html from which text should be extracted
        Returns:
        a clear text string without any HTML content whatsoever (no tags, comment, attribute, ...). Never returns null : return an empty string if specified html was null,
        Since:
        jcms-7.0.3
      • getHtmlIcon

        public static java.lang.String getHtmlIcon​(java.lang.String src,
                                                   java.lang.String alt)
        Build an icon
        Parameters:
        src - the image source
        alt - the iamge title
        Returns:
        the image tag representation
      • getHtmlIcon

        public static java.lang.String getHtmlIcon​(java.lang.String src,
                                                   java.lang.String alt,
                                                   java.lang.String title)
        Build an icon
        Parameters:
        src - the image source
        alt - the image alt
        title - the image title
        Returns:
        the image tag representation
        Since:
        jcms-8.0.0
      • getHtmlIcon

        public static java.lang.String getHtmlIcon​(java.lang.String src,
                                                   java.lang.String alt,
                                                   java.lang.String title,
                                                   java.lang.String css)
        Build an icon
        Parameters:
        src - the image source
        alt - the image alt
        title - the image title
        css - the css for class="" attribute. "icon" is set by default.
        Returns:
        the image tag representation
        Since:
        jcms-8.0.0
      • getHtmlIcon

        public static java.lang.String getHtmlIcon​(java.lang.String src,
                                                   java.lang.String alt,
                                                   java.lang.String title,
                                                   java.lang.String css,
                                                   java.lang.String htmlAttributes)
        Build an icon
        Parameters:
        src - the image source
        alt - the image alt
        title - the image title
        css - the css for class="" attribute. "icon" is set by default.
        htmlAttributes - the html attributes
        Returns:
        the image tag representation
        Since:
        jcms-9.0.0
      • getSpanIcon

        public static java.lang.String getSpanIcon​(java.lang.String css,
                                                   java.lang.String alt,
                                                   java.lang.String title,
                                                   java.lang.String htmlAttributes)
        Build a span icon
        Parameters:
        css - the css value
        alt - the alternative text for the glyph icon
        title - the title attribute
        htmlAttributes - the html attributes
        Returns:
        a span tag
        Since:
        jcms-9.0.0
      • isSameHtml

        public static boolean isSameHtml​(java.lang.String htmlFragment1,
                                         java.lang.String htmlFragment2)
        Check if the two HTML fragment are the same. disregarding white space (when not relevant).
        Parameters:
        htmlFragment1 - the first HTML fragment to compare
        htmlFragment2 - the second HTML fragment to compare
        Returns:
        true if HTML are the same, false otherwise.
        Since:
        jcms-10.0.0
      • text2html

        public static java.lang.String text2html​(java.lang.String text)
        Convert a text as HTML. Escape HTML reserved characters (such at <,> and &) and convert URL to HTML link.
        Parameters:
        text - the text
        Returns:
        the HTML
        Since:
        jcms-10.0.4