Package com.jalios.util.webpage
Class WebPageMetaDataExtractorUtils
- java.lang.Object
-
- com.jalios.util.webpage.WebPageMetaDataExtractorUtils
-
public final class WebPageMetaDataExtractorUtils extends java.lang.Object
Utils to extract a webpage metadata (Title, description, images...)- Since:
- jcms-9.0.4 && jcms-10
- Version:
- $Revision: 136288 $
- Author:
- Kevin Bransard
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.lang.String
extractContent(org.jsoup.nodes.Document document, java.lang.String attrName, java.lang.String... cssQueries)
Returns the extracted content for given cssQueries and given attribute namestatic WebPageMetaData
getWebPageMetaData(java.lang.String url, java.lang.String userAgent)
Returns metadata asWebPageMetaData
object by connecting to given urlstatic WebPageMetaData
getWebPageMetaDataFromHtml(java.lang.String html)
Returns metadata asWebPageMetaData
object by traversing given html source
-
-
-
Method Detail
-
getWebPageMetaDataFromHtml
public static WebPageMetaData getWebPageMetaDataFromHtml(java.lang.String html)
Returns metadata asWebPageMetaData
object by traversing given html source- Parameters:
html
- the html to get meta data from- Returns:
- a
WebPageMetaData
object - Since:
- jcms-9.0.4
-
getWebPageMetaData
public static WebPageMetaData getWebPageMetaData(java.lang.String url, java.lang.String userAgent)
Returns metadata asWebPageMetaData
object by connecting to given url- Parameters:
url
- the url to get meta data fromuserAgent
- the user agent to access url (a default user-agent will be used if null)- Returns:
- a
WebPageMetaData
object - Since:
- jcms-9.0.4
-
extractContent
public static java.lang.String extractContent(org.jsoup.nodes.Document document, java.lang.String attrName, java.lang.String... cssQueries)
Returns the extracted content for given cssQueries and given attribute name- Parameters:
document
- theDocument
attrName
- the attribute name to search for elements returned by the css queries (Can be empty)cssQueries
- the css queries performed to search for elements- Returns:
- a value based on cssQueries and attribute name
-
-