Web plugin enables web-sites to be used as SCAN document sources. It works as a crawler which follows hyperlinks on web-pages and adds them recursively to the repository, starting from a specified URL address.
The plugin supports flexible configuration of a crawling scope, including recursion depth, host and directory name limitations. Standard URL filtering by include/exclude patterns is supported as well.
Parsed web-pages are cached to increase speed and minimize traffic overhead during location update process. A value of “Last-modified” HTTP header is used to determine if a newer version of a cached web-page exists and should be downloaded.
The plugin understands HTML META tags to fill Description/Author properties and optionally can import META Keywords values as a document tags.