Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

HTTP Resource

Embedded Object Grooper.Messaging

Represents a web site or web page to be imported usines HTTP Import.

Remarks

An HTTPResource defines a starting point for importing web content into Grooper using the HTTP Import provider.

How it works

  • URL: Set this to the root of the website or a specific page you want to import.
  • RelativePageUrls: (Optional) Specify a list of relative URLs to include, if you want to restrict import to certain pages under the root URL.
  • LinkSelectors: Add one or more HyperlinkSelector objects to control which links are followed from each page, enabling recursive crawling, filtering, and advanced navigation.
  • Enabled: Use this to temporarily include or exclude this resource from import runs.
  • EnforceScope: When enabled, only URLs that are within the root URL's scope will be followed, preventing the crawler from leaving the intended site or section.
  • Description: Add a description to help identify the purpose or scope of this resource.

Example

To import all product pages from a website:

  • Set URL to https://example.com
  • Add a LinkSelector with a selector like a.product-link and enable recursion to follow all product links.

Notes

  • The combination of RelativePageUrls and LinkSelectors allows you to precisely control which pages are imported and how deep the crawl goes.
  • Use LinkSelectors to include/exclude links, apply URL patterns, and manage recursion for complex site structures.
  • Each HTTPResource operates independently, so you can import from multiple sites or sections in a single import operation.

Properties

NameTypeDescription

See Also

Used By

Notification