class Aws::Kendra::Types::SeedUrlConfiguration
Provides the configuration information of the seed or starting point URLs to crawl.
*When selecting websites to index, you must adhere to the [Amazon Acceptable Use Policy] and all other Amazon terms. Remember that you must only use the Amazon Kendra
web crawler to index your own webpages, or webpages that you have authorization to index.*
[1]: aws.amazon.com/aup/
@note When making an API call, you may pass SeedUrlConfiguration
data as a hash: { seed_urls: ["SeedUrl"], # required web_crawler_mode: "HOST_ONLY", # accepts HOST_ONLY, SUBDOMAINS, EVERYTHING }
@!attribute [rw] seed_urls
The list of seed or starting point URLs of the websites you want to crawl. The list can include a maximum of 100 seed URLs. @return [Array<String>]
@!attribute [rw] web_crawler_mode
You can choose one of the following modes: * `HOST_ONLY` – crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled. * `SUBDOMAINS` – crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled. * `EVERYTHING` – crawl the website host names with subdomains and other domains that the webpages link to. The default mode is set to `HOST_ONLY`. @return [String]
@see docs.aws.amazon.com/goto/WebAPI/kendra-2019-02-03/SeedUrlConfiguration AWS API Documentation
Constants
- SENSITIVE