Website

Browses a complete website to extract content and metadata

Detailed Explanation

Name:
- This field allows you to assign a specific name to your website data source, helping you identify it within your project.
- Example: You might name it "E-commerce Product Listings" if the data pertains to products listed on an e-commerce site.
Website URL:
- This field is for specifying the URL of the website you want to scrape or gather data from.
- Example: If your target website is https://www.example.com, you would enter this URL in the field.
URLs to Exclude:
- This optional field allows you to specify any URLs that you want to exclude from the scraping process. You can list multiple URLs separated by commas.
- Example: If you want to exclude URLs like https://www.example.com/about, you can enter:
  Copyhttps://www.example.com/about, https://www.example.com/contact
Chunk Size:
- This field specifies the number of tokens or characters in each chunk of data processed. The default value is set to 1024, but you can modify it based on your needs. Adjusting the chunk size can help in managing large amounts of data more effectively.
- Example: If you are gathering a large volume of data, you might set the chunk size to 512 for easier processing.
Cost Information:
- This section provides details about the cost associated with importing data from the specified website.
- Example: If it states "Cost per seconds: 220 tokens" and "Remaining seconds: 203620 Seconds," this indicates how many tokens will be charged for each second of data processed and the total seconds of data remaining.

PreviousSQL Database Query NextWebpage

Last updated 6 months ago