PDF File

Enables the import and extraction of text from PDF files.

Detailed Explanation

Name:
- This field is for assigning a specific name to your PDF data source. It helps you identify the connector within your project.
- Example: You could name it "Annual Financial Report" if the PDF file pertains to financial reporting for the year.
PDF File(s):
- In this section, you can upload one or more PDF files directly from your local system. The interface typically supports drag-and-drop functionality for convenience.
- Example: If you have a PDF file called 2023_Annual_Report.pdf, you can drag and drop it into this area or click to browse and select it.
PDF File URL(s) (optional):
- This optional field allows you to enter the URL of one or more PDF files if they are hosted online. This is useful if you prefer to link to a PDF rather than uploading it.
- Example: You might enter a link like https://example.com/2023_Annual_Report.pdf if the PDF is available on your company's website.
Chunk Size:
- This field determines the number of tokens or characters in each chunk of data. The default value is 1024, but you can modify it according to your needs. Adjusting the chunk size can help in managing large PDFs more effectively, particularly during processing or analysis.
- Example: If you are processing a very large document, you might choose to set the chunk size to 512 to ensure smoother handling.
Cost Information:
- The connector provides information regarding the cost associated with importing words from the specified PDF.
- Example: If it states "Cost per words: 7 tokens" and "Remaining words: 6399701 Words," this indicates how many tokens will be charged for each word processed and the number of words remaining for processing.

PreviousRich text NextAzure Document Loader

Last updated 6 months ago