Mistral OCR Loader

Looking at this Mistral OCR Loader form, here's what you need to do to complete it:

Step-by-Step Instructions:

1. Name (Required)

  • Enter a name for your data source in the text box

  • This helps you identify this data later

  • Example: "My Documents" or "Invoice Processing"

2. Files to Load (Required)

  • Click on the upload area or drag and drop your files

  • You can upload multiple files at once

  • Supported formats: PDF, JPG, JPEG, PNG, DOCX, PPTX, WEBP

  • The system will use OCR (Optical Character Recognition) to extract text from these files

3. Select Embedding Model

  • The default "OpenAI - Text Embedding 3 Small" is already selected

  • You can change this if needed by clicking the dropdown menu

  • This model will process and understand your text content

4. Metadata (Optional)

  • Add any additional information about your data source

  • This helps the AI better understand and work with your files

  • You can leave this empty if you don't have specific metadata to add

5. Text Splitter (Optional)

  • Leave the default "markdown" setting unless you have specific requirements

  • This determines how the text will be divided for processing

6. Final Step

  • Click the "Save" button at the bottom to process your files

Important Notes:

  • ⚠️ Cost Warning: Processing costs 500 tokens per page

  • Make sure to fill in the required fields (marked with red warning icons)

  • The system will extract text from images and documents using OCR technology

Once you complete these steps and click "Save," your files will be processed and ready to use with the AI system!

Last updated