Azure Document Loader

How to Complete the Azure Document Loader Form?

Looking at this Azure Document Loader form, here's what you need to do to complete it:

Step-by-Step Instructions:

1. Name (Required)

  • Enter a name for your data source in the text box

  • This helps you identify your documents later

  • Example: "Business Documents," "Invoice Collection," or "Research Files"

  • The placeholder shows "My data source" as an example

  • ⚠️ This field is required (red warning shown)

2. File(s) (Required)

  • Click on the upload area that says "Drop your file here"

  • Or click "Click to browse or drag files here" to select files from your computer

  • You can upload multiple files at once

  • Supported file types:

    • Documents: PDF, DOCX

    • Images: JPEG, JPG, PNG, BMP, TIFF, HEIF

    • Spreadsheets: XLSX

    • Presentations: PPTX

    • Web: HTML

  • ⚠️ This field is required (red warning shown)

3. Chunk Size

  • Set how many tokens or characters should be in each chunk

  • Default value is 1024 (recommended for most cases)

  • You can change this number if needed

  • This controls how your document content is divided for processing

4. Select Embedding Model

  • Default "OpenAI - Text Embedding 3 Small" is already selected

  • You can change this by clicking the dropdown if needed

  • This model processes and understands your document content

5. Review Cost Warning

  • ⚠️ Important: Check the yellow warning box

  • Data import costs 7 tokens per word

  • This is higher than other loaders, so consider your document size

6. Final Step

  • Click the "Save" button at the bottom

  • Your files will be processed using Azure Document Intelligence

  • Text will be extracted with OCR support and ready for AI to work with

Key Features:

  • Advanced OCR: Uses Azure Document Intelligence for high-quality text extraction

  • Multiple Formats: Supports many file types including images, documents, and spreadsheets

  • Smart Processing: Can handle complex document layouts and structures

What Makes This Special:

  • Azure Technology: Uses Microsoft's advanced document processing

  • OCR Support: Can extract text from images and scanned documents

  • Format Variety: Handles more file types than basic loaders

Simple Summary:

  1. Give your document collection a name

  2. Upload your files (many formats supported)

  3. Adjust chunk size if needed (or keep default 1024)

  4. Keep the default embedding model (or change if needed)

  5. Click Save

  6. Done!

Your documents will be processed with advanced Azure technology and converted to searchable text that AI can understand and work with!

Last updated