RAG With Azure Document Intelligence

Muhammad ZubairAugust 20, 2025

0 1,271 5 minutes read

A common challenge in creating document-based chatbots is ensuring the AI receives accurate, structured data before it generates a response. Sending raw, complex documents like invoices or receipts directly to a large language model (LLM) like GPT often leads to incorrect or incomplete answers.

The solution is to implement a Retrieval-Augmented Generation (RAG) pattern using Azure Document Intelligence with Azure Prompt Flow. This process first uses a specialized prebuilt model to extract clean, key-value data from a document such as merchant name, customer information, and totals. This structured data is then sent to GPT-4 for summarization and query answering. This two-step approach ensures the chatbot delivers precise and reliable answers, even from documents with complex layouts.

Why Azure Document Intelligence?

Azure Document Intelligence (formerly Form Recognizer) is the predictive AI engine that powers this solution. It is designed to:

Read and extract information from structured documents, including invoices, receipts, government IDs, and business cards.
Leverage prebuilt models for common document types or create custom models tailored to unique business needs.
Return highly accurate text extraction and data in structured key-value pairs.

Examples:

From a government ID: Country, date of birth, expiry date, document number, etc.
From a business card: Company name, job title, contact info, address + bounding box positions.
From a receipt: Merchant name, date, items, subtotal, tax, and total amount.

Key Components in This Setup

Predictive AI: Azure Document Intelligence (Prebuilt models)
Generative AI: GPT-4 (Azure OpenAI)

How the Prompt Flow Works

The chatbot architecture is straightforward and efficient:

User Input:
- User Query: The question to be answered (e.g., “Who is this invoice for and what’s the total?”).
- Document Link: A URL pointing to the document image (JPG or PNG).
Document Processing:
- The document is sent to an Azure Document Intelligence prebuilt model (e.g., Invoice Analysis).
- The model analyzes the document and returns structured key-value data.
Answer Generation:
- The extracted data is passed to GPT-4 within Azure OpenAI.
- GPT-4 uses this clean data to generate a concise and accurate answer to the user’s query.

Exploring Azure Document Intelligence Studio

Sign in to the Azure Portal.
Search for Document Intelligence and click Create.
Select your subscription, resource group, region, and provide a resource name.
Choose a pricing tier (the Free (F0) tier is excellent for testing).
Click Review + Create to deploy the resource.

After deployment, navigate to your resource’s Keys and Endpoint section to get your credentials. From the overview page, click Go to Document Intelligence Studio to open the testing portal.

The Studio offers two main pathways:

Prebuilt Models: For common document types (invoices, receipts, IDs, etc.).
Custom Models: Trained on your own specific document layouts.

Testing is simple:

Select a prebuilt model (e.g., Invoices).
Drag & drop a sample invoice or use a provided example.
Click Run analysis.
Review the extracted key-value pairs and the bounding boxes that show where the data was found on the document.

Implementation: Setting Up the Prompt Flow in Azure AI Foundry

Open Azure AI Foundry and create or navigate to a project.
Connect an Azure OpenAI Resource: Ensure you have a GPT-4 model deployed and create a connection to it within your project using the primary key.

Creating the Flow

Inside your project, go to Prompt Flow (under Tools) and create a new Standard Flow.
Define the flow inputs:
- user_query (string): The user’s question.
- document_link (string): The URL of the document image.

Adding the Data Extraction Node (Python Tool)

Create a Python tool named document_intelligence. This tool uses the Azure Document Intelligence SDK to call the prebuilt-invoice model and extract data.

Key Code Steps:

Import the azure.ai.documentintelligence SDK.
Configure the client with your endpoint and key.
Call the begin_analyze_document method with the prebuilt-invoice model and the document URL.
Parse the result, extracting relevant fields (VendorName, CustomerName, InvoiceTotal, etc.).
Return all extracted data as a formatted string.

from promptflow import tool
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest


endpoint = "YOUR_ENDPOINT"
key = "YOUR_PRIMARY_KEY"

# The inputs section will change based on the arguments of the tool function, after you save the code
# Adding type to arguments and return value will help the system show the types properly
# Please update the function name/signature per need
@tool
def document_intelligence(url: str)->str:
    finalResponse=""

    document_intelligence_client = DocumentIntelligenceClient(
    endpoint=endpoint, credential=AzureKeyCredential(key)
    )
    
    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-invoice", AnalyzeDocumentRequest(url_source=url)
    )
    invoices = poller.result()
    
    for idx, invoice in enumerate(invoices.documents):
        print("--------Recognizing invoice #{}--------".format(idx + 1))
        vendor_name = invoice.fields.get("VendorName")
        if(vendor_name):
            finalResponse = finalResponse + "vendor name:" + str(vendor_name.get('content')) + "\n"

        vendor_address = invoice.fields.get("VendorAddress")
        if(vendor_address):
            finalResponse = finalResponse + "vendor address:" + str(vendor_address.get('content')) + "\n"
            
        vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
        if(vendor_address_recipient):
            finalResponse = finalResponse + "vendor address recipient" + str(vendor_address_recipient.get('content')) + "\n"

        customer_name = invoice.fields.get("CustomerName")
        if(customer_name):
            finalResponse = finalResponse + "customer name:" + str(customer_name.get('content')) + "\n"

        customer_id = invoice.fields.get("CustomerId")
        if(customer_id):
            finalResponse = finalResponse + "customer id:" + str(customer_id.get('content')) + "/n"

        customer_address = invoice.fields.get("CustomerAddress")
        if(customer_address):
            finalResponse = finalResponse + "customer address" + str(customer_address.get('content')) + "\n"
            
        customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
        if(customer_address_recipient):
            finalResponse = finalResponse + "customer address recipient" + str(customer_address_recipient.get('content')) + "\n"

        invoice_id = invoice.fields.get("InvoiceId")
        if(invoice_id):
            finalResponse = finalResponse + "invoice id" + str(invoice_id.get('content')) + "\n"

        invoice_date = invoice.fields.get("InvoiceDate")
        if(invoice_date):
            finalResponse = finalResponse + "invoice date" + str(invoice_date.get('content')) + "\n"

        invoice_total = invoice.fields.get("InvoiceTotal")
        if(invoice_total):
            finalResponse = finalResponse + "invoice total" + str(invoice_total.get('content')) + "\n"

        due_date = invoice.fields.get("DueDate")
        if due_date:
           finalResponse += "due date: " + str(due_date.get('content')) + "\n"

        purchase_order = invoice.fields.get("PurchaseOrder")
        if purchase_order:
           finalResponse += "purchase order: " + str(purchase_order.get('content')) + "\n"

        billing_address = invoice.fields.get("BillingAddress")
        if billing_address:
           finalResponse += "billing address: " + str(billing_address.get('content')) + "\n"

        billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
        if billing_address_recipient:
           finalResponse += "billing address recipient: " + str(billing_address_recipient.get('content')) + "\n"

        shipping_address = invoice.fields.get("ShippingAddress")
        if shipping_address:
           finalResponse += "shipping address: " + str(shipping_address.get('content')) + "\n"

        shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
        if shipping_address_recipient:
           finalResponse += "shipping address recipient: " + str(shipping_address_recipient.get('content')) + "\n"

        subtotal = invoice.fields.get("SubTotal")
        if subtotal:
           finalResponse += "subtotal: " + str(subtotal.get('content')) + "\n"

        total_tax = invoice.fields.get("TotalTax")
        if total_tax:
           finalResponse += "total tax: " + str(total_tax.get('content')) + "\n"

        previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
        if previous_unpaid_balance:
           finalResponse += "previous unpaid balance: " + str(previous_unpaid_balance.get('content')) + "\n"

        amount_due = invoice.fields.get("AmountDue")
        if amount_due:
           finalResponse += "amount due: " + str(amount_due.get('content')) + "\n"

        service_start_date = invoice.fields.get("ServiceStartDate")
        if service_start_date:
           finalResponse += "service start date: " + str(service_start_date.get('content')) + "\n"

        service_end_date = invoice.fields.get("ServiceEndDate")
        if service_end_date:
           finalResponse += "service end date: " + str(service_end_date.get('content')) + "\n"

        service_address = invoice.fields.get("ServiceAddress")
        if service_address:
           finalResponse += "service address: " + str(service_address.get('content')) + "\n"

        service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
        if service_address_recipient:
           finalResponse += "service address recipient: " + str(service_address_recipient.get('content')) + "\n"

        remittance_address = invoice.fields.get("RemittanceAddress")
        if remittance_address:
           finalResponse += "remittance address: " + str(remittance_address.get('content')) + "\n"

        remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
        if remittance_address_recipient:
           finalResponse += "remittance address recipient: " + str(remittance_address_recipient.get('content')) + "\n"

    return finalResponse

Installing the SDK: The flow runs in a container. You must add azure-ai-documentintelligence to your requirements.txt file and install it to rebuild the environment.

Create an LLM node named summarization.

Configuration:

Model: Your deployed GPT-4 model.

Connection: Your Azure OpenAI connection.

Prompt:

#system:
You are a helpful AI assistant made to behave as a document chatbot that could
chat upon the content contained in documents like invoices, forms etc. 
Prior to calling you, the analyse API of azure document intelligence was called
to extract information from a document and you will be provided with those
sets of information such as subtotal value, merchant address etc.

Your work is to answer the user query based upon this information and 
if you don't have sufficient information to answer the user query, please
respond by saying: "I don't have suitable information to answer your query".

#user:
user query: {{question}}
document extracted information: {{doc__information}}

Inputs mapping:

doc_information → Output from document_intelligence step
question → inputs.user_query

Testing the Chatbot

Example Test:

Input document_link: https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/invoice_sample.jpg
Input user_query: “What is the total cost value, the customer name, and the merchant name in the invoice?”

The flow will process the invoice, extract the data, and GPT-4 will generate a clear answer based on that specific data.

Conclusion:

The core principle is clear: for accurate document-based chatbots, data preprocessing is non-negotiable. Feeding raw documents to an LLM invites error. Azure Document Intelligence solves this by reliably extracting structured information from unstructured documents. Azure Prompt Flow then seamlessly orchestrates the process, passing this clean data to GPT-4 for generating precise and verified answers.

The best part is the ease of exploration: Azure Document Intelligence Studio allows you to prototype and validate the entire data extraction step through a simple drag-and-drop interface, long before you write a single line of code. This combination provides a robust, scalable, and accurate solution for intelligent document processing.