Using LLM to describe images into .docx files by carlodek · Pull Request #1435 · microsoft/markitdown

carlodek · 2025-10-01T11:00:06Z

Image description with LLM into docx docs

What I've done:

Modified converter_utils/docx/pre_process.py to detect images and put the description generated by LLM into the right place.
Moved _llm_caption file into main folder as it will be used by pre_process file too.
Added an image to test it: docx_with_image_test.docx into test_files folder.

How to test:

I've tested it with AzureOpenAI, here it's a code snippet:

from packages.markitdown.src.markitdown import MarkItDown
from openai import AzureOpenAI

if __name__ == "__main__":
    AZURE_OPEN_AI_ENDPOINT = "<your_endpoint>
    AZURE_OPEN_AI_DEPLOYMENT = "<your_deployment>"
    AZURE_OPEN_AI_KEY = "<your_api_key>"
    AZURE_OPEN_AI_API_VERSION = "<your_version>"
    file_path = "tests/test_files/docx_with_image_test.docx"
    client = AzureOpenAI(
        azure_endpoint=AZURE_OPEN_AI_ENDPOINT,
        api_key=AZURE_OPEN_AI_KEY,
        api_version=AZURE_OPEN_AI_API_VERSION
    )
    md = MarkItDown(llm_client=client, llm_model=AZURE_OPEN_AI_DEPLOYMENT, llm_prompt="Please describe the image")
    result = md.convert(file_path)
    print(result.markdown)

…ess too

carloMobilesoft and others added 3 commits September 30, 2025 17:13

TODO: test with other docs

6310917

moved llm caption outside converters as it will be used from pre_proc…

fe8a2b9

…ess too

Merge branch 'main' into main

7bb883a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using LLM to describe images into .docx files#1435

Using LLM to describe images into .docx files#1435
carlodek wants to merge 3 commits intomicrosoft:mainfrom
carlodek:main

carlodek commented Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

carlodek commented Oct 1, 2025

Image description with LLM into docx docs

What I've done:

How to test:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants