Skip to content

feat: Update vendored llama.cpp to include ModernBERT support #2144

@clemlesne

Description

@clemlesne

Context

The current release (v0.3.16, Aug 15 2025) vendors llama.cpp at commit 4227c9b (Aug 14, 2025). ModernBERT architecture support was added to llama.cpp on Dec 22, 2025 in PR ggml-org/llama.cpp#15641 (src/models/modern-bert.cpp). The vendor submodule hasn't been updated in ~7 months.

Problem

Loading a ModernBERT GGUF model fails with:

llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'modernbert'

The binary only knows: bert, falcon, gemma, gpt2, llama, mamba, qwen, rwkv, stablelm — no modernbert.

Attempted fix

We tried updating the vendor submodule to llama.cpp b8373 (Mar 16, 2026) and building from source. Results:

  1. CMake fails with LLAVA_BUILD=ON (default) — vendor/llama.cpp/tools/mtmd/CMakeLists.txt:37 has a set_target_properties error incompatible with the binding's CMakeLists.txt
  2. CMake succeeds with -DLLAVA_BUILD=OFF — builds and installs fine
  3. Runtime ABI mismatch — the Python bindings call llama_get_kv_self which no longer exists in b8373 (symbol not found). The C API has diverged significantly in 7 months.

So it's not just a submodule bump — the Python ctypes bindings need updates to match the current llama.cpp C API.

Why it matters

Several recent high-quality embedding models use ModernBERT architecture:

  • granite-embedding-small-english-r2 (IBM, 47M params, 384d, BEIR NDCG@10 50.9, Apache 2.0)
  • modernbert-embed-base (Nomic, 149M params, Apache 2.0)
  • gte-modernbert-base (Alibaba, 149M params)

These models can't be used with llama-cpp-python until the binding is updated.

Request

Update the Python ctypes bindings and CMakeLists.txt to be compatible with a recent llama.cpp build that includes ModernBERT support (post Dec 22, 2025).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions