Skip to content

[OrcJIT] Add LLVM ORC JIT v2 dynamic object loader addon#254

Merged
cyx-6 merged 32 commits intoapache:mainfrom
yaoyaoding:orcjit
Mar 27, 2026
Merged

[OrcJIT] Add LLVM ORC JIT v2 dynamic object loader addon#254
cyx-6 merged 32 commits intoapache:mainfrom
yaoyaoding:orcjit

Conversation

@yaoyaoding
Copy link
Copy Markdown
Contributor

This PR introduces a new addon package tvm-ffi-orcjit that enables dynamic loading of TVM-FFI exported object files (.o) at runtime using LLVM's ORC JIT v2 engine.

The addon provides a Python API for loading compiled object files, load the tvm-ffi functions defined in the object files.

The API is organized around three main concepts:

  • ExecutionSession: A JIT compilation context that manages the lifetime of dynamic libraries.
  • DynamicLibrary: Represents a shared library that links multiple object files.
  • Object Files: Compiled .o files containing TVM-FFI exported functions.

Usage Example

from tvm_ffi_orcjit import create_session

# Create an execution session (JIT compilation context)
session = create_session()

# Load an object file into the session, returns a DynamicLibrary handle
dylib = session.load("example.o")

# Get and call a function from the loaded object file
add_func = dylib.get_function("simple_add")
result = add_func(1, 2)
print(f"Result: {result}")  # Output: Result: 3

For incremental loading, you can add multiple object files to the same session:

session = create_session()

# Load first object file
dylib = session.load("math_ops.o")
add = dylib.get_function("simple_add")

# Incrementally load more object files into the same library
dylib.load("string_ops.o")

# Now both functions are accessible from the same library
concat = dylib.get_function("string_concat")

See the test for more example.

TODO

  • Support Windows platform (currently Linux and macOS only)
  • Customize memory allocation for dynamic libraries (allow user-provided allocators)

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @yaoyaoding, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant new capability to TVM-FFI by integrating LLVM's ORC JIT v2. This integration enables the dynamic loading and execution of C++ object files directly from Python, offering a flexible and efficient way to extend TVM-FFI's functionality at runtime. The new addon provides a clear Python API for managing JIT compilation contexts and dynamic libraries, making it easier for developers to incorporate custom C++ logic without requiring a full recompile of their applications.

Highlights

  • New Addon Package: Introduces tvm-ffi-orcjit, a new addon package designed to enable dynamic loading of TVM-FFI exported object files (.o) at runtime.
  • LLVM ORC JIT v2 Integration: Leverages LLVM's ORC JIT v2 engine for Just-In-Time compilation and execution of C++ code, providing a powerful and flexible runtime environment.
  • Python API: Provides a Pythonic API centered around ExecutionSession (JIT compilation context) and DynamicLibrary (manages object files and symbols) for seamless interaction.
  • Incremental Loading: Supports incremental loading, allowing multiple object files to be added to the same DynamicLibrary instance, with functions from all loaded files remaining accessible.
  • Comprehensive Examples and Tests: Includes a quick-start example demonstrating basic usage and a robust test suite covering core functionalities, including symbol resolution and conflict handling.
Ignored Files
  • Ignored by pattern: .github/workflows/** (3)
    • .github/workflows/orcjit-publish.yml
    • .github/workflows/orcjit-tests.yml
    • .github/workflows/tvm-ffi-orcjit/ci_test.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new addon package, tvm-ffi-orcjit, to enable dynamic loading of object files using LLVM's ORC JIT v2. The overall design is robust, with a clear separation between the C++ backend and the Python frontend. The implementation demonstrates a deep understanding of LLVM's JIT capabilities, including a clever workaround for the __dso_handle issue. The tests are comprehensive and cover important scenarios like symbol conflicts.

I have identified a few areas for improvement, primarily concerning inconsistencies in documentation, build scripts, and a potential bug in the CMake configuration. There is also a mismatch between a Python API and its C++ backend. Addressing these points will enhance the quality and usability of this new addon.

target_link_libraries(
tvm_ffi_orcjit
PUBLIC tvm_ffi
PRIVATE LLVM
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The find_package(LLVM ... CONFIG) command does not create a single LLVM target. Instead, llvm_map_components_to_libnames populates the LLVM_LIBS variable with a list of component library targets. You should link against ${LLVM_LIBS}.

  PRIVATE ${LLVM_LIBS}

Comment on lines +83 to +105
def link_against(self, *libraries: DynamicLibrary) -> None:
"""Link this library against other dynamic libraries.

Sets the search order for symbol resolution. Symbols not found in this library
will be searched in the linked libraries in the order specified.

Parameters
----------
*libraries : DynamicLibrary
One or more dynamic libraries to link against.

Examples
--------
>>> session = create_session()
>>> lib_utils = session.create_library()
>>> lib_utils.add("utils.o")
>>> lib_main = session.create_library()
>>> lib_main.add("main.o")
>>> lib_main.link_against(lib_utils) # main can call utils symbols

"""
handles = [lib._handle for lib in libraries]
self._link_func(self._handle, *handles)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The link_against method accepts *libraries, which implies it can link against multiple libraries. However, the C++ backend function orcjit.DynamicLibraryLinkAgainst only supports linking against a single library at a time. This will cause a TypeError if more than one library is passed.

To fix this, you should either update the Python method to accept only one library or modify the C++ backend to handle multiple libraries. Given the current C++ implementation, changing the Python API is the most direct solution.

    def link_against(self, library: DynamicLibrary) -> None:
        """Link this library against another dynamic library.

        Sets the search order for symbol resolution. Symbols not found in this library
        will be searched in the linked library.

        Parameters
        ----------
        library : DynamicLibrary
            The dynamic library to link against.

        Examples
        --------
        >>> session = create_session()
        >>> lib_utils = session.create_library()
        >>> lib_utils.add("utils.o")
        >>> lib_main = session.create_library()
        >>> lib_main.add("main.o")
        >>> lib_main.link_against(lib_utils)  # main can call utils symbols

        """
        self._link_func(self._handle, library._handle)

Comment on lines +19 to +20
separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS})
add_definitions(${LLVM_DEFINITIONS_LIST})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These add_definitions are redundant as target_compile_definitions is used for the tvm_ffi_orcjit target on line 93. It's better to use target_compile_definitions for target-specific definitions. I suggest removing these lines to avoid redundancy.

Comment on lines +93 to +127
from tvm_ffi_orcjit import ObjectLoader

# Create a loader instance
loader = ObjectLoader()

# Load an object file
loader.load("example.o")

# Get and call a function
add_func = loader.get_function("simple_add")
result = add_func(1, 2)
print(f"Result: {result}") # Output: Result: 3
```

### Incremental Loading

Load multiple object files and access functions from all of them:

```python
from tvm_ffi_orcjit import ObjectLoader

loader = ObjectLoader()

# Load first object file
loader.load("math_ops.o")
add = loader.get_function("simple_add")

# Load second object file - functions from first remain accessible
loader.load("string_ops.o")
concat = loader.get_function("string_concat")

# Both functions work
print(add(10, 20)) # From math_ops.o
print(concat("Hello", "World")) # From string_ops.o
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The usage examples in this README appear to be outdated. They refer to an ObjectLoader class which is not present in the current implementation. The correct API, as demonstrated in examples/quick-start/run.py, uses create_session, session.create_library, and lib.add(). Please update the examples to reflect the current API to avoid confusion for new users.

Comment on lines +129 to +142
### Direct Module Access

You can also use TVM-FFI's `load_module` directly (`.o` files are automatically handled):

```python
import tvm_ffi

# Load object file as a module
module = tvm_ffi.load_module("example.o")

# Get function
func = module.get_function("my_function")
result = func(arg1, arg2)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This section on "Direct Module Access" suggests that tvm_ffi.load_module("example.o") is supported. However, the implementation does not seem to register a module loader for .o files. If this feature is not yet implemented, it would be best to remove this section or clearly mark it as a future capability to prevent user confusion.

int (*)(void* handle, const TVMFFIAny* args, int32_t num_args, TVMFFIAny* rv);
auto c_func = reinterpret_cast<TVMFFISafeCallType>(symbol);

return Function::FromPacked([c_func, name](PackedArgs args, Any* rv) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to existing LibraryModule impl.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yzh119
Copy link
Copy Markdown
Member

yzh119 commented Nov 11, 2025

One general question, how do we plan to manage the versions of these add ons?

@tqchen
Copy link
Copy Markdown
Member

tqchen commented Nov 11, 2025

I think most likely they can evolve independently for now and optional

tqchen
tqchen previously requested changes Nov 19, 2025
[](const ORCJITExecutionSession& session, const String& name) -> ObjectRef {
return session->CreateDynamicLibrary(name);
})
.def("orcjit.DynamicLibraryAdd",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because dynamic library is only a Module and is not an explicit subclass, let us avoid passing these function via def, instead we can overload GetFunction in ORCJITDynamicLibrary to return set link order and AddFile etc

@junrushao
Copy link
Copy Markdown
Member

This PR has been outstanding for a while, and looks indeed super interesting! I don't really have the expertise to review, unfortunately. @yaoyaoding @cyx-6 What's the status of it? Happy to get it in

@yaoyaoding
Copy link
Copy Markdown
Contributor Author

What's the status of it? Happy to get it in

@cyx-6 is working on this PR right now and currently working on the support of cuda kernel with the orcjitv2.

@cyx-6 cyx-6 force-pushed the orcjit branch 4 times, most recently from 29735d0 to e2ae9da Compare March 10, 2026 16:42
Add ORC JIT-based module loading addon (tvm-ffi-orcjit) with:
- LLJIT wrapper exposing packed functions via TVM FFI Module interface
- Init/fini handling for ELF (.init_array/.fini_array), Mach-O
  (__mod_init_func/__mod_term_func), and COFF (.CRT$XC*/.CRT$XT*)
- COFF .pdata/.xdata relocation stripping for Windows JITLink compat
- Windows DLL import symbol resolution and PLT stub generation
- Static LLVM linking with bundled liborc_rt for manylinux/macOS/Windows
- CI workflow (Linux x86_64/aarch64, macOS arm64, Windows AMD64)
  with cibuildwheel and multi-compiler test coverage
- Comprehensive tests: error handling, type variety, cross-library
  linking, ctor/dtor ordering across platforms and compilers
- Quick-start example with C and C++ variants
- Documentation and kernel library guide
@tqchen tqchen dismissed their stale review March 23, 2026 19:03

stale

// macOS: MachOPlatform handles init/fini natively via jit_->initialize()/deinitialize().
auto& objlayer = jit_->getObjLinkingLayer();
static_cast<llvm::orc::ObjectLinkingLayer&>(objlayer).addPlugin(
std::make_unique<InitFiniPlugin>(GetRef<ORCJITExecutionSession>(this)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ExecutionSession will leak. the session ref count will never reach zero.

the chain is like this: when you do GetRef<ORCJITExecutionSession>(this), it will create a session moved to InitFiniPlugin::session_ and owned by ObjectLinkingLayer, which is owned by LLJIT, which is owned by the session.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// Try to get the symbol - return NullOpt if not found
if (void* symbol = GetSymbol(symbol_name)) {
// Wrap C function pointer as tvm-ffi Function
TVMFFISafeCallType c_func = reinterpret_cast<TVMFFISafeCallType>(symbol);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since c_func returned by GetSymbol points to JIT-allocated code pages, which is owned by LLJIT, and is owned by a session, if the dylib and session are destroyed, referencing c_func would be dangerous behavior.

i think it might be better if we take a look at src/ffi/extra/library_module.cc and use its method

Module self_strong_ref = GetRef<Module>(this);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

*ctx_addr = this;
}
Module::VisitContextSymbols([this](const ffi::String& name, void* symbol) {
if (void** ctx_addr = reinterpret_cast<void**>(GetSymbol(ffi::symbol::tvm_ffi_library_ctx))) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetSymbol calls seem repetitive. since each GetSymbol call will trigger the initializers, it will be init too many times.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, in RunPendingInitializers, the initializers are only called once and then being removed.

std::string symbol_name = symbol::tvm_ffi_symbol_prefix + std::string(name);

// Try to get the symbol - return NullOpt if not found
if (void* symbol = GetSymbol(symbol_name)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, RunPendingInitializers might be triggers too many times.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, the initializers are only called once and then being removed.

if (is_init) {
session_->AddPendingInitializer(&jit_dylib, entry);
} else {
session_->AddPendingDeinitializer(&jit_dylib, entry);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there might be a data racing problem if pending_initializers_ and pending_deinitializers_ are unordered_maps, especially during materialization which happen lazily.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM LLJIT defaults to InPlaceTaskDispatcher, which means materialization runs synchronously in the calling thread. All sequential in the same thread, no concurrent access to the maps. A custom concurrent TaskDispatcher could introduce races, but that is not the configuration used here.

…tion

- Break reference cycle: InitFiniPlugin now stores a raw pointer to
  ORCJITExecutionSessionObj instead of a ref-counted ORCJITExecutionSession.
  The plugin lifetime is bounded by LLJIT which is owned by the session,
  so the raw pointer is always valid.
- Capture a strong self-ref (Module) in the Function lambda returned by
  GetFunction, preventing use-after-free if the dylib/session is destroyed
  while a caller still holds the Function.
namespace ffi {
namespace orcjit {

inline void call_llvm(llvm::Error err, const std::string& context_msg = "") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor, CamelCase CallLLVM, also need documentation. To make it lazy, likely we want to keep TVM_FFI_ORCJIT_LLVM_CALL a macro like TVM_FFI_CUDA_CALL

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

return c, cxx


def _build_all(llvm_prefix: str) -> None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to cross check if we can leverage (enhance) tvm_ffi.cpp.build_inline here(cc @yaoyaoding )

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


from __future__ import annotations

import argparse
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to cross check if we can run this as normal pytest, via tvm_ffi.cpp.build_inline in temp dir and then load test

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tvm_ffi.cpp.build is now refactored, matching the torch api and supporting the object files as input

-y
if ($LASTEXITCODE -ne 0) { throw "micromamba create failed" }

# Build static zstd from source.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider bringing to addons/tvm-ffi-orcjit/scripts/install_llvm.ps1,

BTW, if we are using it for gh workflow, likely it can be simplified via setup miniconda action

https://github.com/apache/tvm/blob/main/.github/actions/setup/action.yml#L21

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scripts removed in new workflow

# Install LLVM from conda-forge using micromamba.
# Usage: bash tools/install_llvm.sh [version]
# version defaults to LLVM_VERSION env var, then 22.1.0
set -ex
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify via setup-miniconda action

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified

concat = lib.get_function("concat")
result = concat("Hello, ", "World!")
print(f"concat('Hello, ', 'World!') = '{result}'")
assert result == "Hello, World!", f"Expected 'Hello, World!', got '{result}'"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move check to sub function so we don't need explicit del here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extracted to the sub function and no more explicit del now


# Use the installed package if available; fall back to source tree for editable dev
try:
from tvm_ffi_orcjit import ExecutionSession
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep things simple, let us just assume tvm_ffi_orcjit is installed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified

cyx-6 added 3 commits March 25, 2026 01:25
…sion

- Item 3: move null checks before GetSymbol/VisitContextSymbols in constructor
- Item 5: make dylib_counter_ std::atomic<int> for thread safety
- Item 6: add comment explaining intentional LoadLibraryA without FreeLibrary
- Item 7: fix docstrings referencing non-existent create_session()
- Item 8: pass list(libraries) instead of tuple in set_link_order
- Item 9: use importlib.metadata.version() instead of hardcoded __version__
@cyx-6
Copy link
Copy Markdown
Contributor

cyx-6 commented Mar 25, 2026

A high-level question is: you used a InitFiniPlugin to handle the ctors/dtors of the loaded object files. And there are some *Platform in the orc_rt (or compiler_rt) will register the handling of ctors/dtors. Maybe add some documentation to explain this (like why on MSVC, we can't depends on the c runtime library, etc). Is it possible to remove the plugin when the orc_rt has better support towards the ctors/dtors?
@yaoyaoding here are the status of three platforms

  • macOS: MachOPlatform (via orc_rt) handles __mod_init_func/__mod_term_func and _cxa_atexit natively. We delegate to jit->initialize()/deinitialize() — no InitFiniPlugin needed.
  • Windows: COFFPlatform is unusable (requires MSVC CRT symbols like _CxxThrowException, RTTI vtables — stalled in LLVM for 2+ years). Our plugin handles .CRT$XC*/.CRT$XT* sections.
  • Linux: ELFNixPlatform not handles ctor/dtor well, before this commit [ORC] Fix missing support of deinitialize on ELF platform and execution order by priority llvm/llvm-project#175981. Would be return to ELFNixPlatform once new llvm release contains this patch commit. So InitFiniPlugin is still needed.

@cyx-6 cyx-6 force-pushed the orcjit branch 2 times, most recently from f125c41 to a2226bb Compare March 25, 2026 16:21
@yaoyaoding
Copy link
Copy Markdown
Contributor Author

Would be return to ELFNixPlatform once new llvm release contains this patch commit.

Cool! I saw your PR to the llvm.

requires MSVC CRT symbols

If we load the MSVC CRT into the process, will the COFFPlatform work?

@cyx-6
Copy link
Copy Markdown
Contributor

cyx-6 commented Mar 26, 2026

If we load the MSVC CRT into the process, will the COFFPlatform work?

The MSVC CRT DLLs are already loaded in the host process — our DLLImportDefinitionGenerator resolves symbols from vcruntime140, ucrtbase, msvcp140, etc. The problem is that COFFPlatform requires a COFF ORC runtime library (orc_rt) to be built and loaded. Looking at the LLVM source, COFFPlatform needs to resolve ~10 runtime entry points like __orc_rt_coff_platform_bootstrap, __orc_rt_coff_register_jitdylib, __orc_rt_coff_register_object_sections, etc. It also aliases _CxxThrowException -> __orc_rt_coff_cxx_throw_exception

The COFF ORC runtime does exist in compiler-rt, and it implements __orc_rt_coff_cxx_throw_exception — but with a big caveat: cross-module exception throwing is unfinished. So even if we build and load the COFF orc_rt, exception handling between host code and JIT code would abort(). The LLVM test suite does include COFF ORC tests (e.g., sehframe-handling.cpp, hello-world.cpp in compiler-rt/test/orc/TestCases/Windows/x86-64/), but these require a fully functional COFF runtime which hasn't been completed.

In principle, if someone completes the COFF ORC runtime (especially the exception interop), COFFPlatform could work. But that's been stalled for 2+ years in LLVM. Our current approach (InitFiniPlugin + DLLImportDefinitionGenerator) gives us working C support without depending on the incomplete runtime.

cyx-6 added 5 commits March 26, 2026 17:01
Explain why repeated jit_->initialize() calls on macOS do not cause
refcount imbalance: LLJIT tracks InitializedDylib and switches from
dlopen to dlupdate after the first call. Similarly, RunPendingInitializers
on Linux/Windows is idempotent (drains map, no-op until new object files).
Move LLVM/zstd installation from CIBW_BEFORE_ALL scripts to a cached
host-level step, and replace zstd source builds with conda-forge
zstd-static package. Delete standalone install scripts.
Remove ZSTD_SOURCE_DIR fallback (now always find_library from conda
prefix). Add CXX_VISIBILITY_PRESET hidden to match tvm-ffi convention.
@yaoyaoding
Copy link
Copy Markdown
Contributor Author

The problem is that COFFPlatform requires a COFF ORC runtime library (orc_rt) to be built and loaded.
...
In principle, if someone completes the COFF ORC runtime (especially the exception interop), COFFPlatform could work. But that's been stalled for 2+ years in LLVM. Our current approach (InitFiniPlugin + DLLImportDefinitionGenerator) gives us working C support without depending on the incomplete runtime.

Thank you for the clarification! Looks like we are on the edge of the orcjit's current development & support coverage. It's pretty cool you finally make it work.

@yaoyaoding
Copy link
Copy Markdown
Contributor Author

The PR looks good to me now, thanks @cyx-6 !

Copy link
Copy Markdown
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor comment, lgtm

int c;
int *d_a, *d_b, *d_c;
cudaMalloc(&d_a, sizeof(int));
cudaMalloc(&d_b, sizeof(int));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a betetr canonical examplw would be apssing DLTensor in from python side via pytorch

@tqchen
Copy link
Copy Markdown
Member

tqchen commented Mar 26, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the tvm-ffi-orcjit addon, which enables dynamic loading and execution of compiled object files using LLVM ORC JIT v2. The implementation includes a C++ backend for session and library management, a Python wrapper, and a custom plugin to handle platform-specific static constructors and destructors. Additionally, the tvm_ffi.cpp build system is enhanced to support compiling sources into relocatable object files and handling pure C files. Review feedback focuses on minor grammatical corrections in the documentation and improving the accuracy of error messages regarding library search paths.

relocations from COFF objects before JITLink graph building, working around a
JITLink limitation with COMDAT section symbols.

Please refers to [ORCJIT_PRIMER.md](./ORCJIT_PRIMER.md) to learn more about object file, linking, llvm orcjit v2, and how the addon works.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Minor grammatical correction: refers should be refer.

Suggested change
Please refers to [ORCJIT_PRIMER.md](./ORCJIT_PRIMER.md) to learn more about object file, linking, llvm orcjit v2, and how the addon works.
Please refer to [ORCJIT_PRIMER.md](./ORCJIT_PRIMER.md) to learn more about object file, linking, llvm orcjit v2, and how the addon works.

Comment on lines +62 to +66
raise RuntimeError(
f"Could not find {_LIB_NAME}. "
f"Searched in {_LIB_PATH} and site-packages. "
f"Please ensure the package is installed correctly."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message is slightly misleading. It states that it searched in site-packages, but the code only iterates through the paths defined in _LIB_PATH. To avoid confusion for users debugging loading issues, it would be better to list the actual paths that were searched.

Suggested change
raise RuntimeError(
f"Could not find {_LIB_NAME}. "
f"Searched in {_LIB_PATH} and site-packages. "
f"Please ensure the package is installed correctly."
)
raise RuntimeError(
f"Could not find {_LIB_NAME}. "
f"Searched in: {_LIB_PATH}. "
f"Please ensure the package is installed correctly."
)

@Kathryn-cat
Copy link
Copy Markdown
Contributor

thanks @cyx-6 ! the PR is LGTM

cyx-6 added 5 commits March 27, 2026 19:28
- Extract build logic from conftest.py into utils.py with
  build_test_objects() that compiles to /tmp
- test_basic.py calls utils directly at module level
- Delete conftest.py, run_all_tests.py, __init__.py
- CI workflow runs pytest + example directly
…workflow

- Extract orcjit build+test logic into .github/actions/build-orcjit-wheel/
- Add path-filtered orcjit job to ci_test.yml
- Add publish_orcjit_wheel.yml for PyPI publishing
- Remove standalone tvm_ffi_orcjit.yml (superseded)
@cyx-6 cyx-6 merged commit 1da5637 into apache:main Mar 27, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants