feat: migrate backend from per-device kernels to InfiniOps operator library by bitzyz · Pull Request #304 · InfiniTensor/InfiniTensor

bitzyz · 2026-04-29T01:52:26Z

This commit replaces InfiniTensor's per-device kernel implementations with InfiniOps, a unified operator library shared across the InfiniTensor ecosystem. This is the first step in a multi-part migration.

Add InfiniOps as a git submodule at 3rd-party/infiniops, version-pinned via .gitmodules. The submodule provides kernel implementations for CPU, CUDA, and other accelerators through a single libinfiniops.so.
Introduce the InfiniOps bridge layer (include/core/infiniops_bridge/):

adapter_kernel.h — base class for adapter kernels that delegate to InfiniOps C++ APIs.

tensor_convert.h — converts InfiniTensor TensorObj (shape, dtype, strides, device) to InfiniOps Tensor as a non-owning view.

infiniops_runtime.h / infiniops_runtime.cc — unified InfiniOpsRuntimeObj that replaces all per-device runtime classes (CudaRuntimeObj, BangRuntimeObj, KunlunRuntimeObj, etc.) with device-specific alloc/dealloc dispatching and workspace management.

cpu_fallback.h — placeholder for future GPU→CPU fallback path for operators not yet supported by InfiniOps on GPU.

Implement 6 InfiniOps adapter kernels (src/kernels/infiniops/): Add, Mul, MatMul (with bias fusion via Gemm beta=1), Cast, Concat, and RMSNorm. Each adapter converts InfiniTensor tensors to InfiniOps tensors and calls the corresponding infini::ops::*::Call().
Remove legacy per-device kernel implementations for CUDA, BANG, Kunlun, Ascend, and IntelCPU backends (~24,700 lines deleted). This includes all kernel source files, runtime classes, operator timers, and device-specific headers. The remaining CPU kernels (src/kernels/cpu/) serve as fallback for operators not yet covered by InfiniOps adapters.
Simplify the build system: remove USE_CUDA, USE_BANG, USE_KUNLUN, USE_ASCEND, USE_INTELCPU CMake options. InfiniOps device backends are now controlled through InfiniOps's own WITH_CPU/WITH_NVIDIA etc. flags. GPU support is re-enabled via -DWITH_NVIDIA=ON passed through to InfiniOps's add_subdirectory.
Update the Python FFI (src/ffi/ffi_infinitensor.cc): replace all device-specific runtime factory functions (cuda_runtime(), bang_runtime(), etc.) with a unified cpu_runtime() / cuda_runtime() that returns InfiniOpsRuntimeObj. Fix copyout_numpy to use numpy.empty() instead of py::array(dtype, shape, nullptr) to avoid NumPy 2.x stride issues.

feat: use infiniops backend part1

1c5d040

bitzyz self-assigned this Apr 29, 2026

bitzyz marked this pull request as draft April 29, 2026 01:58

bitzyz added 3 commits April 29, 2026 10:54

feat: use infiniops backend part2

0f50c06

feat: use infiniops backend part3

e67da5c

feat: use infiniops backend part4

b971239

bitzyz force-pushed the dev-adapt-infiniops branch from 70e886d to 15cbfab Compare April 30, 2026 02:25

feat: use infiniops backend part5

a2575d1

bitzyz force-pushed the dev-adapt-infiniops branch from 15cbfab to a2575d1 Compare April 30, 2026 02:53

feat: use infiniops backend part6

d78ceda

bitzyz force-pushed the dev-adapt-infiniops branch 4 times, most recently from 952e424 to 889b1ad Compare May 7, 2026 03:39

feat: use infiniops backend part7

ddb8443

bitzyz force-pushed the dev-adapt-infiniops branch 2 times, most recently from 9ff0681 to 49a1769 Compare May 8, 2026 01:59

feat: use infiniops backend part8

c11b79b

bitzyz force-pushed the dev-adapt-infiniops branch from 49a1769 to c11b79b Compare May 8, 2026 06:18

feat: use infiniops backend part9

42fe071

bitzyz force-pushed the dev-adapt-infiniops branch from 0b0c969 to 0d67dc3 Compare May 18, 2026 02:52

use infiniops backend part10

70882d2

bitzyz force-pushed the dev-adapt-infiniops branch from 0d67dc3 to 70882d2 Compare May 20, 2026 06:31

feat: use infiniops backend part11

8d67a97

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: migrate backend from per-device kernels to InfiniOps operator library#304

feat: migrate backend from per-device kernels to InfiniOps operator library#304
bitzyz wants to merge 11 commits into
masterfrom
dev-adapt-infiniops

bitzyz commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

bitzyz commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant