Skip to content

support metax device#305

Open
Ceng23333 wants to merge 4 commits into
masterfrom
support_metax
Open

support metax device#305
Ceng23333 wants to merge 4 commits into
masterfrom
support_metax

Conversation

@Ceng23333

Copy link
Copy Markdown

Result of cmake and ctest

zenghua@test:~/workspace/infinitensor_metax/InfiniTensor$ docker exec   infinilm-prefill-dev bash -lc '
set -e
cd /home/zenghua/workspace/infinitensor_metax/InfiniTensor/build-metax
cmake .. -DUSE_METAX=ON -DBUILD_TEST=ON -DCMAKE_BUILD_TYPE=Release
cmake --build . -j"$(nproc)"
ctest --output-on-failure -j"$(nproc)"
'
Configuring for Release build.
-- Found Python: /opt/conda/bin/python3.10 (found version "3.10.10") found components: Interpreter Development Development.Module Development.Embed
CMake Deprecation Warning at 3rd-party/pybind11/CMakeLists.txt:8 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.


-- pybind11 v2.10.3 
CMake Deprecation Warning at 3rd-party/nlohmann_json_cmake_fetchcontent/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.


-- Using the single-header code from /home/zenghua/workspace/infinitensor_metax/InfiniTensor/3rd-party/nlohmann_json_cmake_fetchcontent/single_include/
CMake Deprecation Warning at 3rd-party/googletest/CMakeLists.txt:4 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.


CMake Deprecation Warning at 3rd-party/googletest/googletest/CMakeLists.txt:49 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.


-- Found Python: /opt/conda/bin/python3.10 (found version "3.10.10") found components: Interpreter
CMake Deprecation Warning at 3rd-party/backward-cpp/CMakeLists.txt:23 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.


-- Could NOT find libdw (missing: LIBDW_LIBRARY LIBDW_INCLUDE_DIR) 
-- Could NOT find libbfd (missing: LIBBFD_LIBRARY LIBBFD_INCLUDE_DIR) 
-- Could NOT find libdwarf (missing: LIBDWARF_LIBRARY LIBDWARF_INCLUDE_DIR LIBELF_LIBRARY LIBELF_INCLUDE_DIR) 
-- MACA_PATH: /opt/maca
-- Configuring done (0.3s)
-- Generating done (0.2s)
-- Build files have been written to: /home/zenghua/workspace/infinitensor_metax/InfiniTensor/build-metax
[  1%] Built target backward
[  2%] Built target gtest
[  2%] Built target backward_object
[  3%] Built target gtest_main
[ 47%] Built target InfiniTensor
[ 48%] Built target test_graph
[ 48%] Built target backend
[ 49%] Built target test_graph_handler
[ 49%] Built target test_graph_replace
[ 50%] Built target test_tensor_save
[ 51%] Built target test_hash
[ 52%] Built target test_lazy_allocator
[ 53%] Built target test_concat
[ 53%] Built target test_conv3d
[ 54%] Built target test_broadcast
[ 55%] Built target test_verify
[ 56%] Built target test_conv_transposed_2d
[ 56%] Built target test_expand
[ 58%] Built target test_all_reduce
[ 58%] Built target test_search
[ 58%] Built target test_all_gather
[ 58%] Built target test_clip
[ 60%] Built target test_element_wise
[ 60%] Built target test_batch_norm
[ 60%] Built target test_gather_elements
[ 61%] Built target test_conv
[ 61%] Built target test_resize
[ 61%] Built target test_slice
[ 63%] Built target test_gather
[ 63%] Built target test_extend
[ 63%] Built target test_transpose
[ 64%] Built target test_pad
[ 64%] Built target test_pooling
[ 65%] Built target test_sendrecv
[ 66%] Built target test_reshape
[ 66%] Built target test_matmul
[ 67%] Built target test_split
[ 68%] Built target test_nativecpu_concat
[ 69%] Built target test_nativecpu_identity
[ 70%] Built target test_unary
[ 71%] Built target test_reduce
[ 72%] Built target test_cuda_G2BMM
[ 73%] Built target test_cuda_all_gather
[ 74%] Built target test_nativecpu_split
[ 74%] Built target test_nativecpu_elementwise
[ 75%] Built target test_nativecpu_transpose
[ 75%] Built target test_cuda_GBMM
[ 76%] Built target test_where
[ 77%] Built target test_cuda_broadcast
[ 78%] Built target test_cuda_batch_norm
[ 78%] Built target test_cuda_attention
[ 80%] Built target test_cuda_concat
[ 80%] Built target test_nativecpu_conv3d
[ 81%] Built target test_cuda_gather
[ 82%] Built target test_cuda_all_reduce
[ 83%] Built target test_cuda_inception
[ 84%] Built target test_cuda_reshape
[ 84%] Built target test_cuda_clip
[ 85%] Built target test_cuda_extend
[ 86%] Built target test_cuda_expand
[ 86%] Built target test_cuda_pad
[ 87%] Built target test_cuda_resize
[ 87%] Built target test_cuda_matmul
[ 88%] Built target test_cuda_layernorm
[ 89%] Built target test_cuda_pooling
[ 89%] Built target test_cuda_gather_elements
[ 90%] Built target test_cuda_rope
[ 90%] Built target test_cuda_sendrecv
[ 90%] Built target test_cuda_reduce
[ 91%] Built target test_cuda_element_wise
[ 93%] Built target test_cuda_slice
[ 93%] Built target test_cuda_split
[ 94%] Built target test_cuda_conv3d
[ 94%] Built target test_cuda_conv
[ 95%] Built target test_cuda_where
[ 95%] Built target test_cuda_softmax
[100%] Built target test_cuda_conv_transposed_2d
[100%] Built target test_cuda_transpose
[100%] Built target test_cudagraph
[100%] Built target test_nccl_comm
[100%] Built target test_cuda_unary
[100%] Built target test_perfengine
[100%] Built target test_cuda_conv_fp16
Test project /home/zenghua/workspace/infinitensor_metax/InfiniTensor/build-metax
      Start 42: test_cuda_GBMM
      Start 41: test_cuda_G2BMM
      Start 75: test_perfengine
      Start 76: test_cudagraph
      Start 67: test_cuda_rope
      Start 63: test_cuda_pooling
      Start 60: test_cuda_layernorm
      Start 56: test_cuda_extend
      Start 71: test_cuda_split
      Start 61: test_cuda_matmul
      Start 50: test_cuda_conv
      Start 66: test_cuda_resize
      Start 51: test_cuda_conv3d
      Start 59: test_cuda_inception
      Start 74: test_cuda_where
      Start 72: test_cuda_transpose
      Start 45: test_cuda_attention
      Start 55: test_cuda_expand
      Start 70: test_cuda_softmax
      Start 54: test_cuda_element_wise
      Start 53: test_cuda_conv_transposed_2d
      Start 57: test_cuda_gather
      Start 46: test_cuda_batch_norm
      Start 52: test_cuda_conv_fp16
      Start 69: test_cuda_slice
      Start 64: test_cuda_reduce
      Start 49: test_cuda_concat
      Start 65: test_cuda_reshape
      Start 62: test_cuda_pad
      Start 48: test_cuda_clip
      Start 58: test_cuda_gather_elements
      Start 73: test_cuda_unary
      Start  6: test_search
      Start 36: test_nativecpu_conv3d
      Start 15: test_conv
      Start 39: test_nativecpu_split
      Start 35: test_nativecpu_concat
      Start 16: test_conv3d
      Start 40: test_nativecpu_transpose
      Start 19: test_expand
      Start 26: test_reduce
      Start 13: test_clip
      Start 37: test_nativecpu_elementwise
      Start 21: test_gather
      Start 11: test_batch_norm
      Start 25: test_pooling
      Start 23: test_matmul
      Start 24: test_pad
      Start 27: test_reshape
      Start 18: test_element_wise
      Start 10: test_all_reduce
      Start 38: test_nativecpu_identity
      Start 14: test_concat
      Start 29: test_sendrecv
      Start  2: test_graph_handler
      Start  3: test_graph_replace
      Start  9: test_all_gather
      Start 32: test_transpose
      Start 34: test_where
      Start 28: test_resize
      Start  8: test_verify
      Start 33: test_unary
      Start 30: test_slice
      Start 12: test_broadcast
      Start 17: test_conv_transposed_2d
      Start 20: test_extend
      Start  5: test_lazy_allocator
      Start 22: test_gather_elements
      Start  1: test_graph
      Start  4: test_hash
      Start 31: test_split
      Start  7: test_tensor_save
      Start 43: test_cuda_all_gather
      Start 44: test_cuda_all_reduce
      Start 47: test_cuda_broadcast
      Start 68: test_cuda_sendrecv
      Start 77: test_nccl_comm
 1/77 Test #43: test_cuda_all_gather .............   Passed    0.01 sec
 2/77 Test #44: test_cuda_all_reduce .............   Passed    0.01 sec
 3/77 Test #47: test_cuda_broadcast ..............   Passed    0.00 sec
 4/77 Test #68: test_cuda_sendrecv ...............   Passed    0.00 sec
 5/77 Test  #7: test_tensor_save .................   Passed    0.01 sec
 6/77 Test #77: test_nccl_comm ...................   Passed    0.00 sec
 7/77 Test #29: test_sendrecv ....................   Passed    0.20 sec
 8/77 Test #38: test_nativecpu_identity ..........   Passed    0.21 sec
 9/77 Test #32: test_transpose ...................   Passed    0.21 sec
10/77 Test #16: test_conv3d ......................   Passed    0.23 sec
11/77 Test #34: test_where .......................   Passed    0.22 sec
12/77 Test #28: test_resize ......................   Passed    0.22 sec
13/77 Test #22: test_gather_elements .............   Passed    0.22 sec
14/77 Test  #9: test_all_gather ..................   Passed    0.23 sec
15/77 Test #24: test_pad .........................   Passed    0.25 sec
16/77 Test  #1: test_graph .......................   Passed    0.23 sec
17/77 Test #37: test_nativecpu_elementwise .......   Passed    0.27 sec
18/77 Test #31: test_split .......................   Passed    0.25 sec
19/77 Test  #5: test_lazy_allocator ..............   Passed    0.26 sec
20/77 Test  #2: test_graph_handler ...............   Passed    0.27 sec
21/77 Test #13: test_clip ........................   Passed    0.29 sec
22/77 Test #33: test_unary .......................   Passed    0.27 sec
23/77 Test  #4: test_hash ........................   Passed    0.27 sec
24/77 Test #19: test_expand ......................   Passed    0.30 sec
25/77 Test #14: test_concat ......................   Passed    0.30 sec
26/77 Test  #3: test_graph_replace ...............   Passed    0.30 sec
27/77 Test #12: test_broadcast ...................   Passed    0.30 sec
28/77 Test #23: test_matmul ......................   Passed    0.32 sec
29/77 Test #21: test_gather ......................   Passed    0.33 sec
30/77 Test #26: test_reduce ......................   Passed    0.34 sec
31/77 Test #40: test_nativecpu_transpose .........   Passed    0.35 sec
32/77 Test #10: test_all_reduce ..................   Passed    0.34 sec
33/77 Test #25: test_pooling .....................   Passed    0.35 sec
34/77 Test #18: test_element_wise ................   Passed    0.35 sec
35/77 Test #20: test_extend ......................   Passed    0.36 sec
36/77 Test #39: test_nativecpu_split .............   Passed    0.39 sec
37/77 Test #30: test_slice .......................   Passed    0.44 sec
38/77 Test #17: test_conv_transposed_2d ..........   Passed    0.44 sec
39/77 Test  #8: test_verify ......................   Passed    0.46 sec
40/77 Test #11: test_batch_norm ..................   Passed    0.48 sec
41/77 Test #27: test_reshape .....................   Passed    0.53 sec
42/77 Test #35: test_nativecpu_concat ............   Passed    0.58 sec
43/77 Test #15: test_conv ........................   Passed    0.87 sec
44/77 Test #36: test_nativecpu_conv3d ............   Passed    1.06 sec
45/77 Test  #6: test_search ......................   Passed    1.36 sec
46/77 Test #63: test_cuda_pooling ................   Passed    8.59 sec
47/77 Test #75: test_perfengine ..................   Passed    8.61 sec
48/77 Test #71: test_cuda_split ..................   Passed    8.62 sec
49/77 Test #61: test_cuda_matmul .................   Passed    8.69 sec
50/77 Test #67: test_cuda_rope ...................   Passed    8.72 sec
51/77 Test #76: test_cudagraph ...................   Passed    8.73 sec
52/77 Test #74: test_cuda_where ..................   Passed    8.73 sec
53/77 Test #56: test_cuda_extend .................   Passed    8.78 sec
54/77 Test #52: test_cuda_conv_fp16 ..............   Passed    8.79 sec
55/77 Test #60: test_cuda_layernorm ..............   Passed    8.81 sec
56/77 Test #51: test_cuda_conv3d .................   Passed    9.20 sec
57/77 Test #46: test_cuda_batch_norm .............   Passed    9.22 sec
58/77 Test #45: test_cuda_attention ..............   Passed    9.37 sec
59/77 Test #50: test_cuda_conv ...................   Passed    9.46 sec
60/77 Test #59: test_cuda_inception ..............   Passed    9.48 sec
61/77 Test #55: test_cuda_expand .................   Passed    9.59 sec
62/77 Test #70: test_cuda_softmax ................   Passed    9.61 sec
63/77 Test #53: test_cuda_conv_transposed_2d .....   Passed    9.62 sec
64/77 Test #65: test_cuda_reshape ................   Passed    9.62 sec
65/77 Test #72: test_cuda_transpose ..............   Passed    9.63 sec
66/77 Test #54: test_cuda_element_wise ...........   Passed    9.63 sec
67/77 Test #49: test_cuda_concat .................   Passed   10.07 sec
68/77 Test #58: test_cuda_gather_elements ........   Passed   10.29 sec
69/77 Test #62: test_cuda_pad ....................   Passed   10.37 sec
70/77 Test #66: test_cuda_resize .................   Passed   10.41 sec
71/77 Test #69: test_cuda_slice ..................   Passed   10.41 sec
72/77 Test #48: test_cuda_clip ...................   Passed   10.42 sec
73/77 Test #64: test_cuda_reduce .................   Passed   10.43 sec
74/77 Test #41: test_cuda_G2BMM ..................   Passed   10.48 sec
75/77 Test #57: test_cuda_gather .................   Passed   10.47 sec
76/77 Test #42: test_cuda_GBMM ...................   Passed   10.51 sec
77/77 Test #73: test_cuda_unary ..................   Passed   10.55 sec

100% tests passed, 0 tests failed out of 77

Total Test time (real) =  10.58 sec

Ceng23333 added 4 commits May 9, 2026 01:58
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Signed-off-by: Ceng23333 <441651826@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant