Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error compiling trilinos(16.0.0) on linux #13770

Closed
bztd opened this issue Feb 2, 2025 · 18 comments
Closed

Error compiling trilinos(16.0.0) on linux #13770

bztd opened this issue Feb 2, 2025 · 18 comments
Labels
type: bug The primary issue is a bug in Trilinos code or tests

Comments

@bztd
Copy link

bztd commented Feb 2, 2025

I'm compiling using the SDK provided by Flatpak using parameters copied from archlinux,

  cmake -S Trilinos-trilinos-release-"$_pkgver" \
        -B build \
        -D CMAKE_INSTALL_PREFIX:PATH=/usr \
        -D BUILD_SHARED_LIBS:BOOL=ON \
        -D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON \
        -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=ON \
        -D Trilinos_ENABLE_SECONDARY_TESTED_CODE:BOOL=ON \
        -D Trilinos_ENABLE_PyTrilinos:BOOL=OFF \
        -D Trilinos_ENABLE_Gtest:BOOL=OFF \
        -D Trilinos_ENABLE_TESTS:BOOL=OFF \
        -D Trilinos_ENABLE_TrilinosFrameworkTests:BOOL=OFF \
        -D Trilinos_ENABLE_TrilinosATDMConfigTests:BOOL=OFF \
        -D TPL_ENABLE_gtest:BOOL=OFF \
        -D TPL_ENABLE_MPI:BOOL=ON \
        -D TPL_ENABLE_HDF5:BOOL=ON \
        -D Zoltan_ENABLE_F90INTERFACE:BOOL=ON \
        -D CMAKE_C_FLAGS="$CFLAGS -Wno-incompatible-pointer-types" \
        -D CMAKE_Fortran_FLAGS="$FCFLAGS -fallow-argument-mismatch"

I get the error:

-- Installing: /app/include/Shards_CellTopologyManagedData.hpp
-- Installing: /app/include/Shards_CellTopologyTraits.hpp
-- Installing: /app/include/Shards_BasicTopologies.hpp
-- Installing: /app/include/Shards_IndexList.hpp
-- Installing: /app/include/Shards_SimpleArrayOps.hpp
-- Installing: /app/include/Shards_TypeList.hpp
-- Installing: /app/lib64/cmake/Shards/ShardsConfig.cmake
-- Installing: /app/lib64/cmake/Shards/ShardsTargets.cmake
-- Installing: /app/lib64/cmake/Shards/ShardsTargets-release.cmake
-- Installing: /app/lib64/libtriutils.so.16.0.0
-- Installing: /app/lib64/libtriutils.so.16
-- Set non-toolchain portion of runtime path of "/app/lib64/libtriutils.so.16.0.0" to "/app/lib64"
-- Installing: /app/lib64/libtriutils.so
-- Installing: /app/include/Triutils_config.h
-- Installing: /app/include/Trilinos_Util_iohb.h
-- Installing: /app/include/Trilinos_Util_Version.h
-- Installing: /app/include/Trilinos_Util_CommandLineParser.h
-- Installing: /app/include/Trilinos_Util_CountMatrixMarket.h
-- Installing: /app/include/Trilinos_Util_CountTriples.h
-- Installing: /app/include/Trilinos_Util_CrsMatrixGallery.h
-- Installing: /app/include/Trilinos_Util_ReadMatrixMarket2Epetra.h
-- Installing: /app/include/Trilinos_Util.h
-- Installing: /app/lib64/cmake/Triutils/TriutilsConfig.cmake
-- Installing: /app/lib64/cmake/Triutils/TriutilsTargets.cmake
-- Installing: /app/lib64/cmake/Triutils/TriutilsTargets-release.cmake
CMake Error at Fbuild/packages/epetraext/src/cmake_install.cmake:57 (file):
  file INSTALL cannot find
  "/run/build/trilinos/Fbuild/packages/epetraext/src/libepetraext.so.16.0.0":
  No such file or directory.
Call Stack (most recent call first):
  Fbuild/packages/epetraext/cmake_install.cmake:47 (include)
  Fbuild/cmake_install.cmake:173 (include)

Link to compilation process: https://buildbot.flathub.org/#/builders/6/builds/177328

@bztd bztd added the type: bug The primary issue is a bug in Trilinos code or tests label Feb 2, 2025
@cgcgcg
Copy link
Contributor

cgcgcg commented Feb 2, 2025

The CMake output for Trilinos says

Processing enabled external package/TPL: HDF5 (enabled explicitly, disable with -DTPL_ENABLE_HDF5=OFF)
-- Using find_package(HDF5 ...) ...
-- Found HDF5: hdf5-shared (found version "1.14.5") found components: C
-- HDF5_LIBRARY_NAMES='hdf5;z;hdf5_hl'
-- TPL_HDF5_LIBRARIES='hdf5-shared'
-- TPL_HDF5_INCLUDE_DIRS='/app/include'

but then compilation fails with

[ 10%] Linking CXX shared library libepetraext.so
/usr/lib/gcc/x86_64-unknown-linux-gnu/14.2.0/../../../../x86_64-unknown-linux-gnu/bin/ld: cannot find -lhdf5-shared: No such file or directory
collect2: error: ld returned 1 exit status
make[2]: *** [packages/epetraext/src/CMakeFiles/epetraext.dir/build.make:982: packages/epetraext/src/libepetraext.so.16.0.0] Error 1
make[1]: *** [CMakeFiles/Makefile2:10908: packages/epetraext/src/CMakeFiles/epetraext.dir/all] Error 2
make: *** [Makefile:166: all] Error 2

I don't quite understand how this happened. Is there a way to check artifacts of the build and figure out what the actual link line was? What's also odd is how the install can kick off after the build step just failed.

@bztd
Copy link
Author

bztd commented Feb 2, 2025

I've also done local builds with the same error.

This is the recipe I'm following for building dependencies and Trilinos:

https://github.com/flathub/au.edu.uq.esys.escript/blob/bbbb/au.edu.uq.esys.escript.json

What files do you need?

@cgcgcg
Copy link
Contributor

cgcgcg commented Feb 2, 2025

Since you have a local build: can you do make VERBOSE=1? That should give you the failing command. Then we can check if hdf5-shared is simply somehow missing or somehow pointing to the wrong location.

@bztd
Copy link
Author

bztd commented Feb 2, 2025

It's very strange, I just added: VERBOSE=1 -j${FLATPAK_BUILDER_N_JOBS}

and it also changed the error to:

-- Installing: /app/include/impl/Kokkos_UnorderedMap_impl.hpp
CMake Error at Fbuild/packages/kokkos/containers/src/cmake_install.cmake:61 (file):
  file INSTALL cannot find
  "/run/build/trilinos/Fbuild/packages/kokkos/containers/src/libkokkoscontainers.so.16.0.0":
  No such file or directory.
Call Stack (most recent call first):
  Fbuild/packages/kokkos/containers/cmake_install.cmake:47 (include)
  Fbuild/packages/kokkos/cmake_install.cmake:52 (include)
  Fbuild/cmake_install.cmake:123 (include)

https://buildbot.flathub.org/#/builders/6/builds/178076

@cgcgcg
Copy link
Contributor

cgcgcg commented Feb 2, 2025

What I understand is:

  • Trilinos CMake finds hdf5-shared in /app/lib at configure time.
  • The link line has -L/app/lib on it as well as -lhdf5-shared, but hdf5-shared cannot be found anymore.
  • The installation target is run despite the fact that building Trilinos failed. That's why we see a different error this time. It fails on trying to install another library that wasn't built yet.

I don't understand what could cause this. Is something else deleting files at the same time? I am not convinced at the moment that this is a Trilinos problem. Did you try to reproduce this without flatpack?

@bztd
Copy link
Author

bztd commented Feb 3, 2025

True, you are right. I just recompiled hdf5 and I can't find any dynamic or static library with the name hdf5-shared. I compiled with Java support removed.

It seems that this is the same error:
#12370

@cgcgcg
Copy link
Contributor

cgcgcg commented Feb 3, 2025

Ah, so the conjecture is that something doesn't work right in
https://github.com/trilinos/Trilinos/blob/master/cmake/tribits/common_tpls/FindTPLHDF5.cmake
And #12370 definitely looks similar.

@bartlettroscoe Do you understand where the non-existent hdf5-shared might come from?

@bartlettroscoe
Copy link
Member

@bartlettroscoe Do you understand where the non-existent hdf5-shared might come from?

@cgcgcg, @bztd, from looking at the CMake configure output above, it seems that find_package(HDF5 ...) is what is finding hdf5-shared. From the standpoint of Trilinos and TriBITS, find_package(HDF5 ...) a black box. But there are features in CMake to help debug find_...().

@bztd, to debug how hdf5-shared is getting found by find_package(HDF5 ...) , please start with a clean configure with:

$ rm -r CMakeCache.txt CMakeFiles/

and then add the CMake argument --debug-find-pkg=HDF5 (CMake versions 3.23+) and configure again from scratch and post what that output shows.

(See section "Debugging find_...() Calls" in the book "Professional CMake".)

@bztd
Copy link
Author

bztd commented Feb 3, 2025

I'm confused, is hdf5-shared a library (libhdf5-shared.so) or is that parameter passed to the linker by mistake? I compiled with the settings from the other discussion and Trilinos compiled. What I don't understand about the error is why there is no problem in archlinux, I'm practically using the same configurations, the only thing I've removed is the java support in hdf5.

@bartlettroscoe
Copy link
Member

I'm confused, is hdf5-shared a library (libhdf5-shared.so) or is that parameter passed to the linker by mistake?

@bztd, can you post your generated CMakeCache.txt file and the output from cmake --debug-find-pkg=HDF5? That might give a clue where hdf5-shared is coming from.

@bztd
Copy link
Author

bztd commented Feb 4, 2025

@cgcgcg
Copy link
Contributor

cgcgcg commented Feb 4, 2025

@bztd Could you also post the terminal output from CMake with --debug-find-pkg=HDF5? What we are trying to understand is where the non-existent library hdf5-shared comes from. Trilinos calls a bit of code from CMake for finding HDF5. The question is, does that code produce bad outputs, or is Trilinos interpreting them in a bad way?

@bztd
Copy link
Author

bztd commented Feb 4, 2025

@cgcgcg
Copy link
Contributor

cgcgcg commented Feb 4, 2025

Is possibly netcdf-c the issue here? See: Unidata/netcdf-c#2869

@bztd
Copy link
Author

bztd commented Feb 4, 2025

I run grep -ra "hdf5-shared" ./* in the installation folder

./bin/nc-config:libsprivate="-lhdf5_hl-shared -lhdf5-shared -lm -lz -lzstd -lbz2 -lcurl -lxml2"
./cmake/hdf5-config.cmake:set (${HDF5_PACKAGE_NAME}_EXPORT_LIBRARIES      hdf5-shared;hdf5_tools-shared;hdf5_hl-shared;hdf5_f90cstub-shared;hdf5_fortran-shared;hdf5_hl_f90cstub-shared;hdf5_hl_fortran-shared;hdf5_cpp-shared;hdf5_hl_cpp-shared)
./cmake/hdf5-targets-release.cmake:# Import target "hdf5-shared" for configuration "Release"
./cmake/hdf5-targets-release.cmake:set_property(TARGET hdf5-shared APPEND PROPERTY IMPORTED_CONFIGURATIONS RELEASE)
./cmake/hdf5-targets-release.cmake:set_target_properties(hdf5-shared PROPERTIES
./cmake/hdf5-targets-release.cmake:list(APPEND _cmake_import_check_targets hdf5-shared )
./cmake/hdf5-targets-release.cmake:list(APPEND _cmake_import_check_files_for_hdf5-shared "${_IMPORT_PREFIX}/lib/libhdf5.so.310.5.0" )
./cmake/hdf5-targets.cmake:foreach(_cmake_expected_target IN ITEMS hdf5-shared mirror_server mirror_server_stop hdf5_tools-shared h5diff ph5diff h5ls h5debug h5repart h5mkgrp h5clear h5delete h5import h5repack h5jam h5unjam h5copy h5stat h5dump h5format_convert h5perf_serial h5perf hdf5_hl-shared h5watch hdf5_f90cstub-shared hdf5_fortran-shared hdf5_hl_f90cstub-shared hdf5_hl_fortran-shared hdf5_cpp-shared hdf5_hl_cpp-shared)
./cmake/hdf5-targets.cmake:# Create imported target hdf5-shared
./cmake/hdf5-targets.cmake:add_library(hdf5-shared SHARED IMPORTED)
./cmake/hdf5-targets.cmake:set_target_properties(hdf5-shared PROPERTIES
./cmake/hdf5-targets.cmake:  INTERFACE_LINK_LIBRARIES "hdf5-shared"
./cmake/hdf5-targets.cmake:  INTERFACE_LINK_LIBRARIES "hdf5-shared"
./cmake/hdf5-targets.cmake:  INTERFACE_LINK_LIBRARIES "hdf5-shared"
./cmake/hdf5-targets.cmake:  INTERFACE_LINK_LIBRARIES "hdf5-shared"
./cmake/hdf5-targets.cmake:  INTERFACE_LINK_LIBRARIES "hdf5_hl-shared;hdf5-shared"
./lib/libnetcdf.settings:Extra libraries:	-lhdf5_hl-shared -lhdf5-shared -lm -lz -lzstd -lbz2 -lcurl -lxml2
./lib/cmake/netCDF/netCDFTargets.cmake:  INTERFACE_LINK_LIBRARIES "dl;hdf5_hl-shared;hdf5-shared;/usr/lib/x86_64-linux-gnu/libm.so;/usr/lib/x86_64-linux-gnu/libz.so;/usr/lib/x86_64-linux-gnu/libzstd.so;/usr/lib/x86_64-linux-gnu/libbz2.so;/usr/lib/x86_64-linux-gnu/libcurl.so;/usr/lib/x86_64-linux-gnu/libxml2.so"
./lib/pkgconfig/netcdf.pc:Libs.private: -lhdf5_hl-shared -lhdf5-shared -lm -lz -lzstd -lbz2 -lcurl -lxml2

@bztd
Copy link
Author

bztd commented Feb 4, 2025

I guess it's not Trilinos' problem.

@cgcgcg
Copy link
Contributor

cgcgcg commented Feb 4, 2025

Yeah, seems like a netcdf issue. Maybe try pinning its version to something earlier?

@bztd
Copy link
Author

bztd commented Feb 4, 2025

Thank you very much for the help.

@bztd bztd closed this as completed Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

3 participants