Skip to content

gh-65821: Fix ctypes.util.find_library with musl #18380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 55 additions & 6 deletions Lib/ctypes/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,8 +96,11 @@ def find_library(name):
def _is_elf(filename):
"Return True if the given file is an ELF file"
elf_header = b'\x7fELF'
with open(filename, 'br') as thefile:
return thefile.read(4) == elf_header
try:
with open(filename, 'br') as thefile:
return thefile.read(4) == elf_header
except OSError:
return False
Comment on lines +102 to +103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation behind this? Saying the file is not an elf if we get any kind of OSError does not seem like a great idea to me 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means to catch file does not exist and file is not readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what'd expect to, but I want to make sure returning false on all possible OSError is actually the behavior we want, as it might introduce subtle bugs. For eg. I would expect a BrokenPipeError or TimeoutError to propagate, instead of not considering the file a valid library.

Looking at the usages of this function, I'd guess we only need PermissionError, and I think we only need it on the musl specific code path, so IMO it'd be better off there.

Anyway, I think I am being overly careful here, but would like to hear from the author if they have time. This is not blocker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thinking here was: return True if we know it is an elf, return False if we know it is not or we don't know.

will current find_library implementation for GNU libc or *BSD raise an error in those cases?


def _findLib_gcc(name):
# Run GCC's linker with the -t (aka --trace) option and examine the
Expand Down Expand Up @@ -324,10 +327,55 @@ def _findLib_ld(name):
pass # result will be None
return result

def _findLib_musl(name):
# fallback for musl libc, which does not have ldconfig -p
# See GH-65821

from _ctypes import get_interp
interp = get_interp()
if interp == None:
return None

ldarch = re.sub(r'.*ld-musl-([^.]+)\..*', r'\1', interp)
if ldarch == interp:
# not musl
return None

from glob import glob
if os.path.isabs(name):
return name
if name.startswith("lib") and name.endswith(".so"):
suffixes = [ '.[0-9]*' ]
libname = name
else:
suffixes = ['.so', '.so.[0-9]*', f'.musl-{ldarch}.so.[0-9]*']
libname = 'lib'+name
# search LD_LIBRARY_PATH list and default musl libc locations
paths = os.environ.get('LD_LIBRARY_PATH', '').split(':')
try:
with open(f'/etc/ld-musl-{ldarch}.path') as f:
paths.extend(re.split(':|\n', f.read()))
except OSError:
paths.extend(['/lib', '/usr/local/lib', '/usr/lib'])
Comment on lines +358 to +359
Copy link
Member

@FFY00 FFY00 May 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the possible issues here? Would it make sense to be more specific with the exception? Would it make sense to let the user know that we are defaulting to /lib, /usr/local/lib, /usr/lib (via a warning for eg.)?

Should /lib actually come before /usr/local/lib?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


for d in paths:
f = os.path.join(d, name)
if _is_elf(f):
return os.path.basename(f)
prefix = os.path.join(d, libname)
for suffix in suffixes:
for f in sorted(glob('{0}{1}'.format(prefix, suffix))):
if _is_elf(f):
return os.path.basename(f)

def find_library(name):
# See issue #9998
return _findSoname_ldconfig(name) or \
_get_soname(_findLib_gcc(name)) or _get_soname(_findLib_ld(name))
return (
_findSoname_ldconfig(name)
or _get_soname(_findLib_gcc(name))
or _get_soname(_findLib_ld(name))
or _findLib_musl(name)
)

################################################################
# test code
Expand All @@ -344,6 +392,7 @@ def test():
print(find_library("m"))
print(find_library("c"))
print(find_library("bz2"))
print(find_library("libbz2.so"))

# load
if sys.platform == "darwin":
Expand All @@ -368,8 +417,8 @@ def test():
print(f"crypto\t:: {find_library('crypto')}")
print(f"crypto\t:: {cdll.LoadLibrary(find_library('crypto'))}")
else:
print(cdll.LoadLibrary("libm.so"))
print(cdll.LoadLibrary("libcrypt.so"))
print(cdll.LoadLibrary(find_library("c")))
print(cdll.LoadLibrary("libc.so"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren’t these two lines doing the same thing now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should be. IIRC I added those tests to verify that they actually are the same thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the existing lines should be kept for general linux, and a new block added for musl-based?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would be necessary. The updated test should be good enough for both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m unsure about this change. Maybe the author really wanted to see if libm and libcrypt could be found, not just libc which should always be there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the purpose of testing that? Those libs does not not exist on musl libc systems.

I thought that libm.so and libcrypt.so were used because they were expected to always be there. But there is no guarantee that they are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me put it this way: should the test fail if libm.so or libcrypt.so intentionally does not exist on the filesystem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was about not changing this test for glibc-based systems – new test for musl can use the two new lines

print(find_library("crypt"))

if __name__ == "__main__":
Expand Down
9 changes: 9 additions & 0 deletions Lib/test/test_ctypes/test_find.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,15 @@ def test_find_library_with_ld(self):
unittest.mock.patch("ctypes.util._findLib_gcc", lambda *args: None):
self.assertNotEqual(find_library('c'), None)

def test_find_library_musl(self):
from _ctypes import get_interp
interp = get_interp()
if interp == None or interp.find("ld-musl-") == -1:
self.skipTest('ld-musl not detected')

with unittest.mock.patch("ctypes.util._findSoname_ldconfig", lambda *args: None), \
unittest.mock.patch("ctypes.util._get_soname", lambda *args: None):
self.assertNotEqual(find_library('c'), None)

if __name__ == "__main__":
unittest.main()
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix :func:`ctypes.util.find_library` with musl libc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not mention the introduced API.

53 changes: 53 additions & 0 deletions Modules/_ctypes/callproc.c
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,13 @@
#include <alloca.h>
#endif

#ifdef HAVE_ELF_H
#include <elf.h>
#endif
#ifdef HAVE_LINK_H
#include <link.h>
#endif

#ifdef _Py_MEMORY_SANITIZER
#include <sanitizer/msan_interface.h>
#endif
Expand Down Expand Up @@ -1992,7 +1999,52 @@ buffer_info(PyObject *self, PyObject *arg)
return Py_BuildValue("siN", dict->format, dict->ndim, shape);
}

#ifndef MS_WIN32

#ifdef HAVE_DL_ITERATE_PHDR
static int
interp_cb(struct dl_phdr_info *info, size_t size, void *data)
{
const char **ps = data;
const char *base = (const char *)info->dlpi_addr;
const ElfW(Phdr) *ph = info->dlpi_phdr;
int phn = info->dlpi_phnum;

for (int i=0; i < phn; i++) {
if (ph[i].p_type == PT_INTERP) {
*ps = base + ph[i].p_vaddr;
return 1;
}
}
return 0;
}

static PyObject *
get_interp(PyObject *self, PyObject *arg)
{
const char *s = NULL;
if (PySys_Audit("ctypes.get_interp", NULL) < 0) {
return NULL;
}

if (dl_iterate_phdr(interp_cb, &s) == 1) {
return PyUnicode_FromString(s);
}

return NULL;
}

#else

static PyObject *
get_interp(PyObject *self, PyObject *arg)
{
return Py_None;
}

#endif

#endif

PyMethodDef _ctypes_module_methods[] = {
{"get_errno", get_errno, METH_NOARGS},
Expand All @@ -2015,6 +2067,7 @@ PyMethodDef _ctypes_module_methods[] = {
"dlopen(name, flag={RTLD_GLOBAL|RTLD_LOCAL}) open a shared library"},
{"dlclose", py_dl_close, METH_VARARGS, "dlclose a library"},
{"dlsym", py_dl_sym, METH_VARARGS, "find symbol in shared library"},
{"get_interp", get_interp, METH_NOARGS},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation is missing for this new API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if you introduce a new API, please use Argument Clinic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way to add it as an internal function? So we don't expose it as a public API

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you could prefix it with an underscore.

#endif
#ifdef __APPLE__
{"_dyld_shared_cache_contains_path", py_dyld_shared_cache_contains_path, METH_VARARGS, "check if path is in the shared cache"},
Expand Down
6 changes: 3 additions & 3 deletions configure

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -2755,8 +2755,8 @@ AC_DEFINE(STDC_HEADERS, 1, [Define to 1 if you have the ANSI C header files.])

# checks for header files
AC_CHECK_HEADERS([ \
alloca.h asm/types.h bluetooth.h conio.h crypt.h direct.h dlfcn.h endian.h errno.h fcntl.h grp.h \
ieeefp.h io.h langinfo.h libintl.h libutil.h linux/auxvec.h sys/auxv.h linux/fs.h linux/memfd.h \
alloca.h asm/types.h bluetooth.h conio.h crypt.h direct.h dlfcn.h elf.h endian.h errno.h fcntl.h grp.h \
ieeefp.h io.h langinfo.h libintl.h libutil.h link.h linux/auxvec.h sys/auxv.h linux/fs.h linux/memfd.h \
linux/random.h linux/soundcard.h \
linux/tipc.h linux/wait.h netdb.h net/ethernet.h netinet/in.h netpacket/packet.h poll.h process.h pthread.h pty.h \
sched.h setjmp.h shadow.h signal.h spawn.h stropts.h sys/audioio.h sys/bsdtty.h sys/devpoll.h \
Expand Down Expand Up @@ -4717,7 +4717,7 @@ fi
# checks for library functions
AC_CHECK_FUNCS([ \
accept4 alarm bind_textdomain_codeset chmod chown clock close_range confstr \
copy_file_range ctermid dup dup3 execv explicit_bzero explicit_memset \
copy_file_range ctermid dl_iterate_phdr dup dup3 execv explicit_bzero explicit_memset \
faccessat fchmod fchmodat fchown fchownat fdopendir fdwalk fexecve \
fork fork1 fpathconf fstatat ftime ftruncate futimens futimes futimesat \
gai_strerror getegid getentropy geteuid getgid getgrgid getgrgid_r \
Expand Down
9 changes: 9 additions & 0 deletions pyconfig.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,9 @@
/* Define to 1 if you have the `dlopen' function. */
#undef HAVE_DLOPEN

/* Define to 1 if you have the `dl_iterate_phdr' function. */
#undef HAVE_DL_ITERATE_PHDR

/* Define to 1 if you have the `dup' function. */
#undef HAVE_DUP

Expand All @@ -308,6 +311,9 @@
/* Define to 1 if you have the <editline/readline.h> header file. */
#undef HAVE_EDITLINE_READLINE_H

/* Define to 1 if you have the <elf.h> header file. */
#undef HAVE_ELF_H

/* Define to 1 if you have the <endian.h> header file. */
#undef HAVE_ENDIAN_H

Expand Down Expand Up @@ -703,6 +709,9 @@
/* Define to 1 if you have the `linkat' function. */
#undef HAVE_LINKAT

/* Define to 1 if you have the <link.h> header file. */
#undef HAVE_LINK_H

/* Define to 1 if you have the <linux/auxvec.h> header file. */
#undef HAVE_LINUX_AUXVEC_H

Expand Down