背景
使用neovim打开一个pwn题给的main.c源码的时候,clangd直接exit 1退出。当时的目录是这个样子的:
❯ ls
ld-linux-x86-64.so.2 libc.so.6 main.c pwn
然后我找到了clangd的目录,执行./clangd
,一切正常。
又打开了几个其他的c语言代码测试,一切正常。
把main.c复制到其他目录下,打开仍然一切正常。
看来问题出在clangd在处理这个特定目录下的文件时会闪退。于是我在这个目录下执行/path/to/clangd
。复盘一下大概是这个样子的:
❯ /tmp/clangd_17.0.3/bin/clangd
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required
by /usr/lib/libpthread.so.0)
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/libpthread.so.0)
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required by /usr/lib/librt.so.1)
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/librt.so.1)
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required by /usr/lib/libdl.so.2)
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/libdl.so.2)
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required by /usr/lib/libm.so.6)
/tmp/clangd_17.0.3/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/libm.so.6)
本机系统为archlinux,Glibc版本为2.39,而题目目录下的libc.so.6版本为2.35。有经验的读者应该已经看出问题所在了。
处理过程
由于对clangd并不熟悉,误认为上面的libc库寻找出错可能是其本身特性,所以当时并未发现问题的核心。转而先去研究了一下如何找到clangd输出的log。我使用的neovim是lazyvim配置的,先翻了一遍文档,并没有找到log的位置,但是找到了修改clangd cmd的办法:在~/.config/nvim/plugins里添加一个文件,我命名为nvim-lspconfig.lua.内容如下
return {
{
"neovim/nvim-lspconfig",
opts = {
servers = {
clangd = {
cmd = {
"clangd",
"--background-index",
"--clang-tidy",
"--header-insertion=iwyu",
"--completion-style=detailed",
"--function-arg-placeholders",
"--fallback-style=llvm",
"--log=verbose",
},
},
},
},
},
}
这样clangd的cmd就会被覆盖,而其他设置不变。
后来执行:LspInfo找到了log的位置。
Language client log: /home/juicymio/.local/state/nvim/lsp.log
log里最新的一条是这样的:
[ERROR][2024-04-02 14:13:24] .../vim/lsp/rpc.lua:734 "rpc" "/home/juicymio/.local/share/nvim/mason/bin/clangd" "stderr" "/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required by /usr/lib/libpthread.so.0)\n/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/libpthread.so.0)\n/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required by /usr/lib/librt.so.1)\n/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/librt.so.1)\n/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required by /usr/lib/libdl.so.2)\n/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/libdl.so.2)\n/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_2.36' not found (required by /usr/lib/libm.so.6)\n/home/juicymio/.local/share/nvim/mason/bin/clangd: libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /usr/lib/libm.so.6)\n"
跟在题目目录下用绝对路径运行clangd的报错如出一辙,结合本机glibc版本为2.39,当前目录下libc.so.6版本为2.35,里面恰好没有GLIBC_2.36的符号。此时基本上可以确定是clangd将当前目录下的libc.so.6错误地当作了它本身的依赖。
问题分析
问题找到了,但是为什么呢?这看起来是一个比较低级的错误。
首先查看是不是我环境变量里设置了LD_LIBRARY_PATH,仔细检查发现没有(要是有的话也太惊悚了)。那么就大概率是clangd本身的问题了。于是我对clangd进行了一下检查。
> readelf -d ./clangd
Dynamic section at offset 0x3260b20 contains 30 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libpthread.so.0]
0x0000000000000001 (NEEDED) Shared library: [librt.so.1]
0x0000000000000001 (NEEDED) Shared library: [libdl.so.2]
0x0000000000000001 (NEEDED) Shared library: [libm.so.6]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x0000000000000001 (NEEDED) Shared library: [ld-linux-x86-64.so.2]
0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN/../lib:]
...
可以看到它是设置了runpath的,并且这个runpath看起来也十分正常。查到这里之后我又陷入了一段时间迷惑,并且隐隐感觉runpath最后的冒号似乎不应该存在。于是我又用strace查看了一下装载动态链接库的具体情况。 插播一下程序查找动态链接库的顺序:
- DT_RPATH in the ELF binary, unless DT_RUNPATH set.
- LD_LIBRARY_PATH entries, unless setuid/setgid
- DT_RUNPATH in ELF binary
- /etc/ld.so.cache entries, unless -z nodeflib given at link time
- /lib, /usr/lib unless -z nodeflib
- Done, “not found”. libm.so.6:
openat(AT_FDCWD, "/tmp/clangd_17.0.3/bin/../lib/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v4/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "libm.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "/usr/lib/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
libc.so.6:
openat(AT_FDCWD, "/tmp/clangd_17.0.3/bin/../lib/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v4/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v3/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "glibc-hwcaps/x86-64-v2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (没有那个文件或目录)
openat(AT_FDCWD, "libc.so.6", O_RDONLY|O_CLOEXEC) = 3
前四个查找的为RUNPATH的$ORIGIN/../lib,而libm.so.6查找的最后一个是/usr/lib中的库。所以
openat(AT_FDCWD, "libc.so.6", O_RDONLY|O_CLOEXEC) = 3
这句只有可能是RUNPATH的一部分或者是/etc/ld.so.cache的一部分。后者由/etc/ld.so.conf, /etc/ld.so.conf.d控制,我已经检查过没有异常内容。所以一定是RUNPATH的设置导致搜索了当前目录。此时V3rdant提醒我把冒号删掉试试,可能是冒号后被当成了一个空路径处理。(感谢V3rdant学长QwQ)。使用patchelf修改RUNPATH:
patchelf --set-rpath "\$ORIGIN/../lib" clangd
一切恢复正常。看来确实是冒号的问题。
upd: RTFM
感到有点难绷,去给clangd提了个issue.
upd: 大佬说这个冒号的产生是由于一个cmake的特性(this and this),具体见issue.
参考链接
https://man7.org/training/download/shlib_dynlinker_slides.pdf
https://stackoverflow.com/a/16922836
https://www.gnu.org/software/bash/manual/bash.html#Bourne-Shell-Variables