• 隐藏侧边栏
  • 展开分类目录
  • 关注微信公众号
  • 我的GitHub
  • QQ:1753970025
Chen Jiehua

gdb调试Python动态链接库 

在日常的Python开发中,针对一些有性能要求的逻辑,我们可能会采用C/C++来实现,然后在python脚本层进行调用。而对于C/C++编译出来动态链接库,我们要如何进行调试呢?

调试动态链接库

在Linux下,C/C++编译出来的动态链接库就是我们常见的so文件。调试链接库与调试可执行程序类似,最大的区别在于链接库需要被加载到内存后才能设置断点,而我们则是通过调试调用该so的程序来间接地进行调试。

来个示例

我们通过C++写了一个简单的python模块,源码library.cpp如下:

#include <Python.h>

static PyObject* add(PyObject* self, PyObject* args){
    int a, b;
    if (!PyArg_ParseTuple(args, "ii", &a, &b)){
        return nullptr;
    }
    int sum = a + b;
    PyObject* ret = PyLong_FromLong(sum);
    return ret;
}

static PyMethodDef MyDemoMethods[] = {
        {"addx", add, METH_VARARGS, "add two integers"},
        {nullptr, nullptr, 0, nullptr},
};

PyMODINIT_FUNC initdemo(void){
    (void) Py_InitModule("demo", MyDemoMethods);
}

CMakeLists.txt 如下:

cmake_minimum_required(VERSION 3.10)
project(demo)

set(CMAKE_CXX_STANDARD 11)
find_package(PythonLibs 2.7 REQUIRED)
message(STATUS "Python Include = ${PYTHON_INCLUDE_DIRS}")
include_directories(${PYTHON_INCLUDE_DIRS})

add_library(${PROJECT_NAME} SHARED library.cpp)
set_target_properties(${PROJECT_NAME} PROPERTIES PREFIX "")

编译之后得到一个动态链接库 demo.so

$cmake -G "Unix Makefiles" -D CMAKE_BUILD_TYPE=Debug -B build
$cd build && make

再写一段简单的Python代码来调用它,源码 mytest.py 如下:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import time
import demo

def func():
    for i in range(99999):
        print(demo.addx(i, i**2))
        time.sleep(1)


def main():
    func()


if __name__ == "__main__":
    main()

调试开始

我们可以用gdb启动python进行调试:

$gdb python
(gdb) run mytest.py

也可以直接对正在运行的进程进行调试:

$nohup python mytest.py &
$ps ux | grep "python mytest"
$gdb
(gdb) attach <目标进程pid>
Attaching to process xxx
Reading symbols from /usr/bin/python2.7...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Reading symbols from /lib/x86_64-linux-gnu/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libutil.so.1...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libz.so.1...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libm.so.6...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Reading symbols from /home/jachua/learncpp/pymodule/demo/demo.so...done.
Reading symbols from /lib/x86_64-linux-gnu/libstdc++.so.6...(no debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libgcc_s.so.1...(no debugging symbols found)...done.
0x00007f4261868ff7 in select () from /lib/x86_64-linux-gnu/libc.so.6

进行断点

首先查看一下源码:

(gdb) list library.cpp:add
1       #include <Python.h>
2
3       static PyObject* add(PyObject* self, PyObject* args){
4           int a, b;
5           if (!PyArg_ParseTuple(args, "ii", &a, &b)){
6               return nullptr;
7           }
8           int sum = a + b;
9           PyObject* ret = PyLong_FromLong(sum);
10          return ret;

然后设置一下断点:

(gdb) break library.cpp:8
Breakpoint 1 at 0x7f4261d0f165: file /home/jachua/learncpp/pymodule/demo/library.cpp, line 8.
(gdb) break library.cpp:4
Breakpoint 2 at 0x7f4261d0f135: file /home/jachua/learncpp/pymodule/demo/library.cpp, line 5.

查看所有断点:

(gdb) info breakpoints
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00007f4261d0f165 in add(PyObject*, PyObject*) at /home/jachua/learncpp/pymodule/demo/library.cpp:8
2       breakpoint     keep y   0x00007f4261d0f135 in add(PyObject*, PyObject*) at /home/jachua/learncpp/pymodule/demo/library.cpp:5

继续运行程序:

(gdb) continue

等程序程序执行到断点的位置就会停住:

Breakpoint 2, add (self=0x0, args=0x7f4261609f38) at /home/jachua/learncpp/pymodule/demo/library.cpp:5
5           if (!PyArg_ParseTuple(args, "ii", &a, &b)){

查看变量,并进行单步调试:

(gdb) print a
$1 = 21991

(gdb) print b
$2 = -192616879

(gdb) display a
1: a = 21991

(gdb) display b
2: b = -192616879

(gdb) next

Breakpoint 1, add (self=0x0, args=0x7f4261609f38) at /home/jachua/learncpp/pymodule/demo/library.cpp:8
8           int sum = a + b;
1: a = 18
2: b = 324

(gdb) display sum
3: sum = 21991

还可以查看调用栈:

(gdb) backtrace
#0  add (self=0x0, args=0x7f4261609f38) at /home/jachua/learncpp/pymodule/demo/library.cpp:8
#1  0x000055e7f487cdaa in PyEval_EvalFrameEx ()
#2  0x000055e7f48823ba in PyEval_EvalFrameEx ()
#3  0x000055e7f48823ba in PyEval_EvalFrameEx ()
#4  0x000055e7f487a866 in PyEval_EvalCodeEx ()
#5  0x000055e7f487a1f9 in PyEval_EvalCode ()
#6  0x000055e7f48ace2f in ?? ()
#7  0x000055e7f48a7d20 in PyRun_FileExFlags ()
#8  0x000055e7f48a76ca in PyRun_SimpleFileExFlags ()
#9  0x000055e7f4848188 in Py_Main ()
#10 0x00007f426179c09b in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x000055e7f4847aea in _start ()

生成core文件:

(gdb) generate-core-file
warning: target file /proc/1528/cmdline contained unexpected null characters
Saved corefile core.1528

调试Coredump

进程异常退出时,一般会生成coredump文件。如果没有,可以检查ulimit是否设置正确:

$ulimit -a 
-t: cpu time (seconds)              unlimited
-f: file size (blocks)              unlimited
-d: data seg size (kbytes)          unlimited
-s: stack size (kbytes)             8192
-c: core file size (blocks)         0
-m: resident set size (kbytes)      unlimited
-u: processes                       31797
-n: file descriptors                1024
-l: locked-in-memory size (kbytes)  65536
-v: address space (kbytes)          unlimited
-x: file locks                      unlimited
-i: pending signals                 31797
-q: bytes in POSIX msg queues       819200
-e: max nice                        0
-r: max rt priority                 0
-N 15:                              unlimited

// 设置为无限
$ulimit -c unlimited 

现在我们来调试一下之前手动生成的core文件:

$gdb python -c core.1528
......
Reading symbols from python...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 1528]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `python'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  add (self=0x0, args=0x7f4261609f38) at /home/jachua/learncpp/pymodule/demo/library.cpp:8
8           int sum = a + b;

然后就可以查看当时的调用栈,已经相关的变量:

(gdb) bt
#0  add (self=0x0, args=0x7f4261609f38) at /home/jachua/learncpp/pymodule/demo/library.cpp:8
#1  0x000055e7f487cdaa in PyEval_EvalFrameEx ()
#2  0x000055e7f48823ba in PyEval_EvalFrameEx ()
#3  0x000055e7f48823ba in PyEval_EvalFrameEx ()
#4  0x000055e7f487a866 in PyEval_EvalCodeEx ()
#5  0x000055e7f487a1f9 in PyEval_EvalCode ()
#6  0x000055e7f48ace2f in ?? ()
#7  0x000055e7f48a7d20 in PyRun_FileExFlags ()
#8  0x000055e7f48a76ca in PyRun_SimpleFileExFlags ()
#9  0x000055e7f4848188 in Py_Main ()
#10 0x00007f426179c09b in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x000055e7f4847aea in _start ()

(gdb) p a
$1 = 20
(gdb) p b
$2 = 400

Pyhton调用栈

在上面的示例中,我们只能看到C层的调用栈,无法详细看到Python相关的内容。我们可以通过安装 python-dbg 来解决这个问题。

$sudo apt update 
$sudo apt install python-dbg

重新进行调试,可以看到python的符号表已经被加载进来了:

$gdb python -c core.1528
......
Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/a5/9cad4ea461069ab45e846552c0dc6ed45ef466.debug...done.
done.

warning: core file may not match specified executable file.
[New LWP 1528]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `python'.
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
#0  add (self=0x0, args=0x7f4261609f38) at /home/jachua/learncpp/pymodule/demo/library.cpp:8
8           int sum = a + b;

现在,我们就可以直接看到python的逻辑和调用栈了:

(gdb) py-list
   6    import demo
   7
   8
   9    def func():
  10        for i in range(100000):
 >11            print(demo.addx(i, i**2))
  12            time.sleep(1)
  13
  14
  15    def main():
  16        func()

(gdb) py-bt
Traceback (most recent call first):
  File "mytest.py", line 11, in func
    print(demo.addx(i, i**2))
  File "mytest.py", line 16, in main
    func()
  (frame information optimized out)

Strip Symbol

在cmake指定 -DCMAKE_BUILD_TYPE=Debug 或者 make -g 的时候,编译出来的对象会包含相关的调试符号,相应的so文件体积会更大,加载到内存也会占用更多的空间。

在正式环境中,我们一般会将调试信息去掉,但是异常情况下我们又需要进行调试,这该如何是好?通过 objcopy 和 strip可以轻松搞定。

首先看一下带调试信息和不带调试信息的两个so的区别:

// 分别用Debug和Release两种模式编译
$cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Debug -B Debug  && cd Debug && make
$cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release-B Release && cd Release && make

-rwxr-xr-x 1 jachua jachua 50472 Oct 20 10:18 demo.so

// Release:
-rwxr-xr-x 1 jachua jachua 16272 Oct 20 10:18 demo.so

查看文件大小,并通过file命令查看对象:

// Debug:
-rwxr-xr-x 1 jachua jachua 50472 Oct 20 10:18 demo.so
demo.so: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=6110e1c062ca76ca60fcc347e8408b16527049d1, with debug_info, not stripped

// Release:
-rwxr-xr-x 1 jachua jachua 16272 Oct 20 10:18 demo.so
demo.so: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=67323f051e2a64966f604ea315301b7e8be59ae2, not stripped

我们可以通过objcopy和strip来对 Debug 模式编译出来的链接库进行瘦身处理了:

$objcopy --only-keep-debug demo.so demo.so.dbg
$objcopy --strip-debug demo.so
$objcopy --add-gnu-debuglink=demo.so.dbg demo.so

其中 strip debug 操作也可以通过 strip 命令处理:

$strip --strip-debug --strip-unneeded demo.so

码字很辛苦,转载请注明来自ChenJiehua《gdb调试Python动态链接库》

评论