feat(ci): attaching GDB to tests slows down the build process too much
Description
This Issue is a reminder of what has been tried to overcome the real issue (slow CI).
On the CI, when the tests are executed, the launch scripts (template in cmake/build/linux/template_test.sh.in
) use GDB to be able to catch crashes to print the backtrace and generate a core file that is stored as build artifact.
This is done so, because core generation is kernel settings and we run tests in a gitlab-runner - dockerized environment which disallow ulimit
to work like expected.
Proposal
core generation by the kernel
Normally, you just have to specify ulimit -c unlimited
. Unfortunately, we have three problems:
- We are on Ubuntu and Ubuntu use a special crash handler
apport
that intercept core file creation. You may need to disable it (see https://askubuntu.com/questions/966407/where-do-i-find-the-core-dump-in-ubuntu-16-04lts). You may also need to alter the default core location (sysctl kernel.core_pattern
) and edit/etc/sysctl.conf
. - We use docker and docker share the kernel, but some configuration must be done. You can start to look at https://ddanilov.me/how-to-configure-core-dump-in-docker-container. Basically, the option we need is the
--ulimit core=-1
used when launching the docker image (docker run ---
) and that leads to what block us: - gitlab-runner is the one who launch the docker image, so it must pass the ulimit configuration to docker. Unfortunately: there is no such option (see https://gitlab.com/gitlab-org/gitlab-runner/-/issues/4482).
Mays be seting the ulimit on the host permanetly, in
/etc/sysctl.conf
and/or/etc/security/limits.conf
, and mounting a special directory with the same path as the one in/etc/sysctl.conf
for core files on the host and the docker image, could potentially work. This is rather complicated at least...
Use LLDB
LLDB is often seen as a faster alternative to GDB and with the same features set. It is, and has almost an instant start and yes, you can launch a process and automatically attach it to the debugger, so it can print the backtrace. Use lldb -b --one-line "run" --one-line-on-crash "bt" --one-line-on-crash "quit 1" -- ./a.out 1 2 3
for that. Unfortunately, there are two problems:
- the core saving feature only works on APPLE. ELF format is not supported. Normally, you just have to add the following command to the previous one:
--one-line-on-crash "process save-core /tmps/xxx.core"
, but right now, all you get is the error messageerror: Failed to save core file for process: no ObjectFile plugins were able to save a core for this process
. We have to wait for LLDB developers or contribute. - There is no way to forward the exit code of the attached process. In case of a failing test that does not crash, this is catastrophic, because the test will be monitored as successful.
For now we have postponed the task as the burden of fixing that overweight the potential benefict we may have in term of speed.
Functional specifications
Workflow, UX/UI design, screenshots, etc...
Technical specifications
Details of the implementation
Test plan
Describe how you will verify that the implementation fulfils the specifications