Насколько хороши тесты популярных проектов с открытым исходным кодом?

2022-03-06

Я уже писал про мутационное тестирование проектов, написанных на Си и C++. Первый пост появился в результате знакомства с проектом Mull. Тогда авторы Mull только сделали первую версию инструмента для мутационного тестирования на основе LLVM и мне хотелось попробовать его на проекте сетевого анализатора для индустриальных сетей. Во втором посте я описал свой опыт применения мутационного тестирования для ядра ОС. Mull я не смог использовать из-за отсутствия поддержки в проекте компилятора Clang. Поэтому в этот раз я использовал Frama-C для мутаций и собственных скриптов для автоматизации процесса тестирования и составления отчётов. С тех пор прошло много времени и Mull сильно изменился - он начал поддерживать инкрементальное тестирование (только новые изменения), была добавлена поддержка тестов, написанных на скриптовых языках, до этого Mull мог запускать только юнит-тесты, была добавлена поддержка отчётов в формате mutation testing elements и ещё много других интересных фич. В этом посте я опишу использование Mull для популярных проектов с открытым исходым кодом. Когда я это писал, то преследовал две цели: показать насколько качественные тесты в этих проектах и то, насколько относительно просто можно настроить мутационное тестирование в разообразных системах сборки. Для всех проектов я привожу команды для воспроизведения тестирования теми, кому это будет интересно. Я буду очень рад, если эта статья спровоцирует разработчиков описанных проектов внедрить мутационное тестирование на регулярной основе. По мере возможности я буду добавлять описания для других проектов. Последняя дата обновления поста: 6 марта 2022 года.

SQLite (SUCCESS)

Есть тн sqlite-amalgamation The use of the amalgamation is recommended for all applications.

gcc shell.c sqlite3.c -lpthread -ldl

Одним из известных хорошо протестированных проектов с открытым исходным кодом является sqlite. Там строк кода для тестов в три раза больше чем кода самой библиотеки. (684 KLOC vs 273 KLOC). Нужно добавить что в sqlite уже используется мутационное тестирование https://www.sqlite.org/testing.html#mutation_testing. Вместо LLVM IR в sqlite использовали ассемблерный код. Исходный код на С транслировали в ассемблерный, выполняли мутацию, компилировали и запускали тесты.

gcc -S -o helloworld.s helloworld.c

Тем интереснее для нас. Попробуем его взять для демонстрации мутационного тестирования и попробуем найти в нем ошибки. (FIXME: тесты для sqlite недоступны?)

gcc shell.c sqlite3.c -lpthread -ldl

https://www.sqlite.org/testing.html

Using gcov (or similar) to show that every branch instruction is taken at least once in both directions is good measure of test suite quality. But even better is showing that every branch instruction makes a difference in the output. In other words, we want to show not only that every branch instruction both jumps and falls through but also that every branch is doing useful work and that the test suite is able to detect and verify that work. When a branch is found that does not make a difference in the output, that suggests that the code associated the branch can be removed (reducing the size of the library and perhaps making it run faster) or that the test suite is inadequately testing the feature that the branch implements.

SQLite strives to verify that every branch instruction makes a difference using mutation testing. A script first compiles the SQLite source code into assembly language (using, for example, the -S option to gcc). Then the script steps through the generated assembly language and, one by one, changes each branch instruction into either an unconditional jump or a no-op, compiles the result, and verifies that the test suite catches the mutation.

Unfortunately, SQLite contains many branch instructions that help the code run faster without changing the output. Such branches generate false-positives during mutation testing. As an example, consider the following hash function used to accelerate table-name lookup:

static unsigned int strHash(const char *z){
  unsigned int h = 0;
  unsigned char c;
  while( (c = (unsigned char)*z++)!=0 ){     /*OPTIMIZATION-IF-TRUE*/
    h = (h<<3) ^ h ^ sqlite3UpperToLower[c];
  }
  return h;
}

If the branch instruction that implements the c != 0 test on line 58 is changed into a no-op, then the while-loop will loop forever and the test suite will fail with a time-out. But if that branch is changed into an unconditional jump, then the hash function will always return 0. The problem is that 0 is a valid hash. A hash function that always returns 0 still works in the sense that SQLite still always gets the correct answer. The table-name hash table degenerates into a linked-list and so the table-name lookups that occur while parsing SQL statements might be a little slower, but the end result will be the same.

To work around this problem, comments of the form “/OPTIMIZATION-IF-TRUE/” and “/OPTIMIZATION-IF-FALSE/” are inserted into the SQLite source code to tell the mutation testing script to ignore some branch instructions.

https://www.sqlite.org/th3.html#muttest

The TH3 source tree contains a scripted name “mutation-test.tcl” that automates the process of mutation testing.

The mutation-test.tcl script takes care of all of the details for running a mutation test:

The script compiles the TH3 test harness into machine code (“th3.o”) if necessary.
The script compiles the sqlite3.c source file into assembly language (sqlite3.s) if necessary.
The script loops through instructions in the assembly language file to locate branch operations.
The script makes a copy of the original sqlite3.s file.
The copy is edited to change the branch instruction into either a no-op or an unconditional jump.
The copy of sqlite3.s is assembled into sqlite3.o then linked again th3.o to generate the “th3” executable.
The “th3” binary is run and the output checked for errors.
The script shows progress for each cycle of the previous step then displays a summary of “survivors” at the end. A “survivor” is a mutation that was not detected by TH3.

Mutation testing can be slow, since each test can take up to 5 minutes on a fast workstation, and there are two tests for each branch instructions, and over 20,000 branch instructions. Efforts are made to expedite operation. For example, TH3 is compiled in such a way that it exits as soon as it finds the first error, and as many of the mutations are easily detected, many cycles happen in only a few seconds. Nevertheless, the mutation-test.tcl script includes command-line options to limit the range of code lines tested so that mutation testing only needs to be performed on blocks of code that have recently changed.

Фреймворк https://cris.vtt.fi/en/publications/mut-tools-mutation-testing-for-cc-programs https://www.vttresearch.com/en/project_news/efficient-mutation-testing-tools-cc-programs

How-to https://gist.github.com/ligurio/c7af2506ddd210a9efb62f82f1dc63a0#sqlite-success

Results https://bronevichok.ru/static/report-sqlite3-91f621531.html

14567 mutants killed!

Note: Mull takes about 16Gb RAM for SQLite testing
Code coverage: https://www.opencoverage.net/sqlite/index_html/index.html
How to build https://www.sqlite.org/howtocompile.html
Run tests (TH3 is not public available) https://www.sqlite.org/th3.html

Questions

как Mull понимает можно ли конкурентно запускать интеграционные тесты или нет?
как передавать параметры в mull-runner? например make check
перед mull-runner обязательно должен быть бинарь
если не существует скрипт, то mull будет считать все мутанты неубитыми
“[warning] Cannot read coverage info: Profile uses zlib compression but the profile reader was built without zlib support”
версия llvm для clang должна быть старше или равной версии llvm, с которой собран Mull, see https://github.com/mull-project/mull/issues/902
make sure run.sh works fine, otherwise you will get unkilled mutants!

General

Mull installation https://mull.readthedocs.io/en/latest/Installation.html
- curl -1sLf 'https://dl.cloudsmith.io/public/mull-project/mull-nightly/setup.deb.sh' | sudo -E bash
- sudo apt-get update
- sudo apt-get install -y mull
Create a list with all dependencies
- https://github.com/mull-project/mull/issues/861
- https://cmake.org/cmake/help/latest/command/file.html#get-runtime-dependencies
Run HTTP server
- python3 -m http.server –directory .
- chromium report.html
General
- https://mull.readthedocs.io/en/latest/tutorials/ControlMutationsTutorial.html
- Using Mull (https://mull.readthedocs.io/) with Tarantool testing.
- Support tests written on interpreted languages https://github.com/mull-project/mull/issues/778
- https://stryker-mutator.io/docs/mutation-testing-elements/mutant-states-and-metrics/
- REPORT_NAME=postgres-$(date +%F)-$(git rev-parse --short HEAD)
TODO (https://news.ycombinator.com/item?id=25381397):
- C: git, curl, u-boot
- ~~C++: tensorflow, ceph, pytorch, bitcoin, electron, Marlin, Cataclysm-DDA, llvm-project, rocksdb & QGIS~~

LibreSSL (IN PROGRESS)

$ git clone https://github.com/libressl-portable/portable libressl
$ cd libressl
$ ./autogen.sh
$ TODO

CURL (IN PROGRESS)

$ cat mull.yml yaml mutators: - cxx_logical includePaths: - lib - src excludePaths: - tests quiet: false

$ git clone https://github.com/curl/curl --depth 1
$ cd curl
$ autoreconf -fi
$ export CC=clang-12 CFLAGS="-fembed-bitcode -g -O0  -fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line"; ./configure --with-openssl
$ make -j
$ make test
$ mull-runner-12 ./src/.libs/curl --report-name=REPORT --reporters=Elements -test-program=make -- -f $(pwd)/Makefile test

PHP (TODO)

It’s a critical opensource projects according to Google.
Code coverage (80%): http://gcov.php.net/PHP_7_4/lcov_html/ext/openssl/index.php

OpenSSL (SUCCESS)

https://mull.readthedocs.io/en/0.17.1/tutorials/MakefileIntegration.html
It’s a critical opensource projects according to Google.
How to measure code coverage: https://gist.github.com/mrash/8383288c66f973a2bbb2
Code coverage (43%): https://www.opencoverage.net/openssl/index_html/index.html
Testing: https://wiki.openssl.org/index.php/Unit_Testing

$ git clone https://github.com/openssl/openssl
$ export CC=clang-12
$ ./config -no-asm -no-shared -no-module -no-des -fembed-bitcode -g -O0  -fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line enable-unit-test
$ make -j all
$ make tests # (no effect with -j option)
...
All tests successful.
Files=241, Tests=3277, 411 wallclock secs ( 8.21 usr  0.77 sys + 341.39 cusr 56.84 csys = 407.21 CPU)
Result: PASS
make[1]: Leaving directory '/home/ubuntu/openssl'
$ mull-runner-12 ./apps/openssl --report-name=REPORT --reporters=Elements -test-program=make -- -f $(pwd)/Makefile tests

------------------ OBSOLETE -------------------------

# Remove -Wa and noexecstack flags in Makefile, because "clang: error: -Wa, is not supported with -fembed-bitcode".
# mull-cxx -keep-executable -mutate-only -output=openssl.mutated --workers=12 -linker-flags="-lm -ldl -lpthread -lrt -lcrypto -lssl -L. -Wl,-rpath,.,--export-dynamic" -compilation-flags="-isystem /usr/lib/llvm-10/lib/clang/10.0.0/include/ -isystem ./apps/include/ -isystem . -isystem ./include/" --timeout=6000 apps/openssl
cat << EOF  > run.sh  
#!/bin/sh                             

set -e

cp $1 apps/openssl
make -f $(pwd)/Makefile tests
EOF
chmod +x run.sh
REPORT_NAME=openssl-$(git rev-parse --short HEAD)
mull-runner --workers=1 --report-name=$REPORT --reporters=Elements --include-not-covered -test-program=./run.sh openssl.mutated

CMake (SUCCESS?)

Blocked by
- ~~https://github.com/mull-project/mull/issues/862~~
- ~~https://github.com/mull-project/mull/issues/899~~

$ git clone https://github.com/Kitware/CMake
$ mkdir build && cd build && cmake -DCMAKE_CFLAGS="-fembed-bitcode -g -O0" -DCMAKE_CXX_FLAGS="-fembed-bitcode -g -O0"   -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DCMAKE_C_COMPILER="/usr/bin/clang" -DCMAKE_CXX_COMPILER="/usr/bin/clang++" -DBUILD_TESTING=ON ..
$ ctest --tests-regex "testdriver.+" Tests/TestDriver/
$ mull-runner-12 bin/cmake --report-name=REPORT --reporters=Elements -test-program=ctest -- --tests-regex "testdriver.+" Tests/TestDriver/

QEMU (TODO)

It’s a critical opensource projects according to Google.
How to measure code coverage: https://qemu.readthedocs.io/en/latest/devel/testing.html

git clone https://github.com/qemu/qemu
sudo apt install -y ninja-build
mkdir build && cd build
../configure --cc=clang --cxx=clang++ --extra-cflags="-fembed-bitcode -g" --extra-cxxflags="-fembed-bitcode -g" --target-list=x86_64-softmmu
make

RCU

https://www.cefns.nau.edu/~adg326/mutation17.pdf

We used the tool developed by Andrews et al. [6] to generate mutants. We decided to use this tool as it was evaluated on a set of eight well-known subject programs, part of a Siemens suite [6]. The tool is also simple in design and implementation; a 350 LOC Prolog program and a shell script. This tool generates mutants from a source file, treating each line of code in sequence and applying four classes of “mutation operators”.

PostgreSQL (SUCCESS)

~~Blocked by https://github.com/mull-project/mull/issues/859~~
Code coverage: https://coverage.postgresql.org/
Tests: https://www.postgresql.org/docs/14/regress.html
Requirements:
- https://www.postgresql.org/docs/8.4/install-requirements.html
- https://wiki.postgresql.org/wiki/Compile_and_Install_from_source_code ``` $ sudo apt install libpam0g-dev libreadline-dev libkrb5-dev flex bison zlibc zlib1g-dev $ sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison libxml2-dev libxslt-dev libssl-dev libxml2-utils xsltproc $ curl -O https://ftp.postgresql.org/pub/source/v14.1/postgresql-14.1.tar.gz $ git clone https://github.com/postgres/postgres # requires bison and flex $ CC=clang-12 CXX=clang++-12 CXXFLAGS=”-fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line -g -O0” CFLAGS=”-fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line -g -O0” ./configure $ make -j $ make check $ mull-cxx –workers=3 -keep-executable -mutate-only -output=postgres.mutated -compilation-flags=”-isystem /usr/include/ -isystem /usr/lib/llvm-10/lib/clang/10.0.0/include/ -isystem ./src/include/ -isystem ./src/include/common/ -isystem /usr/include/security” -linker-flags=”-lm -ldl -lpthread -lrt -fprofile-instr-generate -fcoverage-mapping” src/backend/postgres cat << EOF > run.sh
  #!/bin/sh

set -e

cp $1 $(pwd)/src/backend/postgres
make -f $(pwd)/Makefile check EOF chmod +x run.sh REPORTNAME=postgres-$(date +%F)-$(git rev-parse –short HEAD) REPORTNAME=postgres-14rc1 mull-runner postgres.mutated –report-name=$REPORT_NAME –reporters=Elements –reporters=IDE –debug –test-program=bash – ./run.sh postgres.mutated ```

SQLite (SUCCESS)

Note: Mull takes about 16Gb RAM for SQLite testing
Code coverage: https://www.opencoverage.net/sqlite/index_html/index.html
How to build https://www.sqlite.org/howtocompile.html
Run tests (TH3 is not public available) https://www.sqlite.org/th3.html

$ sudo apt-get install -y tcl8.6-dev tclsh clang
$ git clone https://github.com/sqlite/sqlite
$ CC=clang-12 CFLAGS="-fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line -g -O0" ./configure
# Remove fuzztest for test target in Makefile
# Reference run with single thread (-j has no effect):
$ make test # build testfixture
$ # llvm-profdata merge default.profraw -o default.profdata
$ # llvm-cov report ./testfixture  -summary-only -instr-profile=default.profdata
...
# SQLite 2021-09-30 10:47:10 7d16b302826fec3606dbc6e20df0d2182f6946a2ed4076d2412d1df30c552ecb
# 0 errors out of 251021 tests on ubuntu-basic-1-2-10gb-sergeyb Linux 64-bit little-endian
# All memory allocations freed - no leaks
# Maximum memory usage: 9196480 bytes
# Current memory usage: 0 bytes
# Number of malloc()  : -1 calls
#
# real    3m55.552s
# user    1m41.499s
# sys     0m32.020s
$ REPORT_NAME=report-sqlite3-$(git rev-parse --short HEAD)
$ # mull-cxx --exclude-path=ext/* --exclude-path=src/tclsqlite.c --exclude-path="src/test_*" --workers=5 -keep-executable -mutate-only -output=testfixture.mutated -compilation-flags="-isystem /usr/lib/llvm-6.0/lib/clang/6.0.0/include/" -linker-flags="-lm -ldl -lpthread -lz -ltcl8.6 -fprofile-instr-generate -fcoverage-mapping" -coverage-info=default.profdata testfixture

...
       [################################] 84/84. Finished in 111ms
[info] Applying mutations (threads: 1)
       [################################] 14567/14567. Finished in 10623ms
[info] Deduplicate mutants (threads: 1)
       [################################] 1/1. Finished in 48ms
[info] Compiling original code (threads: 5)
       [################################] 84/84. Finished in 400210ms
[info] Link mutated program (threads: 1)
       [################################] 1/1. Finished in 1449ms
[info] Mutated executable: testfixture.mutated
[info] Total execution time: 455251ms
$ mull-runner-12 ./testfixture --report-name=REPORT_NAME --reporters=Elements --ide-reporter-show-killed --test-program=make -- -f $(pwd)/Makefile test
...
/home/ubuntu/sqlite-custom/sqlite3.c:166781:28: warning: Killed: Replaced - with + [cxx_sub_to_add]
  while( nKey2 && pK2[nKey2-1]==' ' ) nKey2--;
                           ^
[info] All mutations have been killed
[info] Mutation score: 100%
[info] Total execution time: 35431ms
$

FreeRDP (SUCCESS?)

Blocked by - ~~https://gitlab.kitware.com/cmake/cmake/-/merge_requests/3661#note_617780~~

git clone https://github.com/FreeRDP/FreeRDP
mkdir build
cd build
cmake -DBUILD_TESTING=ON -DCMAKE_C_FLAGS="-fembed-bitcode -g -O0  -fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line" -DCMAKE_CXX_FLAGS="-fembed-bitcode -g -O0  -fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line" -DCMAKE_BUILD_TYPE=Debug  -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
make -j
make test
TODO: mull-runner-12 ./client/X11/xfreerdp -ide-reporter-show-killed -test-program=make -- Makefile ./client/X11/xfreerdp

Lua (SUCCESS)

Regression testsuite: https://www.lua.org/tests/ (How to run: https://www.lua.org/tests/#basic)
~~Blocked by https://github.com/mull-project/mull/issues/858~~

$ curl -O https://www.lua.org/ftp/lua-5.4.3.tar.gz

--- Makefile.orig       2022-01-31 13:55:04.759326427 +0300
+++ Makefile    2022-01-31 13:55:16.667338525 +0300
@@ -6,8 +6,9 @@
 # Your platform. See PLATS for possible values.
 PLAT= guess
 
-CC= gcc -std=gnu99
-CFLAGS= -O2 -Wall -Wextra -DLUA_COMPAT_5_3 $(SYSCFLAGS) $(MYCFLAGS)
+CC= clang-12 -fembed-bitcode -g
+MULL_FLAGS= -fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line
+CFLAGS= -O2 -Wall -Wextra -DLUA_COMPAT_5_3 $(SYSCFLAGS) $(MYCFLAGS) $(MULL_FLAGS)
 LDFLAGS= $(SYSLDFLAGS) $(MYLDFLAGS)
 LIBS= -lm $(SYSLIBS) $(MYLIBS)
$ make -j
$ cd src
$ # curl -O https://www.lua.org/tests/lua-5.4.3-tests.tar.gz
$ # tar xvzf lua-5.4.3-tests.tar.gz
$ # cd lua-5.4.3-tests
$ # lua all.lua
$ git clone https://framagit.org/fperrad/lua-Harness
# see docs/usage.md
$ cd lua-Harness/test_lua && mull-runner-12 $LUA --report-name=REPORT --reporters=Elements -test-program=prove -- --exec="$LUA" *.t

LuaJIT (BLOCKED)

Blocked by https://github.com/mull-project/mull/issues/995
Building https://luajit.org/install.html

mutators:
 - cxx_logical
includePaths:
 - src
excludePaths:
 - doc
quiet: false

curl -O https://luajit.org/download/LuaJIT-2.1.0-beta3.tar.gz
# Replace DEFAULT_CC in src/Makefile to clang
CFLAGS="-fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -g -grecord-command-line" make
mull-runner-12 ...

Tarantool (SUCCESS)

Code coverage (84%): https://coveralls.io/github/tarantool/tarantool?branch=master
Blocked by https://github.com/mull-project/mull/pull/855
Blocked by https://github.com/mull-project/mull/issues/994

With Clang 12 on Ubuntu 20.04:

git clone https://github.com/tarantool/tarantool
cd tarantool
cat << EOF > mull.yml
mutators:
 - cxx_all
 - cxx_arithmetic
 - cxx_arithmetic_assignment
 - cxx_assignment
 - cxx_bitwise
 - cxx_bitwise_assignment
 - cxx_boundary
 - cxx_calls
 - cxx_comparison
 - cxx_const_assignment
 - cxx_decrement
 - cxx_default
 - cxx_increment
 - cxx_logical
includePaths:
 - src
excludePaths:
 - src/lib/small
 - src/lib/tzcode   # https://github.com/mull-project/mull/issues/994
 - src/lib/uri      # uri_parser.c
 - src/lib/core
 - src/lib/json     # json.c
 - src/box          # xlog.c
quiet: false
EOF
mkdir build
 cmake -DBUILD_TESTING=ON -DCMAKE_C_FLAGS="-fembed-bitcode -g -O0  -fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line" -DCMAKE_CXX_FLAGS="-fembed-bitcode -g -O0  -fexperimental-new-pass-manager -fpass-plugin=/usr/lib/mull-ir-frontend-12 -grecord-command-line" -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DCMAKE_C_COMPILER=clang-12 -DCMAKE_CXX_COMPILER=clang++-12 ..
make -j
??? mull-runner-12 ./src/tarantool --report-name=REPORT --reporters=Elements -test-program=make -- -f $(pwd)/Makefile test

Теги: draft

SQLite (SUCCESS)

Questions

General

LibreSSL (IN PROGRESS)

CURL (IN PROGRESS)

PHP (TODO)

OpenSSL (SUCCESS)

CMake (SUCCESS?)

QEMU (TODO)

RCU

PostgreSQL (SUCCESS)

SQLite (SUCCESS)

FreeRDP (SUCCESS?)

Lua (SUCCESS)

LuaJIT (BLOCKED)

Tarantool (SUCCESS)

Новые