Commit Briefs

0adf54a339 Sergey Bronnikov

update paths (ligurio/dev, origin/ligurio/dev)


8eaceb07ca Sergey Bronnikov

cmake tests


dff58e6289 Sergey Bronnikov

Update assert msg


76afb5cfa7 Sergey Bronnikov

Update cmake for tests


6df34787f1 Sergey Bronnikov

lua_unref


8e3a954daa Sergey Bronnikov

todo


01014385e9 Sergey Bronnikov

CMakePresets


5eceb8748a Sergey Bronnikov

rename titles


7a5eaf0ca1 Sergey Bronnikov

gh actions fixes


57a732f3cb Sergey Bronnikov

cleanup dead code


Branches

Tags

This repository contains no tags

Tree

.github/
.gitignorecommits | blame
.luacheckrccommits | blame
CHANGELOG.mdcommits | blame
CMakeLists.txtcommits | blame
CMakePresets.jsoncommits | blame
CONTRIBUTING.mdcommits | blame
LICENSEcommits | blame
README.mdcommits | blame
docs/
examples/
luzer/
luzer-scm-1.rockspeccommits | blame
mutator/

README.md

[![Static analysis](https://github.com/ligurio/luzer/actions/workflows/check.yaml/badge.svg)](https://github.com/ligurio/luzer/actions/workflows/check.yaml)
[![Testing](https://github.com/ligurio/luzer/actions/workflows/test.yaml/badge.svg)](https://github.com/ligurio/luzer/actions/workflows/test.yaml)
[![License: ISC](https://img.shields.io/badge/License-ISC-blue.svg)](https://opensource.org/licenses/ISC)
[![Luarocks](https://img.shields.io/luarocks/v/ligurio/luzer/scm-1)](https://luarocks.org/modules/ligurio/luzer)

# luzer

a coverage-guided, native Lua fuzzer.

## Overview

Fuzzing is a type of automated testing which continuously manipulates inputs to
a program to find bugs. `luzer` uses coverage guidance to intelligently walk
through the code being fuzzed to find and report failures to the user. Since it
can reach edge cases which humans often miss, fuzz testing can be particularly
valuable for finding security exploits and vulnerabilities.

`luzer` is a coverage-guided Lua fuzzing engine. It supports fuzzing of Lua
code, but also C extensions written for Lua. Luzer is based off of
[libFuzzer][libfuzzer-url]. When fuzzing native code, `luzer` can be used in
combination with Address Sanitizer or Undefined Behavior Sanitizer to catch
extra bugs.

## Installation

```sh
$ luarocks --local install luzer
$ eval $(luarocks path)
```

## Writing Fuzz Tests In Lua

- There must be exactly one fuzz target per fuzz test.
- Fuzz targets should be fast and deterministic so the fuzzing engine can work
  efficiently, and new failures and code coverage can be easily reproduced.
- Since the fuzz target is invoked in parallel across multiple workers and in
  nondeterministic order, the state of a fuzz target should not persist past
  the end of each call, and the behavior of a fuzz target should not depend on
  global state.

```lua
local luzer = require("luzer")

local function TestOneInput(buf)
    local b = {}
    buf:gsub(".", function(c) table.insert(b, c) end)
    if b[1] == 'c' then
        if b[2] == 'r' then
            if b[3] == 'a' then
                if b[4] == 's' then
                    if b[5] == 'h' then
                        assert(nil)
                    end
                end
            end
        end
    end
end

luzer.Fuzz(TestOneInput)
```

While fuzzing is in progress, the fuzzing engine generates new inputs and runs
them against the provided fuzz target. By default, it continues to run until a
failing input is found, or the user cancels the process (e.g. with `Ctrl^C`).

The output will look something like this:

```
$ luajit examples/example_basic.lua
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1557779137
INFO: Loaded 1 modules   (151 inline 8-bit counters): 151 [0x7f0640e706e3, 0x7f0640e7077a),
INFO: Loaded 1 PC tables (151 PCs): 151 [0x7f0640e70780,0x7f0640e710f0),
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2	INITED cov: 17 ft: 18 corp: 1/1b exec/s: 0 rss: 26Mb
#32	NEW    cov: 17 ft: 24 corp: 2/4b lim: 4 exec/s: 0 rss: 26Mb L: 3/3 MS: 5 ShuffleBytes-ShuffleBytes-CopyPart-ChangeByte-CMP- DE: "\x00\x00"-
...
```

The first lines indicate that the "baseline coverage" is gathered before
fuzzing begins.

To gather baseline coverage, the fuzzing engine executes both the seed corpus
and the generated corpus, to ensure that no errors occurred and to understand
the code coverage the existing corpus already provides.

**Fuzzing API**

The luzer module provides a function `Fuzz()`.

`Fuzz(test_one_input, custom_mutator, args)` starts the fuzzer. This function
does not return.

- `test_one_input` is a fuzzer's entry point, it is a function that must take a
  single bytes argument. This will be repeatedly invoked with a single bytes
  container.
- `custom_mutator` defines a custom mutator function (equivalent to
  `LLVMFuzzerCustomMutator`). Default is `nil`.
- `args` is a table with arguments: the process arguments to pass to the
  fuzzer. Field `corpus` specifies a path to a directory with seed corpus, see a
  list with other options in the [libFuzzer documentation][libfuzzer-options-url].

It may be desirable to reject some inputs, i.e. to not add them to the corpus.
For example, when fuzzing an API consisting of parsing and other logic, one may
want to allow only those inputs into the corpus that parse successfully. If the
fuzz target returns `-1` on a given input, `luzer` will not add that input top
the corpus, regardless of what coverage it triggers.

**Structure-Aware Fuzzing**

`luzer` is based on a coverage-guided mutation-based fuzzer (LibFuzzer). It has
the advantage of not requiring any grammar definition for generating inputs,
making its setup easier. The disadvantage is that it will be harder for the
fuzzer to generate inputs for code that parses complex data types. Often the
inputs will be rejected early, resulting in low coverage. For solving this
issue `luzer` offers `FuzzedDataProvider` and two functions to customize the
mutation strategy which is especially useful when fuzzing functions that
require structured input.

Often, a `bytes` object is not convenient input to your code being fuzzed.
Similar to libFuzzer, luzer provides a `FuzzedDataProvider` that can simplify the
task of creating a fuzz target by translating the raw input bytes received from
the fuzzer into useful primitive Lua types.

You can construct the `FuzzedDataProvider` with:

```lua
local fdp = luzer.FuzzedDataProvider(input_bytes)
```

The `FuzzedDataProvider` then supports the following functions:

- `consume_string(max_length)` - consume a string with length in the range `[0,
  max_length]`. When it runs out of input data, returns what remains of the input.
- `consume_strings(max_length, count)` - consume a list of `count` strings with
  length in the range `[0, max_length]`.
- `consume_integer(min, max)` - consume a signed integer with size in the range
  `[min, max]`.
- `consume_integers(min, max, count)` - consume a list of `count` integers in the
  range `[min, max]`.
- `consume_number(min, max)` - consume a floating-point value in the range
  `[min, max]`.
- `consume_numbers(min, max, count)` - consume a list of `count` floats in the
  range `[min, max]`. If there's no input data left, returns `min`. Note that
  `min` must be less than or equal to `max`.
- `consume_boolean()` - consume either `true` or `false`, or `false` when no
  data remains.
- `consume_booleans(count)` - consume a list of `count` booleans.
- `consume_probability()` - consume a floating-point value in the range `[0, 1]`.
  If there's no input data left, always returns 0.
- `remaining_bytes()` - returns the number of unconsumed bytes in the fuzzer
  input.

Examples:

```lua
> luzer = require("luzer")
> fdp = luzer.FuzzedDataProvider(string.rep("A", 10^9))
> fdp:consume_boolean()
true
> fdp:consume_string(2, 10)
AAAAAAAAA
```

Learn more about structure-aware fuzzing in the
[documentation](docs/structure-aware-fuzzing.md).

## Using Custom Mutators Written In Lua

`luzer` allows [custom mutators][libfuzzer-mutators-url] to be written in Lua 5.1
(including LuaJIT), 5.2, 5.3 or 5.4.

The environment variable `LIBFUZZER_CUSTOM_MUTATOR_LUA_SCRIPT` can be set to
the path to the Lua mutator script. The default path is
`./mutator.lua`.

To run the Lua example, use

```sh
LIBFUZZER_CUSTOM_MUTATOR_LUA_SCRIPT=./mutator.lua example_compressed
```

All you need to do on the C/C++ side is

```
#include "mutator.cpp"
```

in the target file where you have `LLVMFuzzerTestOneInput` (or any other
compilation unit that is linked to the target) and then build with the Lua
include and linker flags added to your build configuration.

Then write a Lua script that does what you would like the fuzzer to do, you
might want to use the `mutator.lua` script. The environment variable
`LIBFUZZER_CUSTOM_MUTATOR_LUA_SCRIPT` can be set to the path to the Lua mutator
script. The default path is `./mutator.lua`. Then just run your fuzzing as
shown in the examples above.

**Custom mutator API**

- `LLVMFuzzerCustomMutator(data, size, max_size, seed)` - function that called
  for each mutation. Optional user-provided custom mutator. Mutates raw data in
  `[data, data+size)` inplace. Returns the new size, which is not greater than
  `max_size`. Given the same seed produces the same mutation.
- `LLVMFuzzerMutate(data, size, max_size)` - function that called for each
  mutation. libFuzzer-provided function to be used inside
  `LLVMFuzzerCustomMutator`. Mutates raw data in `[data, data+size)` inplace.
  Returns the new size, which is not greater than `max_size`.

## License

Copyright © 2021-2023 [Sergey Bronnikov][bronevichok-url].

Distributed under the ISC License.

[libfuzzer-url]: https://llvm.org/docs/LibFuzzer.html
[libfuzzer-options-url]: https://llvm.org/docs/LibFuzzer.html#options
[libfuzzer-mutators-url]: https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md
[bronevichok-url]: https://bronevichok.ru/