Fuzz-testing network protocols

Fuzzing or fuzz-testing is a test methodology to improve your product’s robustness. It is used to detect unexpected behaviors by giving a large number of random data to the software which has any kind of input.

In this article, I’ll give a brief explaination about how efficient the fuzzing is to make software robust, especially for those contains network protocol stack, and how we can fuzz-test efficiently.

Why fuzzing?

Because unexpected behavior is always the big concern for software. In the usual unit tests, we can test software with some data to cause failure on purpose, but in such tests the data is known and expectable. What fuzzing can rather reveal is unknown, or unexpectable failure that we couldn’t expect to happen while writing codes(“test with ALL the unexpected data” is logically impossible).
Fuzzing, giving the random data generated by the machine to software under test, is the simple yet effective way to find the bugs which is hard for humans to expect.

Note that fuzzing is NOT just for improving validations of user inputs. Fuzzing sometimes finds more critical issus such as incorrect memory allocations. Obviously in that case what should be fixed is the handling of data and memory allocation, not the validation of user inputs.

Also note that we should NOT stick to the input of fuzzing(=the data generated by machine) that causes crashes. Thoughts like “We don’t need fuzzing, as no one can send such data to our products” is not good because in many cases fuzzing finds more primal issues that may also happens with other data. With the more data given, the more opportunity we can have.

Why is that for networking?

Because networking protocols are always with uncertainty.
As those who has any experience in implementing networking protocols might agree, making it robost enough is no easy work, as a protocol stack is to;

Convert raw binary into structured, meaningful data and vice versa in the way that the specifications ambiguously defines.
Queue and dequeue, combine and split, and correlates the data coming from/forwarded to over networks.
All of those should be implemented as fast as possible.

Even after released with fully covered unit tests, implementors of networking protocols are always anxious and suspicious like René Descartes. We are trying to be as suspicious as possible, and finally find that the only thing certain for us is there exists “me” thinking about it :P

Steps to fuzz-test efficiently

Basically a fuzz-testing consists of the following five steps.

1. Determine what to test

Testing everything is ideal but not realistic enough. The process to determine what to test is to determine what to compromise; it is often so hard to prioritize the test target because both input(fuzz) and output(issue) are unexpectable. The list below might be the hint, but it depends much on the situations you’re in.

User interactions
As described above, fuzzing is not just the work to validate the user input, but it is the fact that where users interact with your software is most likely to have issues. For example, the malformed input into the decoder functions can crash by the packets coming over network, etc.
Software implementations
The most typical issue happens is a crash of the software, and it can be classified by impact. If the software under test is developed by your own, you can predict the impact to some extent. This can only be applied to crashing issue(which is not always the worst issue happens by fuzzing), but may somehow help prioritize the parts of software. If the software does not have segmentation nor resiliency, this cannot be taken into consideration.
- total v.s. partial: If the software has appropriate segmentation, the crash can be partial, which does not result in the denial of service as a whole but unstable.
- permanent v.s. temporary: If the service is resilient enough to come back working by timer or something, the crash can be

Again, the criticality cannot be estimated beforehand. Trying to prioritize by criticality of the issues that might be found is totally nonsense in fuzz-testing, as it can be fixed without fuzzing if it is really expected to happen.

2. Prepare instrumentation

To observe what happened as correctly as possible, and to find condition that issue happen and root cause without much pain, it is neccessary to prepare before running tests. With better understanding of the issue happened, we can prioritize which issue should be remediated first. For the latter one, we should expect issues that is not 100% reproducible, which is often caused only under the complicated condition. In such cases the analysis will be much easier if the software under test is well instrumentated.

The most typical way of instrumentation for networking products is periodical health check. Checking if the software under test is still working by giving it normal inputs, to which the software under test is to respond, helps you see when it happens. If your product has multiple layered protocol stack, it is better to put instrumentation in each layer, as it helps you observe what exactly happened by the input.

Other than that, consider what is NOT normal behavior and try to find it with instrumentation. For example, if the software under test responds to fuzz with some data you don’t know, it might be much more critical than just being crashed.

3. Run tests

For developers, implementing fuzz-testing from inside the software is not such a burden. Just give an enormous number of random and/or suspicious data like listed in Big List of Naughty Strings to functions, which may find some issues. However, to be more efficient, some dedicated tools(“fuzzers”) are available. See Fuzzing tools below for more detail.

4. Analyze the findings

Analysis is useful mostly to prioritize which issues to be remediated first. So, if you just had only a few issues, consuming resources/time in this step might be a waste.

Note that sometimes what you witness and the real root cause might differ. With source code it is often obvious that what causes the issue thanks to the stack trace, but in the case of software which has long lines of codes and is cyclomatically complex, where the stack trace point out is not necessarily the root cause.
A typical example for this is a class and subclass: at a glance the issue is happened due to a bad implementation of a subclass, but actually it is due to the parent class’ implementation. In this case fixing the subclass can only avoid the same issue happening, but it still may happen in another inherited subclass.

5. Remediate issues

Now that you can fix(or ask the vendor to fix) the issue. If you’re not really sure how to fix it in this step, it means you haven’t analyzed enough in the previous step. In that case you should go back to see “what actually happened” and “what caused the issue”.

You fixed it yet? Congrats! But the remediation after finding issues is not just a fix in source code. At least, the test cases should be added not to make it happen again.
Fuzzing will never finish; after the first work consider how you can introduce fuzzing sustainably in your team so that the quality of your software improves continuously without degradations.

Integrating the fuzz-tests into your CI procedure might help, but I don’t recommend it because it means anyone who pushed codes can face the robustness issue, which can be a nightmare for those who are not familiar with it. If you’re in small and well-organized team where every member knows the whole codes it is good idea, though.

Fuzzing tools

Here is the example of most popular ones, but as each fuzzer has different goals, please look for the one that fits your situation by yourself. Awesome-Fuzzing: the curated list of fuzzing resources, can be your good partner. (Oh… there must be blog posts better than mine!)

Generic Standalone Fuzzer

american fuzzy lop: A security-oriented fuzzer that employs a novel type of compile-time instrumentation and genetic algorithms to automatically discover clean.
radamsa: A general purpose fuzzer and test case generator.
Peach Fuzzer: Framework which helps to create custom dumb and smart fuzzers.

Language-specific Libraries

libFuzzer: A library for coverage-guided fuzz testing.
go-fuzz: A coverage-guided fuzzing solution for testing of Go packages.
sulley: A pure-python fully automated and unattended fuzzing framework.

Even for those who have some networking equipment purchased, fuzzing can be useful to improve your service continuity. There are some open-source software for fuzzing like ones listed above, but if your equipment uses minor protocols, it is good idea to use commercial products like Defensics (by Synopsys), a de-facto standard commercial fuzzer which supports a lot of protocols.

There are also some other fuzzer products and services, which meant to be more domain-specific. For example in mobile networking, P1 Security and Security Research Labs are the big players. In many cases this kind of companies also provides one-shot security testing service including fuzzing. Trying the service before starting tests by yourself might be good choice, especially if you don’t have employees familiar with such tests.

For more specific fuzzer categorization(coverage-guided, etc.) and the technologies/algorithms used, I’ll publish another article in the future hopefully.

Happy fuzzing!