Testing using Property Based Tests

Who will guard the guardians? pt1

“Quis custodiet ipsos custodes?” is a Latin phrase found in the work of the Roman poet Juvenal from his Satires (Satire VI, lines 347–348). It translates to “but who will guard the guardians??” …

Its also the header for two posts related to testing and some ways of trying to improve your already made testing harness of test scripts. (In case you dont have already a set of test scripts please have a look at the previous post I wrote about testing !)

This post is about property based testing which is an interesting way of testing your code.It was introduced by the QuickCheck framework in Haskell and it suggests another way to test your code. Its not a magic bullet but rather a complementary addition to traditional testing that we usually do. It covers the scope covered by example-based testing: from unit tests to integration tests.

In example-based unit tests you:

define some example inputs
define the expected results
you run your code and check that they match!

Usualy we use some usual cases of the inputs we expect. If we are feeling creative we introduce also some edge cases that we think they will break the system/function under test. But we have work to do and sometimes these cases are far from exhaustive. How can we be sure that we have done all that we can?? Or “Who is guarding the Guardians”?

Enter PBT to the rescue. In Property-based Testing (PBT) on the other hand:

you describe the properties of the input
you describe the properties of the output
Have the computer try lots of random examples – check they don’t fail
If they do: shrink inputs ,to the minimal set of things needed to happen to fail, automatically

In Python, Hypothesis is a great property-testing library which allows you to write tests along with pytest (its a pytest plugin).

We are going to make use of this library with a small example. We are going to specify properties from very simple functional requirements (inputs are floats) Then Hypothesis will generate tests to try to falsify the properties. It will then try to shrink the set of values that cause errors in order to determine the minimal failure case/cases

Simple app

But enough talking ,lets try to test a simple calculator app in Python.Here is our calculator app in all its glory:

calc app

Simple tests

So lets create some test methods for the app using pytest:

calc app_t

We run the tests created

pytest -v

This results in:

calc pytest

Great! everything passes! we are great. We tested everything.

But is this the best we can do?

Ok lets try now using Property-Based-Testing with the hypothesis module

After some changes needed the testing code becomes:

calc hyptest

We run again the tests created using

pytest -v

calchyptestresults

Looking a bit closely to pytest output we see this:

calchyptestresult2

What on earth is this trickery?
Well it just says:

any float number could be there including nan and in such case.. BOOM
so please kind person writing the code, go back to your function and add something to deal with nan’s (not a number) cases

So all in all its not rocket science. Its just a way of automaticaly trying to find a way to break your functions in a way that you have new test cases to write and guard around.

And you dont need to ruin your test patterns or delete your already made tests. You can run Hypothesis-based tests side by side other tests with pytest .

If you want to dl the code youve seen above click here so you can add your own tests and play with it. Have fun! unzip it on a folder and run

pytest -v

on your bash terminal when you are in that folder. Simples (in a meerkat voice).

But im a data scientist! I deal with dataframes everyday. Little toy examples will not do!!

Well you are in luck. Because there is a community people that have created a lot of strategies of dealing with a number of things like dataframes,GeoJSON and a lot of other constructs that you might find useful.

Have a look here and here for some examples.

So is PBT only a Python/Haskell thing?

Property Based Testing has been implemented in many languages. Below are some example implementations other than the one used above in Python. When I find more I will add them here.

The most datasciencey are: