### 27 January 2015 2:30 PM (pflua | quickcheck | testing | Igalia)

Most tests are useless without data, and the data they need to run can be fairly complicated. Generating a lot of data by hand sounds tedious, and would get rid of a lot of the advantages of generating tests; the flexibility and speed of having a large test case would disappear. So, what does it look like to generate data for tests? Here is an example of generating pflua's AST, one of its intermediate representations, using a functional grammar; the complete code is available. The key thing to notice is that it is actually fairly easy to do this; it is less than a page of straightforward code, mainly choosing randomly between alternatives.

An AST is generated by running Logical(). The simplest alternatives are to generate a one-word AST, which can be 'true', 'false', or 'fail'. Alternatively, a conditional (with if) or a comparison (with a relative operator like '==') can be generated. But this text explanation is already longer than the code; here's the implementation of 'Logical':

function Logical() return choose({ Conditional, Comparison, True, False, Fail })() end

To handle complications, like avoiding dividing by zero or accessing beyond the bounds of a packet, a Lua table ('db') also gets passed around to the arithmetic operations. 'Comparison' then unwraps the assertions, generating code to test if they're violated and failing if they are.

With the general concepts explained, without further ado, here is the actual code to generate random ASTs for tests.

function True() return { 'true' } end function False() return { 'false' } end function Fail() return { 'fail' } end function ComparisonOp() return choose({ '<', '>' }) end function BinaryOp() return choose({ '+', '-', '/' }) end function UnaryOp() return choose({ 'uint32', 'int32', 'ntohs', 'ntohl' }) end -- Boundary numbers are often particularly interesting; test them often function Number() if math.random() < 0.2 then return math.random(-2^31, 2^32 - 1) else return choose({ 0, 1, -2^31, 2^32-1, 2^31-1 }) end end function Len() return 'len' end function Unary(db) return { UnaryOp(), Arithmetic(db) } end function Binary(db) local op, lhs, rhs = BinaryOp(), Arithmetic(db), Arithmetic(db) if op == '/' then table.insert(db, { '!=', rhs, 0 }) end return { op, lhs, rhs } end function PacketAccess(db) local pkt_access_size = choose({1, 2, 4}) local position = {'uint32', Arithmetic(db) } table.insert(db, {'>=', 'len', {'+', position, pkt_access_size}}) return { '[]', position, pkt_access_size } end function Arithmetic(db) return choose({ Unary, Binary, Number, Len, PacketAccess })(db) end function Comparison() local asserts = {} local expr = { ComparisonOp(), Arithmetic(asserts), Arithmetic(asserts) } while #asserts > 0 do expr = { 'if', table.remove(asserts), expr, { 'fail' } } end return expr end function Conditional() return { 'if', Logical(), Logical(), Logical() } end function Logical() return choose({ Conditional, Comparison, True, False, Fail })() end

The above approach is not perfect; it generates quite large test cases. 'choose' weights options equally, though Number() demonstrates how bias can be introduced if some cases should be tested more often than others. Nonetheless, it's a real-world example of something good enough to be useful, while being easy and flexible at the same time.