Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for DECIMAL types to ArgumentTypeFuzzer (facebookincubato…
…r#9373) Summary: Pull Request resolved: facebookincubator#9373 ArgumentTypeFuzzer serves two purposes. 1. Given a generic function signature, generate a random valid return type. 2. Given a generic function signature and a concrete return type, generate a list of valid argument types. Consider a signature of the map_keys function: ``` map(K, V) -> array(K) ``` This signature has 2 type variables: K and V. This signature doesn’t fully specify the return type, just that it must be an array. There are many possible valid return types, but all should be of the form array(K). When asked to generate a valid return type for this signature ArgumentTypeFuzzer may return array(bigint) or array(row(...)), but it should not return ‘varchar’. Now, if we fix the return type to array(bigint) and ask ArgumentTypeFuzzer to generate valid argument types, we may get back a map(bigint, varchar) or a map(bigint, double), but we do not expect a ‘varchar’ or a map(integer, float). By specifying the return type as array(bigint) we effectively bind one of the type variables: K = bigint. At the same time we leave V unspecified and ArgumentTypeFuzzer is free to choose any type for it. To generate a return type, create an ArgumentTypeFuzzer by specifying a function signature and a random number generator, then call fuzzReturnType() method. ``` ArgumentTypeFuzzer fuzzer(signature, rng) auto returnType = fuzzer.fuzzReturnType() ``` To generate argument types for a given return type, create an ArgumentTypeFuzzer by specifying a function signature, a return type and a random number generator, then call fuzzArgumentTypes() method. ``` ArgumentTypeFuzzer fuzzer(signature, returnType, rng) if (fuzzer.fuzzArgumentTypes()) { auto argumentTypes = fuzzer.argumentTypes(); } ``` Notice that fuzzArgumentTypes may fail to generate valid signatures. This can happen if specified 'returnType' is not a valid return type for this signature. This change extends ArgumentTypeFuzzer to support signatures that use generic decimal types. Consider a signature of least function: ``` (decimal(p, s),...) -> decimal(p, s) ``` This signature has 2 integer variables: p and s. The return type is not fully specified. It can be any valid decimal type. ArgumentTypeFuzzer::fuzzReturnType needs to generate values for ‘p’ and ‘s’ and return a decimal(p, s) type. If return type is fixed, say decimal(10, 7), ArgumentTypeFuzzer::fuzzArgumentTypes() needs to figure out that p=10 and s=7 and return a random number of argument types all of which are decimal(10, 7). Consider slightly different function: between ``` (decimal(p, s), decimal(p, s)) -> boolean ``` This signature also has 2 integer variables: p and s. The return type is fully specified though. Hence, ArgumentTypeFuzzer::fuzzReturnType should always return ‘boolean’. However, when return type is fixed to the only possible value, ‘boolean’, ArgumentTypeFuzzer::fuzzArgumentTypes() may generate any valid values for p and s and return any decimal type for the arguments as long as both argument types are the same. A pair of {decimal(10, 7), decimal(10, 7)} is a valid response, as well as {decimal(18, 15), decimal(18, 15)}. Let’s also look at a the signature of the ‘floor’ function: ``` (decimal(p, s)) -> decimal(rp, 0) ``` This function has 3 integer variables: p, s, and rp. The ‘rp’ variable has a constraint: ``` rp = min(38, p - s + min(s, 1)) ``` The return type can be any decimal with scale 0. ArgumentTypeFuzzer::fuzzReturnType may return decimal(10, 0) or decimal(7, 0), but it should not return decimal(5, 2). If we fix return type and ask ArgumentTypeFuzzer to generate valid argument types, it will need to figure out how to generate values p and s such that rp = min(38, p - s + min(s, 1)). This is a pretty challenging task. Hence, ArgumentTypeFuzzer::fuzzArgumentTypes() doesn’t support signatures with constraints on integer variables. It should be noted that ArgumentTypeFuzzer::fuzzReturnType() may also need to make sure that generated ‘rp’ is such that there exist ‘p’ and ‘s’ for which the formula above is true. For this particular formula this is easy because a solution exists for any rp: p = rp, s = 0. However, this is not true in general. It might be better to not support ArgumentTypeFuzzer::fuzzReturnType() for signatures with constraints on integer variables. To fuzz argument or return types, ArgumentTypeFuzzer needs to generate valid values for integer variables. Unlike type variables, integer variables have implicit constraints. A variable that represents a precision must have a value in [1, 38] range. A variable that represents scale must have a value in [1, precision] range. The fuzzer needs a way to determine which variable represents precision, which represents scale and for scale variables it needs to figure out what is the corresponding precision. The fuzzer infers these properties from the way variables are used. It examines the types of arguments and return type to figure out what each variable represents. When encountering decimal(p, s) type, the fuzzer determines that p is precision and s is scale. When encountering decimal(p, 5) type, the fuzzer determines that p is precision that must be >= 5. When encountering decimal(10, s), the fuzzer determines that s is scale that must be in [0, 5] range. This logic is implemented in the ArgumentTypeFuzzer::determineUnboundedIntegerVariables method. Reviewed By: bikramSingh91 Differential Revision: D55772808 fbshipit-source-id: 708f202fc7270aaeaa59e28175aea5147b4a7981
- Loading branch information