Data: Add TCK coverage for reader default values#16638
Conversation
Co-authored-by: Joy Haldar <joy.haldar@target.com>
| Arguments.of(Types.DoubleType.get(), Literal.of(-0.0D)), | ||
| Arguments.of(Types.DateType.get(), Literal.of(DateTimeUtil.isoDateToDays("2024-12-17"))), | ||
| Arguments.of( | ||
| Types.TimeType.get(), Literal.of(DateTimeUtil.isoTimeToMicros("23:59:59.999999"))), |
There was a problem hiding this comment.
Re-enabled the TIME case here, it passes for AVRO and PARQUET, ORC skips. It was added commented out in DataTestBase, I haven't traced why, so flagging in case it was intentional.
| .map(generator -> Arguments.of(format, generator))) | ||
| .toList(); | ||
|
|
||
| private static final List<Arguments> PRIMITIVE_TYPES_AND_DEFAULTS = |
There was a problem hiding this comment.
Could we use a DataGenerator for this?
There was a problem hiding this comment.
The matrix is (type, default-literal) pairs and the test asserts the reader injects exactly that literal, so the type has to stay paired with its expected value. DataGenerator is one schema() + random rows, so it can't carry that pairing, at least as I understand the current interface.
There was a problem hiding this comment.
Can we create a generator with a schema with default values, and create 2 tests, one with nulls where we check for the default values, and one with random data (not nulls), and check for the generated data?
|
|
||
| @ParameterizedTest | ||
| @FieldSource("FILE_FORMATS") | ||
| void testDefaultValues(FileFormat fileFormat) throws IOException { |
There was a problem hiding this comment.
Do we need this test ? Seem the same like testSchemaEvolutionAddColumn?
| expectedNested.setField("missing_inner_float", -0.0F); | ||
| expected.setField("nested", expectedNested); | ||
| } | ||
| return expected; |
| .copy("value_str", val.getField("value_str"), "value_int", 34))); | ||
| expected.setField("nested_map", rebuilt); | ||
| } | ||
| return expected; |
There was a problem hiding this comment.
nit: new line. Please check all the code .Thanks
| .collect(Collectors.toList()); | ||
| expected.setField("nested_list", rebuilt); | ||
| } | ||
| return expected; |
| List<Record> genericRecords = RandomGenericData.generate(writeSchema, 10, 1L); | ||
| writeGenericRecords(fileFormat, writeSchema, genericRecords); | ||
|
|
||
| Schema expectedSchema = |
There was a problem hiding this comment.
Can we reuse writeSchema to create expectedSchema?
| List<Record> genericRecords = RandomGenericData.generate(writeSchema, 10, 1L); | ||
| writeGenericRecords(fileFormat, writeSchema, genericRecords); | ||
|
|
||
| Schema expectedSchema = |
Adds the reader default-value tests from DataTestBase into the Base Format model TCK: