-
Notifications
You must be signed in to change notification settings - Fork 246
fix: re-enable Comet abs
#2595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: re-enable Comet abs
#2595
Conversation
Codecov Report
✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.33%. Comparing base (f09f8af
) to head (1ddff98
).
Additional details and impacted files
@@ Coverage Diff @@ ## main #2595 +/- ## ============================================ + Coverage 56.12% 59.33% +3.20% - Complexity 976 1444 +468 ============================================ Files 119 146 +27 Lines 11743 13758 +2015 Branches 2251 2353 +102 ============================================ + Hits 6591 8163 +1572 - Misses 4012 4373 +361 - Partials 1140 1222 +82
☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.
🚀 New features to boost your workflow:
- ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
c163a77
to
0212d8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the diff, test with ANSI mode on/off.
0212d8b
to
f93a55f
Compare
f93a55f
to
9cb2f6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We store negative values in these columns, I think the schema should not be unsigned int.
// CometTestBase.scala record.add(8, (-i).toByte) record.add(9, (-i).toShort) record.add(10, -i) record.add(11, (-i).toLong)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have UINT here to make sure we cover all types that parquet has. The data files created here are specifically designed to test whether parquet readers can handle all types correctly. Negative values stored in a UINT parquet type test the values around the boundary of allowed values.
To illustrate with an example, when you store the value -1 in a UINT_8 field what gets stored is the bit pattern 0xff. On reading, this is read back as the value 255 which is the maximum value for a UINT_8.
This is both correct and desirable.
Thanks @hsiang-c WDYT of implementing abs with spark flavor in DF? Like I did recently for concat apache/datafusion#18128
Uh oh!
There was an error while loading. Please reload this page.
Which issue does this PR close?
Closes #1890
Partially closes #2314
Rationale for this change
abs
implementation returns unexpected result and was turned off.What changes are included in this PR?
org.apache.spark.SparkArithmeticException
on theMIN_VALUE
of Spark'sIntegralType
, see doc.CometTestBase
, changed the types of column_9
,_10
,_11
and_12
fromUINT_8/16/32/64
toINT_8/16/32/64
b/c we actually have negative values in test data.How are these changes tested?
MIN_VALUE
and decimal values with different precision and scale.