-
Couldn't load subscription status.
- Fork 20
Add polars notebook #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's Guide by SourceryThis pull request adds a new section for Polars, a high-performance alternative to pandas, to the project. It includes updates to the main README.md file and introduces a new README.md file for the Polars section, explaining its purpose and contents. File-Level Changes
Tips
|
@sourcery-ai
sourcery-ai
bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gjbex - I've reviewed your changes - here's some feedback:
Overall Comments:
- There's a typo in the main README.md file. 'Kllustrations' should be 'Illustrations' in the Polars entry.
Here's what I looked at during the review
- 🟡 General issues: 5 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟡 Documentation: 1 issue found
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Consider using a dynamic range for patient iteration.
The loop iterates over a hardcoded range from 1 to 10. Consider using a dynamic range based on the actual number of patients in the dataset to make the code more flexible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (bug_risk): Check for key existence in dictionary access.
When accessing dictionary keys using formatted strings, ensure that the keys exist to avoid potential KeyErrors. Consider using a safer method like dict.get() with a default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Optimize null check to specific columns.
Checking for nulls across all columns might be inefficient for large datasets. Consider checking specific columns that are critical for your analysis to improve performance.
time_series.select(
pl.col('date'),
pl.any_horizontal(
pl.col(['critical_column1', 'critical_column2', 'critical_column3']).is_null()
).alias('has_null')
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Optimize date filtering by converting column type.
If date filtering is a frequent operation, consider converting the date column to a datetime type once and using direct comparisons to improve performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Specify interpolation method and handle edge cases.
The suggestion to use linear interpolation is vague. Specify the method of interpolation and consider handling edge cases where interpolation might not be appropriate, such as at the boundaries of the dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (documentation): Fix typo in 'Kllustrations'
The word 'Kllustrations' should be 'Illustrations'.
Uh oh!
There was an error while loading. Please reload this page.
Summary by Sourcery
Add a new Polars section to the project, including a README and a Jupyter notebook that explores functional differences between pandas and Polars, replicating the existing pandas notebook.
New Features:
Documentation: