Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit b7f205b

Browse files
committed
Add generation script for large CSV files
1 parent 434ba32 commit b7f205b

File tree

2 files changed

+23
-0
lines changed

2 files changed

+23
-0
lines changed

‎source-code/polars/README.md‎

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,8 @@ Polars is an alternative to pandas that is designed to have better performance.
1010
directory with the same name.
1111
1. `polars_versus_pandas_benchmarks.ipynb`: Jupyter notebook that compares the
1212
performance of polars and pandas on a variety of operations.
13+
1. `create_csv_data.py`: Python script to generate one or more large CSV files
14+
for benchmarking.
15+
1. `create_csv_data.slurm`: Slurm script to run `create_csv_data.py` on a
16+
cluster.
1317
1. `data`: Directory containing the data used in the notebook.
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#!/usr/bin/env -S bash -l
2+
#SBATCH --account=lpt2_sysadmin
3+
#SBATCH --nodes=1
4+
#SBATCH --ntasks=1
5+
#SBATCH --cpus-per-task=1
6+
#SBATCH --mem=2G
7+
#SBATCH --time=01:00:00
8+
#SBATCH --mail-user=geertjan.bex@uhasselt.be
9+
#SBATCH --mail-type=FAIL,END
10+
11+
module purge
12+
module load Python/3.11.3-GCCcore-12.3.0
13+
14+
# This should generate a file of approximately 6 GB
15+
python ./create_csv_data.py \
16+
--files 1 \
17+
--cols 100 \
18+
--rows 2500000 \
19+
large_data

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /