148,187 questions
 
 - Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
 1
 vote
 
 
 0
 answers
 
 
 37
 views
 
 Polars LazyFrame sink_parquet + PartitionByKey slower to S3 than local disk
 I'm wondering why I'm seeing such poor performance when writing a LazyFrame using PartitionByKey to S3 when compared to other methods. Here is a simple test script that writes out some random data to ...
 
 
 
 
 0
 votes
 
 
 1
 answer
 
 
 83
 views
 
 Combining two dataframes and keeping the average
 I'm new to coding, and I'm trying to combine the data from two weather stations into one new dataframe sorted by Datetime. I want this new dataframe to contain the average values of the two original ...
 
 
 
 
 2
 votes
 
 
 1
 answer
 
 
 120
 views
 
 How can I force release of internally allocated memory to avoid accumulated allocations leading to MemoryError? [closed]
 First stackoverflow post so apologies in advance if I break rules feedback is appreciated. I could post on Staging Ground, should've done that earlier but have had this issue too long and need a ...
 
 
 
 
 0
 votes
 
 
 1
 answer
 
 
 145
 views
 
 Why do my loops not start? Indexing loops problems?
 I'm trying to convert a SAS program into a R one and I have stumbled at the for() loop and array part. It keeps saying in the log:
"Error in for (. in i) seq_len(NBR_LIGNES_MAX) : 4 arguments ...
 
 
 
 
 -3
 votes
 
 
 0
 answers
 
 
 51
 views
 
 Return only 3 numbers with duplicate values in each row from a dataframe [duplicate]
 From a dataframe I would like to return only the duplicates with 3 different numbers in each row:
df = pd.DataFrame([[4,6,10,21,30,4,6,21,33], # 4,6,21 this has 3 duplicate
 [1,2,4,16,...
 
 
 
 
 5
 votes
 
 
 2
 answers
 
 
 153
 views
 
 Drop duplicate values when merging dataframe
 I have a DataFrame that I want to merge and drop only duplicates values based on column name and row. For example, key_x and key_y has the
same values in the same row in row 0,3,10,12,15.
My DataFrame
...
 
 
 
 
 0
 votes
 
 
 0
 answers
 
 
 42
 views
 
 pywebview: error maximum recursion depth exceeded before pressing button when passing pandas/model objects in js_api
 I’m embedding a small UI with pywebview and want Python to JS live updates. I created a GPSSpoofingDetector class that loads a pickled sklearn model and a pandas test CSV. I want a JavaScript "Start" ...
 
 
 
 
 1
 vote
 
 
 2
 answers
 
 
 93
 views
 
 Combining Identically Indexed and Column Dataframes into 3d Dataframe
 I have 3 2D DataFrames, all with identical indexes (datetime range) and column names, but different data for these labels. I would like to combine these three 2D dataframes into 1 3D DataFrame with an ...
 
 
 
 
 -1
 votes
 
 
 1
 answer
 
 
 97
 views
 
 Compare 2 columns in Polars and rearrange them when they match and unmatch?
 A Polars DataFrame that has 2 columns [Col01 & Col02]. They hold same values though not the same number of times [e.g. Col01 can have say 5 rows of '00000'while Col02 may have 20 rows of '00000' ...
 
 
 
 
 0
 votes
 
 
 2
 answers
 
 
 172
 views
 
 Error due to single-level dataframe merge with multi-level indexed dataframe
 # Read lookup file which only contains 5 columns.
df_lookup = pd.read_excel(
 os.path.join(path, 'lookup.xlsx'),
 index_col=[0, 1, 2, 3, 4])
# sample df_lookup
# |A |B |C |D |E |
# |--|--|--|--|...
 
 
 
 
 2
 votes
 
 
 0
 answers
 
 
 71
 views
 
 How to control the zorder values on superimposed bars in a histogram plot in matplotlib
 I have a list of three dataframes, each of them having four columns of interest. I want to create a figure with four subplots (one for each column). In each subplot, first, I want to create a ...
 
 
 
 
 7
 votes
 
 
 1
 answer
 
 
 218
 views
 
 How to write a pandas-compatible, non-elementary expression in narwhals
 I'm working with the narwhals package and I'm trying to write an expression that is:
applied over groups using .over()
Non-elementary/chained (longer than a single operation)
Works when the native df ...
 
 
 
 
 4
 votes
 
 
 4
 answers
 
 
 144
 views
 
 Create an incremental suffix for values in a pandas column that have duplicate values in another column
 Setup
I have a dataframe, df
import pandas as pd
df = pd.DataFrame(
 {
 'Name':['foo','foo','foo','bar','bar','bar','baz','baz','baz'],
 'Color':['red','blue','red','green','green','...
 
 
 
 
 -2
 votes
 
 
 1
 answer
 
 
 200
 views
 
 How to read a Microsoft SQL Data with Polars [closed]
 I would like to read a database with Polars and benefit from his speed vs Pandas.
Now I use this function to read db with pandas. So my question is simple how to convert it with polars and get ...
 
 
 
 
 3
 votes
 
 
 1
 answer
 
 
 123
 views
 
 Increase the date by number of months in pandas
 I have below pandas data frame
import pandas as pd
import numpy as np
dat = pd.DataFrame({'A' : [1,2,3,4,5], 'B' : ['2002-01-01', '2003-01-01', '2004-01-01', '2004-01-01', '2005-01-01']})
dat['A'] = ...