Skip to main content
Stack Overflow
  1. About
  2. For Teams
Filter by
Sorted by
Tagged with
Advice
3 votes
3 replies
57 views

I am looking for some assistance with how to convert the below XML data into a dataframe. I have managed to write a working code in R (XML package, code is messy) but then I realised it might even be ...
4 votes
2 answers
195 views

I have a dataset with a column of groups, dates, day of the week and some data columns. For each date in each group, I want to work out the same day average from the last 3 weeks (l3w). I've been ...
0 votes
2 answers
86 views

Suppose I have the following polars DataFrame: df = pl.DataFrame({"a": [["A111", "A110"], ["Z254"], ["B897", "C768", "D456"]]}) ...
0 votes
0 answers
75 views

I'm working in R and have a dataframe mtcars with cars having a column wt (weight of the cars). I'm trying to calculate skewness of weights. The following is the exercise as shown in the book. The 1st ...
4 votes
4 answers
184 views

I'm trying to figure out how to change values in a column (Age), based on the values of two separate columns (Species and Length). I have a dataset of fish lengths, with all of them designated either &...
Ray's user avatar
  • 85
3 votes
1 answer
123 views

I have the following dataframe: df <- data.frame( Form=rep(c("Fast", "Medium", "Slow"), each = 3), Parameter =rep(c("Fmax", "TMAX", "B&...
Maz's user avatar
  • 33
3 votes
1 answer
75 views

hist_df_2["time"] = hist_df_2.apply(lambda row : hist_df_2['timestamp'].replace(str(hist_df_2['date']), ''), axis=1) I tried this to remove the date part from the timestamp. However, for ...
1 vote
1 answer
129 views

I'm working with a large Pandas DataFrame and a multi-dimensional NumPy array. My goal is to efficiently "broadcast" a specific column of the DataFrame across one or more dimensions of the ...
-1 votes
0 answers
83 views

Say I have a pandas dataframe of > 2 columns and > 2 rows, I want to apply a function, such as a datatype conversion, to each element in at least two columns. I would like for it to be efficient,...
Best practices
0 votes
4 replies
44 views

I have uploaded an Excel file in Python data frame. But once it's loaded, the file gets locked for further changes. Now I want to unlink the file so that I can make changes in file directly as well.
Tarun's user avatar
  • 1
4 votes
0 answers
135 views

I am trying to filter out the URI column from a parquet file having over 50 million rows containing empty string using import polars as pl lf = pl.scan_parquet("data.parquet") lf.filter(pl....
5 votes
3 answers
249 views

How can I query columns that are lists or dicts? Here is some basic JSON-like data. [ { "id": 1, "name": "John Doe", "age": 30, &...
1 vote
3 answers
114 views

if I have saved a data frame using pickle in a binary file how can I access it? def create_dataset(path): """ creates an binary file with dataset saved in it. "&...
-3 votes
2 answers
112 views

I’m trying to write a Python script that allows the user to input the name of a column and then prints the value counts of that column from a pandas DataFrame. Here's what I currently have: def ...
4 votes
2 answers
124 views

Given a DataFrame that with a column of multiple rows, I try to generate a column with different random samples for each row from a same range, so I tried to write this: >>> import polars as ...

15 30 50 per page
1
2 3 4 5
...
9916

AltStyle によって変換されたページ (->オリジナル) /