3,475 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
3
votes
1
answer
78
views
How to pass argument to func in `pandas.resampler.agg()` when using dict input?
I am trying to resample a pandas dataframe, and for some columns I would like to sum on. additionally, I want to get None/nan as result when there is no rows in a resampling period. For aggregation on ...
-1
votes
1
answer
55
views
Column-wise aggregation of array vectors :calculating mean per "level" for bid/ask data
I am currently working on a data analysis task in DolphinDB where I need to perform column-wise aggregation on array vectors that store level - 10 bid/ask data. Specifically, I have data for bid ...
1
vote
1
answer
107
views
How to prevent duplicate transaction calculations in a ClickHouse materialized view
I’m planning to use ClickHouse to calculate wallet balances based on transactions in my base table. However, there’s an issue: if something goes wrong and I end up inserting the same transactions into ...
0
votes
1
answer
142
views
How to sum two columns and calculate their average in BigQuery?
I'm working with Google BigQuery and I have a table with two numeric columns: grade1 and grade2. I want to calculate the total sum of both columns combined (row-wise) and then find the average of ...
2
votes
1
answer
135
views
Pyspark aggregations optimization
I have a huge dataframe with 3B rows. I'm running the PySpark code below with the Spark config.
spark = SparkSession\
.builder\
.appName("App")\
.config("spark....
0
votes
0
answers
79
views
PySpark aggregations fail
I have a PySpark dataframe that contains 100M rows. I'm trying to do a series of aggregations on multiple columns, after a groupby.
df_agg = df.groupby("colA","colB","colC&...
5
votes
2
answers
175
views
Simpler forwarding of contained object
I have a proprietary file format definition that contains a header format:
class Header
{
public:
uint32_t checksum;
uint16_t impedance;
uint16_t type_of_data;
uint32_t ...
1
vote
1
answer
140
views
How do I use a drop down to change field in Vega visualization
In Vega or Vega lite, I want to create a stacked area chart where I can change the field used to color the visualization. Here is an example visualization. In this example, I would like to be able ...
2
votes
1
answer
83
views
Problems refactoring pandas.DataFrame.groupby.aggregate to dask.dataframe.groupby.aggregate with custom aggregation
I would like to run groupby and aggregation over a dataframe where the aggregation joins strings with the same id.
The df looks like this:
In [1]: df = pd.DataFrame.from_dict({'id':[1,1,2,2,2,3], '...
2
votes
2
answers
152
views
Does a multiplicity of 0..* always require a reference in the form of an instance variable?
I have modeled the relationship between LeaseAgreement and Person as an aggregation. The '1' on the Person side is meant to indicate that each LeaseAgreement has exactly one reference to a Person (in ...
0
votes
2
answers
125
views
Using StringAgg after filter & distinct
I'm using StringAgg and order as follows:
# Get order column & annotate with list of credits
if request.POST.get('order[0][name]'):
order = request.POST['order[0][name]']
...
0
votes
0
answers
54
views
Average aggregation of data stream in bytewax
I want to aggregate the values of my DataStream in tumbling windows of 10 seconds.
Unfortunately is the documentation in Bytewax very limited and I also don't find any other source where an average of ...
0
votes
0
answers
73
views
Running OpenSearch term aggregations in parallel
We have a query calculating number of terms on multiple fields.
{
"query": {
"bool": {
"filter": [
{
"term": {
"...
4
votes
2
answers
242
views
How to represent a Map<Enum, Class> relationship in a UML class diagram? [closed]
I have a class Car, an enum Position, and a class Wheel. In Car, I have a map attribute:
private Map<Position, Wheel> wheels;
I want to represent this structure in a UML class diagram. My ...
0
votes
1
answer
108
views
Masked aggregations in pytorch
Given data and mask tensors are there a pytorch-way to obtain masked aggregations of data (mean, max, min, etc.)?
x = torch.tensor([
[1, 2, -1, -1],
[10, 20, 30, -1]
])
mask = torch.tensor([
...