Class DataFrameGroupBy (1.7.0)
 
 
 
 
 
 
 Stay organized with collections
 
 
 
 Save and categorize content based on your preferences.
 
  
 
 - 2.27.0 (latest)
- 2.26.0
- 2.25.0
- 2.24.0
- 2.23.0
- 2.22.0
- 2.21.0
- 2.20.0
- 2.19.0
- 2.18.0
- 2.17.0
- 2.16.0
- 2.15.0
- 2.14.0
- 2.13.0
- 2.12.0
- 2.11.0
- 2.10.0
- 2.9.0
- 2.8.0
- 2.7.0
- 2.6.0
- 2.5.0
- 2.4.0
- 2.3.0
- 2.2.0
- 1.36.0
- 1.35.0
- 1.34.0
- 1.33.0
- 1.32.0
- 1.31.0
- 1.30.0
- 1.29.0
- 1.28.0
- 1.27.0
- 1.26.0
- 1.25.0
- 1.24.0
- 1.22.0
- 1.21.0
- 1.20.0
- 1.19.0
- 1.18.0
- 1.17.0
- 1.16.0
- 1.15.0
- 1.14.0
- 1.13.0
- 1.12.0
- 1.11.1
- 1.10.0
- 1.9.0
- 1.8.0
- 1.7.0
- 1.6.0
- 1.5.0
- 1.4.0
- 1.3.0
- 1.2.0
- 1.1.0
- 1.0.0
- 0.26.0
- 0.25.0
- 0.24.0
- 0.23.0
- 0.22.0
- 0.21.0
- 0.20.1
- 0.19.2
- 0.18.0
- 0.17.0
- 0.16.0
- 0.15.0
- 0.14.1
- 0.13.0
- 0.12.0
- 0.11.0
- 0.10.0
- 0.9.0
- 0.8.0
- 0.7.0
- 0.6.0
- 0.5.0
- 0.4.0
- 0.3.0
- 0.2.0
DataFrameGroupBy(
 block: bigframes.core.blocks.Block,
 by_col_ids: typing.Sequence[str],
 *,
 selected_cols: typing.Optional[typing.Sequence[str]] = None,
 dropna: bool = True,
 as_index: bool = True
)Class for grouping and aggregating relational data.
Methods
agg
agg(func=None, **kwargs) -> bigframes.dataframe.DataFrameAggregate using one or more operations.
| Parameter | |
|---|---|
| Name | Description | 
| func | function, str, list, dict or NoneFunction to use for aggregating the data. Accepted combinations are: - string function name - list of function names, e.g.  | 
aggregate
aggregate(func=None, **kwargs) -> bigframes.dataframe.DataFrameAPI documentation for aggregate method.
all
all() -> bigframes.dataframe.DataFrameReturn True if all values in the group are true, else False.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | DataFrame or Series of boolean values, where a value is True if all elements are True within its respective group; otherwise False. | 
any
any() -> bigframes.dataframe.DataFrameReturn True if any value in the group is true, else False.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | DataFrame or Series of boolean values, where a value is True if any element is True within its respective group; otherwise False. | 
count
count() -> bigframes.dataframe.DataFrameCompute count of group, excluding missing values.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Count of values within each group. | 
cumcount
cumcount(ascending: bool = True)Number each item in each group from 0 to the length of that group - 1.
| Parameter | |
|---|---|
| Name | Description | 
| ascending | bool, default TrueIf False, number in reverse, from length of group - 1 to 0. | 
| Returns | |
|---|---|
| Type | Description | 
| Series | Sequence number of each element within each group. | 
cummax
cummax(
 *args, numeric_only: bool = False, **kwargs
) -> bigframes.dataframe.DataFrameCumulative max for each group.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative max for each group. | 
cummin
cummin(
 *args, numeric_only: bool = False, **kwargs
) -> bigframes.dataframe.DataFrameCumulative min for each group.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative min for each group. | 
cumprod
cumprod(*args, **kwargs) -> bigframes.dataframe.DataFrameCumulative product for each group.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative product for each group. | 
cumsum
cumsum(
 *args, numeric_only: bool = False, **kwargs
) -> bigframes.dataframe.DataFrameCumulative sum for each group.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Cumulative sum for each group. | 
diff
diff(periods=1) -> bigframes.series.SeriesFirst discrete difference of element. Calculates the difference of each element compared with another element in the group (default is element in previous row).
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | First differences. | 
expanding
expanding(min_periods: int = 1) -> bigframes.core.window.WindowProvides expanding functionality.
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | An expanding grouper, providing expanding functionality per group. | 
kurt
kurt(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameReturn unbiased kurtosis over requested axis.
Kurtosis obtained using Fisher's definition of kurtosis (kurtosis of normal == 0.0). Normalized by N-1.
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  | 
kurtosis
kurtosis(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameAPI documentation for kurtosis method.
max
max(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute max of group values.
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed max of values within each group. | 
mean
mean(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute mean of groups, excluding missing values.
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| Returns | |
|---|---|
| Type | Description | 
| pandas.Series or pandas.DataFrame | Mean of groups. | 
median
median(
 numeric_only: bool = False, *, exact: bool = True
) -> bigframes.dataframe.DataFrameCompute median of groups, excluding missing values.
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| exact | bool, default TrueCalculate the exact median instead of an approximation. | 
| Returns | |
|---|---|
| Type | Description | 
| pandas.Series or pandas.DataFrame | Median of groups. | 
min
min(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute min of group values.
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed min of values within each group. | 
nunique
nunique() -> bigframes.dataframe.DataFrameReturn DataFrame with counts of unique elements in each position.
prod
prod(numeric_only: bool = False, min_count: int = 0)Compute prod of group values.
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed prod of values within each group. | 
quantile
quantile(
 q: typing.Union[float, typing.Sequence[float]] = 0.5, *, numeric_only: bool = False
) -> bigframes.dataframe.DataFrameReturn group values at the given quantile, a la numpy.percentile.
Examples:
>>> import bigframes.pandas as bpd
>>> bpd.options.display.progress_bar = None
>>> df = bpd.DataFrame([
... ['a', 1], ['a', 2], ['a', 3],
... ['b', 1], ['b', 3], ['b', 5]
... ], columns=['key', 'val'])
>>> df.groupby('key').quantile()
 val
key
a 2.0
b 3.0
<BLANKLINE>
[2 rows x 1 columns]
| Parameters | |
|---|---|
| Name | Description | 
| q | float or array-like, default 0.5 (50% quantile)Value(s) between 0 and 1 providing the quantile(s) to compute. | 
| numeric_only | bool, default FalseInclude only  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Return type determined by caller of GroupBy object. | 
rolling
rolling(window: int, min_periods=None) -> bigframes.core.window.WindowReturns a rolling grouper, providing rolling functionality per group.
| Parameter | |
|---|---|
| Name | Description | 
| min_periods | int, default NoneMinimum number of observations in window required to have a value; otherwise, result is  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Return a new grouper with our rolling appended. | 
shift
shift(periods=1) -> bigframes.series.SeriesShift each group by periods observations.
| Parameter | |
|---|---|
| Name | Description | 
| periods | int, default 1Number of periods to shift. | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Object shifted within each group. | 
skew
skew(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameReturn unbiased skew within groups.
Normalized by N-1.
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  | 
std
std(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameCompute standard deviation of groups, excluding missing values.
For multiple groupings, the result index will be a MultiIndex.
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Standard deviation of values within each group. | 
sum
sum(numeric_only: bool = False, *args) -> bigframes.dataframe.DataFrameCompute sum of group values.
| Parameters | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only float, int, boolean columns. | 
| min_count | int, default 0The required number of valid values to perform the operation. If fewer than  | 
| Returns | |
|---|---|
| Type | Description | 
| Series or DataFrame | Computed sum of values within each group. | 
var
var(*, numeric_only: bool = False) -> bigframes.dataframe.DataFrameCompute variance of groups, excluding missing values.
For multiple groupings, the result index will be a MultiIndex.
| Parameter | |
|---|---|
| Name | Description | 
| numeric_only | bool, default FalseInclude only  |