Quasi-arithmetic mean

Generalization of means

In mathematics and statistics, the quasi-arithmetic mean or generalised f-mean or Kolmogorov-Nagumo-de Finetti mean^[1] is one generalisation of the more familiar means such as the arithmetic mean and the geometric mean, using a function $f$ {\displaystyle f}. It is also called Kolmogorov mean after Soviet mathematician Andrey Kolmogorov. It is a broader generalization than the regular generalized mean.

Definition

[edit ]

If f is a function which maps an interval $I$ {\displaystyle I} of the real line to the real numbers, and is both continuous and injective, the f-mean of $n$ {\displaystyle n} numbers $x_{1},\dots ,x_{n}\in I$ {\displaystyle x_{1},\dots ,x_{n}\in I} is defined as $M_{f}(x_{1},\dots ,x_{n})=f^{-1}\left({\frac {f(x_{1})+\cdots +f(x_{n})}{n}}\right)$ {\displaystyle M_{f}(x_{1},\dots ,x_{n})=f^{-1}\left({\frac {f(x_{1})+\cdots +f(x_{n})}{n}}\right)}, which can also be written

M_{f}({\vec {x}})=f^{-1}\left({\frac {1}{n}}\sum _{k=1}^{n}f(x_{k})\right)

{\displaystyle M_{f}({\vec {x}})=f^{-1}\left({\frac {1}{n}}\sum _{k=1}^{n}f(x_{k})\right)}

We require f to be injective in order for the inverse function $f^{-1}$ {\displaystyle f^{-1}} to exist. Since $f$ {\displaystyle f} is defined over an interval, ${\frac {f(x_{1})+\cdots +f(x_{n})}{n}}$ {\displaystyle {\frac {f(x_{1})+\cdots +f(x_{n})}{n}}} lies within the domain of $f^{-1}$ {\displaystyle f^{-1}}.

Since f is injective and continuous, it follows that f is a strictly monotonic function, and therefore that the f-mean is neither larger than the largest number of the tuple $x$ {\displaystyle x} nor smaller than the smallest number in $x$ {\displaystyle x}.

Examples

[edit ]

If $I=\mathbb {R}$ {\displaystyle I=\mathbb {R} }, the real line, and $f(x)=x$ {\displaystyle f(x)=x}, (or indeed any linear function $x\mapsto a\cdot x+b$ {\displaystyle x\mapsto a\cdot x+b}, $a$ {\displaystyle a} not equal to 0) then the f-mean corresponds to the arithmetic mean.
If $I=\mathbb {R} ^{+}$ {\displaystyle I=\mathbb {R} ^{+}}, the positive real numbers and $f(x)=\log(x)$ {\displaystyle f(x)=\log(x)}, then the f-mean corresponds to the geometric mean. According to the f-mean properties, the result does not depend on the base of the logarithm as long as it is positive and not 1.
If $I=\mathbb {R} ^{+}$ {\displaystyle I=\mathbb {R} ^{+}} and $f(x)={\frac {1}{x}}$ {\displaystyle f(x)={\frac {1}{x}}}, then the f-mean corresponds to the harmonic mean.
If $I=\mathbb {R} ^{+}$ {\displaystyle I=\mathbb {R} ^{+}} and $f(x)=x^{p}$ {\displaystyle f(x)=x^{p}}, then the f-mean corresponds to the power mean with exponent $p$ {\displaystyle p}.
If $I=\mathbb {R}$ {\displaystyle I=\mathbb {R} } and $f(x)=\exp(x)$ {\displaystyle f(x)=\exp(x)}, then the f-mean is the mean in the log semiring, which is a constant shifted version of the LogSumExp (LSE) function (which is the logarithmic sum), $M_{f}(x_{1},\dots ,x_{n})=\mathrm {LSE} (x_{1},\dots ,x_{n})-\log(n)$ {\displaystyle M_{f}(x_{1},\dots ,x_{n})=\mathrm {LSE} (x_{1},\dots ,x_{n})-\log(n)}. The $-\log(n)$ {\displaystyle -\log(n)} corresponds to dividing by n, since logarithmic division is linear subtraction. The LogSumExp function is a smooth maximum: a smooth approximation to the maximum function.

Properties

[edit ]

The following properties hold for $M_{f}$ {\displaystyle M_{f}} for any single function $f$ {\displaystyle f}:

Symmetry: The value of $M_{f}$ {\displaystyle M_{f}} is unchanged if its arguments are permuted.

Idempotency: for all x, $M_{f}(x,\dots ,x)=x$ {\displaystyle M_{f}(x,\dots ,x)=x}.

Monotonicity: $M_{f}$ {\displaystyle M_{f}} is monotonic in each of its arguments (since $f$ {\displaystyle f} is monotonic).

Continuity: $M_{f}$ {\displaystyle M_{f}} is continuous in each of its arguments (since $f$ {\displaystyle f} is continuous).

Replacement: Subsets of elements can be averaged a priori, without altering the mean, given that the multiplicity of elements is maintained. With $m=M_{f}(x_{1},\dots ,x_{k})$ {\displaystyle m=M_{f}(x_{1},\dots ,x_{k})} it holds:

M_{f}(x_{1},\dots ,x_{k},x_{k+1},\dots ,x_{n})=M_{f}(\underbrace {m,\dots ,m} _{k{\text{ times}}},x_{k+1},\dots ,x_{n})

{\displaystyle M_{f}(x_{1},\dots ,x_{k},x_{k+1},\dots ,x_{n})=M_{f}(\underbrace {m,\dots ,m} _{k{\text{ times}}},x_{k+1},\dots ,x_{n})}

Partitioning: The computation of the mean can be split into computations of equal sized sub-blocks: $M_{f}(x_{1},\dots ,x_{n\cdot k})=M_{f}(M_{f}(x_{1},\dots ,x_{k}),M_{f}(x_{k+1},\dots ,x_{2\cdot k}),\dots ,M_{f}(x_{(n-1)\cdot k+1},\dots ,x_{n\cdot k}))$ {\displaystyle M_{f}(x_{1},\dots ,x_{n\cdot k})=M_{f}(M_{f}(x_{1},\dots ,x_{k}),M_{f}(x_{k+1},\dots ,x_{2\cdot k}),\dots ,M_{f}(x_{(n-1)\cdot k+1},\dots ,x_{n\cdot k}))}

Self-distributivity: For any quasi-arithmetic mean $M$ {\displaystyle M} of two variables: $M(x,M(y,z))=M(M(x,y),M(x,z))$ {\displaystyle M(x,M(y,z))=M(M(x,y),M(x,z))}.

Mediality: For any quasi-arithmetic mean $M$ {\displaystyle M} of two variables: $M(M(x,y),M(z,w))=M(M(x,z),M(y,w))$ {\displaystyle M(M(x,y),M(z,w))=M(M(x,z),M(y,w))}.

Balancing: For any quasi-arithmetic mean $M$ {\displaystyle M} of two variables: $M{\big (}M(x,M(x,y)),M(y,M(x,y)){\big )}=M(x,y)$ {\displaystyle M{\big (}M(x,M(x,y)),M(y,M(x,y)){\big )}=M(x,y)}.

Central limit theorem : Under regularity conditions, for a sufficiently large sample, ${\sqrt {n}}\{M_{f}(X_{1},\dots ,X_{n})-f^{-1}(E_{f}(X_{1},\dots ,X_{n}))\}$ {\displaystyle {\sqrt {n}}\{M_{f}(X_{1},\dots ,X_{n})-f^{-1}(E_{f}(X_{1},\dots ,X_{n}))\}} is approximately normal.^[2] A similar result is available for Bajraktarević means and deviation means, which are generalizations of quasi-arithmetic means.^[3]^[4]

Scale-invariance: The quasi-arithmetic mean is invariant with respect to offsets and scaling of $f$ {\displaystyle f}: $\forall a\ \forall b\neq 0((\forall t\ g(t)=a+b\cdot f(t))\Rightarrow \forall x\ M_{f}(x)=M_{g}(x)$ {\displaystyle \forall a\ \forall b\neq 0((\forall t\ g(t)=a+b\cdot f(t))\Rightarrow \forall x\ M_{f}(x)=M_{g}(x)}.

Characterization

[edit ]

There are several different sets of properties that characterize the quasi-arithmetic mean (i.e., each function that satisfies these properties is an f-mean for some function f).

Mediality is essentially sufficient to characterize quasi-arithmetic means.^[5]^{: chapter 17}
Self-distributivity is essentially sufficient to characterize quasi-arithmetic means.^[5]^{: chapter 17}
Replacement: Kolmogorov proved that the five properties of symmetry, fixed-point, monotonicity, continuity, and replacement fully characterize the quasi-arithmetic means.^[6]
Continuity is superfluous in the characterization of two variables quasi-arithmetic means. See [10] for the details.
Balancing: An interesting problem is whether this condition (together with symmetry, fixed-point, monotonicity and continuity properties) implies that the mean is quasi-arithmetic. Georg Aumann showed in the 1930s that the answer is no in general,^[7] but that if one additionally assumes $M$ {\displaystyle M} to be an analytic function then the answer is positive.^[8]

Homogeneity

[edit ]

Means are usually homogeneous, but for most functions $f$ {\displaystyle f}, the f-mean is not. Indeed, the only homogeneous quasi-arithmetic means are the power means (including the geometric mean); see Hardy–Littlewood–Pólya, page 68.

The homogeneity property can be achieved by normalizing the input values by some (homogeneous) mean $C$ {\displaystyle C}.

M_{f,C}x=Cx\cdot f^{-1}\left({\frac {f\left({\frac {x_{1}}{Cx}}\right)+\cdots +f\left({\frac {x_{n}}{Cx}}\right)}{n}}\right)

{\displaystyle M_{f,C}x=Cx\cdot f^{-1}\left({\frac {f\left({\frac {x_{1}}{Cx}}\right)+\cdots +f\left({\frac {x_{n}}{Cx}}\right)}{n}}\right)}

However this modification may violate monotonicity and the partitioning property of the mean.

Generalizations

[edit ]

Consider a Legendre-type strictly convex function $F$ {\displaystyle F}. Then the gradient map $\nabla F$ {\displaystyle \nabla F} is globally invertible and the weighted multivariate quasi-arithmetic mean^[9] is defined by $M_{\nabla F}(\theta _{1},\ldots ,\theta _{n};w)={\nabla F}^{-1}\left(\sum _{i=1}^{n}w_{i}\nabla F(\theta _{i})\right)$ {\displaystyle M_{\nabla F}(\theta _{1},\ldots ,\theta _{n};w)={\nabla F}^{-1}\left(\sum _{i=1}^{n}w_{i}\nabla F(\theta _{i})\right)}, where $w$ {\displaystyle w} is a normalized weight vector ( $w_{i}={\frac {1}{n}}$ {\displaystyle w_{i}={\frac {1}{n}}} by default for a balanced average). From the convex duality, we get a dual quasi-arithmetic mean $M_{\nabla F^{*}}$ {\displaystyle M_{\nabla F^{*}}} associated to the quasi-arithmetic mean $M_{\nabla F}$ {\displaystyle M_{\nabla F}}. For example, take $F(X)=-\log \det(X)$ {\displaystyle F(X)=-\log \det(X)} for $X$ {\displaystyle X} a symmetric positive-definite matrix. The pair of matrix quasi-arithmetic means yields the matrix harmonic mean: $M_{\nabla F}(\theta _{1},\theta _{2})=2(\theta _{1}^{-1}+\theta _{2}^{-1})^{-1}.$ {\displaystyle M_{\nabla F}(\theta _{1},\theta _{2})=2(\theta _{1}^{-1}+\theta _{2}^{-1})^{-1}.}

References

[edit ]

Andrey Kolmogorov (1930) "On the Notion of Mean", in "Mathematics and Mechanics" (Kluwer 1991) — pp. 144–146.
Andrey Kolmogorov (1930) Sur la notion de la moyenne. Atti Accad. Naz. Lincei 12, pp. 388–391.
John Bibby (1974) "Axiomatisations of the average and a further generalisation of monotonic sequences," Glasgow Mathematical Journal, vol. 15, pp. 63–65.
Hardy, G. H.; Littlewood, J. E.; Pólya, G. (1952) Inequalities. 2nd ed. Cambridge Univ. Press, Cambridge, 1952.
B. De Finetti, "Sul concetto di media", vol. 3, p. 36996, 1931, istituto italiano degli attuari.

^ Nielsen, Frank; Nock, Richard (June 2017). "Generalizing skew Jensen divergences and Bregman divergences with comparative convexity". IEEE Signal Processing Letters. 24 (8): 2. arXiv:1702.04877 . Bibcode:2017ISPL...24.1123N. doi:10.1109/LSP.2017.2712195. S2CID 31899023.
^ de Carvalho, Miguel (2016). "Mean, what do you Mean?". The American Statistician . 70 (3): 764‒776. doi:10.1080/00031305.2016.1148632. hdl:20.500.11820/fd7a8991-69a4-4fe5-876f-abcd2957a88c . S2CID 219595024.
^ Barczy, Mátyás; Burai, Pál (2022年04月01日). "Limit theorems for Bajraktarević and Cauchy quotient means of independent identically distributed random variables" . Aequationes Mathematicae. 96 (2): 279–305. arXiv:1909.02968 . doi:10.1007/s00010-021-00813-x. ISSN 1420-8903.
^ Barczy, Mátyás; Páles, Zsolt (2023年09月01日). "Limit Theorems for Deviation Means of Independent and Identically Distributed Random Variables". Journal of Theoretical Probability. 36 (3): 1626–1666. arXiv:2112.05183 . doi:10.1007/s10959-022-01225-6. ISSN 1572-9230.
^ ^a ^b Aczél, J.; Dhombres, J. G. (1989). Functional equations in several variables. With applications to mathematics, information theory and to the natural and social sciences. Encyclopedia of Mathematics and its Applications, 31. Cambridge: Cambridge Univ. Press.
^ Grudkin, Anton (2019). "Characterization of the quasi-arithmetic mean". Math stackexchange.
^ Aumann, Georg (1937). "Vollkommene Funktionalmittel und gewisse Kegelschnitteigenschaften". Journal für die reine und angewandte Mathematik . 1937 (176): 49–55. doi:10.1515/crll.1937.176.49. S2CID 115392661.
^ Aumann, Georg (1934). "Grundlegung der Theorie der analytischen Analytische Mittelwerte". Sitzungsberichte der Bayerischen Akademie der Wissenschaften: 45–81.
^ Nielsen, Frank (2023). "Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometry". arXiv:2301.10980 [cs.IT].

Retrieved from "https://en.wikipedia.org/w/index.php?title=Quasi-arithmetic_mean&oldid=1305302326"

Definition

Examples

Properties

Characterization

Homogeneity

Generalizations

See also

References