Foundations of Geometric Deep Learning109
[x]={g.x;g∈G}theorbitassociated withx,thenthe distributionofx, condi-
tionalon [x],isthe uniformmeasureon theorbit;precisely theHaarmeasure
of the group. Inother words,the data distribution is unchangedif we replace x
by g.x for any g∈G.
RecallthedecompositionoferrorfromChapter2.Letusfirstaddressthe
approximation error. We decompose
∥f–f
∗
∥
2
=∥f–S
G
f+S
G
f–f
∗
∥
2
=∥f–S
G
f∥
2
+∥S
G
f–f
∗
∥
2
+2⟨f–S
G
f,S
G
f–f
∗
⟩
=∥f–S
G
f∥
2
+∥S
G
f–f
∗
∥
2
+2E
[x]
E
x|[x]
[(f(x)–S
G
f(x))(S
G
f(x)–f
∗
(x))]
=∥f–S
G
f∥
2
+∥S
G
f–f
∗
∥
2
+2E
[x]
[(S
G
f(x)–S
G
f(x))(S
G
f(x)–f
∗
(x))]
=∥f–S
G
f∥
2
+∥S
G
f–f
∗
∥
2
,
wherethe fourthequalityholdsbecause f
∗
isconstant alongorbits.Inwords,
thegroupaveragingisanorthogonalprojectioninL
2
(X,ν).Moreover,we
immediatelydeducethat∥f–f
∗
∥
2
≥∥S
G
f–f
∗
∥
2
foranyf∈F,andtherefore
thattheapproximationerrorofS
G
FisupperboundedbythatofF.Addi-
tionally,wewon’tformalizeithere,butitisnothardtoseethatsincegroup
averagingisaprojection,thestatisticalerrorisreducedbyaveraging.We
thushaveahypothesisclassS
G
Fforwhichboththeapproximationandthe
statistical errors are upper bounded by those of F.
Thissoundslikegreatnewsindeed,reaffirmingtheadvantageofusing
knownsymmetriesofthetargetfunctionintoourhypothesisclass.However,
there are(at least)twoimportantissues atthis point.First, thegroup averaging
operatordefinesaconvenientmathematicalobject,butitrequiresanaverag-
ing overgroup orbits, whichcan become prohivitively expensiveveryquickly,
forsufficientlylargegroups.Next,wehavearguedthatstatisticalerrorswill
besmallerinthesymmetrichypothesisspace,butwehaven’tactuallyquan-
tifiedbyhowmuch.Asitturnsout,thegainscanberoughylyquantifiedas
follows:given adataset of n points,the symmetric learning amountsto having
instead ≃|G|n datapoints whenever Gis discrete.In the continuoussetting, the
gains amountto using[x] instead ofx∈R
Ω
, since [x]liveson aspace oflower
dimensionality|Ω|–dim(G)thanx.Whilethismaybeasubstantialgain,itdoes
not unfortunately break thecurse of dimensionality. Indeed, inthe example of
images,theEuclideangroupinR
2
hasatmost6degreesoffreedom,atiny
number against the thousands or millions of pixels!
Wethusneed toaddressthesetwomajor aspectswithadditional ideas.For
thatpurpose,wewillnowintroducetwoimportantconceptsthatextendthe
notion of symmetricrepresentation and enable efficient learning in theface of