Group Convolution on Homogeneous Spaces

•

Convolutionalneuralnetworksrelyonmatchinginputfeatureswith

appropriately-transformedfilterparameters;thisideaextendswell

beyond shift-equivariance on grids.

•

Wecandefinearichfamilyofgroupconvolutionswhichapplyamore

generalfiltertransformation,followedbyaninnerproductwiththe

features over the domain—so long as the input domain is homogeneous.

•

Forshift-equivariantfunctionsongrids,thedomainandgrouphave

identicalstructure—constructingthemoregeneralcaserequirescarefully

keeping track of the structure within the group.

•

This allows ustoconstruct convolutionalneuralnetworks over spheres,

DNA sequences, 3D medical scans, and many more domains.

Ourdiscussionofgridshighlightedhowshiftsandconvolutionsareinti-

matelyconnected:convolutionsarelinearshift-equivariant

operations,and

vice versa, any shift-equivariantlinear operator is a convolution. Furthermore,

shift operatorscanbe jointlydiagonalisedby theFouriertransform.As itturns

out,thisispartofa farlargerstory:bothconvolutionandtheFouriertransform

can be defined for any group of symmetries that we can sum or integrate over.

ConsidertheEuclideandomainΩ=R.Wecanunderstandtheconvolution

asapatternmatchingoperation:wematchshiftedcopiesofafilterθ(u)with

aninputsignalx(u).Thevalueoftheconvolution(x⋆θ)(u)atapointuisthe

inner product of the signal x with the filter shifted by u,

(x⋆θ)(u)=⟨x,S

θ⟩=

x(v)θ(u+v)dv.

Notethatinthis caseu isboth apointonthedomainΩ=Randalsoan element

ofthetranslation group,whichwecanidentifywiththedomainitself,G=R.

Wewillnowshowhowtogeneralise thisconstruction,bysimply replacingthe

translation group by another group Gacting on Ω.

194Chapter 7

7.1Domain

As discussed in Chapter3, the action ofthe group Gon thedomain Ω induces

a representationρof Gon thespace ofsignals X(Ω)via ρ(g)x(u)=x(g

–1

u). In

theaboveexample,Gisthetranslationgroupwhoseelementsactbyshifting

thecoordinates,u+v,whereasρ(g)istheshiftoperatoractingonsignalsas

x)(u)=x(u–v). Finally,inorder toapply afilterto thesignal, weinvokeour

assumption of X(Ω) being a Hilbert space, with an inner product

⟨x,θ⟩=

x(u)θ(u)du,

whereweassumed, forthesakeofsimplicity,scalar-valuedsignals,X(Ω,R).

Having thus defined how to transformsignals andmatch themwith filters,we

can define the group convolution for signals on Ω,

(x⋆θ)(g)=⟨x,ρ(g)θ⟩=

x(u)θ(g

–1

u)du.(7.96)

Note thatx⋆θtakesvalueson theelements gof ourgroup Grather than points

on thedomainΩ.Hence,the nextlayer, whichtakesx⋆θasinput, shouldact

on signals defined on to the group G, a point we will return to shortly.

JustlikehowthetraditionalEuclideanconvolutionisshift-equivariant,the

moregeneralgroupconvolutionisG-equivariant. Thekeyobservation isthat

matching thesignalx withag-transformed filterρ(g)θis thesameas matching

theinversetransformedsignalρ(g

–1

)xwiththeuntransformedfilterθ.Math-

ematically, thiscan beexpressed as ⟨x,ρ(g)θ⟩=⟨ρ(g

–1

)x,θ⟩.With thisinsight,

G-equivariance of the group convolution(Equation 7.96) follows immediately

fromitsdefinitionandthedefiningpropertyρ(h

–1

)ρ(g)=ρ(h

–1

g)ofgroup

representations,

(ρ(h)x⋆θ)(g)=⟨ρ(h)x,ρ(g)θ⟩=⟨x,ρ(h

–1

g)θ⟩=ρ(h)(x⋆θ)(g).

TheG-equivariantgroupconvolutioncanbeseenasa "generator"of models:it

can provide a recipe for constructingG-equivariant neural networks given any

suitable G. Inordertoeventuallyunderstandhow toground thisintoconcrete

model equations, we need to look at specific examples.

7.1.1Grid convolution

Thecaseofshiftequivarianceoverthe(one-dimensional)gridwehavestud-

iedthroughoutChapter6isobtainedwiththechoiceΩ=Z

={0,...,n–1}

andthecyclicshiftgroupG=Z

.Thegroupelementsinthiscasearecyclic

shiftsofindices,i.e.,anelementg∈Gcanbeidentifiedwithsomeu∈

{0,...,n–1}suchthatg.v=(v–u)modn,whereastheinverseelementis

–1

.v=(v+u)modn.Importantly,inthisexampletheelementsofthegroup

Groups195

Figure 7.1

Left: Cosmic microwave background radiation, captured by thePlanck space observa-

tory, is asignal on S

. Right: Theaction of the specialorthogonal group, SO(3), onthe

sphere, S

. Threetypes of rotation arepossible; SO(3) isa three-dimensional manifold.

(shifts)arealsoelementsofthedomain(indices).Wethuscan,withsome

abuseofnotation,identifythetwostructures(i.e.,Ω=G);ourexpressionfor

the group convolution in this case

(x⋆θ)(g)=

n–1

v=0

–1

leads to the familiar convolution

(x⋆θ)

n–1

v=0

v+umod n

7.1.2Spherical convolution

Nowconsiderthetwo-dimensionalsphereΩ=S

withthegroupofrota-

tions,thespecialorthogonalgroupG=SO(3).Whilechosenforpedagogical

reasons,thisexampleisactuallyverypracticalandarisesinnumerousappli-

cations.Inastrophysics,forexample,observationaldataoftennaturallyhas

sphericalgeometry(Figure7.1).Thesamecanbesaidofanytaskinvolv-

ingweatherprediction(Lametal.2023).Furthermore,sphericalsymmetries

are very important in applications in chemistry whenmodeling molecules and

trying topredict theirproperties, e.g. forthe purposeof virtualdrug screening.

Representingapointonthesphereasathree-dimensionalunitvectoru:

∥u∥=1,theactionofthegroupcanberepresentedasa3×ばつ3orthogonalmatrix

Rwithdet(R)=1.Thesphericalconvolution canthusbewrittenastheinner

196Chapter 7

product between the signal and the rotated filter,

(x⋆θ)(R)=

x(u)θ(R

–1

u)du.

Thefirstthingtonoteisthannowthegroupisnotidenticaltothedomain:

thegroupSO(3)isaLiegroupthatisinfactathree-dimensionalmanifold,

whereasS

isatwo-dimensionalone.Consequently,inthiscase,unlikethe

previous example, the convolution is a function on SO(3) rather than on Ω.

Thishasimportantpracticalconsequences:inourGeometric DeepLearning

blueprint,we concatenatemultipleequivariantmaps("layers"indeeplearning

jargon)by applyinga subsequentoperator to theoutput ofthe previous one. In

the caseof translations, wecan applymultiple convolutions insequence, since

theiroutputsarealldefinedonthesamedomainΩ.Inthegeneralsetting,

sincex⋆θisafunctiononGratherthanonΩ,wecannotuseexactlythe

sameoperationsubsequently—it meansthatthenextoperationhasto dealwith

signalsonG,i.e.x∈X(G).Ourdefinitionofgroupconvolutionallowsthis

case:wetakeasdomainΩ=Gactedon byGitselfviathegroupaction(g,h)7→

ghdefinedbythecompositionoperationofG.Thisyieldstherepresentation

ρ(g)actingonx∈X(G)by(ρ(g)x)(h)=x(g

–1

.Justlikebefore,theinner

productisdefinedbyintegratingthepoint-wiseproductofthesignalandthe

filteroverthedomain,whichnowequalsΩ=G.Inourexampleofspherical

convolution, a second layer of convolution would thus have the form

((x⋆θ)⋆φ)(R)=

SO(3)

(x⋆θ)(Q)φ(R

–1

Q)dQ.

7.1.3Limitations

Since convolutions involve inner products, that inturn require integrating over

the domainΩ,wecan onlyuseiton domainsΩ thatare small(inthediscrete

case) or low-dimensional (in the continuous case).

Forinstance,wecanuseconvolutionsontheplaneR

(twodimensional)

orspecialorthogonalgroupSE(3)(threedimensional),oronthefinitesetof

nodesofagraph(n-dimensional).Itmightthenbetemptingtoconstructa

highlyexpressivegraphneuralnetworkby performingthiskindofconvolution

directly onthe groupofpermutations S

. We can,for example,first transform

features from a set of nodes Vinto S

(x⋆θ)(σ)=

u∈V

x(u)θ(σ

–1

(u)),(7.97)

and then continue transforming on S

as follows:

((x⋆θ)⋆φ)(σ)=

′

∈S

(x⋆θ)(σ

′

)φ(σ

–1

◦σ

′

),(7.98)

Groups197

312

132

321

231

123

213

Inputnodes,V

(Lifted)

(123)

(213)

(132)

(312)

(231)

(321)

(Convolved)

(123)

(213)

(132)

(312)

(231)

(321)

Figure 7.2

Leveragingthegroupconvolutional framework toconstructalearnableS

-equivariant

transformationoverthree-nodegraphs,perEquations7.97–7.98.Thefirstlayermaps

nodefeatures inX(V) directlyto permutationfeatures inX(S

),via parametersθ:V→

R;thesecondlayermapsX(S

)→X(S

)viaparametersφ:S

→R.Whileelegant

and spanning a rich class of permutation-equivariant graph models, it quickly becomes

unwieldy at larger |V|, and does not easily transfer across graphs of different sizes.

whereσ,σ

′

∈S

arepermutationsand◦ispermutationcomposition(see

Figure 7.2 for an example over three nodes and six permutations).

However,inpractice,suchalayercannotbeconstructedonanybutthe

smallestofgraphs,becausethepermutationgroupS

hasn!elements,and

thelayeraboverequiresstoringafeaturevector,(x⋆θ)(σ),foreachofthose

elements.Whilesuchconstructionsareindeedimpractical,itisinterestingto

ponderthatthereexistsanextremelyrichfamilyofpermutation-equivariant

modelsactingdirectlyonthepermutationgroupinthisway,encompassing

layers of possibly significant amounts of expressive power.

Similarly,integratingoverhigher-dimensionalgroupsliketheaffinegroup

(containing translations,rotations, shearingand scaling,for atotal of6 dimen-

sions)isnotfeasibleinpractice.Nevertheless,aswehaveseeninChapter5,

we can still build equivariantconvolutionsfor large groupsGby workingwith

signalsdefinedonlow-dimensionalspacesΩonwhichGacts.Indeed,itis

possibletoshowthatanyequivariantlinearmapf:X(Ω)→X(Ω

′

)between

198Chapter 7

twodomainsΩ,Ω

′

canbewrittenasageneralisedconvolutionsimilartothe

group convolution discussed here.

Second,wenotethatthe Fourier transformwederivedinChapter6fromthe

shift-equivariancepropertyoftheconvolution canalsobeextended toamore

generalcasebyprojectingthesignalontothematrixelementsofirreducible

representations of G.

Finally,wepointtotheassumptionthathassofarunderpinnedourdiscus-

sioninthisChapter:whetherΩwasagrid,plane,orthesphere,wecould

transformeverypointintoanyotherpoint,intuitivelymeaningthatallthe

pointsonthedomain"lookthesame."AdomainΩwithsuchpropertyis

calledahomogeneousspace,whereforanyu,v∈Ωthereexistsg∈Gsuch

that g.u=v

. In future Chapters, we will try to relax this assumption.

7.2Model

Asdiscussedthusfar,wecangeneralisetheconvolutionoperationfromsignals

onaEuclideanspacetosignalsonanyhomogeneousspaceΩ acteduponby

agroupG.ByanalogytotheEuclideanconvolution, whereatranslated filter

ismatchedwiththesignal,theideaofgroupconvolutionistomove thefilter

around the domainusing thegroup action,e.g. by rotatingand translating.By

virtueofthetransitivityofthegroupaction,wecanmovethefiltertoany

positiononΩ.Inthissection,wewilldiscussseveralconcreteexamplesof

thegeneralideaofgroupconvolution,includingimplementationaspectsand

architectural choices.

7.2.1Discrete group convolution

WebeginbyconsideringthecasewherethedomainΩaswellasthegroup

Garediscrete.Asourfirstexample,weconsidermedicalvolumetricimages

representedassignalsofon3Dgridswithdiscretetranslationandrotation

symmetries.Thedomainisthe3DcubicalgridΩ=Z

andtheimages(e.g.

MRIorCT3Dscans)aremodelledasfunctionsx:Z

→R,i.e.x∈X(Ω).

Although,inpractice,suchimageshavesupportonafinitecuboid[W]×ばつ[H]×ばつ

[D]⊂Z

,weinsteadprefertoviewthemasfunctionsonZ

withappropri-

atezeropadding.Asoursymmetry,weconsiderthegroupG=Z

⋊O

distance-andorientation-preservingtransformationsonZ

.Thisgroupcon-

sistsoftranslations(Z

)andthediscreterotationsO

generatedby90degree

rotations about the three axes (see Figure 7.3).

As oursecondexample, we considerDNA

sequences madeupof fourlet-

ters:C,G,A,andT.Thesequencescanberepresentedonthe1DgridΩ=Z

assignalsx:Z→R

,whereeachletterisone-hotcodedinR

.Naturally,we

have a discrete 1D translation symmetry on the grid, but DNA sequences have

Groups199

Figure 7.3

A 3×ばつ3 filter, rotated byall 24 elements ofthe discrete rotationgroup O

, generatedby

90-degreerotationsabout the verticalaxis (red arrows),and 120-degree rotationsabout

a diagonal axis (blue arrows).

anadditionalinterestingsymmetry.ThissymmetryarisesfromthewayDNA

is physically embodiedas adoublehelix, andthe way itis readby themolec-

ularmachineryofthecell.Eachstrandofthedoublehelixbeginswithwhat

iscalledthe5

′

-endandendswitha3

′

-end,withthe5

′

ononestrandcom-

plementedbya3

′

ontheotherstrand.Inotherwords,thetwostrandshave

anoppositeorientation.SincetheDNAmoleculeisalwaysreadoffstarting

atthe5

′

-end,butwedonotknowwhichone,asequencesuchasACC-

CTGGisequivalenttothereversedsequencewitheachletterreplacedbyits

complement,CCAGGGT. Thisiscalledreverse-complement symmetryofthe

lettersequence,andisdepictedinFigure7.4.Wethushavethetwo-element

groupZ

={0,1}correspondingtotheidentity0andreverse-complement

transformation 1(andcomposition 1+1=0mod2). The fullgroupcombines

translations and reverse-complement transformations.

Inthisdiscretecase,thepreviouslydefinedgroupconvolution(Equation

7.96) is given as the following inner product:

(x⋆θ)(g)=

u∈Ω

ρ(g)θ

,(7.99)

between the (single-channel) input signal x and a filter θtransformed by g∈G

via ρ(g)θ

=θ

–1

, andtheoutput x⋆θisafunction onG.Note that,sinceΩ is

discrete, we have replaced the integral from Equation 7.96 by a sum.

200Chapter 7

3’

5’5’

3’

Figure 7.4

A schematic of the DNA’s double helix structure, with the two strands coloured in blue

and red.Notehow the sequencesinthe helicesarecomplementaryand readinreverse

(from 5’ to 3’).

7.2.2Transform + Convolve approach

Wewillshowthatthediscretegroupconvolution canbeimplementedintwo

steps:afiltertransformationstep,andatranslationalconvolutionstep.The

filtertransformationstepconsistsofcreatingrotated(orreverse-complement

transformed)copiesofa basicfilter, whilethetranslationalconvolution isthe

sameasinstandardCNNsandthusefficientlycomputableonhardwaresuch

asGPUs.To seethis,notethat,inbothof ourexamples, wecan writeageneral

transformationg∈Gasatransformationh∈H(e.g.arotationorreverse-

complement transformation)followed bya translationk∈Z

, i.e.g=kh (with

juxtapositiondenotingthecompositionofthegroupelementskandh).By

properties of the group representation, we have ρ(g)=ρ(kh)=ρ(k)ρ(h). Thus,

(x⋆θ)(kh)=

u∈Ω

ρ(k)ρ(h)θ

u∈Ω

(ρ(h)θ)

u–k

(7.100)

Werecognisethelastequationasthestandard(planarEuclidean)convolu-

tionofthesignalxandthetransformedfilterρ(h)θ.Thus,toimplement

groupconvolutionforthesegroups,wetakethecanonicalfilterθ,create

transformedcopiesθ

=ρ(h)θforeachh∈H(e.g.eachrotationh∈O

reverse-complement DNA symmetryh∈Z

),andthenconvolvex witheachof

these filters: (x⋆θ)(kh)=(x⋆θ

)(k). For both of our examples, the symmetries

acton filtersbysimplypermuting thefiltercoefficients, asshownin Figure7.3

fordiscreterotations.Hence,theseoperationscanbeimplementedefficiently

using an indexing operation with pre-computed indices.

Whilewedefinedthefeaturemapsthatareproducedbythegroupconvo-

lutionx⋆θasfunctionsonG,thefactthatwecansplitanyg∈Gintog=hk

meansthatwecanalsothinkofthemasastackofEuclideanfeaturemaps

Groups201

(sometimes called orientation channels),with one feature map per filter trans-

formation /orientationk. Forinstance,inour firstexamplewewouldassociate

toeachfilterrotation(eachnodeinFigure7.3)afeaturemap,whichisobtained

byconvolving(inthetraditionaltranslationalsense)therotatedfilter.These

feature mapscanthusstillbe storedasaW ×ばつH ×ばつCarray,wherethe number

ofchannelsCequalsthenumberofindependentfilterstimesthenumberof

transformations h∈H(e.g. rotations).

Aspreviouslyshown,thegroupconvolutionisequivariant:(ρ(g)x)⋆θ=

ρ(g)(x⋆θ). Whatthis means in termsof orientation channels is that,under the

action ofh, eachorientation channelistransformed, andthe orientationchan-

nelsthemselvesarepermuted.Forinstance,ifweassociateoneorientation

channelpertransformationinFigure7.3andapplyarotationby90degrees

aboutthez-axis(correspondingtotheredarrows),thefeaturemapswillbe

permuted as shown by the red arrows.

Thisdescriptionmakesitclearthatagroupconvolutionalneuralnetwork

bearsmuchsimilaritytoatraditionalCNN.Hence,manyofthenetworkdesign

patternsdiscussedinChapter6,suchasresidualnetworks,canbeusedwith

group convolutions as well.

7.2.3Spherical CNNs via the Fourier domain

Forthe continuoussymmetrygroup ofthesphere thatwe saw inSection7.1.2,

itispossibletoefficientlyimplementtheconvolutioninthespectraldomain,

usingtheappropriateFouriertransform(weremindthereaderthatthecon-

volutiononS

isafunctiononSO(3),henceweneedtodefinetheFourier

transformonboththesedomainsinordertoimplementmulti-layerspherical

CNNs).Sphericalharmonicsareanorthogonalbasisonthe2Dsphere,anal-

ogoustotheclassicalFourierbasisofcomplexexponential.Onthespecial

orthogonal group,the Fourier basisis known as theWignerD-functions. Both

of these bases find wide applications in quantum mechanics and chemistry.

Inbothcases, theFouriertransforms(coefficients)arecomputedastheinner

product withthebasisfunctions,andan analogyoftheConvolution Theorem

holds: one can compute the convolution in the Fourier domain as the element-

wiseproductoftheFouriertransforms. Furthermore,FFT-likealgorithms exist

for theefficient computation ofthe Fourier transformonS

and SO(3).Using

thisapproach,Cohenetal.(2018)wereabletobuildthefirstefficientspherical

CNN; we defer to their paper for specific details.

7.3Case Study: TacticAI

Asmightbeexpected,groupconvolutionalnetworksmanifestinawidevariety

of interesting domains, but perhaps the reader might still find it surprising that

202Chapter 7

↕

↔

↔↕

↕

↔

↔↕



∥X

↕





↕

∥X





↔

∥X

↔↕





↔↕

∥X

↔



↕

Figure 7.5

Illustrationof asingle layerof TacticAI’s group-equivariantneural network.Fora given

cornerkicksituation(herevisualisedusingonlysixplayersforclarity),TacticAIfirst

generates four views, corresponding to all four possible corner kick locations (e.g., X

↕

fortheverticallyreflectedcorner).Then,agraphneuralnetwork,G,processescarefully

chosenpairsofviews,inawaythatfollowstheequivarianceconstraint.Lastly,allof

the computedview-pair representationsareaggregated toproducelatentview features

(e.g. H

↕

for the vertically-reflected view).

theyhaveseennotableuseinassociationfootball

analytics.Toroundoff

thegroupsChapter,weprovideatargetedoverviewoftheresultingTacticAI

method,developedincollaborationwithLiverpoolFC(Wangetal.2024).

Rather than surveying the entire architecture and the tasks it was deployed on,

weparticularlyfocusonitsgroup-equivariantaspects(seealsoFigure7.5),

along with meaningful context on why these aspects were called for.

ProblemsetupTacticAIisasystemcapableofpredictiveandgenerative

modellingovervarioustacticalsetupsinfootball.Thetacticalsetupinputis

representedasagraphof22nodes—oneforeachplayer—witheachnode’s

features,x

∈R

,comprisingbothspatial(currentpositionandvelocity)and

physical(player height,weightandball possession)informationaboutthe cor-

responding player. It is assumed thatall pairs of players are connected to each

other,allowingthemodeltoinferthemostimportantconnectionsautomati-

cally(aswediscussedinChapter5).Thesystemneedstotheneitherpredict

Groups203

↕

↔

↕

↔

↔↕

Figure 7.6

TheCayleygraphofthedihedralgroupD

={e,↕,↔,↔↕},organising allofitsindi-

vidual elementsandtheiractionona domainspanningthefourcornersofa rectangle.

Notethatthearrowsarebidirectional,sinceeachD

groupactionisitsowninverse.

Further,notethatcomposingbothhorizontalandverticalreflectionsisequivalenttoa

180

◦

rotation.

theoutcomeofafutureevent(e.g.whowillmakenextcontactwiththeball—a

node classification task,or willa shot betaken—a graphclassification task) or

generate novel setups (that, e.g., modulate the likelihood a shot will be taken).

As itisoftentricky tomeaningfullyinfluenceopen-playtacticsinfootball,

TacticAIfocussesallofitsattentiononmodellingoutcomesinsetpieces,

duringwhichthegameiseffectivelyfrozen.Specifically,cornerkicksetups

weretargeted,astheyoccurreasonablyfrequently,startfromarigidposi-

tion,andofferimmediategoal-scoringpotential.Further,cornerkicktactics

areoftendeterminedwellinadvanceofindividualmatches,allowingfora

cleaner window for TacticAI to influence coaching decisions.

AnunexpectedsymmetryThedecisiontofocusoncornerkicks alsobrings

with itan undesirabletradeoff—theyare simplynot extremelyfrequent, lead-

ingtorelativelymodestdatasetsizes.Indeed,eventhoughTacticAIwastrained

overseveralseasons’worthof PremierLeaguedata,only9,693uniquetrain-

ingexampleswereextractablefromthisdata—afarcryfromthelargescale

datasetsusedformanyofthecasestudiespreviouslydiscussedinthisbook.

Further coupled with thescarcity of certain target events (such as shots, which

are relatively rare), the quantity of useful signal is significantly reduced.

204Chapter 7

Atthislevelofdataavailability,exploitingsymmetriesarisesasanatural

approach for optimising theextent to which the provided data isutilised. That

said,itwasnot immediatelyobviouswheresuchsymmetriescouldcomefrom.

Owingto adirect suggestionfromthe LiverpoolFC collaborators

,it wascon-

cludedthat,ifonevariesthespecificcornerinwhichthesetpieceisbeing

taken(outofthefourpossiblecornersofthepitch),theoutcomeswouldremain

approximately equivariant.

The symmetrygroupwhich governschanging cornerpositionsin awaythat

preservestherelativepositioning ofallotherpointsisthedihedralgroup

,G=

.Thisgroupcomprisesonlyfourelements:D

={e,↕,↔,↔↕},enumerat-

ingfour possibletransformations: identity, verticalflip, horizontalflip,vertical

andhorizontalflip(180

◦

rotation).ItisfullydescribedbyFigure7.6;note

thateachoperationisitsown inverse, i.e.,g=g

–1

forallg∈D

.Accordingly,

TacticAI’s neural network layers are designed to be D

-equivariant.

Notethat,inthecontextof football,D

equivariance isnot exact—reflecting

aparticularcornerkickmaynotresultinexactlythesameoutcomes.Among

otherfactors,manycorner-kicktakerstendtohaveapreferredfoot,which

directlyimpactstheprecisionwithwhichtheycancrosstheballintothe

penalty box.However, the dataefficiencybenefits ofconstraininga modelinto

outputting D

-equivariant predictions mayoutweigh any inaccuracies in those

predictions—an assumption that turned out to be true

ConstructingD

-equivariantGNNsSinceD

isasmall,discretegroup,

the blueprint of group convolutions we described throughout this Chapter per-

fectlyapplies.Firstly,letusassumewehaveatourdisposalagraphneural

network(GNN)layer

,G(X),thatcanoperateoverinputnodefeatures,X

(noteweomittheadjacencymatrixhereforbrevity,asitwillalwaysremain

fullyconnected).OurD

-equivariantGNNwillcarefullydistributeG across

various views intoacorner kicktacticalsetup,inawaythatpreservestheD

symmetry in the outputs (as pointed out by Figure 7.5).

Once wehave ourinput features,X∈R

22×ばつk

, wefirst "lift"them intothe D

space by generatingall fourtransformed views: X

=Xρ(g)for g∈D

. The

correspondingrepresentationmatricesρ(g)∈R

k×ばつk

aresimpletoconstructif

ourspatial featuresarezero-centeredaroundthe pitchcenter—we simplyneed

toflipthe signofallcolumnsofXcorrespondingtotheaxes beingreflected.

For example, over vertical flips this amounts tothe following diagonalmatrix:



↕













1i=j∧i is not a y-axis feature

–1i=j∧i is a y-axis feature

0i=j

Groups205

Now, we can follow Equation 7.96 to define our group-convolutional layer:

h∈D



∥X

–1



whereinwe’vereplacedthe(inner)productwithourGNNlayerGapplied

over concatenatedfeatures.Thislayeryields latentrepresentations H

∈R

22×ばつl

forallviewsg∈D

,andadditionalsuchlayersmaybeeasilystacked.

Finalpredictionsmaybeobtainedthrougheitherframeaveraging(H=

↕

↔

↔↕

)orretrievingH

,dependingonwhetheraninvariantor

equivariant prediction is required.

Thisapproachhassuccessfullydeliveredonitspromiseinthelow-data

set-pieceanalyticsdomain:forreceiverandshotprediction,itimprovedthe

baselineGNN’spredictivepowerbyover5%;acomparablejumpwasobtained

byleveragingagraphstructureinthefirstplace(comparedtousingaDeep

Setsmodel).Accordingly,D

equivariancebecameoneofthe uniquelynotable

features of TacticAI.