Preface
La connaissance de certains principes supplée facilement à laconnoissance de certains
faits.
—Claude Adrien Helvétius, De l’esprit (1759)
InOctober1872,thephilosophyfacultyofasmalluniversityintheBavar-
iancityofErlangenappointedanewyoungprofessor.Ascustomary,he
wasrequestedtodeliveraninauguralresearchprogramme,whichhepub-
lished underthe somewhat longand boringtitle ‘VergleichendeBetrachtungen
überneueregeometrischeForschungen’(‘Acomparativereviewofrecent
researchesingeometry’).TheprofessorwasFelixKlein,onlytwentythree
yearsofageatthattime,andhisinauguralworkhasenteredtheannalsof
mathematics as the "Erlangen Programme."
Thenineteenthcenturyhadbeenremarkablyfruitfulforgeometry.Forthe
firsttimeinnearlytwothousandyearsafterEuclid’sElements,theconstruction
ofprojectivegeometrybyPoncelet,hyperbolicgeometrybyGauss,Bolyai,
and Lobachevsky, andelliptic geometrybyRiemann showedthatan entirezoo
of diverse geometries was possible.However, theseconstructions had quickly
diverged intoindependentandunrelatedfields, withmanymathematiciansof
thatperiod questioninghow thedifferentgeometries arerelatedto eachother
and what actually defines a geometry.
Thebreakthrough insightof Kleinwas toapproachthe definitionof geom-
etry asthe study ofinvariants,or inotherwords, structures thatare preserved
underacertaintype oftransformations (symmetries).Kleinusedtheformalism
of grouptheory to definesuch transformationsand usethe hierarchyof groups
andtheirsubgroupsinordertoclassifydifferentgeometriesarisingfromthem.
Thus,thegroupofrigidmotionsleadstothetraditionalEuclideangeometry,
xiiPreface
while affine or projective transformations produce, respectively, the affine and
projective geometries.
TheimpactoftheErlangenProgrammeongeometrywasveryprofound.
Furthermore,itspilledtootherfields,especiallyphysics,wheresymmetry
principlesallowedtoderiveconservationlawsfromfirstprinciplesofsym-
metry(an astonishingresult known asNoether’sTheorem), andeven enabled
theclassificationofelementaryparticlesasirreduciblerepresentationsofthe
symmetry group.
Atthetimeofwriting,thestateofthefieldofdeeplearningissomewhat
reminiscentof thefieldof geometryinthe nineteenthcentury.Thereis aver-
itablezooofneuralnetworkarchitecturesforvariouskindsofdata,butfew
unifyingprinciples.Asintimespast,thismakesitdifficulttounderstandthe
relations betweenvarious methods,inevitably resulting inthereinvention and
re-brandingofthesameconceptsindifferentapplicationdomains.Foranovice
enteringthefield,absorbingthesheervolumeofredundantandunconnected
ideas is a major challenge.
Inthisbook,wemakeamodestattempttoapplytheErlangenProgramme
mindsettothedomainofdeeplearning,withtheultimategoalofobtaininga
systematisationofthis fieldand ‘connectingthedots’.Wecallthis geometrisa-
tionattempt‘GeometricDeepLearning’,andtruetothespiritofFelixKlein,
proposetoderivedifferentinductivebiasesandnetworkarchitecturesimple-
mentingthemfromfirstprinciplesofsymmetryandinvariance.Inparticular,
wefocusonalargeclassofneuralnetworksdesignedforanalysingunstruc-
tured sets, grids, graphs, and manifolds, and show that they can be understood
inaunifiedmannerasmethodsthatrespectthestructureandsymmetriesof
these domains.
Webelievethisbookwouldappealtoabroadaudienceofdeeplearning
researchers, practitioners, and enthusiasts. A novice may use it asan overview
andintroductiontoGeometricDeepLearning.Aseasoneddeeplearningexpert
maydiscovernewwaysof derivingfamiliararchitecturesfrom basicprinciples
andperhapssomesurprisingconnections.Practitionersmaygetnewinsights
on how tosolveproblems intheirrespective fields. Asatextbook, webelieve
the bookcan beused inan advanced (graduate)machine learningcourse, oras
a foundational ML course for the mathematically-oriented audience.
Withsuchafast-pacedfieldasmodernmachinelearning,theriskofwrit-
ingabooklikethisisthatitbecomesobsoleteandirrelevantbeforeitsees
the lightofday.Having focusedonfoundations,ourhope isthatthekey con-
ceptswediscusswilltranscendtheirspecificrealisations—or,asHelvétius
(1759) putit, "the knowledge ofcertain principleseasily compensatesthe lack
of knowledge of certain facts" .
Prefacexiii
What this book is about
Our bookis designedto introduceexistingdeep learningarchitectures through
the prism ofgeometry andcategorise them basedon thefundamentalsymme-
triesof thedata theywork on.We takecare nottoexpresspredilectionsforany
specificarchitecture(thoughwemighthaveourownsuchpreferences–nihil
humanumnobisalienum)aswebelievethereisno"onetruearchitecture",
much like there was no "one true geometry" in mathematics.
What this book is not about
TruetoHelvétius’maximsuggestingtofocusonprinciplesandtheirarchi-
tecturalinstances,weavoiddetaileddiscussionofspecificmachinelearning
pipelines (suchas self-supervisedlearning, generativemodelling, orreinforce-
ment learning)andtheirtrainingand regularisationprocedures (suchas many
gradient-descentvariantsorbatchnormalisation).Asmachinelearningprac-
titionersourselves,wearehoweverawarethat"grayarealltheories,and
greenaloneLife’sgoldentree".Itisnotuncommontoseebenchmarksand
leaderboards dominated byarchitectures thatdo not necessarilyhave rigorous
mathematicalunderpinnings,whichwewilltypicallyrefrainfromexempli-
fying.Thereisaplethoraofreasonswhythiscouldhappeninpractice.One
oftencitedreasonisabiasinthedatathatdoesnotreflectthesymmetries
oftheproblemweactuallycareabout.Anotherreasonarethe‘hardwarelot-
tery’(Hooker2021)and‘hypelottery’ phenomenawheresubstantialresources
canincentivise large-scale hyperparametertuningleadingto"winning"tricks
thathave nothingtodowiththechoiceofarchitecture(TrockmanandKolter
2022).Finally,inmany applications,acarefully-tuneddomain-specificarchi-
tecture may outperform a genericmathematically-principledone on particular
problems (Liu et al. 2022).
How to use this book
We anticipatethe bestway to usethisbook isas awayto masterthe geometric
approachofcategorisingandreasoningaboutdeeplearningarchitectures:it
can serveasa usefulstudycompanionwhilelearningaboutexisting architec-
tures, orasource ofinspiration whiledevising ordescribing novelones. These
principlesareembodiedbythecourseswehavedeliveredfirstattheAfrican
Master’sinMachineIntelligence(AMMI)in2021–2022andsubsequentlyat
CambridgeandOxfordin2022–2026.Accordingly,weexpectthatourtext
can serve as avaluablefoundation forundergraduateor graduate-level courses
in machinelearning and provide ourlecture slides asan accompanimentto the
book at geometricdeeplearning.com.
xivPreface
Wehaveattemptedtocast areasonablywide netinterms ofthearchitectures
wediscusshere,inordertoillustratethepowerofourgeometricblueprint.
Hence,ourbookcouldbeinterpretedasasurveyofmachinelearningarchi-
tectures(circa2022)—yet,wefindthistobeasuboptimalwaytoutiliseit.
Indeed,ourworkdoesnotattempttoaccuratelysummarisetheentireexist-
ing wealthof researchon GeometricDeepLearning. Rather, we study several
well-knownarchitecturesin-depthinordertodemonstratethekeyprinciples
and groundthem inexistingresearch,with thehope thatwe haveleftsufficient
referencesforthereadertomeaningfullyapplytheseprinciplestoanyfuture
geometric deep architecture they encounter or devise.
Suggested Pre-requisites
Whiletryingtomakethebookself-contained,weassumethereadertohave
a goodgrasp ofseveral basicmathematical concepts.Should youwish athor-
oughintroductionto theessentials,wewarmlyrecommendstartingwith Borde
and Bronstein (2025): it was written precisely with our textbook in mind.
Forthosewhoneedtofillanylacunae,theclassicalbookofBruckner,
Bruckner,andThomson(2008)providesafulloverviewofrealanalysis,
includingcalculus(thenotionsofderivativesandintegration),linearalgebra
(vectorspacesandmatrices),functionalanalysis(metric-,Banach-,Hilbert-,
and L
p
-spaces), andharmonicanalysis (Fourierseries andtransforms).Group
theoryplaysacentralrolein ourexposition.Foralightweight introductioninto
thissubject,werecommendthevisualapproachofCarter(2021).Adeeper
studyofthesubjectincludingthenotionsofFouriertransformsongroups
andirreduciblerepresentationsispresentedinthebookofFolland(1989)on
abstract harmonic analysis. Our discussion on manifolds would benefit from a
basicbackgroundindifferential geometry,forwhichwesuggesttheclassical
text of Do Carmo (2016).
We alsofind itusefulto have an understandingofthe foundationsofsignal
processing—for which Mallat (1999)is an excellenttext. Further, many of the
constructsanddatadomainsweusewillhaveanunderlyinggraphstructure,
sowebelievethatafoundationingraphtheorymaybeofbenefitaswell;we
recommend Chungand Graham(1997) forspectral graphtheory andShuman
et al. (2013) and Sandryhaila and Moura (2013) for graph signal processing.
Lastly,asonemightexpect,apriorunderstandingofthefoundationsof
machinelearningmayamplifythereader’sunderstandingofthesignificance
of thevarious architectureswe discuss,aswellas offer awaytospotconnec-
tionstotheirimplementationbeforeweelaborateonthem.Therearemany
suitableintroductorytexts,ofwhichwerecommendMurphy(2022)andthe
most recent book by Bishop and Bishop (2024).

AltStyle によって変換されたページ (->オリジナル) /