Re: [PATCH] [RFC] arm64: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION
From: Sedat Dilek
Date: Fri Feb 26 2021 - 04:07:05 EST
On Fri, Feb 26, 2021 at 9:14 AM Arnd Bergmann <arnd@xxxxxxxxxx> wrote:
>
>
On Fri, Feb 26, 2021 at 1:36 AM Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote:
>
>
>
> On Thu, Feb 25, 2021 at 12:21 PM Arnd Bergmann <arnd@xxxxxxxxxx> wrote:
>
> >
>
> > From: Arnd Bergmann <arnd@xxxxxxxx>
>
> >
>
> > When looking at kernel size optimizations, I found that arm64
>
> > does not currently support HAVE_LD_DEAD_CODE_DATA_ELIMINATION,
>
> > which enables the --gc-sections flag to the linker.
>
> >
>
> > I see that for a defconfig build with llvm, there are some
>
> > notable improvements from enabling this, in particular when
>
> > combined with the recently added CONFIG_LTO_CLANG_THIN
>
> > and CONFIG_TRIM_UNUSED_KSYMS:
>
> >
>
> > text data bss dec hex filename
>
> > 16570322 10998617 506468 28075407 1ac658f defconfig/vmlinux
>
> > 16318793 10569913 506468 27395174 1a20466 trim_defconfig/vmlinux
>
> > 16281234 10984848 504291 27770373 1a7be05 gc_defconfig/vmlinux
>
> > 16029705 10556880 504355 27090940 19d5ffc gc+trim_defconfig/vmlinux
>
> > 17040142 11102945 504196 28647283 1b51f73 thinlto_defconfig/vmlinux
>
> > 16788613 10663201 504196 27956010 1aa932a thinlto+trim_defconfig/vmlinux
>
> > 16347062 11043384 502499 27892945 1a99cd1 gc+thinlto_defconfig/vmlinux
>
> > 15759453 10532792 502395 26794640 198da90 gc+thinlto+trim_defconfig/vmlinux
>
> >
>
>
>
> Thanks for the numbers.
>
> Does CONFIG_TRIM_UNUSED_KSYMS=y have an impact to the build-time (and
>
> disc-usage - negative way means longer/bigger)?
>
> Do you have any build-time for the above numbers?
>
>
They are in the mailing list archive I linked to:
>
>
==== defconfig ====
>
332.001786355 seconds time elapsed
>
8599.464163000 seconds user
>
676.919635000 seconds sys
>
==== trim_defconfig ====
>
448.378576012 seconds time elapsed
>
10735.489271000 seconds user
>
964.006504000 seconds sys
>
==== gc_defconfig ====
>
324.347492236 seconds time elapsed
>
8465.785800000 seconds user
>
614.899797000 seconds sys
>
==== gc+trim_defconfig ====
>
429.188875620 seconds time elapsed
>
10203.759658000 seconds user
>
871.307973000 seconds sys
>
==== thinlto_defconfig ====
>
389.793540200 seconds time elapsed
>
9491.665320000 seconds user
>
664.858109000 seconds sys
>
==== thinlto+trim_defconfig ====
>
580.431820561 seconds time elapsed
>
11429.515538000 seconds user
>
1056.985745000 seconds sys
>
==== gc+thinlto_defconfig ====
>
389.484364525 seconds time elapsed
>
9473.831980000 seconds user
>
675.057675000 seconds sys
>
==== gc+thinlto+trim_defconfig ====
>
580.824912807 seconds time elapsed
>
11433.650337000 seconds user
>
1049.845569000 seconds sys
>
Thanks for the numbers Arnd.
>
So HAVE_LD_DEAD_CODE_DATA_ELIMINATION is a small improvement
>
on build time (since it can spend less time linking), while
>
CONFIG_TRIM_UNUSED_KSYMS slows it down quite a bit. Combining
>
CONFIG_TRIM_UNUSED_KSYMS with CONFIG_THINLTO is really
>
slow because here most of the time is spent in the final link (especially
>
when you have many CPU cores to do the earlier bits quickly), but then
>
it does the link twice.
>
My first pre-v5.12-rc1 kernel-build was with Clang-ThinLTO enabled.
But with the next ones I jumped to Sami's Clang-CFI.
>
> BTW, is CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y setable for x86 (64bit)?
>
> ( Did not look or check for it. )
>
>
No, in mainline, HAVE_LD_DEAD_CODE_DATA_ELIMINATION is currently
>
only selected on MIPS and PowerPC. I only sent experimental patches to
>
enable it on arm64 and m68k, but have not tried booting them. If you
>
select the symbol on x86, you should see similar results.
>
OK, i see:
$ git grep HAVE_LD_DEAD_CODE_DATA_ELIMINATION arch/mips/
arch/mips/Kconfig: select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
$ git grep HAVE_LD_DEAD_CODE_DATA_ELIMINATION arch/powerpc/
arch/powerpc/Kconfig: select HAVE_LD_DEAD_CODE_DATA_ELIMINATION
So, I need to add this to arch/x86/Kconfig.
You happen to know if changes to arch/x86/kernel/vmlinux.lds.S
(sections) are needed?
Last question:
The last days I see a lot of fixes touching inlining with LLVM/Clang v13-git.
What git tag are you using?
What are your experiences?
Pending patches (kernel-side)?
I use:
$ /opt/llvm-toolchain/bin/clang --version
dileks clang version 13.0.0 (
https://github.com/llvm/llvm-project.git
c465429f286f50e52a8d2b3b39f38344f3381cce)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm-toolchain/bin
My LLVM toolchain is ThinLTO+PGO optimized for Linux-kernel builds.
- Sedat -