[フレーム]

DiscussionZen 7 speculation thread

Page 20 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Meteor Late

Senior member
Dec 15, 2023
340
374
96
If there were overclocking tools for Mac and you raised your voltage limit, raised the clock, you could get to very power inefficient zone too. But Apple just prevents you from doing it.

Yes, but that is all irrelevant when Apple already is much faster 1T while consuming less. Who cares if they would double or even triple the power consumption if clocked, say, 7% higher? a good way to measure efficiency is both performance at the same power or power at the same performance.

AMD is at a worse point in the curve I think from an efficiency standpoint. But if you clock it like 10% lower to be at a more optimal point, the deficit is still way too high.

Meteor Late

Senior member
Dec 15, 2023
340
374
96
Why do you think I don't understand this? When did I ever disagree with this statement?

All I'm asking you is when you lower the clocks to achieve the same perf/watt as an M4 Pro, what is the speed of Strix Halo?

Better would be to simply lower the frequency so that it consumes a similar amount of power as M4 Pro, that would be more apples to apples, performance at the same power.

inquiss

Senior member
Oct 13, 2010
543
802
136
Yes, but that is all irrelevant when Apple already is much faster 1T while consuming less. Who cares if they would double or even triple the power consumption if clocked, say, 7% higher? a good way to measure efficiency is both performance at the same power or power at the same performance.

AMD is at a worse point in the curve I think from an efficiency standpoint. But if you clock it like 10% lower to be at a more optimal point, the deficit is still way too high.
The point you're missing is that apple has much better idle consumption and because AMD has a lot of cores and a chunk of idle consumption it really throws the single thread efficiency calc completely out the window. It's true if you run one thread apple is much more efficient but that's more about the other factors at play than the single thread efficiency alone ignoring wherever it is on its own v/f curve and it's own performance. It looks so skewed because of the idle power.
  • Like
Reactions: Tlh97, Covfefe and Josh128

Josh128

Golden Member
Oct 14, 2022
1,434
2,178
106
So what is Strix Halo's ST speed if you lower the clocks to get the same perf/watt as M4 Pro?

At stock clocks, M4 Pro is already 52% faster than Strix Halo ST.
You do realize M4 is on 3nm and Strix Halo is on 4nm, right? Lets circle back when they are on the same process otherwise the discussion is pretty pointless.

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
AMD is at a worse point in the curve I think from an efficiency standpoint. But if you clock it like 10% lower to be at a more optimal point, the deficit is still way too high.

Yes, but it is not 1 to 1.

You can lower power consumption by 50% and lose 5% of performance - as a hypothetical scenario.

Meteor Late

Senior member
Dec 15, 2023
340
374
96
Yes, but it is not 1 to 1.

You can lower power consumption by 50% and lose 5% of performance - as a hypothetical scenario.

Yes, AMD would definitely improve the efficiency more than Apple by lowering 5 or 10% the clock speed, because they are at a higher point in the curve. But the gap would still be way too high.

Markfw

Moderator Emeritus, Elite Member
May 16, 2002
27,304
16,134
136
You keep repeating this but that's exactly what would happen if Intel puts enough atoms into a single chip.

What you're missing is area efficiency, which is a factor when it comes to MT scaling. Ultimately what matters for a server chip is area x efficiency x performance of one core x # of cores. Area directly correlates to cost of the chip.

So you put enough atoms into a single chip and it can win nT race by a mile. It's likely terribly inefficient area wise for the performance.
NO, as the added Mod comment says, no trolling or baiting. We have heard enough about Intel.

Second, small cores without SMT and avx-512 are very lacking. Just SMT alone doubled (or greatly increases) the number or cores. Once you add ability for other things like avx-512 they are lost in the dust.
  • Like
Reactions: Tlh97 and marees

Meteor Late

Senior member
Dec 15, 2023
340
374
96
The point you're missing is that apple has much better idle consumption and because AMD has a lot of cores and a chunk of idle consumption it really throws the single thread efficiency calc completely out the window. It's true if you run one thread apple is much more efficient but that's more about the other factors at play than the single thread efficiency alone ignoring wherever it is on its own v/f curve and it's own performance. It looks so skewed because of the idle power.

AMD has other parts with less cores if you want to compare, like for example lower variants of Strix Point, Krackan Point...

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
Why do you think I don't understand this? When did I ever disagree with this statement?

All I'm asking you is when you lower the clocks to achieve the same perf/watt as an M4 Pro, what is the speed of Strix Halo?

You don't achieve the same perf / watt. But what you will achieve is similar ratio of perf / watt in ST as you get in MT. Meaning, not 350% - 400% difference but more like 35% to 40% difference.

In other words, the wet dream (below) of 360% gap in efficiency growing to 720% gap in efficiency is just not living in reality:

Can you estimate how much more performance and efficiency gains AMD needs to overtake Apple's M8?

Here's a baseline for you via Notebookcheck. M4 Pro is roughly 52% faster in Cinebench ST and 3.6x more efficient than Strix Halo.

Let's suppose M8 doubles M4 per/watt to 19pts/w. Let's suppose ST is increased by 46% over 4 generations to 260 points.

Will Zen7 increase Strix Halo efficiency by 7.2x while also increasing ST performance by 2.2x?

BenchmarkStrix Halo 395+M4 Pro MiniM4 Max% Difference (M4 Max vs Strix Halo)
Memory Bandwidth256GB/s273GB/s546GB/s+113.3%
Cinebench 2024 ST116.8178178+52.4%
Cinebench 2024 MT164817292069+25.6%
Geekbench ST297838363880+30.3%
Geekbench MT212692250925760+21.1%
3DMark Wildlife (GPU)196151934537434+90.8%
GFX Bench (fps) (GPU)114125.8232+103.5%
Blender GPU Party Tug (GPU)55 sec43 sec
Cinebench ST Power Efficiency2.62 pts/W9.52 pts/W
Cinebench MT Power Efficiency14.7 pts/W20.2 pts/W
  • Like
Reactions: Tlh97 and Covfefe

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
Yes, AMD would definitely improve the efficiency more than Apple by lowering 5 or 10% the clock speed, because they are at a higher point in the curve. But the gap would still be way too high.

I posted repeatedly that Mac M line has advantage in performance and efficiency, which I did not once dispute.

All I am disputing is the 360% efficiency advantage going on to 720% efficiency advantage.

Calling BS on it.

Meteor Late

Senior member
Dec 15, 2023
340
374
96
I posted repeatedly that Mac M line has advantage in performance and efficiency, which I did not once dispute.

All I am disputing is the 360% efficiency advantage going on to 720% efficiency advantage.

Calling BS on it.

Yeah usually what you want to compare is two metrics between two processors:
-Performance at the same power: Possible to do, just downclock Zen 5 Laptop parts to match M4 or M5 power consumption 1T, then assess the difference in performance.
-Power at the same performance: Not possible I think because you cannot manually downclock or power limit manually an Apple CPU I think.

Difference in the first scenario will always be much lower than in the second scenario, because power scales quadratically with performance when talking about clock speed in general, of course it depends on the process node, on which part of the curve we are on, etc. So for example, 50% more performance at the same power is more impressive than 50% less power at the same performance.
  • Like
Reactions: Joe NYC

Kepler_L2

Golden Member
Sep 6, 2020
1,015
4,339
136
FP512 - I wonder what that might be. AVX-512 equivalent synchronized with Intel
That's what they already have with Zen5, it just means 512-bit execution units (unlike Zen4 with AVX-512 on FP256)
1/2 ACE - ACE seem to come from Advanced Matrix Extensions from AMD / Intel collaboration, and presumably, since this referred to 1/2 CCD (8 cores) it could be 1/2 of AMD's planned ACE unit.
Again it just means 512-bit execution instead of 1024-bit (double pumped)
it mentions 4x FP8 performance and 2x Int8 performance. I am assuming that these will be new datatypes for AVX-512. Zen 6 is already adding 2x FP16 performance, so Zen 7 seems to be extending it further to 8xFP8. I presume this will also be part of the AVX-512 definition.
FP8 support (2x) plus 2x FMA/iFMA execution ports

Also IMO this is why they are going very aggressively to A14, such a massive FPU plus AMX/ACE support would just be too big and power hungry even on N2P.

511

Diamond Member
Jul 12, 2024
4,822
4,389
106
Second, small cores without SMT and avx-512 are very lacking. Just SMT alone doubled (or greatly increases) the number or cores. Once you add ability for other things like avx-512 they are lost in the dust.
funny you say that E cores had 4 way SMT and AMX-512 at one point but the program bit the dust thanks to amazing Intel decisions

Joe NYC

Diamond Member
Jun 26, 2021
3,815
5,363
136
That's what they already have with Zen5, it just means 512-bit execution units (unlike Zen4 with AVX-512 on FP256)

Again it just means 512-bit execution instead of 1024-bit (double pumped)

FP8 support (2x) plus 2x FMA/iFMA execution ports

Also IMO this is why they are going very aggressively to A14, such a massive FPU plus AMX/ACE support would just be too big and power hungry even on N2P.

Thanks.

BTW, this Tweet seems to indicate that on Zen 6 side, the mobile cores will have 256b vectors. Does that mean AMD is going back to Zen 4 type implementation in mobile?

If so, IMO, it is a smart decision.

Last edited:

yuri69

Senior member
Jul 16, 2013
684
1,224
136
If die size if going up by 40% (from 70 mm2 to 98 mm2), the core count is going up by 33% and there is some density increase from new node, there should be some extra transistors and die size per core with Zen 7
Oh boy, not the "IPC power of extra transistors" again.

Compared to Zen 6 this MLID's Zen 7 has 33% more cores, twice the L2 per core, and features support for, likely, many nearly fixed-use vector/FP ISA extensions... Throw some structural increases in and add that "invisible" stuff like security, RAS, profiling, or QoS things will surely eat more transistors in the 2029 timeframe...
  • Like
Reactions: MrMPFR

AnandTech is part of Future plc, an international media group and leading digital publisher. Visit our corporate site.
© Future Publishing Limited Quay House, The Ambury, Bath BA1 1UA. All rights reserved. England and Wales company registration number 2008885.
RESOURCES
FOLLOW
Top Bottom

AltStyle によって変換されたページ (->オリジナル) /