Intel Panther Lake will allegedly reintegrate the memory controller into the compute tile — Nova Lake is expected to separate the two again with added optimizations

(Image credit: Intel)

Intel's Core Ultra 200S series, codenamed Arrow Lake, has launched after a long wait. However, initial reviews have been disappointing. Arrow Lake shows generational regressions in gaming performance, as mentioned in our Core Ultra 9 285K review. Intel is seemingly back to the drawing board again and will allegedly revamp Panther Lake by integrating the IMC (Integrated Memory Controller) with the Compute Tile as per hardware leaker Kopite at X, nipping most of the latency issues in the bud. Moreover, hardware sleuth Jaykihn alleges Panther Lake lacks a dedicated SoC Tile.

Intel's disaggregated approach was bound to have problems, as we saw with AMD's RDNA 3 architecture. The latency problem is twofold - a slow ring bus (about 3.9 GHz) and an off-die memory controller. Raptor Lake could match Zen 4 in gaming performance due to a blazing fast ring bus - clocking in at roughly 5 GHz and a simple monolithic design. With Arrow Lake, data has to travel across dies to reach the memory controller, followed by the DRAM. Plus, we have the added penalties introduced by an almost 20% slower ring bus, leading to atrocious L3 access cycles. Here is a simple diagram of Arrow Lake for explanation.

Arrow Lake diagram — (Image credit: Jaykihn )

Now that we have some context let's move on to today's leak. Kopite alleges that Intel's upcoming mobile-only Panther Lake CPUs will reintegrate or reunite the IMC with the Compute Tile. Hence, data will not need to be routed through additional Die-to-Die interconnects to communicate with the IMC.

Although PTL will reintegrate IMC into the compute die, NVL will once again separate and optimize it.October 26, 2024

But there's a catch—as always. Apparently, Panther Lake will not feature an SoC Tile, which could be the only plausible reason why Intel would abandon its AMD-esque strategy of separating the cores and the IMC. This should be taken with a pinch of salt, as there is no concrete evidence from Intel, and even the leaker seems unsure.

Subsystems traditionally located on a separate SOC tile are moved to the compute tile in PTL due to the lack of a dedicated SOC tile.The lack of a dedicated SOC tile is due to scale.October 26, 2024

Nova Lake, the successor to Arrow Lake, will reportedly separate the cores and the IMC again, but with added optimizations. We believe the IMC issue should not be that hard to fix since AMD has been using the same 2.5D strategy since the inception of Zen 2. The real question is the ring bus. Will Intel be forced to redesign its interconnects akin to AMD's renowned Infinity Fabric?

Most of this is unclear as of now, but everyone in the enthusiast field has one common concern: "Why did Arrow Lake underdeliver?" Intel axed Meteor Lake on the desktop to focus on Arrow Lake, and this is the first new architecture we've seen since 2021 (Alder Lake), as Raptor Lake was merely a refresh. For now, let's hope Intel can address Arrow Lake's inconsistencies through microcodes and Windows Updates, and then we'll see the final numbers.

TOPICS

Hassam Nasir is a die-hard hardware enthusiast with years of experience as a tech editor and writer, focusing on detailed CPU comparisons and general hardware news. When he’s not working, you’ll find him bending tubes for his ever-evolving custom water-loop gaming rig or benchmarking the latest CPUs and GPUs just for fun.

10 Comments Comment from the forums

TheSecondPower

I remember hearing that Panther Lake will succeed Lunar Lake. Lunar Lake already shares one die for the memory controller and CPU cores and GPU and NPU. There is a separate I/O die though.

I think Broadwell-E used a mesh interconnect instead of a ring bus because it had too many cores for a ring bus. I'm guessing Intel's server CPUs use something similar. As I recall it wasn't great for latency.

I'm guessing AMD's solution is a little different. Zen 1 and 2 had clusters of 4 cores and since Zen 3 AMD has had clusters of 8 cores, where latency is low inside a cluster and a little high outside of it.
Reply
waltc3

RDNA 3 is, IIRC, a GPU architecture, never to be confused with a CPU architecture...;) Big differences all around.
Reply
TheSecondPower

waltc3 said:
RDNA 3 is, IIRC, a GPU architecture, never to be confused with a CPU architecture...;) Big differences all around.
RDNA 3 and Zen 2 were both AMD products to move to the memory controller away from the compute units, like Arrow Lake and Meteor Lake do. However while this was largely successful for Zen 2 (because of huge advancements in other respects), it didn't go over so well for RDNA 3. Yes Arrow Lake and RDNA 3 are not alike since one is a CPU line and one a GPU line, but they have memory latency troubles in common.
Reply
Batuhangüvenç

Lunar lake has memory controller on compute tile and platform controller tile. maybe this is why panther lake will not have soc tile ,since lunar lake is far more successful design (better latencies for caches ,core to core even faster than raptor lake. İ think panther lake basically will be more powerfully lunar lake so they give same names to tiles as lunar lake . 4p+8e on same ring to good performance ,and 4e on separate bus to maximize efficency at light workloads.12xe3 Intel 18a 10 percent IPC for p core . Dream cpu
Reply
bit_user

TheSecondPower said:
I think Broadwell-E used a mesh interconnect instead of a ring bus because it had too many cores for a ring bus.
No, it was the last generation to use one or more rings (up to 2.5, IIRC). Skylake-SP was the first to use a mesh.

TheSecondPower said:
I'm guessing Intel's server CPUs use something similar. As I recall it wasn't great for latency.
Eh, server CPUs should really be designed to optimize latency under high utilization, which makes your typical latency benchmark pretty irrelevant. Meshes scale very well, at least until the point where they start crossing die boundaries (which burned Sapphire Rapids and is something Intel walked back in Emerald Rapids).

TheSecondPower said:
I'm guessing AMD's solution is a little different. Zen 1 and 2 had clusters of 4 cores and since Zen 3 AMD has had clusters of 8 cores, where latency is low inside a cluster and a little high outside of it.
Uh, I thought we were talking about memory latency. That matters a lot more than core-to-core.
Reply
bit_user

TheSecondPower said:
RDNA 3 and Zen 2 were both AMD products to move to the memory controller away from the compute units, like Arrow Lake and Meteor Lake do. However while this was largely successful for Zen 2 (because of huge advancements in other respects), it didn't go over so well for RDNA 3.
I'm not sure how harmful it really was for RDNA3's performance. Yes, it has higher L3 latencies than RDNA2, but it also scaled L3 bandwidth quite significantly.

Source: https://chipsandcheese.com/p/microbenchmarking-amds-rdna-3-graphics-architecture
GPUs love bandwidth and are a lot more tolerant of latency than CPUs. The only way to really know how much it hurt them is by profiling actual games and seeing to what extent shader occupancy is being constrained by L2 misses vs. RDNA2.
Reply
Kamen Rider Blade

Am I the only one wondering WTF with Intel?

I can understand using a "Tile-Based Architecture" for DeskTop / WorkStation.

Why not just go with Monolithic for LapTop parts, just like AMD?

When Power / Efficiency / Thermals matter, just make it Monolithic.

Keep your Design in check and not make it too large like AMD does, and it should be fine.

All the excess Power/Heat/Latency is more amenable in a DeskTop setting where you can benefit from the Tile/Chiplet designs inherent modularity.
Reply
bit_user

Kamen Rider Blade said:
Am I the only one wondering WTF with Intel?

I can understand using a "Tile-Based Architecture" for DeskTop / WorkStation.

Why not just go with Monolithic for LapTop parts, just like AMD?

When Power / Efficiency / Thermals matter, just make it Monolithic.
If you read their launch material around Meteor Lake, they were keen to highlight how they put all the essential blocks in the SoC tile (including 2x LPE cores), which enabled them to completely power down the GPU tile and CPU tile for light-duty usage. Meteor Lake did achieve impressive battery life in tasks like video calls, where it just had a minimal amount of compute + (hardware-accelerated) video encode/decode to do.

Also, Intel's packaging technology is more power-efficient than AMDs. So, it's much less of a downside for laptops than when AMD does it (e.g. Dragon Range).

Clearly, it had significant downsides for them. However, we should recognize that there was a certain logic behind the decision.
Reply
Kamen Rider Blade

bit_user said:
If you read their launch material around Meteor Lake, they were keen to highlight how they put all the essential blocks in the SoC tile (including 2x LPE cores), which enabled them to completely power down the GPU tile and CPU tile for light-duty usage. Meteor Lake did achieve impressive battery life in tasks like video calls, where it just had a minimal amount of compute + (hardware-accelerated) video encode/decode to do.

Also, Intel's packaging technology is more power-efficient than AMDs. So, it's much less of a downside for laptops than when AMD does it (e.g. Dragon Range).

Clearly, it had significant downsides for them. However, we should recognize that there was a certain logic behind the decision.

But can't Intel do the same type of Power Gating when they're Monolithic for each of the ASIC's seperated IP section?

Wouldn't the benefits of Monolithic out-weigh having to deal with all the down-sides of Chiplet Tech?
Reply
bit_user

Kamen Rider Blade said:
But can't Intel do the same type of Power Gating when they're Monolithic for each of the ASIC's seperated IP section?
I honestly don't know how much more effective it is to power down an entire die that just gating all of the corresponding parts.

Kamen Rider Blade said:
Wouldn't the benefits of Monolithic out-weigh having to deal with all the down-sides of Chiplet Tech?
On balance, it would seem Meteor Lake went too far, given which way they went with Lunar Lake and seem to be going with Panther Lake.

It reminds me a little of how they overdid chiplets in Ponte Vecchio, which had somewhere around 50 different chiplets, packed and stacked in there. I wonder who in their right mind thought it was a good idea to lean so heavily into relatively new tech, like that. If you look at how AMD does things, it's very incremental and I'm sure they learn a lot in each generation.
Reply

Show more comments