Debunking ARC myths
Myth 1: ARC specs are superior to competition
True only on paper. In reality, it performs horrendously, in worst case scenarios comparable to a GPU from 2009 - 15 years ago.
https://i0.wp.com/chipsandcheese.com/wp-content/uploads/2022/10/arc_1workgroup_bandwidth-2.png?ssl=1
The ancient HD 7950 GPU clobbers ARC in memory performance.
In fact, the A770’s single workgroup bandwidth is closest to the very old AMD HD 5850.
The actual memory throughput is also horrible: https://i0.wp.com/chipsandcheese.com/wp-content/uploads/2022/10/arc_vram_scaling1-1.png?ssl=1
A770 is only comparable to the RX6600XT's memory throughput only when there's enough parallel workload thrown at it. While the RX6600 gets 234GB/s out of it's potential 256GB/s, A770 fluctuates between 150-250GB/s(staying close to 150), while needing larger workload to reach that number.
A770: 150-250GB/s = 30-50% of theoretical
RX6600XT: 234GB/s = 91%
RTX 3060 Ti: 425GB/s = 95%
Chipsandcheese points out at high workload count, you can get close to theoretical, and this is why it shines on RT and 4K workloads. At least, comparatively, as a 400mm2 part performing like a 250mm2 x60 Ti isn't fantastic, it's just way better than before.
Summary: The article points out ARC has low 1) memory performance 2) difficulty of utilizing memory 3) Fused Multiply throughput is low. Comparable to AMD's Terascale(5850 for one). 4) GPU execution latency is on the high side 5) Cache memory isn't the best either, although not so far behind.
C&C points out how lot(not all) problems are likely due to having the iGPU mindset. They need to grow out of it by sticking to making more future dGPUs.
Myth 2: DX9 and DX11 APIs are still emulated.
False. DX11's performance deficiencies are because they were hyper focused on making value GPUs - iGPUs, and at such low performance driver bottlenecks(which show up as CPU bottlenecks) are essentially hidden. dGPUs perform several times higher so now they need to optimize for titles. DX11 has never been emulated. They are rewriting the stack and the latest interview by GN said they are close to finishing it(to a new high performance driver optimized for dGPU). Up to now, they've been whitelisting individual games, which is why you see large number of game updates listed in the driver notes.
This is unlike DX9, where, previous to April of 2023, was on a D3D9on12 translation layer. Late 2022 driver introduced a completely new stack with native DX9 drivers, and full transition happened in early 2023. I would not be surprised though, even DX9 needs further optimizations, similar to DX11 being hampered because it was using an iGPU stack.
Myth 3: Problems can be fixed all by drivers.
Maybe some, but they'll need to put in tremendous work. If they address the above hardware deficiencies on Battlemage, it'll be easier and faster to optimize drivers for it. Current Alchemist is an imbalanced architecture.