It is not only a deeply pipelined processor, but it is also a very wide superscalar CPU that can theoretically sustain up to 5 instructions (4+ 1 branch) per clock cycle. The design philosophy of the 970FX is very aggressive. But when it comes to frequency headroom, the 970FX should do - in theory - better than the Opteron, but does not come close to the "old" Pentium 4.
When it comes to branch prediction penalties, the 970FX penalty will be closer to the Pentium 4 (Northwood). So, the Pentium 4 has to do less work in those 20 stages than what the 970FX performs in those 16 or 21 stages. The 20 stages were counted from the trace cache. 21 stages might make you think that the 970FX is close to a Pentium 4 Northwood, but you should remember that the Pentium 4 also had 8 stages in front of the trace cache. Floating point is handled through 21 stages, and the Opteron only needs 17. While the Opteron has a 12 stage pipeline for integer calculations, the 970FX goes deeper and ends up with 16 stages. The 970FX is deeply pipelined, quite a bit deeper than the Athlon 64 or Opteron.
Remember that most of the performance boost (10-30%) noticed in x86 64 bit programs came from the 8 extra registers available in "pure" 64 bit mode.
Insiders say that the PowerPC970FX has less "register pressure" than, for example, the EM64T and AMD64 CPUs (16 registers), which on their turn have less register pressure than the "older" 32 bit x86 CPUs with only 8 architectural registers. The end result is better performance, thanks to less "bookkeeping". These are the registers that can be used to program the calculations in the binary (and assembler) code.Ĭompilers for the PowerPC 970FX should thus be able to produce code that is cleaner, with less shuffling of data between the L1-cache, "secret" rename registers and architectural registers. Architectural registers are the registers that are visible to the programmer, mostly the compiler programmer. The RISC ISA, which is quite complex and can hardly be called "Reduced" (The R of RISC), provides 32 architectural registers. Meet the G5 processor, which is in fact IBM's PowerPC 970FX processor.