Modern CPU Architecture: A Crash Course

July 10 Geo Miller

Subscribe to our FULL RSS feed!

Just what is going on with processors these days? HCW regulars have noted a severe decline of new hardcore enthusiasts. The exciting days of overclocking, case mods, and drooling over new architectures seem to have passed. But what is to blame for this new age of premods and complacency with stock speed? Processors just got faster as engineers packed more hardware onto each chip – and seemed to leave us behind. As clock speed became less important for system performance, marketing groups don’t even seem to know what to call them anymore.

So now, it’s time to catch up. We will look briefly at the flagship chips made by Intel and AMD.

A processor’s task is to Fetch instructions, Decode what to do, Execute the request, and Store the results. To increase clock speed, these three stages have to be broken up. This is so that you don’t have to wait for every line in the processor to stabilize for the result of every instruction. Much like an assembly line, each part of the instruction is decoded and sent through the processor in stages along the pipeline. More stages means faster clock speed, but then that also means more problems when the code requires a jump (if, else, switch, case), if the execution would be faster in a different order, or information needed for a new instruction is still in the pipeline. It’s a delicate balance where there is no perfect solution. Engineers are forced to try designs and simulate which one will complete the consumers demands the fastest. It’s no longer a simple race of frequency.

Intel Core 2 Duo

The Core processor is Intel’s the response to the death of clock speed. The Pentium 4’s solution to more speed was to continue breaking into more stages. This quickly saw a plateau of performance increases as the processor would have to repopulate buffers whenever it had to change the path of execution. They ended up with 20+ stages beforeCore2 abandoning this strategy. (The original Pentiums only at 10 to 12!) Core scaled back to 14 stages, but added more buffer space on chip for reordering instructions that aren’t dependent on each other. Transistors are smaller, so rather than using them to get one instruction processed faster, Core uses more hardware in attempt to predict the best possible execution pattern. That’s right, it executes instructions before you even know you want them.

Core 2 is merely an upgraded Core 1; Core 2 doesn’t mean “a pair of Core 1’s.” I love Marketing. The frequencies they run at are roughly the same, but Core 2 has approximately 290 million transistors while Core 1 has 151 million. Core 2 also boasts EM64T, Intel’s 64 bit instruction support. Core Duo or Core 2 Duo are two cores (pipelines for Fetch/Decode/Execute/Store) on the same chip. They share an L2 cache so that they can work on the same data set without being required to go out to main memory.

AMD Athlon 64 X2

AMD jumped off the frequency bandwagon long ago, when they starting rating their processors as “1400+, 2000+, etc.” They saw that Intel’s Pentium 4 was going to bottleneck itself with a sole concentration on speed. AMD slowed the clock, but had more work done in each clock cycle. AMD also pushes 64 bit computing further in the execute stage, which allows for more information in each instruction to be processed. Each AMD X2 Core has its own L1 and L2AMD64 cache. When you have a Dual-Core chip, it adds what they call the “HyperTransport Bus” between the L2 caches. The consquence of two L2 caches is that there is more data transfer required if two threads must work on the same data set. On the upside, however, there is more room for each core to work on seperate processes. Finally, Quad FX is arguably the first push for consumer dual processors. Yes, to make it more complicated, AMD wants you to have 2 processors which both have 2 cores and 2 L2 caches. Depending on your cache size, AMD’s transistor count ranges from 120 to 250 million transistors and 15 stages, similar to the Intel, but constructed into a different pipeline.

Intel has responded to AMD’s Quad FX with the Intel Core 2 Quad. These are very different, however, as the Core 2 Quad still shares a single L2 cache.

Unfortunately both of these processors tell us the same thing. The days of pushing your hardware to the limit are on its way out. Speed is less about brute force and more about elegance in design. How can we continue to be hardCORE? The best way I can say for now is Knowledge.

Filed under: PC Hardware

Tags: , ,

Related Posts:


AMD’s New Barcelona Architecture is Coming
AMD Barcelona benchmarks; somewhat disappointing
January 2008 Sales Figures: Shortages cause a 3 way tie
Everybody Wants a Wii - December 2007 Video Game Sales Figures
Playstation 3 80GB’s PS2 backwards compatibility sucks

3 Comments »

Comment by Jim Conklin
2007-07-10 10:52:33

Very informative technical piece.

 
Comment by Jeffrey Subscribed to comments via email
2007-07-11 13:43:19

So, just out of curiosity, when can we expect to see motherboards that take four of the quad-core chipsets by either AMD or Intel? I’ve built a few computational clusters and you get tremendous bang for the buck with them compared to big iron from SGI, CRAY or SUN, but you eventually run up against a wall of diminishing returns because with 128+ nodes in a rack, the industry’s 2-5% accepted failure rate comes after you in the form of LOTS of hardware failurs. Now I’m thinking fewer nodes with as many cores per node as possible……

 
Comment by Geo Miller
2007-07-13 11:32:19

A good point. Intel is talking about their “80-core” processor — so clearly computing clusters are going to have fewer nodes in the future.

 
Name (required)
E-mail (required - never shown publicly)
URI
Subscribe to comments via email
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.