The Long Mode Chronicles
How the World Became x86-64 Inside
Back in 2024, I was looking up details about the timeline of the x86-64 transition when I came across posts by Bob Colwell, former Pentium Pro chief architect at Intel. He posted a response on Quora to a question that asked why AMD developed the x86-64 ISA, and if Intel with their vast resources would have been able to develop both. His answer was that Intel did implement a version of 64-bit x86 in Pentium 4, but it was fused off and no one could use it. There was conflict with higher ups that it would jeopardize their IA64 efforts, so he was overridden, but his thinking was that he could leave the logic in but disabled. If or when Intel was ready to embrace 64-bit x86, Intel could move quickly. A screenshot of his post is below.
The fact that Intel had a 64-bit x86 implementation isn’t new. The tension behind the scenes and a former chief architect openly admitting this was, however. Some tech sites picked up on this information when I shared it, but the way articles framed this era didn’t seem right to me. Tom’s Hardware wrote, “The nugget indicates that Intel could have beaten AMD to the x86-64 punch if the former wasn't dead-set on the x64-only Itanium line of CPUs.”
Meanwhile, Techspot’s title read, “Intel could have beaten AMD to 64-bit transition but wrongly chose not to.” My question to these claims is, “Are they true?”
I recently came across this technical report released in February of 2007 by two Microsoft employees, Matthew Kerner and Neil Padgett, that answers these questions and provides historical context. This post is primarily a summary of the report, which is comprised of interviews of employees at Microsoft, AMD, and Intel who were involved in the decision making at the time. I’ve filled in some additional details based on information from other sources I’ve found. While this isn’t the first piece that talks about the history of this transition, very few have covered it from the perspective of insiders of all three companies who worked on this.
First, the key characters that were interviewed for the paper are listed below.
Secondly, there will be lot of codenames used in this post, so to avoid confusion, here are definitions.
IA64: Intel Architecture 64, Intel’s 64-bit VLIW ISA
VLIW: Very Long Instruction Word
EPIC: Explicitly Parallel Instruction Computing, Intel’s version of VLIW
Merced: code name for IA64 processor, later branded as Itanium
Prescott: processor in the Pentium 4 line that was the first to officially support Intel64
AMD64: AMD’s proposed ISA to extend x86 to 64 bits. Also referred to as x86-64 and x64
Intel64: Intel’s implementation of AMD64. Also referred to as Yamhill, Clackamas, and EMT64
Sledgehammer: AMD’s codename for their first chip supporting the AMD64 ISA, eventually branded as Opteron. Also referred to as K8
The Motivation for IA64
The IA64 effort became public through a joint announcement by Intel and HP in 1994. During this time period, Intel engineers were concerned about the scalability of CISC architectures and believed that x86 was approaching a performance ceiling while RISC architectures would continue with the performance lead. They were worried that the increased in complexity would be difficult to manage and that a clean sheet was necessary to achieve further improvements in performance. To put this in perspective, Intel’s flagship microprocessor at this time was the Pentium series, with the successor Pentium Pro (“P6”) still in active development. Pentium was superscalar but in-order; out-of-order execution wasn’t added until Pentium Pro.
When the IA64 project was started it was perceived that the performance of RISC machines would outstrip that of [Complex Instruction Set Computing (CISC)] machines. … They felt that they could get higher performance with a [Very Large Instruction Word (VLIW)] architecture than they could with a CISC architecture. - Dave Cutler, “A History of Modern 64-bit Computing”
While Intel considered alternatives like DEC’s Alpha to succeed x86, another opportunity came along. HP was embarking on a new architecture because they believed that a VLIW chip could outperform a RISC chip. The idea was that compilers could optimize code across a larger instruction window and extract significantly more instruction level parallelism (ILP) than a CPU could at runtime. Meanwhile, HP was reluctant to pour more money into their fabs, so they considered partnering with Intel on account of their superior fabrication process. HP had more expertise in processor architecture with PA-RISC, Intel had the fabs, both companies had compiler and OS expertise, so eventually HP approached Intel to form a partnership that became IA64.
There was also a business case for IA64. Other x86 CPU vendors (AMD, Cyrix, IDT) were allowed to make PC processors that were socket-compatible with Pentium. As the market leader, Intel had to continuously innovate to avoid x86 becoming a commodity. Developing a new ISA with patent protection would allow Intel to get a head start on the competition.
IA64 Development
While the IA64 ISA was developed internally by Intel and HP, Intel also invested a lot of time and effort in software. Intel engineers made the initial IA64 Windows port, which was later taken over by Microsoft.
One feature of IA64 was the ability to choose different levels of speculation. More aggressive speculation led to better performance, but it also resulted in much larger generated code size. For example, a feature called instruction predication allowed the processor to execute both paths of a branch and only commit instructions once the branch was resolved. Predicated instructions had to be executed, even if the instructions in that branch path were never committed. Early on, it was difficult to fill the IA64 instruction bundles, and code streams consisted of up to 20% to 30% nops. While the Intel compiler team used many of the new IA64 features to maximize performance, the Microsoft compiler team chose a reduced degree of speculation to achieve a smaller code footprint.
While the first generation of Itanium parts (“Merced”) were targeted for 1999, it faced repeated schedule slips and ended up shipping in 2001 with a clock speed of 800 MHz. The initial target market was high end workstations, but over time the scope of the market shrank to servers and then to high-end servers.
The Birth of AMD64
Fred Weber speculated on the reasons why Intel chose a new ISA for IA64 rather than extending x86:
People who believed a new ISA was required to achieve higher performance.
People who believed a new ISA, with strong patent protection, was required to get ahead of the competition.
People who believed that Intel had enough influence on the market to make any architecture a success.
In addition, Intel made it difficult for competitors to follow their lead. AMD, originally a second source of Intel compatible processors, was forced to develop their own x86 processor in-house with K5. With the Pentium II, Intel changed sockets and competitors would have had to pay licensing fees if they wanted to use Intel’s chipsets. Rather than pay, AMD opted to develop their own chipsets for K7/Athlon. Weber credits overcoming these struggles, first with developing CPUs and later their own platforms, as a key contributor to AMD64’s success.
Weber said that AMD considered various options for a 64-bit ISA, including Alpha, SPARC, MIPS, PowerPC, and even IA64, along with support for legacy x86 binaries. However, Weber understood how difficult it was to get developers to adopt a new ISA. With IA64’s focus on scientific and numerical workloads rather than general computing, AMD saw an opportunity to attack an area Intel had left exposed. On the systems side, they believed they could build 2-way and 4-way “glueless”1 symmetric multiprocessing systems (SMP) to break into the server market. The server market’s higher margins along with the belief that server market share would lead to high volumes in the desktop PC market eventually led to AMD betting on a 64-bit extension to x86. Succeeding in this effort allowed for the possibility for AMD to escape Intel’s shadow. It was a big bet given the relative size of AMD compared to Intel, but it was also the one with the best chances.
AMD eventually announced AMD64 in 1999 and published the ISA spec in 2000. Their first 64-bit processor, code named “Sledgehammer”, was presented at the Microprocessor Forum in 1999 and publicly announced in 2001. This means that the project began after the existence of x86 processors with superscalar and out-of-order execution, so AMD architects were confident that they could continue to improve x86 performance beyond what was believed possible less than a decade prior.
AMD approached Microsoft to collaborate on an AMD64 port for Windows and was also open to feedback on their proposed ISA. This stood in contrast to Intel’s approach to IA64 where the spec was developed internally and released without much feedback from outside the Intel/HP alliance. While there was no contract for AMD/Microsoft partnership, in the end Microsoft developed the AMD64 port of Windows and they contributed several features to the ISA. I won’t go into the technical details of the features, but they were:
RIP-relative addressing
CR8 (control register), which improved interrupt performance over the task priority register (TPR)
SwapGS
Exception trap frame improvements
NX bit
Fast FP switch
Transparent multicore support, which used the same method for detecting multiple cores as SMT/HyperThreading on Intel processors
In parallel, AMD also worked with the open source community. Code Sorcery developed a Pascal compiler, SuSE ported the C and Fortran compilers, and the community also made a Linux port. Keep in mind that Linux was less than 10 years old at this point but was rapidly becoming more popular.
The majority of features from x86 were unchanged in AMD64, and while this may not have been the cleanest approach, from a development perspective it was the easiest. Porting to AMD64, from OS and tools to applications, was much smoother. The same microarchitecture was used in 32-bit mode and 64-bit mode. Improvements to one mode applied to the other; in contrast supporting two different ISAs on the same silicon would often result in having to make tradeoffs.
Market Reception
One response to IA64 versus AMD64 worth noting is expressed by the following quotes from Linus Torvalds, the outspoken creator of Linux. These are sourced from online articles that are no longer available.
“[Intel] threw out all the good parts of the x86 because people thought those parts were ugly. They aren't ugly, they're the 'charming oddity' that makes it do well.”
"Right now Intel doesn't even seem to be interested in '64-bit for the masses', and maybe IBM will be. AMD certainly seems to be serious about the 'masses' part, which in the end is the only part that really matters."
"Code size matters. Price matters. Real world matters. And ia-64 at least so far falls flat on its face on ALL of these."
While not everyone was as direct as Torvalds, the market spoke with sales numbers. Itanium sales in the first full quarter of sales in 2003 was estimated to be less than 2500 servers with a focus on scale-up SMP systems. Opteron, on the other hand, sold an estimated 150,000 units in their first year and finally penetrated the server market. While these were big numbers for AMD given their initial server market share of 0%, it’s worth noting that Intel sold around 6 million 32-bit servers in that same period.
While Itanium didn’t hit the projected targets, the release of Intel64 products in 2004 and subsequent price war did slow down some of the Opteron share gain. By Q1 2006, Opteron had reached 22.9% server market share.
Intel64
One of the most interesting parts of this saga was how the development of Intel64 came about. After the Sledgehammer presentation at the Microprocessor Forum, Intel felt they could not ignore it and had to plan a response. They knew that no matter how Intel responded, AMD was likely to get 64-bit x86 silicon first.
One option they considered was to preannounce a competing version of a 64-bit x862. However, this plan had a few problems. The industry, including Microsoft, was unlikely to get behind yet another competing ISA without a compelling reason. More importantly, any public acknowledgement of this effort would have jeopardized IA64 adoption. Developing this in secret wasn’t an option either, as they relied on outside vendors for OS and toolchain support.
Eventually Bhandarkar proposed an AMD64-compatible ISA in mid-2000. This effort came under different names like Yamhill, Clackamas, and EMT64, but it was finally branded as Intel64. Intel knew that if vendors were coding for AMD64, support for Intel64 could be added quickly.
Once Intel decided on this plan, they monitored Windows source code for changes related to AMD64, created their own internal builds, and tested them before eventually sharing these plans with Microsoft. This plan was disclosed to partners in 2002, pre-production testing was done in 2003, and Intel64 production silicon shipped in 2004 in a later stepping of their Prescott line of chips.
Conclusion
Going back to the original question from the beginning of this post, was it true that “Intel could have beaten AMD to the x86-64 punch if the former wasn't dead-set on the x64-only Itanium line of CPUs”? Is it true, as the Techspot article claims, that “Intel could have beaten AMD to 64-bit transition but wrongly chose not to”? The answer to both of these is simply “no”. Intel was in a partnership with HP, and any public effort to work on x86-64, or for that matter any other 64-bit ISA, would have jeopardized IA64. While the benefit of hindsight allows us to cast judgment that IA64 was doomed to fail, there was genuine belief for a long time that IA64 was the future due to Intel’s sheer size and influence and AMD’s underdog status. This quote from a December 2005 CNet article expresses this sentiment.
The initial Itanium prospects were impressive. All the major server and operating systems companies jumped on board.
…
"The momentum was huge," Gwennap said. "There was this incredible anticipation and expectation that this was going to be the next big thing. Intel was on a roll, and with HP backing them, then other companies started jumping on the bandwagon."
Many factors led to IA64’s muted reception, from Itanium’s poor legacy x86 performance and high cost, but one could argue the final nail in the coffin for this project was Intel64.
Intel's moves to keep Xeon competitive hurt Itanium. “Once AMD showed Intel what to do with x86--adding 64-bit support--that was the end of Itanium right there. When Intel announced it was going to do (64-bit x86 chips), it was obvious Itanium was irrelevant for anything but the high end of the market,” said Peter Glaskowsky, an Envisioneering analyst and chief architect of start-up MemoryLogix.
While Colwell’s Quora post doesn’t refer to his 64-bit implementation of x86 as Intel64 or any of its variants, the timeline makes it likely. Since his version of events is slightly different from what was described in the paper, it’s only fair to include his perspective on how things came about. He posted a second answer on Intel’s 64-bit journey, which I’ve included below.
While the report credits Bhandarkar with the decision to pursue what would eventually become Intel64, it’s not clear how Colwell’s depiction of the tension with upper management fits into all of this. One possibility is that the engineers were allowed to work on Intel64 as long as it was not publicized. However, Colwell also stated that he was ordered to remove the 64-bit logic from Pentium 4, which he compromised by fusing it off instead.
Finally, I should at least mention rumors of Intel’s “P7” project that was supposed to follow Pentium Pro. It was the original 64-bit x86 project that was canceled and replaced with the IA64 effort. This article in Real World Technologies goes into a little more detail.
In early 1993 Intel’s Santa Clara processor design team had just finished off the P5 project (Pentium) and started work on the P7. Intel had initiated a new strategy to operate two separate x86 processor development teams in parallel, in an overlapped fashion. Under this strategy, when the first team finishes the generation N processor design it starts work on the generation N+2 processor, while the second team is in the middle of the generation N+1 processor project. The hope was to cut the four-year intervals between new processor cores in half. When work on the P7 started up in Santa Clara, Intel’s P6 team in Hillsboro Oregon was about 18 months and a lot of hard work away from delivering the Pentium Pro. The P7 was a powerful 64-bit x86 compatible successor to the P6 envisioned to have around 20 million transistors or nearly four times as many as the Pentium Pro. In some ways Intel’s original P7 project conceptually resembles AMD’s K8 “sledgehammer”, the 64-bit successor to the K7 Athlon.
The P7 progressed only far along enough for Intel’s engineers to realize that extending x86 to 64 bits, and staying competitive with RISC processors, would be challenging to say the least. Around this time Intel entered into an alliance with Hewlett Packard to develop a high performance 64-bit processor incorporating variable length VLIW technology from HP’s “Wide word” extension to its Precision Architecture RISC architecture. In 1994 the Santa Clara team dropped all work on the 64 bit x86 processor design called P7 and started on the first implementation of the new IA-64 architecture arising from the Intel-HP alliance, a processor later known as Merced. The Merced project adopted the P7 designation, and its troubled offspring is targeted to reach the market later this year under the name “Itanium”.
In retrospect, supporting two completely different ISAs on the same silicon ended up being very costly, even if it wasn’t the main reason for IA64’s lack of adoption. Chip design is an exercise in economics, where area is yield and yield is cost. Unless you’re a certain fruit-based company that owns the full stack from OS and tools to hardware, getting the industry to migrate to a new ISA is extremely difficult. This lesson applies to today as ARM architectures are trying to displace x86-64 in both consumer and server markets. It can still happen, but maybe not on the timeline that some are hoping for.
This story is also an example of serendipitous timing. It’s easy to criticize the IA64 project in retrospect, but the decisions were made with the best available information at the time. It was only a few years later that the possibilities began to shift.
Edit: This is the link to the technical report. However, if you’re having trouble accessing it, I have a copy of it here.
“Glueless” describes a point-to-point multiprocessor system, that is, one that does not rely on a shared bus system with a central memory controller and coherency point.
The paper described this as “a competing ISA with a RISC-like 64-bit extension to x86” which I interpreted to mean a 64-bit version of x86 that was incompatible with AMD64.







Hans De Vries figured it out from die shots in March/April 2003:
April 2003: http://www.chip-architect.org/news/2003_04_20_Looking_at_Intels_Prescott_part2.html
March 2003: http://www.chip-architect.org/news/2003_03_26_Prescott_clues_for_Yamhill.html
I'm a little disappointed there's no mention of Solaris, which had amd64 support with the 2005 shipment of Solaris 10 FCS.