Agner`s CPU blog

Many software programmers consider Intel's compiler the best optimizing compiler on the market, and it is often the preferred compiler for the most critical applications. Likewise, Intel is supplying a lot of highly optimized function libraries for many different technical and scientific applications. In many cases, there are no good alternatives to Intel's function libraries.

Unfortunately, software compiled with the Intel compiler or the Intel function libraries has inferior performance on AMD and VIA processors. The reason is that the compiler or library can make multiple versions of a piece of code, each optimized for a certain processor and instruction set, for example SSE2, SSE3, etc. The system includes a function that detects which type of CPU it is running on and chooses the optimal code path for that CPU. This is called a CPU dispatcher. However, the Intel CPU dispatcher does not only check which instruction set is supported by the CPU, it also checks the vendor ID string. If the vendor string says "GenuineIntel" then it uses the optimal code path. If the CPU is not from Intel then, in most cases, it will run the slowest possible version of the code, even if the CPU is fully compatible with a better version.

I have complained about this behavior for years, and so have many others, but Intel have refused to change their CPU dispatcher. If Intel had advertised their compiler as compatible with Intel processors only, then there would probably be no complaints. The problem is that they are trying to hide what they are doing. Many software developers think that the compiler is compatible with AMD processors, and in fact it is, but unbeknownst to the programmer it puts in a biased CPU dispatcher that chooses an inferior code path whenever it is running on a non-Intel processor. If programmers knew this fact they would probably use another compiler. Who wants to sell a piece of software that doesn't work well on AMD processors?

Because of their size, Intel can afford to put more money into their compiler than other CPU vendors can. The Intel compiler is relatively cheap, it has superior performance, and the support is excellent. Selling such a compiler is certainly not a profitable business in itself, but it is obviously intended as a way of supporting Intel's microprocessors. There would be no point in adding new advanced instructions to the microprocessors if there were no tools to use these instructions. AMD is also making a compiler, but the current version supports only Linux, not Windows.

Various people have raised suspicion that the biased CPU dispatching has made its way into common benchmark programs (link link). This is a serious issue indeed. We know that many customers base their buying decision on published benchmark results, and a biased benchmark means an unfair market advantage worth billions of dollars.

The legal battle

AMD have sued Intel for unfair competition at least since 2005, and the case has been settled in November 2009. This settlement deals with many issues of unfair competition, apparently including the Intel compiler. The settlement says:

2.3 TECHNICAL PRACTICES

Intel shall not include any Artificial Performance Impairment in any Intel product or require any Third Party to include an Artificial Performance Impairment in the Third Party's product. As used in this Section 2.3, "Artificial Performance Impairment" means an affirmative engineering or design action by Intel (but not a failure to act) that (i) degrades the performance or operation of a Specified AMD product, (ii) is not a consequence of an Intel Product Benefit and (iii) is made intentionally to degrade the performance or operation of a Specified AMD Product. For purposes of this Section 2.3, "Product Benefit" shall mean any benefit, advantage, or improvement in terms of performance, operation, price, cost, manufacturability, reliability, compatibility, or ability to operate or enhance the operation of another product.

In no circumstances shall this Section 2.3 impose or be construed to impose any obligation on Intel to (i) take any act that would provide a Product Benefit to any AMD or other non-Intel product, either when such AMD or non-Intel product is used alone or in combination with any other product, (ii) optimize any products for Specified AMD Products, or (iii) provide any technical information, documents, or know how to AMD.

This looks like a victory for AMD. If we read "any Intel product" as Intel's compilers and function libraries, "any Third Party" as programmers using these compilers and libraries, and "Artificial Performance Impairment" as the CPU dispatcher checking the vendor ID string; then the settlement puts an obligation on Intel to change their CPU dispatcher. I will certainly check the next version of Intel's compiler and libraries to see if they have done so or they have found a loophole in the settlement.

Interestingly, this is not the end of the story. Only about one month after the AMD/Intel settlement, the US Federal Trade Commission (FTC) filed an antitrust complaint against Intel. The accusations in the FTC complaint are unusually strong:

Intel sought to undercut the performance advantage of non-Intel x86 CPUs relative to Intel x86 CPUs when it redesigned and distributed software products, such as compilers and libraries.
[...]
To the public, OEMs, ISVs, and benchmarking organizations, the slower performance of non-Intel CPUs on Intel-compiled software applications appeared to be caused by the non-Intel CPUs rather than the Intel software. Intel failed to disclose the effects of the changes it made to its software in or about 2003 and later to its customers or the public. Intel also disseminated false or misleading documentation about its compiler and libraries. Intel represented to ISVs, OEMs, benchmarking organizations, and the public that programs inherently performed better on Intel CPUs than on competing CPUs. In truth and in fact, many differences were due largely or entirely to the Intel software. Intel's misleading or false statements and omissions about the performance of its software were material to ISVs, OEMs, benchmarking organizations, and the public in their purchase or use of CPUs. Therefore, Intel's representations that programs inherently performed better on Intel CPUs than on competing CPUs were, and are, false or misleading. Intel's failure to disclose that the differences were due largely to the Intel software, in light of the representations made, was, and is, a deceptive practice. Moreover, those misrepresentations and omissions were likely to harm the reputation of other x86 CPUs companies, and harmed competition.
[...]
Some ISVs requested information from Intel concerning the apparent variation in performance of identical software run on Intel and non-Intel CPUs. In response to such requests, on numerous occasions, Intel misrepresented, expressly or by implication, the source of the problem and whether it could be solved.
[...]
Intel's software design changes slowed the performance of non-Intel x86 CPUs and had no sufficiently justifiable technological benefit. Intel's deceptive conduct deprived consumers of an informed choice between Intel chips and rival chips, and between Intel software and rival software, and raised rivals' costs of competing in the relevant CPU markets. The loss of performance caused by the Intel compiler and libraries also directly harmed consumers that used non-Intel x86 CPUs.

The remedy that the FTC asks for is also quite farreaching:

Requiring that, with respect to those Intel customers that purchased from Intel a software compiler that had or has the design or effect of impairing the actual or apparent performance of microprocessors not manufactured by Intel ("Defective Compiler"), as described in the Complaint:

Intel provide them, at no additional charge, a substitute compiler that is not a Defective Compiler;

Intel compensate them for the cost of recompiling the software they had compiled on the Defective Compiler and of substituting, and distributing to their own customers, the recompiled software for software compiled on a Defective Compiler; and

Intel give public notice and warning, in a manner likely to be communicated to persons that have purchased software compiled on Defective Compilers purchased from Intel, of the possible need to replace that software.

Maybe the FTC has decided that the AMD/Intel settlement was not a fair and sufficient remedy against Intel's monopoly behavior? The settlement compensates AMD, but not VIA and other microprocessor vendors, and not the customers who have been harmed by insufficient competition and by the "defective" software produced with the Intel compiler.

My own findings

When I started testing Intel's compiler several years ago, I soon found out that it had a biased CPU dispatcher. Back in January 2007 I complained to Intel about the unfair CPU dispatcher. I had a long correspondence with Intel engineers about the issue, where they kept denying the problem and I kept providing more evidence. They said that:

The CPU dispatch, coupled with optimizations, is designed to optimize performance across Intel and AMD processors to give the best results. This is clearly our goal and with one exception we believe we are there now. The one exception is that our 9.x compilers do not support SSE3 on AMD processors because of the timing of the release of AMD processors vs. our compiler (our compiler was developed before AMD supported SSE3). The future 10.x compilers, which enter beta this quarter and release around the middle of the year, will address this now that we've had time to tune and adjust to the new AMD processors.

Sounds nice, but the truth is that the CPU dispatcher didn't support SSE or SSE2 or any higher SSE in AMD processors and still doesn't today (Intel compiler version 11.1.054). I have later found out that others have made similar complaints to Intel and got similarly useless answers (link link).

The Intel CPU dispatcher does not only check the vendor ID string and the instruction sets supported. It also checks for specific processor models. In fact, it will fail to recognize future Intel processors with a family number different from 6. When I mentioned this to the Intel engineers they replied:

You mentioned we will not support future Intel processors with non-'6' family designations without a compiler update. Yes, that is correct and intentional. Our compiler produces code which we have high confidence will continue to run in the future. This has the effect of not assuming anything about future Intel or AMD or other processors. You have noted we could be more aggressive. We believe that would not be wise for our customers, who want a level of security that their code (built with our compiler) will continue to run far into the future. Your suggested methods, while they may sound reasonable, are not conservative enough for our highly optimizing compiler. Our experience steers us to issue code conservatively, and update the compiler when we have had a chance to verify functionality with new Intel and new AMD processors. That means there is a lag sometime in our production release support for new processors.

In other words, they claim that they are optimizing for specific processor models rather than for specific instruction sets. If true, this gives Intel an argument for not supporting AMD processors properly. But it also means that all software developers who use an Intel compiler have to recompile their code and distribute new versions to their customers every time a new Intel processor appears on the market. Now, this was three years ago. What happens if I try to run a program compiled with an old version of Intel's compiler on the newest Intel processors? You guessed it: It still runs the optimal code path. But the reason is more difficult to guess: Intel have manipulated the CPUID family numbers on new processors in such a way that they appear as known models to older Intel software. I have described the technical details elsewhere.

Perhaps the initial design of Intel's CPU dispatcher was indeed intended to optimize for known processor models only, without regard for future models. If any of my students had made such a solution that was not future-oriented, I would consider it a serious flaw. Perhaps the Intel engineers discovered the missing support for future processors too late so that they had to design the next generation of their processors in such a way that they appeared as known models to existing Intel software.

After Intel had flatly denied to change their CPU dispatcher, I decided that the most efficient way to make them change their minds was to create publicity about the problem. I contacted several IT magazines, but nobody wanted to write about it. Sad, but not very surprising, considering that they all depend on advertising money from Intel. The only publicity was my own optimization manual where I have described the problem in detail and given instructions on how to replace the unfair CPU dispatcher. I wonder why AMD didn't create public awareness about the problem. Were they obliged to keep quiet about an ungoing lawsuit? And what about VIA/Centaur?

Workarounds

At present, we don't know if or when Intel will make a new compiler and new software libraries that do not check the vendor ID string. In the meantime, here is what we can do about the problem.

Use another compiler. In my tests, the Gnu compiler for Linux has an optimizing performance similar to the Intel compiler, but the Gnu function library (glibc) is inferior. All other compilers gave lower performance in my tests. There is no other Windows compiler with a similar performance, not even the Gnu compiler for Windows.
Use the Intel software and patch the CPU dispatcher. In my C++ manual, I have provided the code for alternative CPU dispatchers for Intel's compiler and function libraries and descriptions on how to patch them into your software. This, of course, relies on undocumented details of the Intel software. This dispatcher-patch can improve performance on non-Intel processors considerably in many cases.
Never trust any benchmark unless it is open source and compiled with a neutral compiler, such as Gnu or Microsoft.
It is possible to change the CPUID of AMD processors by using the AMD virtualization instructions. I hope that somebody will volunteer to make a program for this purpose. This will make it easy for anybody to check if their benchmark is fair and to improve the performance of software compiled with the Intel compiler on AMD processors.

Links

My Discussion in Aceshardware forum 2007.

Discussion in AMD Developer Forums 2008.

My Discussion in AMDzone 2009.

Discussion in comp.arch 2004.

Complaint to Intel 2004, discussion in slashdot.org.

Mark Mackey, complaint to Intel 2005.

PCMark 2005 benchmark proven unfair. Arstechnica.

Testimony by John Oram regarding BAPCo benchmark organization.

Comment on AMD Developer Central 2005.

AMD files lawsuit 2005.

AMD untitrust complaints 2005.

Settlement agreement between AMD and Intel, 2009.

FTC complaint 2009.

Technical details in my C++ optimization manual.

[Added later:]

Author: Agner	Date: 2019-12-29 10:32
You need to link the patch before any Intel function library. I just found another function that needs patching. The function named `int mkl_serv_intel_cpu()` returns 1 when running on an Intel CPU and 0 when running on another brand of CPU. I have provided a patch at /www.agner.org/optimize/intel_dispatch_patch.zip

Author: Agner	Date: 2010-10-01 06:11
Here is a CPUID manipulation program for AMD processors that I have received from a Russian friend. link. I haven't tested it. Use it at your own risk. Explanation.

Author:	Date: 2012-01-30 19:15
I just came across this: https://github.com/jimenezrick/patch-AuthenticAMD Interesting. Not what I want though, I'm googling for a way to make KVM return "GenuineIntel" for the CPUID instruction.

Author: Agner	Date: 2011-02-09 02:15
AVK wrote: Does it mean that AMD CPUs finally get a feature to change its CPUID.Vendor string like VIA CPUs have? Unfortunately not. It is read-only.

Author:	Date: 2010-02-01 18:50
Posted a blog on compilers/benchmarks - Chipping Away the Façade on Compilers and Benchmarks for AMD Processors blogs.amd.com/work/2010/01/22/chipping-away-the-facade-on-compilers-and-benchmarks-for-amd-processors/

Author:	Date: 2014-09-16 04:37
I recently peeked at the generated code for the intel c++ compiler suite 2015 when on x64 not only is the cpu dispatcher present but it seem there is an entry for AuthenticAMD in it I sadly don't have enough knowledge about x64 assembly to know what is going on

Author: Agner	Date: 2011-12-25 04:19
Daniel wrote: Seems like the x64 equivalent for AVX is smaller than the 32bit number. Adding a simple switch for 64bit and changing the the value at the line __intel_cpu_indicator = 0x20000; from 0x20000 to 1<<16 for the 64bit branch solves the problem. Please email me with an example of a function that fails with __intel_cpu_indicator = 0x20000. I will investigate it.

Author: David	Date: 2012-05-19 16:35
For the static IPP library, can't a custom dispatcher be built with something like... if not "GenuineIntel" then if cpu reports support for SSE2 / 3 / 4 / etc.. then staticinitcpu(the intelcpu compatible with those instructions)?? Has such a dispatcher already been created by AMD or someone??

Author:	Date: 2013-05-20 18:05
Open-source Yeppp! library (www.yeppp.info) provides vector mathematical functions optimized for both Intel and AMD processors, and demonstrates performance comparable to the best vendor-optimized libraries: www.yeppp.info/home/yeppp-performance-numbers

Author: Agner	Date: 2013-05-21 07:22
Marat Dukhan wrote: Open-source Yeppp! library (www.yeppp.info) provides vector mathematical functions optimized for both Intel and AMD processors, and demonstrates performance comparable to the best vendor-optimized libraries: www.yeppp.info/home/yeppp-performance-numbers Thank you. That is good news indeed. I hope that you will add more functions to the library.

Author: Agner	Date: 2017-04-12 00:52
The single-thread instructions per clock rate of Ryzen is higher than for any Intel processor, except for 256-bit vector code. I am testing the Ryzen right now and the test results are coming soon. Please be patient.

Author:	Date: 2019-02-12 14:54
It's been almost 2 years since you mentioned that you were testing ryzen, have you finished? I don't see any results. You may have just placed them somewhere and never replied to this thread. I did not see them.

Author: Agner	Date: 2019-02-13 00:33
The test results for Ryzen are in my optimization manual. It can execute five instructions per clock in small loops.

Author: Agner	Date: 2017-07-27 00:48
Intel has a long history of monopoly-like behavior and expensive legal battels, according to this video blogger: www.youtube.com/watch?v=osSMJRyxG0k.

Author: Walker	Date: 2020-06-29 03:27
Check d3dx9_43.dll and d3dx10_43.dll. Branch selection dependent on cpuid "GenuineIntel" is still there and functional. By Microsoft. lol And AMD did lick Microsoft all over with their Windows 10 only(not, 7 runs perfectly) support for ryzen in contrast. Pathetic.

Author: Agner	Date: 2022-08-08 04:58
The new Intel compiler is better. For latest updates, see my new forum: link.

Agner`s CPU blog

Will Intel be forced to remove the "cripple AMD" function from their compiler?

The legal battle

My own findings

Workarounds

Links

Out of court settlement with FTC

Mathematica

Mathcad

Matlab

Preliminary test results for Matlab

Intel Math Kernel Library

Intel Vector Math Library

Intel Performance Primitives

Intel Threading Building Blocks

Intel standard C library and standard math library

Intel Short Vector Math Library

Intel C++ compiler

Other Intel products