Hi Agner,
I have read your blog and tutorial (Optimizing software in C++) with special focus on section 13.7 (page 139 - 141).
I am trying to run MKL2019 on AMD cpu on Ubuntu 18.04 by changing the CPU dispatcher as per your code snippet given in intel_mkl_feature_patch.c (from asmlib.zip).
I added your CPU dispatcher snippet into the application and tried to compile by additionally linking libmkl_core.so.
But I got the following errors:-
undefined reference to __intel_mkl_feature_indicator
undefined reference to __intel_mkl_feature_indicator_x
undefined reference to __intel_mkl_features_init_x
undefined reference to __intel_mkl_feature_indicator_x
undefined reference to __intel_mkl_feature_indicator
When I directly use dispatchpatch64.o from amslib.zip, I get undefined reference to intel_mkl_patch.
Can you please let me know other system dependencies required to use your patch? Or that new MKL's CPU dispatcher has changed and the above functions no longer exist.
Thanks,
Biplab Raut
Deng wrote:
Recently I tested the inteldispatchpatch.zip
(2014-07-30) in asmlib on MKL library in Intel
Parallel Studio 2015 Update 3.
It gave me error like "Intel MKL ERROR: CPU 1
is not supported." on AMD servers. (from
mkl_blas_xdgemv ()). If removing the
'dispatchpatch64.o' during linking, the MKL works on
AMD servers.
It seems to me that the dispatch does
not work on newer Intel MKL. do you know more way how
to patch it correct?
Agner Fog wrote:
inhahe wrote:
I think it's arguable whether or not Intel
crippled
AMD via an "affirmative engineering or
design
action" as opposed to a "failure to
act" (as distinguished in the settlement).
Checking for vendor ID is an
affirmative
action. The grey area is whether they are optimizing
for specific CPU models or for specific instruction
sets. There is only one case where they distinguish
between different CPU models that have the same
instruction set, namely Pentium 4 versus Pentium
M. In
most cases, however, they use the same code path for
both, or the two paths are identical or almost
identical. The distinction may be unimportant from a
technical point of view, but it may give Intel a
legal
excuse for claiming that they are optimizing for
specific CPU models.
I doubt Intel can be required
to optimize specifically for a CPU that's not
theirs
The settlement doesn't require
that.
Another reason it's a gray area is that
it's possible that the code path optimizations
they
took were obvious and would likely apply to any
modern
x86 CPU (though the fact that AMD and Intel
are the
only two players in the game sort of makes it
beg
the
question),
Most optimizations are indeed obvious
applications of the available instruction set. If
you
have SSE2 you can do four additions in one
instruction. That's an obvious thing to do
regardless
of CPU model. Don't forget there is a third player,
VIA. Their chips are fast enough for being relevant
here.
given the small excerpt of the
settlement shown, it seems possible to me that
what
they *actually* did is make something up that
will
sate AMD's lawyers while at the same time
leaving
the
door open for them to either continue the same
practice, or cease the practice (if it's too
obviously
anti-competitive or if they explicitly said
they'd
cease it elsewhere) but instate similar and/or
related
practices in the future, on account of the
fact that
those practices can easily be classified as
"failures to act." [...]
However, if the decisions
for how and when to use SSE instructions are
intricately tied in with the rest of their
code path
algorithm (and possibly rely on internal
structure
of
the CPU design), then the caveats I brought up
earlier
still apply.
Yes, they will probably be able to
claim
that. From a merely technical perspective, I think
it's a bad idea to make different code paths for two
processors that support the same instruction set
based
on whether a particular instruction runs a little
faster on one than on the other. If you consider the
time it takes to develop a complete program plus the
time it takes to market it, then it is likely that
the
processors you optimized for will be obsolete for
your
most demanding customers before the time your
software
peaks on the market. My advise would certainly be to
optimize for the newest processor, but make sure you
maintain compatibility with older
processors.
But of
course Intel compiler engineers are not obliged to
listen to my advice if doing otherwise enables
them to
harm their competitors.
In any case, whether not supporting
optimizations on AMD's CPUs was an affirmative
design
decision to undermine AMD machines or merely a
failure
to act (to benefit AMD machines), either way,
it's
clearly wrong for them to publish benchmarks to
OEMs,
etc. comparing AMD CPUs to Intel CPUs using
their
own
compiler that specifically optimizes for Intel
CPUs
(based on Vendor ID no less!, but either way)
and
not
for AMD CPUs. It's misleading, and according
to the
UTC, even when specifically confronted with the
issue
they would habitually either mislead or
directly lie
about the cause for the speed difference and
whether
it could be solved. So *that's* the part that's
really devious, and I can see why the FTC sued
them.
I *hate* companies like that. Incidentally,
though,
all companies are companies like that.
Fortunately, not all companies are like
that. I am sure this case has harmed Intel's
reputation. They can be damn sure that their next
compiler version will be thoroughly scrutinized.
Hopefully, they will take their reputation into
account when they design the next compiler version
and
function libraries.