Search found 75 matches

by agner
2023-02-11, 5:29:45
Forum: Agner's CPU blog
Topic: Branch Predictor and Instruction Pipelining, how can I prove the existence of it?
Replies: 3
Views: 51625

Re: Branch Predictor and Instruction Pipelining, how can I prove the existence of it?

The "likely" macro is a hint to the compiler, not to the CPU. The compiler may swap if-else branches to put the most likely branch first.
by agner
2022-11-20, 6:22:08
Forum: Agner's CPU blog
Topic: Question about your Testp
Replies: 1
Views: 38739

Re: Question about your Testp

No, you need root permission. Installing the linux driver requires sudo.
by agner
2022-11-06, 6:21:24
Forum: Agner's CPU blog
Topic: AMD Zen 4 Ryzen 7000 series with AVX-512 support
Replies: 10
Views: 131704

Re: AMD Zen 4 Ryzen 7000 series with AVX-512 support

Vladislav_152378 wrote: What is predication they're all talking about? It is the use of masks. A mask can selectively enable or disable individual vector elements. Mysticial wrote: If 512-bit was split into the different pipes on the same cycle, it should be 1.0/cycle as you wouldn't be able to use ...
by agner
2022-11-04, 14:12:02
Forum: Agner's CPU blog
Topic: AMD Zen 4 Ryzen 7000 series with AVX-512 support
Replies: 10
Views: 131704

Test results for AMD Zen 4

I have now had access to test a Zen 4, and the results are quite good. The Zen 4 is produced with a 5 nm process making it possible to run at a clock frequency of 4.5 - 5.7 GHz. The high clock frequency combined with a bigger micro-op cache, L2, and L3 caches, improved branch instructions, and sever...
by agner
2022-09-06, 14:59:13
Forum: Agner's CPU blog
Topic: Which C++ compiler is best?
Replies: 3
Views: 58867

Re: Which C++ compiler is best?

nanxiao wrote:
it is said the Rust program performance is quite good
The Rust compiler is based on LLVM, the same as Clang is using. This means that you can get similar performance as Clang C++ in some cases, but not all. You better do your own tests.
by agner
2022-09-06, 4:31:54
Forum: Agner's CPU blog
Topic: AMD Zen 4 Ryzen 7000 series with AVX-512 support
Replies: 10
Views: 131704

Re: AMD Zen 4 Ryzen 7000 series with AVX-512 support

Dannotech wrote: it would only split AVX 512 into half and double pump it through the AVX 256 vector engine to save power This was expected. It saves silicon space. It might still give a performance benefit in the decoding stage. I will test the performance if somebody can give me remote access to a...
by agner
2022-08-09, 8:35:04
Forum: Agner's CPU blog
Topic: Which C++ compiler is best?
Replies: 3
Views: 58867

Which C++ compiler is best?

C++ is the best programming language if you want programs to run fast. My C++ optimization manual explains why. The number of C++ compilers on the market is decreasing. Many compilers have disappeared from the market or are no longer maintained. Do you remember compiler names like Borland, Embarcade...
by agner
2022-08-08, 9:13:02
Forum: Agner's CPU blog
Topic: Intel's "cripple AMD" function
Replies: 6
Views: 332113

New Clang-based Intel compiler is better

There is an important update to this story. Intel have switched to a Clang-based compiler that works well with non-Intel microprocessors. If you download the new Intel "oneAPI" Compiler, you get two versions. A legacy version named "Classic" which is a continuation of the old compiler, and a new "LL...
by agner
2022-08-07, 9:53:28
Forum: Agner's CPU blog
Topic: C++ Vector class library
Replies: 5
Views: 86659

New release supports half precision floating point vectors

Version 2.02.00 of the Vector Class Library is now released. Link . New in this version: Half precision floating point vectors, using the new AVX512-FP16 instruction set (or emulated on lower instruction sets). Includes all general functions and operators, exponential function, trigonometric functio...