News and research about CPU microarchitecture and software optimization
-
mineiro
- Posts: 1
- Joined: 2023-08-25, 2:53:03
Post
by mineiro » 2023-08-25, 3:13:11
I recently used the testp tool to measure two instructions in unrolled code 1000 times.
rept 1000
mov (r|e)ax, 123
endm
I got the following results:
Code: Select all
mov rax,123 mov eax,123
Clock Core cyc Clock Core cyc
546 442 382 312
558 469 386 312
676 550 388 311
Looking at the results I assumed that it is desirable to use 32-bit rather than 64-bit registers when possible in a 64-bit environment.
Where is my mistake in assuming such a fact?
Can I conclude that both instructions execute the same number of cycles?
(Linux - Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz).
-
agner
- Site Admin
- Posts: 76
- Joined: 2019-12-27, 18:56:25
-
Contact:
Post
by agner » 2023-08-25, 6:22:49
An optimizing assembler should code mov rax,123 as mov eax,123 because the result is zero-extended into rax anyway. The two instructions should give identical results. Test results may vary for random reasons. Zero extension cannot be used with negative constants. mov rax,-123 is two bytes longer than mov eax,-123.