Added __*shift_nz routines (for shifting by non-zero or a constant amount)#756
Added __*shift_nz routines (for shifting by non-zero or a constant amount)#756ZERICO2005 wants to merge 2 commits intomasterfrom
__*shift_nz routines (for shifting by non-zero or a constant amount)#756Conversation
| or a, $10 | ||
|
|
||
| call __lshru | ||
| call __lshru_nz |
There was a problem hiding this comment.
Proof: Shift amount should be 5-29, and the same shift amount was used in DJNZ (which would break if the shift amount could be zero)
| ex (sp), hl | ||
| ; shift is non-zero and [1, 11] in the non-UB case | ||
| call c, __llshl | ||
| call c, __llshl_nz |
There was a problem hiding this comment.
Proof: A is [1, 204] here, and call c, __llshl_nz will only call the function if the shift amount is less than 31.
| ld c, a ; A is [1, 23] | ||
| ; shift until the MSB of the mantissa is the LSB of the exponent | ||
| call __ishl | ||
| call __ishl_nz |
There was a problem hiding this comment.
Proof: rcf \ adc hl, hl was done prior to .L.subnormal, which means that __ictlz will return a value that is at least 1 since the LSB will be cleared.
|
|
||
| ex (sp), hl ; (SP) = shift | ||
| call __llshru | ||
| call __llshru_nz |
There was a problem hiding this comment.
Proof: Shift amount is [1, 11], and the exact same shift amount was used for DJNZ, which would break if the shift amount were zero.
| call __ictlz | ||
| ld c, a | ||
| call __ishl | ||
| call __ishl_nz |
There was a problem hiding this comment.
Proof: add hl, hl is done, meaning that the LSB is 0, so __ictlz will return a value greater than 1
| ld d, c ; store C | ||
| ld c, a | ||
| call __ishl | ||
| call __ishl_nz |
There was a problem hiding this comment.
Proof: Since sub a, 23 set the carry flag, it implies that A became [-23, -1], then neg makes A [1, 23]
Similar in spirit to #755.
__*shift_nzmay allow for a small speed optimization by optionally skipping a test for a shift-by-zero.I know that Clang/LLVM has some functionality to detect if the shift amount is non-zero. So the compiler could be able to output
__*shift_nzwhen applicable.__*shift_nzwill be emitted either when the shift amount is constant, or the shift amount is a variable that is proven to not be zero.Calling
__*shift_nzwith a shift amount of zero is undefined behavior.Additionally, it is always safe to convert
__*shift_nzback to__*shift.Pros:
call __*shiftin the compiler output, then you can be almost certain that the shift amount is by a variable instead of a constant__lshl_nzis used, then it might be possible to not link__lshlwhich would save 3 bytes (although this would need FASMGrequireto implement)Cons:
Here, 3F + 1 is saved by skipping the check for a shift-by-zero in
__lshl.Note that
__*shift_nzaliases__*shiftif no optimizations are possible.Here is a list of routines where
__*shift_nzis faster:__bshl__bshru__bshrs__lshl__lshru__lshrs__i48shru__i48shrs