Some interesting ARM 32 results

Post Reply
rhyde
Site Admin
Posts: 47
Joined: Sun Dec 04, 2022 5:36 pm

Some interesting ARM 32 results

Post by rhyde »

I've recently been working on some numeric conversion code for AARCH32 (32-bit ARM code). A few interesting results from that work:

1. Lookup tables (256 element) work great for converting integers to strings of hexadecimal digits (assuming you don't mind a 512-byte lookup table). Much faster than the traditional approach (at least, on a Pi 400, YMMV on other CPUs/systems).

2. Tables are a spectacular failure when going the other direction (strings to numeric values). I used a jump table the the Thumb TBH instruction to implement a switch statement to classify each input character and process it accordingly. The traditional "shift 4 and add" algorithm was quite a bit faster.

3. Neon sucks. I did a Neon version of the numeric to string function. The code was very short. Alas, it ran much slower that the lookup table approach (see [1]). I'm sure Neon on A32 is good for something, converting 32-bit values to hex strings is not one of those things. (Note: I tried two different Neon algorithms, one using TBX and the other using the traditional "shift and add" approach, neither worked well.)

4. Surprise, surprise: though the ARM supports 64-bit floating-point arithmetic in hardware (at least on Cortex-A-class CPUs I'm working on), there are no instructions to convert a double-precision float to a 64-bit integer or vice versa. Had to do that in software (64-bit integer to double wasn't so bad, the other direction was a bit hairy).

I still have a lot of cleanup and optimization to do on this code before putting it in "The Art of ARM Assembly, Volume 2" But I did post the code in the "Generic Assembly" topic here, if you're interested in looking at it.

Cheers,
Randy Hyde
Post Reply