crc32: align memory access
There is some assembly working with a byte array like with an array of unsigned long values. That is incorrect, because the byte array may be not aligned by 'unsigned long'. The patch makes crc32 calculate the hash on the prefix byte-by-byte, then word-by-word on aligned addresses, and again byte-by-byte on a tail which is less than word. Assuming that the word-by-word part is the longest, this should reduce number of memory/cache loads in ~x2 times. Because in case of a not aligned word load it was necessary to load 2 words, and then merge them into one. When addresses are aligned, this is only 1 load. Part of #4609
Loading
Please register or sign in to comment