the TiAlloc() and TiFree() code path. The rationale of the changes: Performing the one-time initialization check (first call to mmapTileStore()) for every TiAlloc() is unnecessary if the decision code is reworked to allow NULL pointers to exist in the computation and still make the correct decision. Use of 'unsigned long' for pointer arithmetic is not compatible with _WIN64 compiler model LLP64. Changed to 'pointertype'. The computations addition/subtraction/greater-than were performed multiple times. Seemed a convoulted method when a single operation should be good enough. The use of a single-linked-list with HEAD and TAIL does not make sense. My gut is telling me that for the purpose of memory allocation a LIFO is better as a free-and-reuse of the most recently freed item is more likely to already be in the CPU cache and the oldest freed item is more likely to have been evicted from CPU cache. So given the use of a custom allocator and no support to reclaim/compact or manage fragmentation other factor didn't carry any weight. Technically TiAlloc() returns undefined memory and the first action of the caller is to write new values, but the point remains that write is more likely to cause eviction of something else from cache with the original FIFO scheme. Due to the free list use during TiFree() there is a write to each alloc which will make the cache-line hotter in the cache (less likely to have been evicted) when using LIFO scheme than FIFO scheme. Further more the use of HEAD and TAIL had a far more complex TileStoreFree() than was necessary even for such a list. A 4 line method turned into 7 with multiple conditions tested when branching. The TileStoreFreeList/TileStoreFreeList_end were public symbols which may also impact the freedom the compiler has to optimise around them. Using LIFO single-linked-list resulted in the removal of TileStoreFreeList_end and associated simplification. Use of 'static' for the methods mmapTileStore() and getTimeFromTileStore() these are not public API so adding the 'static' give the compiler a hint these methods maybe inlined as they are not accessile from outside this compliation unit. The -O3 assembly result looks quite healthy in achieving the original goal of instruction and branch reductions with the compiler able to inline all 3 methods into a single TiAlloc(). |
||
|---|---|---|
| .. | ||
| Makefile | ||
| search.c | ||
| tile.c | ||
| tile.h | ||