This attempts to remove the number of active lines of code, branches from

the TiAlloc() and TiFree() code path.

The rationale of the changes:

Performing the one-time initialization check (first call to
mmapTileStore()) for every TiAlloc() is unnecessary if the decision code
is reworked to allow NULL pointers to exist in the computation and still
make the correct decision.

Use of 'unsigned long' for pointer arithmetic is not compatible with _WIN64
compiler model LLP64.  Changed to 'pointertype'.

The computations addition/subtraction/greater-than were performed
multiple times.  Seemed a convoulted method when a single operation
should be good enough.

The use of a single-linked-list with HEAD and TAIL does not make sense.
My gut is telling me that for the purpose of memory allocation a LIFO is
better as a free-and-reuse of the most recently freed item is more likely
to already be in the CPU cache and the oldest freed item is more likely
to have been evicted from CPU cache.  So given the use of a custom
allocator and no support to reclaim/compact or manage fragmentation
other factor didn't carry any weight.
Technically TiAlloc() returns undefined memory and the first action of
the caller is to write new values, but the point remains that write is
more likely to cause eviction of something else from cache with the
original FIFO scheme.
Due to the free list use during TiFree() there is a write to each alloc
which will make the cache-line hotter in the cache (less likely to have
been evicted) when using LIFO scheme than FIFO scheme.

Further more the use of HEAD and TAIL had a far more complex
TileStoreFree() than was necessary even for such a list.  A 4 line
method turned into 7 with multiple conditions tested when branching.
The TileStoreFreeList/TileStoreFreeList_end were public symbols which may
also impact the freedom the compiler has to optimise around them.

Using LIFO single-linked-list resulted in the removal of
TileStoreFreeList_end and associated simplification.

Use of 'static' for the methods mmapTileStore() and getTimeFromTileStore()
these are not public API so adding the 'static' give the compiler a hint
these methods maybe inlined as they are not accessile from outside this
compliation unit.

The -O3 assembly result looks quite healthy in achieving the original
goal of instruction and branch reductions with the compiler able to
inline all 3 methods into a single TiAlloc().
This commit is contained in:
Darryl L. Miles 2025-01-31 16:25:52 +00:00 committed by R. Timothy Edwards
parent 1c3a059031
commit 51ec834f6c
1 changed files with 25 additions and 22 deletions

View File

@ -663,7 +663,7 @@ TiJoinY(
#ifdef HAVE_SYS_MMAN_H
/* MMAP the tile store */
static signed char
static void *
mmapTileStore(void)
{
int prot = PROT_READ | PROT_WRITE;
@ -676,25 +676,17 @@ mmapTileStore(void)
TxError("TileStore: Unable to mmap ANON SEGMENT\n");
_exit(1);
}
_block_end = (void *) ((pointertype) _block_begin + map_len);
_block_end = (void *) ((char *) _block_begin + map_len);
_current_ptr = _block_begin;
return 0;
return _current_ptr;
}
Tile *
static Tile *
getTileFromTileStore(void)
{
Tile *_return_tile = NULL;
if (!_block_begin && !_block_end)
{
mmapTileStore();
}
/* Check if we can get the tile from the
* Free list
*/
/* Check if we can get the tile from the Free list */
if (TileStoreFreeList)
{
_return_tile = TileStoreFreeList;
@ -704,19 +696,30 @@ getTileFromTileStore(void)
/* Get it from the mmap */
if (((pointertype)_current_ptr + sizeof(Tile))
> (pointertype)_block_end)
void *nextp = (char*)_current_ptr + sizeof(Tile);
if ((pointertype)nextp > (pointertype)_block_end)
{
/* this will trigger for initial allocatation on startup, due to:
* 0 + sizeof(Tile) > 0
* _current_ptr + sizeof(Tile) > _block_end
* so there is no need to explicitly test for that in the allocation path
* the comparison is greater-than (instead of greater-than-or-equal-to)
* so the last block/byte can be allocated
*/
mmapTileStore();
}
_current_ptr = (void *)((pointertype)_current_ptr + sizeof(Tile));
if ((pointertype)_current_ptr > (pointertype) _block_end)
{
fprintf(stderr,"TileStore: internal assertion failure...");
_exit(1);
/* _current_ptr will be updated, so recompute nextp */
nextp = (char*)_current_ptr + sizeof(Tile);
ASSERT((pointertype)nextp <= (pointertype)_block_end, "nextp");
}
return (Tile *)((pointertype)_current_ptr - sizeof(Tile));
void *thisp = _current_ptr;
/* TODO investigate alignment padding, maybe best to pad Tile for this */
/* advance _current_ptr for next time */
_current_ptr = nextp;
return (Tile *)thisp;
}
Tile *