image_pdfimage_print

For over two decades—since 2002—U-Boot has relied on version 2.6.6 of Doug Lea’s malloc (dlmalloc, old docs) to handle dynamic memory allocation. While reliable, the codebase was showing its age.

In a massive 37-patch series, we have finally updated the core allocator to dlmalloc 2.8.6. This update brings modern memory efficiency algorithms, better security checks, and a cleaner codebase, all while fighting to keep the binary size footprint minimal for constrained devices.

Why the Update?

The move to version 2.8.6 isn’t just about bumping version numbers. The new implementation offers specific technical advantages:

  • Improved Binning: The algorithm for sorting free chunks is more sophisticated, leading to better memory efficiency and less fragmentation.
  • Overflow Protection: Robust checks via MAX_REQUEST prevent integer overflows during allocation requests, improving security.
  • Reduced Data Usage: The old pre-initialised av_[] array (which lived in the data section) has been replaced with a _sm_ struct in the BSS (Block Started by Symbol) section. This change reduces usage in the data section by approximately 1.5KB, a significant win for boards with constraints on initialised data.

The Battle for Code Size

One of the biggest challenges in embedded development is code size. Out of the box, the newer dlmalloc was significantly larger than the old version—a non-starter for SPL (Secondary Program Loader) where every byte counts.

To combat this, the patch series a Kconfig option to strip down the allocator for space-constrained SPL builds (specifically via CONFIG_SYS_MALLOC_SMALL).

Key optimisations include:

  1. NO_TREE_BINS: Disables complex binary-tree bins for large allocations (>256 bytes), falling back to a simple doubly-linked list. This trades O(log n) performance for O(n) but saves ~1.25KB of code.
  2. SIMPLE_MEMALIGN: Simplifies the logic for aligned allocations, removing complex retry fallback mechanisms that are rarely needed in SPL. This saves ~100-150 bytes.
  3. NO_REALLOC_IN_PLACE: Disables the logic that attempts to resize a memory block in place. Instead, it always allocates a new block and copies data. This saves ~500 bytes.

With these adjustments enabled, the new implementation is actually smaller than the old 2.6.6 version on architectures like Thumb2.

Keeping U-Boot Features Alive

This wasn’t a clean upstream import. Over the last 20 years, U-Boot accrued many custom modifications to dlmalloc. This series carefully ports these features to the new engine:

  • Pre-relocation Malloc: Support for malloc_simple, allowing memory allocation before the main heap is initialised.
  • Valgrind Support: Annotations that allow developers to run U-Boot sandbox under Valgrind to detect memory leaks and errors.
  • Heap Protection: Integration with mcheck to detect buffer overruns and heap corruption.

New Documentation and Tests

To ensure stability, a new test suite (test/common/malloc.c) was added to verify edge cases, realloc behaviours, and large allocations. Additionally, comprehensive documentation has been added to doc/develop/malloc.rst, explaining the differences between pre- and post-relocation allocation and how to tune the new Kconfig options.

Next Steps

The legacy allocator is still available via CONFIG_SYS_MALLOC_LEGACY for compatibility testing, but new boards are encouraged to use the default 2.8.6 implementation.

Author

  • Simon Glass is a primary author of U-Boot, with around 10K commits. He is maintainer of driver model and various other subsystems in U-Boot.

Leave a Reply

Your email address will not be published. Required fields are marked *