400 likes | 496 Views
Dive into the world of C++ memory allocation with this comprehensive guide on allocators. Learn how to customize container memory usage, optimize memory usage and speed, and avoid common pitfalls. Discover examples, best practices, and performance considerations when working with allocators.
E N D
Custom STL Allocators Pete Isensee Xbox Advanced Technology Group pkisensee@msn.com
Topics • Allocators: What are They Good For? • Writing Your First Allocator • The Devil in the Details • Allocator Pitfalls • State • Syntax • Testing • Case Study
Containers and Allocators • STL containers allocate memory • e.g. vector (contiguous), list (nodes) • string is a container, for this talk • Allocators provide a standard interface for container memory use • If you don’t provide an allocator, one is provided for you
Example • Default Allocator list<int> b; // same as: list< int, allocator<int> > b; • Custom Allocator #include “MyAlloc.h” list< int, MyAlloc<int>> c;
The Good • Original idea: abstract the notion of near and far memory pointers • Expanded idea: allow customization of container allocation • Good for • Size: Optimizing memory usage (pools, fixed-size allocators) • Speed: Reducing allocation time (single-threaded, one-time free)
Example Allocators • No heap locking (single thread) • Avoiding fragmentation • Aligned allocations (_aligned_malloc) • Fixed-size allocations • Custom free list • Debugging • Custom heap • Specific memory type
The Bad • No realloc() • Requires advanced C++ compilers • C++ Standard hand-waving • Generally library-specific • If you change STL libraries you may need to rewrite allocators • Generally not cross-platform • If you change compilers you may need to rewrite allocators
The Ugly • Not quite real objects • Allocators with state may not work as expected • Gnarly syntax • map<int,char> m; • map<int,char,less<int>, MyAlloc<pair<int,char>> > m;
Pause to Reflect • “Premature optimization is the root of all evil” – Donald Knuth • Allocators are a last resort and low-level optimization • Especially for games, allocators can be the perfect optimization • Written correctly, they can be introduced w/o many code changes
Writing Your First Allocator • Create MyAlloc.h • #include <memory> • Copy or derive from the default allocator • Rename “allocator” to “MyAlloc” • Resolve any helper functions • Replace some code with your own
Writing Your First Allocator • Demo • Visual C++ Pro 7.0 (13.00.9466) • Dinkumware STL (V3.10:0009) • 933MHz PIII w/ 512MB • Windows XP Pro 2002 • Launch Visual Studio
Two key functions • Allocate • Deallocate • That’s all!
Conventions template< typename T > class allocator { typedef size_t size_type; typedef T* pointer; typedef const T* const_pointer; typedef T value_type; };
Allocate Function • pointer allocate( size_type n, allocator<void>::const_pointer p = 0) • n is the number of items T, NOT bytes • returns pointer to enough memory to hold n * sizeof(T) bytes • returns raw bytes; NO construction • may throw an exception (std::bad_alloc) • default calls ::operator new • p is optional hint; avoid
Deallocate function • void deallocate( pointer p, size_type n ) • p must come from allocate() • p must be raw bytes; already destroyed • n must match the n passed to allocate() • default calls ::operator delete(void*) • Most implementations allow and ignore NULL p; you should too
A Custom Allocator • Demo • That’s it! • Not quite: the devil is in the details • Construction • Destruction • Example STL container code • Rebind
Construction • Allocate() doesn’t call constructors • Why? Performance • Allocators provide construct function void construct(pointer p, const T& t) { new( (void*)p ) T(t); } • Placement new • Doesn’t allocate memory • Calls copy constructor
Destruction • Deallocate() doesn’t call destructors • Allocators provide a destroy function void destroy( pointer p ) { ((T*)p)->~T(); } • Direct destructor invocation • Doesn’t deallocate memory • Calls destructor
Example: Vector template< typename T, typename A > class vector { A a; // allocator pointer pFirst; // first object pointer pEnd; // 1 beyond end pointer pLast; // 1 beyond last };
Example: Reserve vector::reserve( size_type n ) { pointer p = a.allocate( n, 0 ); // loop ona.construct()to copy // loop ona.destroy()to tear down a.deallocate( pFirst, capacity() ); pFirst = p; pLast = p + size(); pEnd = p + n; }
Performance is paramount • Reserve • Single allocation • Doesn’t default construct anything • Deals properly with real objects • No memcpy • Copy constructs new objects • Destroys old objects • Single deallocation
Rebind • Allocators don’t always allocate T list<Obj> ObjList; // allocates nodes • How? Rebind template<typename U> struct rebind { typedef allocator<U> other; } • To allocate an N given type T Alloc<T> a; T* t = a.allocate(1); // allocs sizeof(T) Alloc<T>::rebind<N>::other na; N* n = na.allocate(1); // allocs sizeof(N)
Allocator Pitfalls • To Derive or Not to Derive • State • Copy ctor and template copy ctor • Allocator comparison • Syntax issues • Testing • Case Study
To Derive or Not To Derive • Deriving from std::allocator • Dinkumware derives (see <xdebug>) • Must provide rebind, allocate, deallocate • Less code; easier to see differences • Writing from scratch • Allocator not designed as base class • Josuttis and Austern write from scratch • Better understanding • Personal preference
Allocators with State • State = allocator member data • Default allocator has no data • C++ Std says (paraphrasing 20.1.5): • Vendors encouraged to support allocators with state • Containers may assume that allocators don’t have state
State Recommendations • Be aware of compatibility issues across STL vendors • list::splice() or C::swap()will indicate if your vendor supports stateful allocators • Dinkumware: yes • STLport: no • Test carefully
State Implications • Container size increase • Must provide allocator: • Constructor(s) • Default may be private if parameters required • Copy constructor • Template copy constructor • Global comparison operators (==, !=) • No assignment operators required • Avoid static data; generates one per T
Heap Allocator Example template< typename T > class Halloc { Halloc(); // could be private explicit Halloc( HANDLE hHeap ); Halloc( const Halloc& ); // copy template< typename U > // templatized Halloc( const Halloc<U>& ); // copy };
Template Copy Constructor • Can’t see private data template< typename U > Halloc( const Halloc<U>& a ) : m_hHeap( a.m_hHeap ) {} // error • Solutions • Provide public data accessor function • Or allow access to other types U template <typename U> friend class Halloc;
Allocator comparison • Example template< typename T, typename U > bool operator==( const Alloc<T>& a, const Alloc<U>& b ) { return a.state == b.state; } • Provide both == and != • Should be global fucns, not members • May require accessor functions
Syntax: Typedefs • Prefer typedefs • Offensive list< int, Alloc< int > > b; • Better // .h typedef Alloc< int > IAlloc; typedef list< int, IAlloc > IntList; // .cpp IntList v;
Syntax: Construction • Containers accept allocators via ctors IntList b( IAlloc( x,y,z ) ); • If none specified, you get the default IntList b; // calls IAlloc() • Map/multimap requires pairs Alloc< pair< K,T > > a; map< K, T, less<K>, Alloc< pair< K,T > > > m( less<K>(), a );
Syntax: Other Containers • Container adaptors accept containers via constructors, not allocators Alloc<T> a; deque< T, Alloc<T> > d(a); stack< T, deque<T,Alloc<T> > > s(d); • String example Alloc<T> a; basic_string< T, char_traits<T>, Alloc<T> > s(a);
Testing • Test the normal case • Test with all containers (don’t forget string, hash containers, stack, etc.) • Test with different objects T, particularly those w/ non-trivial dtors • Test edge cases like list::splice • Verify that your version is better! • Allocator test framework: www.tantalon.com/pete.htm
Case Study • In-place allocator • Hand off existing memory block • Dole out allocations from the block • Never free • Example usage typedef InPlaceAlloc< int > IPA; void* p = malloc( 1024 ); list< int, IPA > x( IPA( p, 1024 ) ); x.push_back( 1 ); free( p ); • View code
In-Place Allocator • Problems • Fails w/ multiple concurrent copies • No copy constructor • Didn’t support comparison • Didn’t handle containers of void* • Correct implementation • Reference counted • Copy constructor implemented • Comparison operators • Void specialization
In-Place Summary • Speed • Scenario: add x elements, remove half • About 50x faster than default allocator! • Advantages • Fast; no overhead; no fragmentation • Whatever memory you want • Disadvantages • Proper implementation isn’t easy • Limited use
Recommendations • Allocators: a last resort optimization • Base your allocator on <memory> • Beware porting issues (both compilers and STL vendor libraries) • Beware allocators with state • Test thoroughly • Verify speed/size improvements
Recommendations part II • Use typedefs to simplify life • Don’t forget to write • Rebind • Copy constructor • Templatized copy constructor • Comparison operators • Void specialization
References • C++ Standard section 20.1.5, 20.4.1 • Your STL implementation: <memory> • GDC Proceedings: References section • Game Gems III • pkisensee@msn.com • www.tantalon.com/pete.htm