Wednesday, April 20, 2011

I want to make my own Malloc

I want to make my own malloc/free so I can make a precise copying allocator.

Any gurus have any tips and suggestions?

I have a few questions for now:

  1. Should I just malloc large chunks of memory, and then distribute from that so I don't have to call the system calls?
  2. How are copying collectors usually done? I would imagine that this part is a bit complicated to do efficiently. My naive implementation would just malloc a block the size of the remaining objects which would require 2x the space.
From stackoverflow
  • There's rather a lot of good literature on implementing malloc and similar things. but I notice that you include C++ here -- are you aware that you can write your own implementation of new and delete in C++? That might be useful as a way to do it easily.

    In any case, the characteristics you want are going to depend pretty heavily on your workload, that is, on the pattern of usage over time. If you have only mallocs and new frees, it's easy, obviously. If you have only mallocs of one, or a few different, block sizes, that's also simple.

    In other languages, you get some leverage by having the ability to chain memory together, but C isn't that smart.

    The basic implementation of malloc simply allocates a header that contains the data length, an "in use flag", and the malloced memory. Malloc then constructs a new header at the end of its space, allocates the memory, and returns a pointer. When you free, it just resets the in use flag.

    The trick is that when you do a lot of mallooc and free, you can quickly get a lot of small blobs that aren't in use, but are hard to allocate. So you need some kind of bumpo gc to merge blocks of memory.

    You could do a more complicated gc, but remember that takes time; you don't want a free to take up a lot of time.

    There's a nice paper on Solaris malloc implementations you might find interesting. Here's another on building an alternative malloc, again in Solaris, but the basics are the same. And you should read the Wikipedia article on garbage collection, and follow it to some of the more formal papers.

    Update

    You know, you really should have a look at generational garbage collectors. The basic idea is that the longer something remains allocated, the more likely is it to stay allocated. This is an extension of the "copying" GC you mention. Basically, you allocate new stuff in one part of your memory pool, call it g0. When you reach a high water mark on that, you look through the allocated blocks and copy the ones that are still in use to another section of memory, call it g1, Then you can just clear the g0 space and start allocating there. Eventually g1 gets to its high water mark and you fix that by clearing g0, and clean up g1 moving stuff to g0, and when you're done, you rename the old g1 as g0 and vice versa and continue.

    The trick is that in C especially, the handles you hand out to malloc'ed memory are straight raw pointers; you can't really move things around without some heap big medicine.

    Second update

    In comments, @unknown asks "Wouldn't moving stuff around just be a memcpy()". And indeed it would. but consider this timeline:

    warning: this is not complete, and untested, just for illustration, for entertainment only, no warranty express or implied

    /* basic environment for illustration*/
    void * myMemoryHdl ;
    unsigned char lotsOfMemory[LOTS]; /* this will be your memory pool*/
    

    You mallocate some memory

    /* if we get past this, it succeded */
    if((myMemoryHdl = newMalloc(SIZE)) == NULL)
        exit(-1);
    

    In your implementation of malloc, you create the memory and return a pointer to the buffer.

    unsigned char * nextUnusued = &lotsOfMemory[0];
    int partitionSize = (int)(LOTS/2);
    int hwm = (int) (partition/2);
    /* So g0 will be the bottom half and g1 the top half to start */
    unsigned char * g0 = &lotsOfMemory[0];
    unsigned char * g1 = &lotsOfMemory[partitionSize];
    
    
    void * newMalloc(size_t size){
       void * rtn ;
       if( /* memory COMPLETELY exhausted */)
          return NULL;
       /* otherwise */
       /* add header at nextUnused */
       newHeader(nextUnused);     /* includes some pointers for chaining
                                   * and a field with values USED or FREE, 
                                   * set to USED */
       nextUnused += HEADERLEN ;  /* this could be niftier */
       rtn = nextUnused ;
       nextUnused += size ;
    }
    

    Some of the things are freed

      newFree(void * aHandle){
         *(aHandle-offset) = FREE ; /* set the flag in the header, 
                                     * using an offset. */
      }
    

    So now you do all the stuff and you get to your high water mark.

     for( /* each block in your memory pool */ )
        if( /* block header is still marked USED */ ) {
            memcpy(/* block into other partition */);
        }
     /* clear the partition */
     bzero(g0, partitionSize);
    

    Now, go back to the original handle you saved in myMemHdl. What does it point to? (Answer, you just set it to 0x00 with bzero(3).)

    That's where the magic comes in. In C at least, the pointer you returned from your malloc is no longer under your control -- you can't move it around after the fact. In C++, with user-defined pointer-like types, you can fix that.

    Unknown : Since malloc implementations usually have a header with the data length, then wouldn't a movement simply involve a memcpy?
    Charlie Martin : Yup. But now, where is the pointer you gave the caller to malloc going to point?
  • Don't use idea #1, you'd be wasting a lot of resources that might potentially remain unused for the entire execution of an application, all the while stopping other processes from utilising it (depending on how much you plan to take).

    David Thornley : Many OSs work on the basis that you only use memory if you actually use it, so staking out a large chunk of address space is harmless per se. In general, if writing your own allocator, you do want to limit system calls from it.
    Unknown : @ David, do you have more information on that comment? I wonder if its true.
    janneb : Yes, it is true. Look up how virtual memory works, or specifically how OS's use 'overcommit'.
    el.pescado : Moreover, operating systems can allocate process only whole pages of memory. Most commonly, page size is 4kb. So, even if you want to allocate a single byte, OS has to give you 4kb (one page).
  • It'd be helpful to know a little more about what you're trying to do. Is this for a single type? Or set of relatively uniform types?

    • Memory pools are a tried-and-true technique if you need to speed up allocs for a give type (one possible downside is you can be hopping all over memory, affecting cache performance)
    • Application-wide allocation techniques usually include rounding all allocs up to profile-driven bucket sizes so that you keep fragmentation down.
    • Multi-threaded apps usually have per-thread pools that can be alloc'd from without risking contention with other threads

    The doc's for tcmalloc are pretty good at describing the current state of the art for these techniques.

    There isn't too much complicated stuff here, but you're probably reinventing the wheel. There are a few open source libraries that do this.

  • Here is a nice paper on this.

    Dynamic Storage Allocation: A Survey and Critical Review

  • Read about Doug Lea's malloc

    http://g.oswego.edu/dl/html/malloc.html

    Unknown : nice article, upvoted.
    unforgiven3 : Doug Lea was one of my favorite professors at SUNY Oswego. I learned a lot from his classes/books.
    iain : wow how lucky was that. he is a concurrent god.

0 comments:

Post a Comment