Lab-4: Malloc lab
Introduction
In this lab you will be writing a dynamic storage allocator for C programs, i.e., your own version of the malloc, free and realloc routines. You are encouraged to explore the design space creatively and implement an allocator that is correct, efficient and fast.This is an individual lab.
Obtaining the lab
First, click on Lab4's github classroom invitation link (posted on Campuswire) and select your NYU NetID. Next, clone your repo by typing the following
$ cd cso-labs $ git clone git@github.com:nyu-cso-sp26/lab4-<YourGithubUsername>.git lab4
This lab's files are located in the lab4/ subdirectory.
Files you can modify
The only files you should modify for this lab are mm-implicit.c. If you are doing the bonus part of this lab, you'll also modify mm-buddy.c.
Malloc API
The API of your malloc library consists of the following 5 functions (declared in mm-common.h):int mm_init(void); void* mm_malloc(size_t size); void mm_free(void *ptr); void* mm_realloc(void *ptr, size_t size); heap_info_t mm_checkheap(bool verbose);Note that we use the prefix mm_* in the above APIs to distinguish our custom malloc implementation from those come with the standard C library. The API semantics are as follows:
mm_init: This function is called to initialize your malloc library prior to any any mm_malloc and mm_free calls. In this lab, the given tester programs (mm_unittest.c, mdriver.c) invoke mm_init. The return value is -1 if there was a problem in performing the initialization, 0 otherwise.
mm_malloc: This function allocates size bytes of contiguous space on the heap. It returns a pointer to the allocated space. The allocated space should lie within the heap region and should not overlap with any other allocated space.
We will comparing your implementation to the version of malloc supplied in the standard C library. Since the standard C library malloc always returns payload pointers that are aligned to 16 bytes, your malloc implementation should do likewise and always return 16-byte aligned pointers.
mm_free: This function frees the corresponding chunk whose payload is pointed to by ptr. It returns nothing. This function assumes that the pointer ptr was returned by an earlier call to mm_malloc or mm_realloc and has not yet been freed.
mm_realloc: This function allocates size bytes of contiguious space, with the following constraints.
- if ptr is NULL, the call is equivalent to mm_malloc(size)
- if size is equal to zero, the call is equivalent to mm_free(ptr)};
- if ptr is not NULL, it must have been returned by an earlier call to mm_malloc or mm_realloc. The call to mm_realloc changes the memory block pointed to by ptr (i.e.the old block) to one with size bytes and returns the address of the new block. Notice that the address of the new block might be the same as the old block, or it might be different, depending on your implementation, the amount of internal fragmentation in the old block, and the size of the realloc request.
- The contents of the new block are the same as those of the old ptr block, up to the minimum of the old and new sizes (the new size can be either smaller or larger than the old size). Everything else is uninitialized. For example, if the old block is 8 bytes and the new block is 12 bytes, then the first 8 bytes of the new block are identical to the first 8 bytes of the old block and the last 4 bytes are uninitialized. Similarly, if the old block is 8 bytes and the new block is 4 bytes, then the contents of the new block are identical to the first 4 bytes of the old block.
Lastly, mm_checkheap: The mm_cheapheap routine is an auxilary routine that checks the consistency of your heap. The function argument verbose can be used to turn on/off verbose debug printing.
The mm_malloc, mm_free, mm_realloc semantics match those of the C standard library's malloc, realloc, and free routines. Type man malloc for the complete documentation.
When implementing mm_init, mm_free, mm_mallocmm_realloc functions, you need to invoke the following functions which emulate the OS' syscalls, instead of using the real ones (in order to make testing easier). These emulated syscalls are defined in memlib.c and described below:
void *mem_sbrk(int incr): Expands the heap by incr bytes, where incr is a positive non-zero integer and returns a generic pointer to the first byte of the newly allocated heap area. The semantics are identical to the Unix sbrk function, except that mem_sbrk accepts only a positive non-zero integer argument.
void *mem_heap_lo(void): Returns a generic pointer to the first byte in the heap.
void *mem_heap_hi(void): Returns a generic pointer to the last byte in the heap.
Your implementation must observe the following rules:
- You should not invoke any memory-management related library calls or system calls, i.e. you cannot use malloc, calloc, free, realloc, sbrk, brk or any variants of these calls in your code. You may use memcpy and other C library functions that are not related to memory-management. In case of doubt, please check with the staff on Campuswire.
- You should not use any dynamically sized data structures such as dynamically-sized arrays, trees, or lists in your program. Yes, you can define a global array of a fixed number of elements of structs, pointers or other scalars.
- For consistency with the C standard library's malloc package, which returns blocks aligned on 16-byte boundaries, your allocator must always return pointers that are aligned to 16-byte boundaries. The tester will enforce this requirement for you.
Implement implicit list malloc
You are asked to implement an existing simple design, i.e. the implicit list based malloc (with both header and footer).To implement implicit-list, you shall work from the skeleton code given to you in mm-implicit.{c,h}. You should only modify mm-implicit.c while leaving mm-implicit.h unchanged.
In mm-implicit.h, we give you the definition of chunk header (or footer) in the implicit list design. Since the header is 8-byte in size, we'll pad the start of the heap by 8 bytes so that the payloads start at 16-byte aligned boundaries. Since our implementation frequently refers to the size of the header, we set global variable hdr_size = sizeof(header_t), and use hdr_size later in the code.
You shall complete the implicit list implementation in the following 3 stages:
- Complete various helper functions that manipulate the header, including set_size_status, get_size, get_status, set_status, header2footer, payload2header.
- Complete helper function next_chunk_header and prev_chunk_header
- Implement mm_heapcheck.
- Complete helper functions split, first_fit and use them to implement mm_malloc.
- Complete helper functions coalesce and use them to implement mm_free.
- Complete mm_realloc.
After completing each stage of the implementation, make sure that the unit tests corresponding to those functions that you've implemented pass:
$ make $ ./implicit-unittest start test_set_size_status...passed start test_get_size...passed start test_get_status...passed start test_set_status...passed start test_header2footer...passed start test_payload2header...passed start test_next_chunk_header...passed start test_prev_chunk_header...passed start test_first_chunk_header...passed start test_ask_os_for_chunk...passed start test_mm_checkheap...passed start test_split...passed start test_first_fit...passed start test_mm_malloc...passed start test_coalesce_next...passed start test_coalesce_prev...passed start test_mm_free...passed start test_mm_realloc...passed
Now you are ready to run our trace-based tester. The tester program is provided as mdriver.c which is linked to your malloc implementation and tests it for correctness, space utilization, and throughput. We name the tester that's been linked with the implicit-list malloc as implicit-mdriver.
The tester implicit-mdriver reads a set of trace files, each of which contains a sequence of allocate, reallocate, and free events corresponding to some example application workload. It then calls your mm_malloc, mm_realloc, and mm_free routines according to the sequence of events.
To run the tester, type:
$ ./implicit-mdriver -V
The -V turns on verbose printing in the tester. It's unlikely that you'll pass all the tracefiles even though you've already passed the unit tests. To debug, first identify which tracefile caused the error based on the output of `./implicit-mdriver -V`. Then, run the tester on that specific tracefile instead of all of the required tracefiles. For example, suppose you've identified that tracefile named amptjp-bal.rep has caused error. Then, type the following to run the test based on amptjp-bal.rep tracefile alone:
$./implicit-mdriver -V -f traces/amptjp-bal.rep
Note that all the tracefiles reside in the traces/ subdirectory. You can also type $./implicit-mdriver -h to see a full list of command line options.
The next step in debugging is to understand why this tracefile has triggered error in your code. In order to do this, you must understand the sequence of malloc/free/realloc requests that your code has been invoked with according to the tracefile. Open the tracefile, which is a simple ASCII file, using any editor of your choice. For example, below is the content of the first 10 lines of realloc-bal.rep. The comments following // annotate what each line means:
a 0 512 //(m)alloc request 0 demands 512 bytes a 1 128 //(m)alloc request 1 demands 128 bytes r 0 640 //realloc demands request 0 be enlarged to 640 bytes a 2 128 //(m)alloc request 2 demands 128 bytes f 1 //free request 1 r 0 768 //realloc demands request 0 be enlarged to 768 (note this is the 2nd time request 0 has been realloc-ed) a 3 128 //(m)alloc request 3 demaands 128 bytes f 2 //free request 2 r 0 896 //realloc demands request 0 be further enlarged to 896 bytes (3rd time request 0 has been realloc-ed) a 4 128 //(m)alloc request 4 demands 128 bytes ...
Once you know the sequence of malloc/free/realloc requests of a tracefile, then you know what the expected behavior of your heap should be. And you can run your code in gdb to see if the code's actual behavior matches its expected behavior.
Once you got every tracefile to pass correctly, you can test all tracefiles again to make sure everything goes well. Here's the example output if the tester runs successfully.
$ ./implicit-mdriver Using default tracefiles in ./traces/ Performance index = 60.0 * util + 40.0 * (your throughput)/(libc's throughput) 11 out of 11 traces passed, average performance index 48.4 (out of 100.0)You can refer to here to understand how the performance index is calculated. As we expect, the implicit list design works but does not achieve great performance.
Bonus part (15% extra): Implement a buddy system based malloc
For extra challenge after finishing implicit-list malloc, you can implement a buddy system based malloc.
You are supposed to modify only these files: mm-buddy.c mm-buddy-unittest.c. This time, we no longer provide the unit test implementation nor any skeleton code. You should write your own unit test (mm-buddy-unittest.c) in the style of mm-implicit-unittest.c to test the correctness of your implementation in mm-buddy.c. The given makefile compiles your unit test mm-buddy-unittest.c into binary executable buddy-unittest which you can invoke.
Once you finish the implementation, invoke the trace-based tester as follows:
$ ./buddy-mdriver -V
Programming Advice
- Is every block in the free list marked as free?
- If you are coalescing with both previous and next free chunk, there should not be any contiguous free blocks. Is this property always true?
- Is every free block in the free list?
- Do the pointers in the free list point to valid free blocks?
- Do any allocated blocks overlap?
- Do the pointers in a heap block point to valid heap addresses?
- If you decide to do the bonus part, write unit tests (in mm-buddy-unittest.c) in the style of mm-implicit-unittest.c. They might seem tedious, but they are crucial for debugging.
- When testing with trace-based tester, use simple workloads first to debug correctness. We provide two very simple trace files short{1,2}.rep to simplify debugging. Type mdriver -f ./traces/short1.rep to test your memory allocator on one of these workloads first.
- The first 9 traces contain requests to malloc and free. The last 2 traces contain requests for realloc, malloc, and free. We recommend that you start by getting your malloc and free routines working correctly and efficiently on the first 9 traces. Only then should you turn your attention to the realloc implementation. For starters, build mm_realloc on top of your existing mm_malloc and mm_free implementations. But to get really good performance, you will need to build a stand-alone mm_realloc.
- Use gdb. It will help you isolate and identify out of bounds memory references.
- For advanced students, you should consider learning to use a profiler, such as gprof, to understand and optimize the performance of your program.
Heap consistency checker:
Your heap consistency checker mm_checkheap should be thorough. Some examples of what a heap checker might check are:
Evaluation
You will receive zero points if you break any of the rules or your code is buggy and crashes the tester. Otherwise, your grade will be calculated as follows:
Correctness of implicit-list (80 points). To achieve full score, your implicit list implementation must pass both the unit test implicit-unittest and trace-based test implicit-mdriver.
Performance of your implicit-list (20 points). Two performance metrics will be used to evaluate your solution:
Space utilization: The peak ratio between the aggregate amount of memory used by the tester (i.e., allocated via mm_malloc or mm_realloc but not yet freed via mm_free) and the size of the heap used by your allocator. The optimal ratio equals to 1. You should find good policies to minimize fragmentation in order to make this ratio as close as possible to the optimal.
Throughput: The average number of operations completed per second.
The tester program summarizes the performance of your allocator by computing a performance index, which is a weighted sum of the space utilization (U) and throughput (T)
Performance index = 0.6*U + 0.4*min(1, T/T_libc)
where U is your space utilization, T is your throughput, and T_libc is the measured throughput of Standard C library's malloc on your system on the default traces. Throughput is calculated as the number of requests in a given trace divided by how long it takes a malloc library to process all those requests. If your performance index is close enough to our reference implementation, then you get full performance credit.
