Lab 3: Memory management¶
Handed out Wednesday November 3, 2021
Due Wednesday November 17, 2020
Objectives¶
Modify memory layout to move stack to top of address space (70%)
Implement stack growth (30%)
Preliminaries¶
You can get the starter code from the lab2 repository.
Part 1: Changing memory layout¶
Overview¶
In this part, you’ll be making changes to the xv6 memory layout. Sound simple? Well, there are a few tricky details.
Details¶
In xv6, the VM system uses a simple two-level page table. If you do not remember the details, read Section 20.3 of OS 3 easy steps. However, you may find the description in Chapter 1 of the xv6 manual sufficient (and more relevant to the assignment).
The xv6 address space is currently set up like this:
code
stack (fixed-sized, one page)
heap (grows towards the high-end of the address space)
In this part of the xv6 project, you’ll rearrange the address space to look more like Linux:
code
heap (grows towards the high-end of the address space)
... (gap)
stack (at end of address space; grows backwards)
You can see the general map of the kernel memory in memlayout.h
;
the user memory starts at 0 and goes up to KERNBASE
.
Note that we will not be changing the kernel memory layout at all,
only the user memory layout.
Right now, the program memory map is determined by how we load the
program into memory and set up the page table (so that they are
pointing to the right physical pages). This is all implemented in
exec.c as part of the exec system call using the underlying support
provided to implement virtual memory in vm.c
. To change the memory
layout, you have to change the exec code to load the program and
allocate the stack in the new way that we want.
Moving the stack up will give us space to allow it to grow, but it
complicates a few things. For example, right now xv6 keeps track of
the end of the virtual address space using one value (sz). Now you
have to keep more information potentially e.g., the end of the bottom
part of the user memory (i.e., the top of the heap, which is called
brk
in Un*x), and bottom page of the stack.
Once you figure out in exec.c
where xv6 allocates and initializes
the user stack; then, you’ll have to figure out how to change that to
use a page at the high-end of the xv6 user address space, instead of
one between the code and heap.
Some tricky parts: Let me re-emphasize: one thing you’ll have to be
very careful with is how xv6 currently tracks the size of a process’s
address space (currently with the sz
field in the proc
struct).
There are a number of places in the code where this is used (e.g.,
to check whether an argument passed into the kernel is valid; to copy
the address space). We recommend keeping this field to track the size of
the code and heap, but doing some other accounting to track the stack,
and changing all relevant code (i.e., that used to deal with sz
)
to now work with your new accounting. Note that this potentially
includes the shared memory code that you are writing for part 2.
Part 2: Growing the Stack¶
The final item, which is challenging: automatically growing the
stack backwards when needed. Getting this to work will make you into
a kernel boss, and also get you those last 10% of credit. Briefly,
here is what you need to do. When the stack grows beyond its
allocated page(s) it will cause a page fault because it is accessing
an unmapped page. If you look in traps.h, this trap is T_PGFLT
which is currently not handled in our trap handler in trap.c
.
This means that it goes to the default handling of unknown traps,
and causes a kernel panic.
So, the first step is to add a case in trap to handle page faults. For now, your trap handler should simply check if the page fault was caused by an access to the page right under the current top of the stack. If this is the case, we allocate and map the page, and we are done. If the page fault is caused by a different address, we can go to the default handler and do a kernel panic like we did before.
Bonus (5%): Write code to try and get the stack to grow into the heap. Were you able to? If not explain why in detail showing the relevant code.
Hints¶
IMPORTANT Check the Survival guide to walk you through this assignment.
Particularly useful for this project:
Chapter 1 of xv6 + anything else about fork()
and exec()
,
as well as virtual memory.
Take a look at the exec
code in exec.c which loads a
program into memory. It will be using VM functions from vm.c
such as allocuvm
(which uses mappages
). These will be very
instructive for implementing shm_open
(in lab 4) – we are
allocating a new page the first time (similar to allocuvm
) and
adding it to the page table (similar to mappages
).
It may be helpful to try to answer these questions to yourself:
Read chapter 2 in the xv6 book. Briefly explain the operation of
allocuvm()
andmappages()
and Figure 1-2. Check howexec
uses them for an idea.Explain how you would given a virtual address figure out the physical address if the page is mapped otherwise return an error. In other words, how would you find the page table entry and check if its valid, and how would you use it to find the physical address.
Find where in the code we can figure out the location of the stack.
Survival guide¶
From assignment:
Right now, the program memory map is determined by how we load the
program into memory and set up the page table (so that they are
pointing to the right physical pages). This is all implemented in
exec.c
as part of the exec system call using the underlying support
provided to implement virtual memory in vm.c
. To change the memory
layout, you have to change the exec code to load the program and
allocate the stack in the new way that we want.
Specifically, lets start by opening up exec.c
check the exec(...)
function which implements the system call. Exec does the following:
Part 1.¶
Opens the executable file and parses it. The rest of this paragraph is FYI. Typically executable files are made up of a header including information that allows us to index the rest of the file. The file after that consists of sections including the code, the global data, and sometimes other sections like uninitialized data. These are the parts of the memory that we need to initialize from the executable. The header information includes the number of sections, the start of each section in the file, and where it maps to in virtual memory, and the length of each section.
Part 2.¶
Initializes the kernel memory using setupkvm()
which maps the
pages of the kernel to the process address space. We dont really need
to know what happens in here.
Part 3.¶
It then moves on to load the sections of the executable file
into memory using loaduvm()
which creates the memory pages for each
section and maps them to the address space (by initializing the page
table pointers – more details in a bit). These sections in xv6 are
loaded starting at VA 0, and going up. Each new section starts at the
beginning of a new page. Recall that sections include code,
global/static data, etc..
Conveniently, we can then keep track of where the user address space
ends, which also defines the size of the process using one value
(proc->sz
). So, as we map new pages, sz
(rounded up to the
next page) can serve as their virtual address since we are simply
filling in the address space sequentially.
In VM, we map virtual pages to physical frames. XV6 has no swap so
all memory pages have to be in physical memory. Physical memory is
allocated by the kernel allocator kmalloc
. If you want to dig
deeper (you need to for the bonus part) you will see how these pages
are initialized using kinit()
and kinit2()
which are called
during the boot process in main.c
. As a result, we use
kmalloc()
as we request each new page inside vm.c
to allocate
a new physical page.
The vm.c
functions such as allocuvm()
typically follow this up
with a call to mappages
which is used to initialize the page table
entries mapping the virtual address to the physical page that it just
allocated. Otherwise these physical frames that we allocate cannot be
used by our process.
Part 4.¶
At this point, we loaded code and data sections, and its time
to create the stack. xv6 does not support a heap at the moment (there
is no malloc()/free()
available to user programs if you noticed).
It currently maps the stack in its virtual address space at a page right
after the last page we loaded from the executable (i.e., at sz
rounded up to the next page boundary).
Since the stack grows down, allocating a page here means there is no room to grow the stack – as it grows down, it will run into the code/data. To protect against that, xv6 adds one page buffer and marks it as unreadable so that in case the stack grows, we get a memory error and can stop the program. The code to create the stack is:
// round sz up to the next page boundary since stack must start
// in a new page
sz = PGROUNDUP(sz);
// our first introduction to allocuvm; it allocates and maps two pages
if((sz = allocuvm(pgdir, sz, sz + 2*PGSIZE)) == 0)
goto bad;
// we clear the PTE for the first page to create a buffer page
// between stack and code/data
clearpteu(pgdir, (char*)(sz - 2*PGSIZE));
TODO 1: This is the part of the code that we need to change to move
the stack. The current code calls allocuvm to create two pages,
one for the stack and one as a guard page starting at VA sz
which
is right after the code and data. It then clears the page table entry
for the guard page.
We want to locate the stack starting at the top of the
user address space to give it room to grow. To understand what we
need to do, lets look at allocuvm
.
It takes 3 parameters:
The page table (
pgdir
). This will not changeThe virtual address of the first page we are mapping – this needs to change to point to the top page of the user part of memory (right under
KERNBASE
). If you usekernbase
, you will try to map the page to the kernel address space.The virtual address of the last page we are mapping. For us, we are creating a stack with only a single page, so this can another address in the same page, slightly bigger than the first address.
allocuvm
allocates the page, and maps it to the page table. So,
basically we are done with moving the stack by just changing these
parameters to the right value. However, there are a few loose ends to
tie up.
Part 5.¶
Finally, we initialize the stack pointer, currently to sz
.
TODO 2: you will have to change this to the address of the top word in
the stack page. Note that KERNBASE
is the first word in the kernel
address space, so this is the word right under that.
We proceed to initialize the stack with the parameters for main as per the linux/x86 stack convention. The details are not important to us for now.
Loose Ends/Other changes.¶
Now that we moved the stack, a few places in the Kernel that hard coded the previous location of the stack have to be changed. These include:
TODO 3: All of the functions that are defined in syscall.c
(and sysfile.c
) for accessing the user stack have some checks to
see if the addresses are indeed on the stack. These checks compare
the address against sz
since that was the top of the stack in the
old implementation. You have to change those checks (or remove them
if it is easier). Check all the accessor functions such as
argint, argstr, argptr, argfd,
etc…
TODO 4: copyuvm()
. This function is used as part of fork()
to create a copy of the address space of the parent process calling
fork to the child process. It is implemented in vm.c
If you look at this function, it is one big for loop that iterates over the virtual address space and copies the pages one by one. The loop starts with:
for(i = 0; i < sz; i += PGSIZE){
since it assumes the virtual address space starts at 0 and goes to sz
.
Now this has to be changed to take into account the new stack.
If we look deeper, it reads the page table to get the PTE for the
page, allocates a new physical frames, and copies the page from the
parent memory to the new page. Finally it uses mappages
to map this
new copy to the child address space by adding a PTE to its page table.
How do we change it? Now sz
tracks the bottom part of the address
space, so its ok to leave that loop alone. We have to keep track of
the size of the stack, and added another loop that iterates over the
stack page(s) and does the same thing (kmallocs
a page for each one,
memmoves
to create a copy from the parent, and then mapages()
to add it to page table).
The loop will be very similar, with the exception of the virtual
address ranges that iterates over. Before we add stack growth, the
stack is only one page, but as the stack grows we need to keep track
of the number of stack pages. To prepare for this, we need to add a
variable in struct proc to keep track of the size of the stack (in
pages or bytes–either is fine, but I recommend pages). This counter
starts out with a stack of one page; set it in exec()
.
Debugging: If your stack moved correctly, xv6 will be able to boot into shell successfully.
If you don’t allocate/map the stack correctly, you will get errors either in the allocation functions (e.g., remapping errors) or as your program runs (page faults).
If you don’t take care of all the argint()
etc.. functions some of
your system calls will not be able to pass parameters correctly. The
results could be weird. For example, printf won’t print, and wait
wont wait (leading to init continuing to fork processes, etc…)
Growing the stack.¶
Now that our stack has been moved, we have room to grow it. When the a program causes the stack to grow to an offset bigger than one page, at this point, we will be accessing a page that is not allocated/mapped. This will cause a page fault. Basically, we will trap to the trap handler in trap.c
In there there is a switch statement with a case for every supported
trap. We need to add a case for page faults. This page fault has
trap number 14 (or T_PGFLT
) as defined in traps.h
.
TODO 5: Add a case for the page fault. When a page fault occurs,
you can check the address that caused the page fault in a hardware
register called CR2
. The CR register (standing for Control
Register) keep track of important hardware state information.
You can read the CR2
register using the function rcr2()
.
Once you have the offending address, next we need to check if it is
from the page right under the current bottom of the stack. If it is,
we need to grow the stack. You can use allocuvm
again, but you have
to initialize it with the right parameters to allocate one page at the
right place. After that, you can increment your stack size counter,
which finishes your trap handler.
Voila! you should be good to go.
To check if the stack grows correctly, write a recursive program that nests deep enough to get a stack longer than 1 page. You should get a page fault and grow the stack correctly if your implementation works.