Memory Abstraction

The part of memory in this topic is predominantly referring to the RAM. For reference, the memory heirarchy you probably saw in cs2100:

heirarchy

RAM (Random Access Memory) is a physical memory storage which can be treated as an array of bytes where each byte has a unique index known as a physical address. It is a contiguous memory region with an interval of consecutive addresses.

physicalMem

We are interested in seeing how we can use it to run processes. Processes are a running abstraction of a program, after writing and compiling to a.out, to become a process we need to press enter for it to run (and be a process). But before it can run, it needs to be loaded into memory.

illustration

We have here our program that gets compiled into executable code. Once we press 'enter', it has to be loaded into (any free space in) memory.

CompileLoad

codeCompile

loadMem

But, where in the memory does the process go? Once the process runs, does the size change?

Generally, there are 2 types of data in a process, transient data (local variables, things we create/delete) and presistent data (global variables). We need to think not just about where these data go but also that they can grow and shrink.

Managing Memory

In terms of memory, what the OS does include:

Allocate memory space to new process
- Else cannot run the process!
Manage memory space for process
Protect memory space of processes from each other
- Avoid overwriting memory
Provide memory related system calls to processes
- Where do they exist in memory
Manage memory space for internal use

The key idea of memory abstraction is presenting a logical interface for memory accesses

Without Abstraction

Let's consider the case without memory abstraction. Suppose a process directly uses physical address, e.g. load $reg1, [1024]. We need to check for how to access the memory location, who has access, whether multiple processes can share the physical memory correctly and whether the address space can be protected easily.

Something good about using physical address is, we don't need management when loading a process - if process A goes there, it goes there!

noAbsPro

but obviously, there is a problem with this. If a process b comes in and the address overlaps, what do we do now? We can try something like memory switching, or since we still have space in the memory we can place process b below and update the address (e.g. add 8000)

noAbsCon

Address Relocation

Addres relocation involves recalculating the memory references during loading. Since every step needs their address recalculated, this cases slow loading time. It is also not easy to distinguish memory reference from normal integer constant. To reduce having to add the offset to every step, we can try to store it somewhere.

Base + Limit Registers

On the CPU, we can make use of some special register. Storing the address in the register means we just need to change our instruction to say for every address we calculate, we just add whatever is in the special register.

The offset is the base (if B starts at 8000, thats the offeset address) and to protect processes from each other, we have a limit to say how large this process is going too be. Beyond that if the process tries to access the address, there will be an error.

Base register - The base of all memory references, during compilatino timeall memory references are compiled as an offset from this register. During loading, the base register is initialized to the starting address of the process memory space.

Limit register - Indicates the range of the memory space of current process, and all memory access is checked against the limitto protect memory space integrity.

baseLimit

If process A tries to access 5001, we know it cannot belong to A. But how is this check done?

To access adr, we need to calculate actual where actual = base + adr and check that actual < limit. So every memory location we access incurs this additional check (additional cost associated). However, this is fundementally a very useful idea. This has later been generalized to segmentation mechanism. This also prvides a crude memory abstraction - the 2 addresses 4096 in the process a and b are no longer the same physical location.

Since the address 4096 is not physical - it can be thought of as logical. The process thinks it has this address 4096 is where it puts its instruction in - they have an idea of what memory looks like but its up to us to figure out where to put those processes at.

Logical Address

Embedding actual physical address in a program is a bad idea, so we have the concept of logical address. It is how the process views its memory space, not physical adderess in general. A mapping between logical and physical address is needed. Each process has a self-contained, independent logical memory space.

load $reg1, [1024]

Contiguous Memory Management

Processes must be in memory during execution. We look at the store memory concept where we have the load-store memory execution model. Here, we make a few assumptions

Each process occupies a contiguous memory region
- Processes are loaded in one go (process a, then b)
The physical memory is large enough to contain one or more processes with complete memory space
- Both process A and B can be fully fitted in memory

We need to have multiple processes in the memory at the same time to support multitasking (so we can switch from 1 process to another). When physical memory is full, we free up memory by removing terminated processes and swapping blocked process to secondary storage.

Memory Partitioning

Memory partition is the contiguous memory region allocated to a single process. There are 2 allocation schemes here, fixed size and variable size partition. Note that there will always be 1 part of memory for the OS. After all, it is a process and needs to be loaded in the memory to run.

Fixed-Size Partition

Physical memory is fixed into fixed partitions where a process will occupy 1 partition. They can be easier to keep track of.

fixSize

Since memory is power of 2/4, it is easy to partition into fixed sizes. As seen from the illustration, we put the processes into the different partitions. But the leftover space is not enough for other process causing wasted space. This is known as internal fragmentation. It is internal as it is internal to that partition.

Some pros of this are that it is easy to mange and fast to allocate, every free partition is the same so there is no need to choose.

However, the partition size needs to be large enough to contain the largest processes. Smaller processes will waste memory space causing internal fragmentation.

Variable-Sized Partition

Paritions are created based on actual size of the process and the OS keeps track of the occupied and free memory regions where it performs splitting and merging when necessary. This can be harder too keep track but can offer better memory usage.

dynamicSpace

When a process finishes, it creates free memory for other process to be loaded in. If the space is not enough, we may have to move the other processes, causing a lot of misaligned space. There can be many small gaps which together are big enough to store a process but individually are too small. This is known as external fragmentation.

We have to bring the space together (remember - assumption 1: all process memory should be contiguous), also known as compaction. But during compaction, since the memory addresses are changing, we cannot run the processes. We need to stop processes which can be a waste of time!

So this is good in the sense it is flexible and removes internal fragmentation, but it requires more information to be maintained in the OS and takes more time to locate appropriate region in memory.

Allocation Algorithms

Assuming the OS maintains a list of partitions and holes, we have an algorithm to locate a partition of size N. We need to search for a hole with size M > N. Here, we have several varaints:

First-Fit: Take the first hole that is large enough
Best-Fiot: Find the smallest hole that is large enough
Worst-Fit: Find the largest hole

To do this, the system has to maintain some information - information about the partitions already occupied and where the gaps are. Then, we split the hole into N and M-N where N is the new partition and M-N is the left over space, which is a new hole.

Over time, when this repeats we can have multiple small holes. Thus when an occupied partition is freed, we should merge with adjacent hole if possible. By consolidating the holes, when another process comes in, we have enough space. Remember we are looking at contiguous memory management - processes have to fit in memory together and not in multiple passes.

Another way is compaction. Here, we move the occupied partition around to create consolidated holes. However, this cannot be invoked too frequently as it is very time consuming. When moving things around, we cannot run anything else until we rearrange the memory. No real work is done.

First-Fit

The OS maintains a linked list of partition information (where the processes are and which are the gaps).

Illustration

EmptyProcess AProcess BProcess A finishProcess CProcesss C finish

part1

At the start, the free space is from 256 to 4095 with a part of memory reserved for OS. (OS is resident in memory)

part2

Process A comes in and occupies the next part in memory. Linked list is updated

part3

Process B enters and goes with the first space available at 768

part4

When reclaiming space back, we mark it as empty rather than deleting it. (So need to overwrite info) So now that space is marked available. (F for free, T for taken)

part5

Now process C enters and has 2 choices in memory. But since C is larger than A, the first available location is after B.

part6

Process C finishes and the area in memory is marked is available.

As seen, we have all the information starting to form a long list as we go. The longer the list, the more time and effort needed to check the list and blocks on the list are not evenly spaced out. We also may not have a sense on when we might find a space we need - how for do we need to go? We need a slightly faster and better mechanism for this.

Buddy System

Buddy memory allocation improvs on first-fit by providing an efficient algorithm for partition splitting. It locates a good match of free partition (hole) and does partition de-allocation and coalescing.

The idea is that it looks at memory (which is in powers of 2) and divide as necessary. A free block is split into half repeatedly to meet request, where 2 halves forms a sibling block (buddy blocks). When buddy blocks are both free, they can be merged to form a larger block. Starting from whatever space available, it divide by 2 until we have a space for the process.

buddy

This gives us faster lookup and easier to manage. We keep an array A[0..k] where $2^k$ is the largest allocatable block size. Each array element A[j] is a linked list which keeps track of free blocks of size $2^j$ and each free block is indicated by just the starting address. So, at the start when all memory is free, the only index that has information of block available is the highest index.

In actual implementation, there may be a smallest allocatable block size as well since a block that is too small is not cost effective to manage.

Allocation:

To allocate a block of size N, we find the smallest s such that $2^s >=N$.
Access A[s] for a free block
- If free block exists, remove block from free block list and allocate the block
- Else, find smallest r from s+1 to k such that A[r] has a free block B. For (r-1 to s), repeatedly split B -> A[s...r-1] has a new free block, then repeat step 2

Example

For a process N of size 100, closest power of 2 >=100 is $2^7$ (128). So we look at A[7]. If there isn't a block available, we go to next highest level $2^8$. We split it into 2 part of 128 each and place the process in.

Deallocation:

To free a block B, check in A[s] where $2^s$ == size of B
If buddy C of B exists (and is free)
- Remove B and C from list
- Merge B and C to get larger block B'
- Goto step 1 where B <- B'
Else (if buddy not free)
- Insert B to list in A[s]

But how do we know which blocks are buddies (next to each other)? Observe that block address A is $xxxx00..00_2$. With 2 blocks of half the size after splitting, we get $B = xxxx00..00_2$ and $C = xxxx1..00_2$. This works as we are storing the start address of the block!

Example

$A = 0 (000000_2)$ , size = 32
After splitting, we get
- $B = 0 (000000_2)$, Size = 16
- $C = 16 (010000_2)$, size = 16

So 2 blocks B and C are buddy of size s, if the $s^{th}$ bit of B and C is a complement and the leading bits up to s of B and C are the same.

To make it easier to visualize, we can take a look at the illustration

Illustration

BeginProcess 1Process 2Process 1 finish

buddy1

We begin with free memory of 512

buddy2

Process of size 100 comes in (smallest S larger than 100 is 7). Memory is split into 2 blocks of 256 and the array and linked list is updated.

buddy3

A block of 256 is further split into 2 blocks of 128 where P is allocated at 0. Now the only blocks free is 1 of 128 and 1 of 256.

buddy4

Another process of size 250 enters. Since there is available free space of 256, Q is allocated at 256. Now only 1 block of size 128 is available.

buddy5

Processs 1 finishes and now the block is free and goes back to the list. Since 0 and 128 are next to each other, we can merge them together.

buddy6

Merging them creates a block of size 256 and we update the array and linked list to point to the block size 256 at location 0.