Skip to content

J-Format

Now we arrive at the last instruction format, the J-format. For us, there is only one instruction in this format that is of interest to us, which is the j instruction. We also need to keep in mind our design principles.

Design Principle #2

Keep the common part common

Design Principle #3

Use as much of the part as possible

Design Principle #4

Keep the instruction size uniform

Given that, the only common part is the opcode.

Bits

J-format instructions use the following fields with the following name and number of bits:

opcode immediate
6 26

The immediate field is 26-bit.

Fields Meaning
opcode Specifies the instruction.
immediate The target address to be further processed.

Steps

The steps to assemble J-format instructions can be summarised as follows:

  1. Find the value of opcode.
  2. Compute the number for immediate.
  3. Convert the values to binary.
  4. Combine the fields.

Opcode Values

The values for the opcode field is summarised below:

Operation Hexadecimal Decimal
j 02 02

Immediate Field

For branches, PC-relative addressing was used because we do not need to branch too far away. However, for a general jump (j), we may actually jump to anywhere in memory! Think about calling a function1, the function can be anywhere.

The ideal case is then to be able to specify a 32-bit memory address to jump to. Unfortunately, we cannot do this. The reason is simple, we still have to allocate 6 bits for opcode. So, the only usable number of bits is only 26 bits after we remove all other fields to make room for this.

Fortunately, just like with branches, we can have an optimisation. The optimisation is to jump only to word-aligned addresses. So, the last 2 bits are always 00. By assuming that the address always end with 00, we can leave them out. As such, we can now specify 28 bits out of the 32-bit address.

This still leaves us with a missing 4 bits. How do we get these remaining 4 bits? We cannot simply set them to all 0 because if we are currently at address 0xFFFFFFFC. Further assume that all of the program is located with addresses starting at 0xF0000000. Now we have two choices:

  1. The remaining 4 bits is at the least significant bits (LSB), and we set it to 0000. This makes the last 6 bits to be 000000 due to the additional word-alignment.

    • Then we cannot jump to a nearby address such as 0xFFFFFFF0.
    • We also cannot jump to any address for which the last 6 bits are not 0.
  2. The remainign 4 bits is at the most significant bits (MSB), and we set it to 0000. This means the address has to be of the form 0000 XXXX XXXX XXXX XXXX XXXX XXXX XX00.

    • Then we cannot jump to anywhere within our program. Since our program starts at 1111 0000 0000 0000 0000 0000 0000 0000.

So, we cannot have the remaining 4 bits to be all 0. The simplest solution is to have the remaining 4 bits to take the 4 most significant bits from $PC + 4. Although we cannot jump to anywhere in the memory, it is sufficient most of the time.

Maximum Jump Range

What is the maximum jump range?

The maximum jump range is at the 256MB boundary. This is equivalent to 256 × 220 or to put it more simply, it is 228. We can compute this as follows:

  1. The number of bits we can control is 26 bits, this translates to 226 bytes.
  2. The last 2 bits is always 00, so we can specify 226 words or 228 bytes.
  3. 228 = 256MB.

Address Calculation

  • Without bitwise operations:

    $PC' = ($PC + 4) - (($PC + 4) % 268435456) + (immediate × 4)

  • With bitwise operations:

    $PC' = (($PC + 4) & 0xF0000000) | (immediate × 4)

Here, $PC + 4 is the address of the next instruction and $PC' is the target address. 268435456 is equal to 228. Now, calculating the address is probably more difficult due to the magic number. So, we will also illustrate it with a diagram below.

Address Calculation

Examples

Jump

We simply have to follow the steps but we need to reverse the address calculation above. So, we will illustrate that with examples.

Jump

Jump
1
2
3
4
5
6
Loop: beq  $t1, $zero, End    # addr: 8  <- jump target
      #op  $rs, $rt  , label
      add  $t0, $t0  , $t2    # addr: 12
      addi $t1, $t1  , -1     # addr: 16
      j    Loop               # addr: 20 <- current PC
End:                          # addr: 24
Fields Decimal Value Binaries
opcode 2 000010
immediate 2 00000000000000000000000010
  1. We start with the target address which is 8 and convert this to binary: 0000 0000 0000 0000 0000 0000 0000 1000
    • We keep the first 4 bits for comparison later: 0000
    • Since the last 2 bits are always 0, we can simplify this binary to: 00 0000 0000 0000 0000 0000 0000 0010
    • Since there are only 26 bits for immediate, we remove the first 4 bits: 00 0000 0000 0000 0000 0000 0010
  2. We look at the value of $PC + 4 which is 24 and convert this to binary: 0000 0000 0000 0000 0000 0000 0001 1000
    • We are only concerned with the first 4 bits: 0000
  3. Since the first 4 bits from step (2) is the same as the first 4 bits from step (1), we can proceed
    • The immediate value is then simply the 26 bits from step (1): 00 0000 0000 0000 0000 0000 0010

If you know the target address is a valid target address (_from step (3) above), you can simply get the immediate value as shown in the diagram below:

Immediate

Given the steps, we can now combine the binaries:

000010 00000000000000000000000010

or more simply:

00001000000000000000000000000010

We can also convert this into hexadecimal by splitting it into 4-bit groups:

Binary to Hexadecimal
1
2
3
4
   00001000000000000000000000000010
=> 0000 1000 0000 0000 0000 0000 0000 0010
=> 0    8    0    0    0    0    0    2
=> 08000002

0x08000002

Far Away

Note that both branch instructions (i.e., beq and bne) and jump instructions (i.e., j) have a limit to how far they can jump. In the case of branch, we can only jump ±215 instructions away and in the case of jump, we are restricted to 256MB boundary.

What if we want to branch/jump further than the instruction allows us to? We will discuss the approach that we can use in both cases.

Branch

Given the instruction beq $s0, $s1, label, what happen if the address for label is farther away from the $PC than what can be supported by beq (or bne) instruction? Here, we can actually chain beq (or bne) with j instruction. In other words, we introduce an intermediate label (e.g., mid_branch) that contains a j label instruction. We then replace the original instruction with beq $s0, $s1, mid_branch.

Of course, you can also chain with another beq instruction instead of j instruction. This will be especially useful when you are trying to branch to outside of the 256MB boundary, which we will discuss later.

Original
1
2
3
4
5
label: # can be above or below
  :
  :    # code omitted
  :
beq $s0, $s1, label
Replacement
1
2
3
4
5
6
7
8
9
label: # can be above or below
  :
  :    # code omitted
  :
mid_branch: j label
  :
  :    # code omitted
  :
beq $s0, $s1, mid_branch

Jump

Given the instruction j label, what happen if the address of label is outside of the 256MB boundary? In this case, we will need to chain the instruction with beq instruction that is guaranteed to always branch. Similar to before, we add an intermediate label. Here, the beq will need to be at or near the edge of the 256MB boundary.

Original
1
2
3
4
5
label: # can be above or below
  :
  :    # code omitted <- boundary
  :
j label
Replacement
1
2
3
4
5
6
7
8
9
label: # can be above or below
  :
  :    # code omitted <- boundary
  :
mid_jump: beq $s0, $s1, label
  :
  :    # code omitted
  :
j mid_jump

Warning

There are a few things we have to be careful of. Firstly, we need to ensure that the added instruction can be reached from the original instruction. Additionally, we need to ensure that the added instruction can reach the original target. Since we are adding instructions, this may mean that the original instruction is pushed down which may change the boundary. This also means that we may need to change the immediate value for other instructions.

Secondly, we need to ensure that the added instruction is not executed unintentionally. In our replacement recipe above, we kind of omitted the code before and after the added instruction. One trick here is that we can add branch/jump before the added instruction to branch to the line immediately after the added instruction. If we do that, then we ensure that normal execution of the program will ignore this added instruction.

Replacement
1
2
3
4
5
  :    # code omitted <- boundary
          beq $zero, $zero, after  # ---+
mid_jump: # added instruction           |
after:                             # <--+
  :    # code omitted

  1. Although calling a function is NOT tested, it is a useful analogy for this.