Topic: Reverse engineering tip, tricks, and tools

Let's leave "Fernly running on MT6261" for discussions that are about just that topic. It might be better having a separate thread for chat about Fernvale RE methods which doesn't have to always involve Fernly or the MT6261 processor. I don't claim to be a master by any means but I'll kick it off with a couple of things I stumbled onto that have been working for me.

Re: Reverse engineering tip, tricks, and tools

A quick but handy one... If you're still getting the hang of ARM assembly language, create a file called "hexfn.c" and paste in this program fragment:

extern char *msg;

static inline unsigned char foo(int z)
{
  if (z <= 9)
    return '0' + z;
  else
    return 'a' + z - 10;
}

void testfn(long x)
{
  msg[7] = foo(x & 0xf);
  msg[6] = foo((x >> 4) & 0xf);
  msg[5] = foo((x >> 8) & 0xf);
  msg[4] = foo((x >> 12) & 0xf);
  msg[3] = foo((x >> 16) & 0xf);
  msg[2] = foo((x >> 20) & 0xf);
  msg[1] = foo((x >> 24) & 0xf);
  msg[0] = foo((x >> 28) & 0xf);
}

Then cross-compile it with "arm-none-eabi-gcc -S hexfn.c" or whatever the command is for your toolchain installation. (Use the native compiler you're one of those swanky Novena users, I guess.) The "-S" flag tells it to create the file hexfn.s with ARM assembler source that does the same thing as your C routine. You can also add optimizer flags such as -O3 or -Os which radically change the output instructions sometimes.

I did just this when I needed an ARM assembler function to output a 32-bit int as hexadecimal. There's a lot to learn from seeing different ways to write something where you already know exactly what it does.

For questions about what an opcode like "bleq" or whatever does, just google "arm assembler instruction bleq". You'll get a gazillion hits. The pages at infocenter.arm.com and www.keil.com/support were especially handy, I thought.

3 (edited by isogashii 2016-04-15 02:41:45)

Re: Reverse engineering tip, tricks, and tools

At a certain point, I knew that 0xfff00b64 was the entry point for the ROM serial input function, and Fernly's serial output had worked without changing anything, but serial input was NFG. I wrote a C-callable "rom_getchar()" function for Fernly to get past this point, and then started pulling sections of the ROM code into it. At one stage, it looked like this:

.text
.global    rom_getchar

rom_getchar:
    push    {lr}
    adr    r0, char_buffer
    mov    r1, #1
    mvn    r2, #0
    bl    usb_uart_in
    ldrb    r0, char_buffer
    pop    {pc}

char_buffer:
    .word    0
w_0b65:
    .word    0xfff00b65

usb_uart_in:
    push    {r3, lr}

# level 0: this is ROM call
#    ldr    r3, w_0b65
# level 1: call extracted ROM fn below
    adr    r3, l_0b64+1
    blx    r3

    pop    {r3, pc}

.thumb

# define and use "tlink" macro to generate link routines for long calls in thumb mode
# E.g., "tlink 0b7e" defines entry "t_0b7e" which calls 0xfff00b7e entry point in ROM
#
.macro tlink addr
t_\addr:
    push    {r7, lr}
    ldr    r7, =(0xfff00000 + 0x\addr + 1)
    blx    r7
    pop    {r7, pc}
.endm

#
# level 1 ROM replacement fn: 0xb64
#

tlink    1d1e

l_0b64:
    push {r4, r5, r6, lr}
    movs r5, r0
    movs r6, r1
    movs r4, #0
    b l_0b76

l_0b6e:
# level 1: this is ROM call
#    bl t_1d1e
# level 2: call extracted ROM fn below
    bl l_1d1e
    strb r0, [r5, r4]
    add r4, #1

l_0b76:
    cmp r4, r6
    blo l_0b6e
    movs r0, #0
    pop {r4, r5, r6, pc}

#
# level 2 ROM replacement fn: 0x1d1e
#

tlink    1794
tlink    1cf0

l_1d1e:
    push {r3, r4, r7, lr}
    movs r0, #0
    add r3, sp, #0
    strb r0, [r3]
    ldr r4, d_48ac
    add r4, #0x44
    b l_1d30

l_1d2c:
    bl t_1794
# Level 3:
#    bl l_1794

l_1d30:
     ldr r0, [r4]
    cmp r0, #0
    beq l_1d2c
    mov r0, sp
    bl t_1cf0
# Level 3:
#    bl l_1cf0
    add r3, sp, #0
    ldrb r0, [r3]
    pop {r3, r4, r7, pc}

d_48ac:
    .word    0x700048ac

The Radare2 disassembly output I was working from for the "Level 1 function" was this:

  |||||||   0x00000b64      70b5           push {r4, r5, r6, lr}
  |||||||   0x00000b66      051c           adds r5, r0, 0
  |||||||   0x00000b68      0e1c           adds r6, r1, 0
  |||||||   0x00000b6a      0024           movs r4, 0
  ||||||,=< 0x00000b6c      03e0           b 0xb76
  |||||.--> 0x00000b6e      01f0d6f8       bl 0x1d1e
  |||||||   0x00000b72      2855           strb r0, [r5, r4]
  |||||||   0x00000b74      0134           adds r4, 1
  ||||||`-> 0x00000b76      b442           cmp r4, r6
  |||||`==< 0x00000b78      f9d3           blo 0xb6e
  |||||||   0x00000b7a      0020           movs r0, 0
  |||||||   0x00000b7c      70bd           pop {r4, r5, r6, pc}

A few things worth commenting on:

Most of ROM runs in Thumb mode instructions, but Fernly compiles to classic 32-bit ARM instructions. Everything I wrote above the ".thumb" directive assembles to ARM instructions; everything below to Thumb code. The way you get from one to the other is with the "blx" instruction. ARM and Thumb instructions both have to be aligned to even memory addresses, so the low memory bit (really always 0) of the branch destination can be used as a flag to switch processor modes. That's why there's the "adr r3, l_0b64+1" instruction to load the target address of the subroutine call and also flag a mode switch.

Notice that 0xfff00b64 just calls another ROM function at 0xfff001d1e, making one call for each byte requested. I'd discovered the real ROM getchar() function.

In Thumb mode, a branch-and-link instruction doesn't have enough address bits to jump all the way from the 0x7000xxxx RAM block to the 0xfff0xxxx ROM area. I wrote the "tlink" macro to make it more convenient to deal with this. The line "tlink 1d1e" creates a wrapper function called "t_1d1e". I just replaced the subroutine call in the Radare2 output with a call to this function. In the next stage, I moved that function into Fernly and used the label "l_1d1e" for the local replacement version running in RAM.

Unfortunately Radare2 doesn't disassemble binary files to instructions that are totally compatible with GNU Assembler. I spent a lot of time cleaning and correcting the listing output. Examples: "movs r4, 0" had to be changed to "mov r4, #0". Immediate operands need a '#' sign before them, and there's no option in Thumb mode for the MOV operation with an immediate operand to change the status register--what the 's' suffix should mean. The assembler throws an error for either of these things. I also changed "adds r5, r0, 0" (add 0 to r0 and store in r5) to "mov r5, r0" for the sake of readability, since that's what it really does.

The main reason for doing all this was that, after making sure it still worked when linked into Fernly, I could insert instructions to dump out register values at different places, or even just to signal that a certain function actually ran when a character was read from USB serial, which wasn't always easy to tell just from reading. Eventually the function calls went 8 levels deep before it all finally bottomed out as raw register reads.

Re: Reverse engineering tip, tricks, and tools

It's always fun to bring up a new board.  And fantastic job reverse-engineering it smile

The difference between "movs" and "mov" is one of unified syntax.  Or rather, non-unified syntax.  You can try setting ".syntax unified" at the top of your ASM file to see if thta helps things in the future: https://sourceware.org/binutils/docs/as … on_002dSet

Re: Reverse engineering tip, tricks, and tools

Many thanks for the kind words and the tip on unified syntax, xobs. I'll take a look at that.

I checked the code again and there are several "movs r4, #0" instructions and such in there that the assembler obviously accepted. I remember having to take the 's' off some opcodes, but I've got something mixed up with something else. (There's also at least one "add r3, sp, #0" that I never changed to a MOV instruction.) I'm still new to ARM assembler myself, and I've been a lot more confused than this many times。

The main thing was to change as little as needed to regenerate the same code, ideally the same binary opcodes as the ROM dump except for minor things like branch-and-link address offsets. That was the reason for staying in Thumb mode with the extra trouble it brought. It seemed like there were fewer chances to draw a wrong conclusion and get sent off in the wrong direction--not that it never happened!--if the assembler listing matched the ROM dump byte-for-byte as much as it could.

xobs wrote:

It's always fun to bring up a new board.

Yes, definitely, if by "board" you mean "watch". smile I've been startled a few times by remembering that I can and do wear it on my wrist. It feels bigger working with it than something like an Arduino.