Lately, my student and I have been trying to take apart a firmware at work. We couldn’t identify the chip because we were told not to spoil the hardware. There is no datasheet and we are pretty much stuck. After taking a step back and digging more into other documentations, I realised that we might be dealing with some proprietary runtime system. Gosh.
Anyway, this matter at work reflected that I need to build a stronger foundation in firmware analysis. Since I’m already on SJCAM4000, I thought I should take a look at this firmware… and to build on a stronger understanding of what I already know about firmware analysis.
Firmware
In a nutshell, firmware is a special type of software that provides codes and resources to instruct the hardware what to do in a system. It has to be held in the persistent storage (e.g. ROM) as it contains all the critical functions to run the hardware.
Firmware Components
Typically, an image may contain the following components:
- Bootloader
- Kernel
- Drivers
- File-system
- Applications
- Resources (e.g. picture files)
What’s in the firmware is dependable on the vendors. Generally, there are three types of firmware format:
- Full (OS + Bootloader + Libraries + Applications)
- Integrated (Libraries + Applications)
- Partial (Applications/Libraries/Resources)
Firmware Security
The firmware image could be compressed and/or encrypted to protect against the reverse engineers.
From what I understand, it seems like it’s still very common to distribute the firmware without any security measures.
Analysing SJ4000 WIFI Firmware v1.8
I downloaded SJCAM_SJ4000WIFI_Firmware_v1.8 (filename: FW96655A) from the official website.
One way to start understand more about the firmware is to read the change logs. Not too sure if they have typos as the change logs seem to be outdated on the website… but it shows that the system on the camera is able to mount exFAT and FAT32 file system.
Reconnaissance
The first thing that I did is to use file
to identify the format of the firmware image. There is not much information using file
apart knowing that it’s not an archive file.
➜ file FW96655A.bin
FW96655A.bin: data
Next, I used the strings
command to see if we can find any interesting readable strings in the image. This helps to see if the firmware image is encrypted. Fortunately, it is not! Furthermore, we are able to identify the actual model of the camera. It appears that SJ4000 might be a white label of the Novatek NT96655 dash camera.
➜ strings FW96655A.bin > strings.txt
➜ head -10 strings.txt
BCL1
NT96655 1000000020100701
0LD:GS650
...
We used hexdump
to view the first 16 bytes of the firmware. The first 4 byte, 42 43 4c 31
, is the magic number for Basic Compression Library. Fortunately, BCL is open-source and can be found here.
➜ hexdump -Cn 16 FW96655A.bin
00000000 42 43 4c 31 8f 6c 00 09 00 5e a5 e8 00 33 5c 90 |BCL1.l...^...3\.|
Decompress the Firmware
We will need to decompress the firmware image in order to analyse further. To do this, we downloaded the source codes and compiled the library using the make
command. This will generate bfc.exe
in the src
folder, which can be used to compress a test file.
➜ make && chmod +x bfc.exe
➜ ./bfc.exe
Usage: ./bfc.exe command [algo] infile outfile
Commands:
c Compress
d Deompress
Algo (only specify for compression):
rle RLE Compression
sf Shannon-Fano compression
huff Huffman compression
lz LZ77 Compression
rice8 Rice compresison of 8-bit data
rice16 Rice compresison of 16-bit data
rice32 Rice compresison of 32-bit data
rice8s Rice compresison of 8-bit signed data
rice16s Rice compresison of 16-bit signed data
rice32s Rice compresison of 32-bit signed data
We ran all the algorithms on a test file and dump the first 16 bytes of the compressed files.
(RLE) 42 43 4c 31 00 00 00 01 00 00 00 64 00 68 65 6c
(LZ) 42 43 4c 31 00 00 00 09 00 00 00 64 00 68 65 6c
(SF) 42 43 4c 31 00 00 00 0a 00 00 00 64 5b 66 df 56
(HUFF) 42 43 4c 31 00 00 00 02 00 00 00 64 17 7b 95 bc
From this exercise, we can see that the first 8 bytes of the firmware is similar to the file we have compiled using the LZ algorithm (last byte = 09
). We can also determine from the source code that the compression header is 12 bytes long. First 4 bytes consists of the magic number, next 4 bytes consists of the algorithm type and the last 4 bytes is the size of the input file (64
bytes file).
void WriteWord32( int x, FILE *f )
{
fputc( (x>>24)&255, f );
fputc( (x>>16)&255, f );
fputc( (x>>8)&255, f );
fputc( x&255, f );
}
/* Write header */
fwrite( "BCL1", 4, 1, f );
WriteWord32( algo, f );
WriteWord32( insize, f );
The fifth and sixth bytes of the firmware might mean something for the firmware developer. To find out, we will need to do statistical analysis on the past firmware images to see if there are any patterns. We will not do it in this post.
Now, we should be able to decompress the firmware. Unfortunately, we encountered segmentation fault
.
➜ ./bfc.exe d ../../Firmware/FW96655A.bin ../../Firmware/FW96655A.bin.2
LZ77 decompress ../../Firmware/FW96655A.bin to ../../Firmware/FW96655A.bin.2...
Input file: 3366036 bytes
Output file: 6202856 bytes
[1] 14926 segmentation fault ./bfc.exe d ../../Firmware/FW96655A.bin ../../Firmware/FW9
6655A.bin.2
We can use a debugger to find out where did the segmentation fault occurs. But I decided to start from the source codes to see if we can find out the cause of the segmentation fault. I started on ReadWord32
, which is called to read 4 bytes from the given buffer.
int ReadWord32( FILE *f )
{
unsigned char buf[4];
fread( buf, 4, 1, f ); // segmentation fault might happen here?
return (((unsigned int)buf[0])<<24) +
(((unsigned int)buf[1])<<16) +
(((unsigned int)buf[2])<<8) +
(unsigned int)buf[3];
}
I believe a segmentation fault might happen if f
pointer is already at the end of the file. Now, let’s find out where ReadWord32
is used. The function is called 3 times in a row when the program is preparing for decompression. After that, the buffer pointer (from 13th byte onwards) will be passed to the respective algorithm to decompress the data.
insize = GetFileSize( f );
if( command == 'd' )
{
/* Read header */
algo = ReadWord32( f ); /* Dummy */
algo = ReadWord32( f );
outsize = ReadWord32( f );
insize -= 12;
}
There are two ways this code might have gone wrong at this point of time.
The header might be longer than 12 bytes. This means that the numebr of data bytes might be less than the buffer that is allocated in the decompression library. Hence, the decompression library might encounter segmentation fault when it is dealing with
null
characters at the end of the buffer.The allocated output size in the decompression library is incorrect. Hence, it might encounter segmentation fault as the output bytes are copied to illegal memory location.
I went to take a look at the decompression file, lz.c
, to find out what are the bytes that are written to the header.
out[ outpos ++ ] = (unsigned char) marker;
outpos += _LZ_WriteVarSize( bestlength, &out[ outpos ] );
outpos += _LZ_WriteVarSize( bestoffset, &out[ outpos ] );
inpos += bestlength;
bytesleft -= bestlength;
The algorithm computes the length of the output and assign the value to bestlength
. Based on the code above, we can confirm that 00 33 5c 90
(12th to 16th byte) is the output size of the compressed image.
➜ hexdump -Cn 16 FW96655A.bin
00000000 42 43 4c 31 8f 6c 00 09 00 5e a5 e8 00 33 5c 90 |BCL1.l...^...3\.|
The output size written in the header is 4-byte smaller than the size computed by GetFileSize(f) - 12 bytes
in the decompression algorithm. The size variable, insize
, is used as the loop condition in function LZ_Uncompressed
. This cause the code to access out of bound memory.
LZ_Uncompress( in, out, insize ){
...
do{
symbol = in[ inpos ++ ]; // seg fault
...
}
while( inpos < insize ); // insize = 3366036, size(in) = 3366032
...
}
To fix this problem, all we have to do is to compute the correct insize
variable. Instead of taking away only 12 bytes of header, we need to take away 16 bytes.
if( command == 'd' )
{
/* Read header */
algo = ReadWord32( f ); /* Dummy BCL1*/
algo = ReadWord32( f );
outsize = ReadWord32( f );
insize -= 16; // instead of 12 bytes
}
After modifying the source codes, we are able to decompress the image without any problem.
➜ ./example d ../../Firmware/FW96655A.bin ../../Firmware/FW96655A.bin.2
size = 3366048
LZ77 decompress ../../Firmware/FW96655A.bin to ../../Firmware/FW96655A.bin.2...
Input file: 3366032 bytes
Output file: 6202856 bytes
We confirmed that we have decompressed the file successfully by using the strings
command again. Instead of garbage, we are able to extract sets of complete ASCII strings from the firmware image.
---Decompressed Below---
<html>
<head><title>Page Not found</title></head>
<body><h2>The requested URL was not found on this server.</h2></body>
</html>
---Compressed Below---
<html>
<head><title>Page Not found</
<body><h2>The
\ed URL was n
9 on this
\.</h2></
4p#=
4L%=
4('=
We ran binwalk -A
and noticed that the firmware uses MIPS. We can most likely throw this file into a disassembler and get some good analysis results.
...
4190512 0x3FF130 MIPSEL instructions, function epilogue
4190600 0x3FF188 MIPSEL instructions, function epilogue
4190704 0x3FF1F0 MIPSEL instructions, function epilogue
4192296 0x3FF828 MIPSEL instructions, function epilogue
4192644 0x3FF984 MIPSEL instructions, function epilogue
4192884 0x3FFA74 MIPSEL instructions, function epilogue
4192920 0x3FFA98 MIPSEL instructions, function epilogue
4192964 0x3FFAC4 MIPSEL instructions, function epilogue
...
We can confirm that this firmware is using eCos v2.0.6
based on the given strings.
➜ strings FW96655A.bin.2 | grep ecos
/cygdrive/c/project-ecos/ecos-2.0.6/packages/infra/v2_0_60/src/tcdiag.cxx
/cygdrive/c/project-ecos/ecos-2.0.6/packages/infra/v2_0_60/src/pure.cxx
'/cygdrive/c/project-ecos/ecos-2.0.6/packages/kernel/v2_0_60/src/common/clock.cxx
Seems like Michael Lo might be the guy who prepared the operating system. (Hah!)
/home/michael.lo/project-ecos/ecos-rtl8189es/install/include/cyg/io/eth/rltk/819x/wrapper/skbuff.h
Anyway, I thought it might be quite interesting to grep http
from the firmware. Indeed, we see a hardcoded IP address in the firmware.
➜ grep -i http strings.uncompress
POOL_ID_HTTPLVIEW
HTTP/1.0 200 OK
HTTP/1.0 404 Not Found
HTTP/1.0 405 Method Not Allowed
HTTP/1.1 200 OK
HTTP/1.1 200 OK
http://115.29.201.46:8020/download/filedesc.xml
http://%s%s
PhotoExe_OpenHttpLiveView
http file server
HTTP/1.1 200 OK
HTTP/1.1 201 Created
HTTP/1.1 206 Partial Content
HTTP/1.1 400 Bad Request
HTTP/1.1
HTTP/1.0
Not support HTTP method %d
HTTP/1.1 200 OK
Unfortunately http://115.29.201.46:8020/download/filedesc.xml
is no longer available. But it’s useful to find more information about the camera online.
For instance, here’s the document that describes all the possible GET commands on NT96660. Apparently, the IP address used to contain XML document to determine if the camera should update the firmware on the camera. It seems like the camera has hosted a HTTP server.
Indeed I was able to access my camera’s home page by visiting its IP address, which can also be found in the strings. The most interesting command that we can do via wifi is to perform OTA firmware update.
Apparently, you will need to upload the firmware onto the camera. The firmware should be placed in a location that is understood by the project. I assume this refers to the project creator.
<Function>
<Cmd>3013</Cmd>
<Status>-18</Status>
</Function>
Seems like as long as you know where the firmware should reside, you can run this command over the wifi to upload the camera with rogue firmware. However, we have to note that the attacker’s motivation in this case will not be strong since SJ4000 will not be switched on all the time. Anyway, it seems like this could be the location that they are looking for:
This probably means that my MicroSD card is mounted on A:/
. I can probably upload firmware.bin
and execute the HTTP command to update firmware.
Scanning
Before we get into some hardcore reverse engineering work, we can always run a quick network scan to find out if there are any open ports on the camera. There are 3 open ports. Glad that administrative remote ports are not in the list (e.g. SSH)!
➜ nmap -Pn 192.168.1.254
Starting Nmap 7.70 ( https://nmap.org ) at 2019-05-28 07:24 +08
Nmap scan report for 192.168.1.254
Host is up (0.0073s latency).
Not shown: 997 closed ports
PORT STATE SERVICE
80/tcp open http
3333/tcp open dec-notes
8192/tcp open sophos
It seems like port 3333 is used to stream camera’s status and events while port 8192 is used to service photo shoot preview.
Disassemble the Firmware
Since we know that the firmware uses MIPS, we can attempt to analyse the firmware using a disassembler. Ghidra is very useful in this case since the free IDA pro does not support MIPS.
The program identified a substantial number of functions (on the right windows). It took a while to analyze and finally identify all the strings in the firmware. Not too sure if it is because I’m running Ghidra in a virtual machine.
Anyway, the first step to disassemble the firmware is to find out the loading address. The correct memory mapping is important as it maps the branches and references to the right addresses. One technique is to look for static addresses in the instruction code.
It may not be very obvious but the code is trying to reference something at 0x800023c4
. In MIPS, sw
means store word
. This means that it is trying to store something from the left operand to a specific address at the right operand.
If 0x800023c4
is non-existence in our firmware, it could mean two things:
- The loading address is incorrect.
- It refers to RAM, which we do not have it on our firmware.
I think there are methods to find the loading address but I don’t know MIPS enough to read the instruction. So I decided to try my luck and reload the firmware at 0x8000000
instead.
And booom! We got it right! It’s obvious that we got it correct because the instructions are decompiled nicely into readable code as opposed to the previous time we loaded the file.
What’s next
After this, we can analyze the codes to look for bugs and patch the firmware for better use.