Lexmark Printers Firmware Extraction - Part A

Tal Zohar

/ September 9th, 2020

Lexmark Printers Firmware Extraction - Part A

Visibility is Foundational to Security

One of the biggest benefits of the Medigate platform is the incredible visibility it provides into the connected devices in the network. This visibility is based on extensive research that our Medigate Labs research team has done over the years into the minutia of what makes up medical, IoMT and IoT devices and how they work.

The Medigate platform is able to identify a device's operating system (OS), including its version, which is critical for assessing risk and understanding the potential threats and vulnerabilities of that device. For example, devices whose OSes have reached their end of life (EoL) are more vulnerable to security risks, due to a lack of security updates. In addition, knowing the specific version of the OS helps pinpoint whether or not the device is affected by a specific vulnerability.

One of the device types that organizations need to pay particular attention to is printers. Due to how common and potentially vulnerable printers are, it is extremely important to assess the risks these devices pose.

Honing in on Lexmark Printers
Encrypted Firmware Updates
Hardware Firmware Extraction
Conclusion

Honing in on Lexmark Printers

When we looked at which printers were in the networks of our many customers, we found Lexmark was one of the most common brands used by health systems. We then took a closer look at Lexmark and all it's models and versions. The following is what we learned.

Encrypted Firmware Updates

The first thing we examined was the firmware updates of Lexmark printers. Using a simple entropy test, we observed they were encrypted. Entropy represents the randomness of a binary stream. The entropy test examines the probabilities of each bit in arbitrary block size with a sliding window. That probability is entered into the entropy equation (see below the equation for one bit):

In the case of a random binary block, the probabilities will be equal:

This will cause entropy for that block

If the whole data is a truly random, the entropy will be constant at 1. If it's not, the entropy will change frequently.

A random stream is most commonly caused by:

Encrypted data
Compressed data
A stream created by a random generator (like a key generator)

In the graph below, we show the entropy test results of the Lexmark firmware update we downloaded. You can see the entropy rises up at the beginning and stays consistently on '1' until the end. Typically, there is a header at the beginning, which accounts for the starting point, and the rest is the binary stream. Because the binary stream is quite large and it consistently on '1' without any ripple, we assume the binary stream encrypted.

After seeing this, we decided to buy a printer to further investigate. We wanted to extract the firmware to better understand how it works and the update process.

Hardware Firmware Extraction

We chose Lexmark MS811n for our investigations.

The first thing we did when it arrived was disassemble the side door to reveal the main printed circuit board (PCB).

We then started to identify its elements and create a simple block diagram of how it works, which you can see below.

Block Diagram

Note, we made assumptions about the Serial and JTAG interfaces based on early experience. From the block diagram, we created a research tree (see below) with the primary goal of extracting the firmware from the hardware.

Research Tree

We calculated the risk of damage vs the probability of success to create the following priorities for our investigation:

Serial Port
JTAG Port
NAND Flash

The following is what we did for each of these components.

Serial Port

Serial Interface Exploration

We suspected the component in the picture was a serial interface connector.

The red mark indicates pin number one from our perspective. Typically, the minimum number of signals is three:

The fourth signal is usually for VCC, which is the voltage source of the I/O's of the processor. So, our assumption was the four signals were used at the connector.

To check which pin was actually used for each signal, we ran some tests:

The “output/ pull-up” tests were done by connecting a 1Kohm resistor between the GND and the signal. If it was just "pull-up," the voltage would drop to:

In this case, (see image below) which means the voltage should drop to 0.3V. It did in fact drop to that voltage at pin #3.

Serial Setup

We used a Saleae logic analyzer to sample the Txd at bootup and measure the baud rate, which was 38400. We then used a FTDI UART to USB Cable to connect to an open a terminal with the above baud rate and we got a boot log and startup!

From the boot log, we learned the OS is Linux and the bootloader is U-boot, which is what we suspected. Unfortunately, the printer blocked the RxD signal, so we could see the logs but couldn't run any commands. We needed to delete the UART branch completely.

JTAG PORT

JTAG Interface Exploration

We located a connector next to the Serial Port that we suspected was a JTAG interface connector.

Typically, the JTAG Signals are:

TMS
TCK
TDI
TDO
TRST (optional)
VCC
GND

We needed to try to identify these seven pins out of the 12 pins contained in the connector. The possible combinations include the full k-permutations of n, where k=7 and n=12, which puts the potential number of combinations available is 3,991,680:

This means we can't just brute-force the right combination. In order to reduce the number of possible combinations, we ran some tests similar to the tests we did for the SERIAL Port.

Each assumption we made was based on the previous experience and some convention, such as:

TCK is known to have a pull-Down resistor
TDO is the only output in a JTAG signal
Reset indicators - when we pulled down the reset with a 1Kohm resistor the printer restarted
TRST is the only one with 1Kohm pull-up and a 10Kohm input (still not certain though)

We left with 2 unknown signals and one uncleared. In the worst-case scenario, we got p(4,3), which equals 24 permutations, so we decided to use the only pink gadget the world has, the JTAGULATOR, to try to identify the pins.

The JTAGULATOR's main purpose was to find the right permutation by trying to read the IDCODE each time. Because we already knew part of the signals, the brute-force should have been much easier. Unfortunately, it could not find the right permutations, so we decided to try doing it manually by using the "J-link" pod of SEGGER.

Again, we could not find the right permutations, but did make some interesting observations. We noticed that when we assigned the #10 pin as TDI, every time we pulled it down, TDO pulled down as well. The same thing happened when we pulled it up:

Our conclusion was the 10 pin was TRST as we suspected. At that point, we decided to drop the JTAG branch and start trying to extract the firmware directly from the Flash memory.

Reading the NAND Flash

Background

NAND Flash is Flash memory (floating gate transistors that are connected in a series and resemble a NAND Gate). The NAND Flash is built of blocks with each block containing several pages. A page is the smallest object that can be read. When we want to write, we have to erase a full block and then write the relevant pages. The Flash assembled in the Lexmark printer is Macronicx MX30LF1G08.

The structure of that Flash

A page is 2048 bytes.
Each page has a spare byte of 64 bytes.
A block contains 64 pages.
The Flash contains 1K blocks.

The total size of the Flash is 64k*(2048+64) = 135MB. When we subtracted the spare bytes we got 128MB (1Gbit). The purpose of the spare bytes is to manage the Flash, but it can also be used for user data or more commonly for error correction code (ECC), like Hamming code.

De-soldering the Flash.

We decided to de-solder the Flash instead of patching the I/O's. We used a Hot Air Station of Weller for de-soldering it. Before de-soldering it, we protected the rest of the circuit around the Flash by using High-Temperature Tape, a.k.a. Kapton Tape.

After de-soldering it, we needed to clean the remaining solder from the contacts.

This allowed us to finally read the content of the flash memory with a universal Memory programmer. We used Elnec BeeProg2. It's the most reliable programmer in the market, but it was pretty pricey.

After we read the content, we got 135MB of a binary blob.

Binary Exploration

When we opened the Flash image in a hex editor, we noticed the string "programz" was found every 2112 bytes.

The value 2112 = 2048 + 64 bytes, which is the exact size of each page with its spare bytes. We assumed it was a signature of the management library they used. If the Flash is managed, we wanted to understand the method because it could contain uncontentious blocks that would invalidate the data. We were able to determine the data in the spare byte was not 100% ECC, because we found the "programz' string there.

Block Management

If the Flash is managed and there are some bad blocks, which can't be used, we knew we would find a stream of zeros in the size of one block (135168 bytes). First, we ran an entropy test to see if the information was continuous.

When we saw there were some gaps, we looked to see if there were zeros present or something else. We found that at each entropy drop, there was a sequence of 0xFF's. Our next test was to find any sequence of zeros, excluding the spare bytes, to see if we missed any bad blocks. The largest zeros sequence we found was 0xF84 (3972) bytes. This was more than one page, but less than a whole block. Therefore, our conclusion was there were no bad blocks at all. As a result, we could read the whole Flash and ignore the spare bytes.

Conclusion

The most productive and efficient way to extract the Lexmark firmware from hardware is to de-solder the Flash and read it with a universal programmer. The Flash is probably partially managed. If there is a bad block, the controller should skip it and read the next block. Because we didn't find any bad blocks, we could consider the data as a continuous stream. After excluding the spare bytes, we got 128MB (2048*64k =0x8000000 bytes).

Limitation:

There is a chance that the data is somehow corrupted and the way to fix it is to use in some way the information in the spare bytes.

To see how and what we learned from further firmware analysis and file extraction, please read Part B of this blog.

References

“Entropy (information theory)”, Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Entropy_(information_theory)
"Universal asynchronous receiver-transmitter", Wikipedia, the free encyclopedia. https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter
Randy Johnson, Steward Christie (Intel Corporation, 2009), JTAG 101—IEEE 1149.x and Software Debug.
JTAGULATOR, GRAND IDEA STUDIO. http://www.grandideastudio.com/jtagulator/
“Flash memory”, Wikipedia, the free encyclopedia. https://en.wikipedia.org/wiki/Flash_memory

Stay in the know Get the Team82 Newsletter

Recent Vulnerability Disclosures