The Silent Spy Among Us: Smart Intercom Attacks

Vera Mens

/ March 9th, 2023

The Silent Spy Among Us: Smart Intercom Attacks

UPDATE, March 13:

Akuvox has reached out to Team82 informing us that it has confirmed the 13 vulnerabilities we uncovered in the E11 smart intercom, and said it will update firmware for these devices before March 20.

In a statement to customers and partners emailed to Team82, Akuvox said:

“We noticed a recent research report about Akuvox E11 vulnerabilities from Claroty and relevant media coverages. Once we confirmed the existence of the vulnerabilities, we have given the top priority to patch the vulnerabilities. An updated firmware will be released before March 20, 2023 and be available on the Akuvox Knowledge Base.

“Additionally, Akuvox strictly complies with the laws and regulations in all countries and regions where we operate and is committed to continuously enhancing product security to meet the most stringent requirements and to best protect the users of our products.”

What started out as a journey to learn more about a new smart intercom inside the Claroty offices turned into an expansive Team82 research project that uncovered 13 vulnerabilities in the popular Akuvox E11. The vulnerabilities could allow attackers to execute code remotely in order to activate and control the device’s camera and microphone, steal video and images, or gain a network foothold.

Vulnerabilities and their main attack vectors
Our Research Journey
Explore the Firmware
Akuvox Web Server Emulation
Vulnerability Research: Local Environment
Unprotected Routes
Command Injection for Easy Pwn
Vulnerability Research: Cloud Environment
SIP and Cross-Account Abuse
Mitigations
CVEs

Vulnerabilities and their main attack vectors:

Remote code execution within the local area network
Remote activation of the device’s camera and microphone and transmission of data back to the attacker
Access to an external, insecure FTP server and the download of stored images and data

The vulnerabilities remain unpatched after many unsuccessful attempts to contact and coordinate the disclosure with the Chinese vendor, a global leader in SIP-based smart intercoms. Our efforts to reach Akuvox began in January 2022, and along the way several support tickets were opened by Team82 and immediately closed by the vendor before our account was ultimately blocked on Jan. 27, 2022.

We involved the CERT Coordination Center (CERT/CC), which also made multiple attempts to contact the vendor to no avail. After months of failed attempts, we disclosed our findings to CISA in December; CISA also had no success in working with Akuvox, and today published an advisory describing 13 vulnerabilities found by Team82. The implications of those flaws range from missing authentication, hard-coded encryption keys, missing or improper authorization, and the exposure of sensitive information to unauthorized users.

Our Research Journey

Open Sesame

Last year, as Claroty’s business grew, so did the number of its employees. Very soon our startup-size offices were too small and we had to move to others. When we arrived at our shiny new offices, something new greeted us at the door: an Akuvox E11 smart intercom.

A new door phone is not very exciting for most people, but we are security researchers; we see a camera attached to an ethernet cable and our heart pumps faster.

Our first notion in poking around this new connected device was to figure out if this could make our team’s life simpler. For example, having our research space near the closest office entrance meant that we’d spend a lot of time getting up from our desk letting people in if the receptionist wasn’t around—not fun.

We decided to look for an API that we could use to open the door; given this is a smart intercom, there must be one. Fortunately, we didn’t have to dig too deeply; it was right in the documentation:

Unlock Door Instructions — An API that allows users to remotely unlock doors secured by Akuvox E11 devices.

So we could use an API that takes credentials for authentication and an action which in our case would be to open the door: http://IP/fcgi/do?action=OpenDoor&UserName=XXX&Password=XXX&DoorNum=1From there, we were able to write a simple Slack bot using this API and never have to get up out of our chairs again.

Clarobot Interface — A view of the Slack bot written by Team82 using the Akuvox API.

Naturally, our curiosity wasn’t satisfied, and we wanted answers to more questions, such as: Are all Akuvox API calls protected by a username and password? What other actions can be performed?

To find out, and rather than brick a production device and anger our IT department, we bought our own device.

Akuvox Intercom eBay — Team82 purchased an Akuvox E11 to conduct its research.

Explore the Firmware

While we waited for the physical device to arrive, we were able to download the firmware online:

We discovered the firmware was not encrypted, which meant we could export it with the binwalk utility:

Let’s extract it and see what we could find:

First, was a squash-fs partition with Linux-like directory structure, which simplified our ability to emulate and review the firmware.

Our next step was to find its main services. The best candidate was the local web server, which would allow us to discover where configurations reside (because we could follow where they are saved), and what actions can be performed on the device.

Usually, the web server is brought up right at initialization making it straightforward to find the configuration and implementation file’s location. Therefore, we looked at the init script directory and searched for interesting scripts and binary executions. Finally, we found in script init.sh a lighttpd web server initialization process and were able to continue our journey.

The web server configuration resides in : /app/config/web/lighttpd.conf

By digging more into the configuration files, we learn that the web server uses the FCGI module and the pages are served by /app/bin/fcgiserver.fcgi. The implementation of the main functionalities resides in /app/lib/libservlets.so.

Akuvox Web Server Emulation

Since we hadn’t received the physical device yet, but had the binaries of the web server, why not just run and execute them? Akuvox devices, like many other embedded devices, run on ARM-based CPUs. Therefore, we need to use a Raspberry Pi or similar device to run the binaries within the Akuvox firmware. Luckily for us, we always have a couple Raspberry Pis lying around.

Akuvox Web Server Simulation — A Raspberry Pi used to emulate the Akuvox E11 web server.

Since the web server uses absolute paths for file access, we wouldn’t be able to run the binaries inside the RPi root file system; all of its paths would be relative to the RPi file system. This was solved by using chroot on the “/” directory of the Akuvox firmware; once inside the chrooted file system, all paths would be relative to the firmware’s file system.

The procedure of chrooting to another file system is always similar, and goes something like this:

> mkdir $SQUASH_FS/system $SQUASH_FS/proc $SQUASH_FS/dev $SQUASH_FS/tmp

> sudo mount -t proc /proc $SQUASH_FS/proc/

> sudo mount --rbind /dev $SQUASH_FS/dev/

> echo 'export SHELL=/system/bin/sh' >> ./system/etc/mkshrc

> sudo chroot . /system/bin/sh

Now that we were within the Akuvox’s file system, we could bring up the web server:

> /app/bin/lighttpd -D -f /app/config/web/lighttpd.conf -m /app/lib

From there, we were able to browse http://IP_OF_RPI and access an Akuvox web interface:

Akuvox Login — The Akuvox web interface.

Vulnerability Research: Local Environment

Starting with the Web

The default password for the local web interface is admin/admin (noted in the intercom’s documentation). Our goal was then to explore the web server and find interesting and unprotected routes.

After some poking of the main binary, we found an unprotected route. (CVE-2023-0354):

http://IP/fcgi/do?id=6&id=2&Operation=GetDivContent&DivName=[REDACTED]

UI screen:

This route is used to extract the device configuration, which includes information about the device, and credentials for various services embedded in the configuration:

The passwords however are encrypted (but not the cloud token):

Let’s see how the encryption is implemented and whether the development team made mistakes that could help us decrypt the passwords:

ptr to enrypted — libcfg.so: AesAndBase64Encrypt used to encrypt password with a hardcoded key.

We found the password to decrypt the firmware embedded within.

Despite the function name (aes_decrypt), the encryption is not the AES you are familiar with, but a proprietary cipher (CVE-2023-0353). Since the decryption code resides on the firmware, we can use it to decrypt the credentials. (Fraunhofer, a German research organization, reported the issue before this research. Refer here for more information.)

Let’s decrypt the password used for web login:

The credentials are admin/admin (as we expected in our newly chrooted environment). That is a cool finding because it means that we have a web authentication bypass for every network accessible device. Using Censys we found about 5,000 of these devices exposed to the internet:

Unprotected Routes

Another interesting route is the following one, which we redacted:

http://IP/fcgi/do?id=8&Operation=GetDivContent&DivName=[REDACTED1]

http://IP/fcgi/do?id=8&Operation=GetDivContent&DivName=[REDACTED2]

Those routes enable network sniffing on the device, and do not require authentication. This is bad by itself, but is even worse since most of the communication to and from the device is not encrypted or properly encrypted, meaning that passwords for services, images and logs are communicated in plain text.

Conveniently enough, the generated configuration and PCAP files can be downloaded without authentication as well.

http://IP/[REDACTED]/config.tgz

http://IP/[REDACTED]/phone.pcap

This information allowed us to capture network activity from the device, see all communication with the cloud, and with local parties. This also allows us to capture SIP calls as well. By decoding the packets, we will be able to recreate the video and the voice that was communicated.

Command Injection for Easy Pwn

Gaining a web authentication bypass is interesting, but we are researchers. We see a box with ethernet cable attached to it, and we want to run code on it. Therefore, we looked for a vulnerability that allowed us to execute code from the web interface.

Now with the authentication bypass, we have a new broad attack surface available to us and we can explore the routes that do require authentication.

After reviewing most of the route’s implementation, we found a command injection vulnerability (CVE-2023-0351) that in the “call log” page:

http://IP/fcgi/do?id=5&id=1:

The actual vulnerability resides in libservlets.so library, CPhoneBookModel::SetContDataByString function:

Path not sanitized snippet — libservlets.so: command injection in SetContDataByString that parses input data from the user when contact is added to the contact list.

The system function executes the following:

system("busybox mv /app/resources/www/htdocs/download/FILE_NAME.jpg /mnt/sdcard/profiles/FILE_NAME.jpg")

The FILE_NAME is the filename of the profile picture provided by the user. You cannot see it in the UI because this functionality is hidden, but see it when we look at the implementation.

By reviewing the code, we see that the FILE_NAME is not sanitized, so in theory we can pass any string we want and it will be concatenated to the rest of the command to be executed.

There are some limitations:

FILE_NAME/COMMAND cannot include backslashes or spaces: This is a no-brainer. We will UTF-8 encode the backslash and execute the command with the command expansion notation to avoid use of the spaces. Another note, we need somehow to review the output, so we will redirect the output of the command we want to execute to the “/tmp/download”; folder which can be accessed from the web at http://IP/download/FILE.For example, the command : id > /tmp/download/id_result will become: $(id>$({echo,'\x2f'tmp'\x2f'download'\x2f'id_result}))
FILE_NAME/COMMAND.jpg file must exist in /app/resources/www/htdocs/download:

Let's find a place in which we can upload a file to /app/resources/www/htdocs/download.

Luckily we have found an option to upload a JPG file to ContactProfile module:

Fortunately for us, the file upload functionality does not sanitize the filename and it uploaded to /app/resources/www/htdocs/download

Now that the file exists, we can issue the request with the command injection:

Let's wrapt it all in some Python script:

We can now run an arbitrary code execution on all accessible devices.

More Unprotected Routes

As a bonus, we have discovered another (well-documented) web interface that shows camera footage in real time—no authentication mechanism is implemented for this interface (CVE-2023-0349):

http://IP:8080/video.cgi

http://IP:8080/jpeg.cgi

http://IP:8080/picture.cgi

Camera Sample — Team82's personal helper showcasing our just-arrived Akuvox device.

Live stream Akuvox — A snippet from Akuvox documentation demonstrating the live stream feature.

This means that anyone with network access (public or LAN) to the device can use this route to see the real time video captured by the intercom. In sensitive areas such as a healthcare facility, this would be a privacy issue that would violate patient privacy regulations, for example.

Discovery Services

Once our Akuvox device arrived, it was time to explore its architecture:

We’ve already described our findings about the local web configuration server and web interface, now we need to look at its IP Scanner discovery tool.

IP scanner is a Windows-based application. The download is available here. Its purpose is to search the local network for Akuvox devices:

By looking at Wireshark output, we can see that this is not a familiar discovery protocol. In addition, it looks like the payload is encrypted. To understand the protocol structure and how it is encrypted, we must look into either the Windows application or the binary that responds to the protocol at the device itself. We’ve chosen the latter.

First, lets see the encryption implementation:

Once again, we see hard-coded passwords to decrypt the packet data. This time however, a typical AES is used:

So we have everything to decrypt the packet contents. Let’s do it:

This is an xml-formatted message. From its structure, it looks like multiple message types exist. Let’s continue with reviewing the binary to see if there are more interesting message types.

Look at this one:

Can it be? A message used for command execution? Without authentication?

We have the code that parses the message, so we can construct it! This is what we came up with:

So we wrote a quick PoC script to send our command via the proprietary protocol.

It works! We have an arbitrary command execution on every device within the local network. Cool.

Let’s move to the cloud.

Vulnerability Research: Cloud Environment

Pictures and Cloud: What Could Go Wrong?

SmartPlus, a mobile application, allows a user to control the intercom remotely.

Another feature is the activities screen:

We can view all the interesting "activities" within the app through this feature. If enabled, movement near the intercom's camera can be considered activity. As soon as someone walks past the intercom, the device takes a picture and uploads it to a remote address. That’s how we can see all the activities in the app. Of course, our lab intercom isn't connected to a real door and the individuals are my stuffed animals.

Where do those pictures come from and where are they stored?

Remember, there exists a functionality to sniff the traffic on the intercom. By turning it on, we can see what happens when a picture is taken:

Ok, so this is bad…

Every time a door is opened on any Akuvox (door phone) in the world, an image is sent to the company’s FTP server. A single user, “akuvox,” is used and thus, a single password for authentication. In addition, it is stored on the root directory of the server. What is more, the name of the image includes the MAC address which is a device identifier.

Ok maybe it is not that bad. Can we list the directory?

We can, meaning that an attacker can access the FTP server from any FTP client and see the names of the images that are uploaded, and therefore can download the specific images from the server by name.

SIP and Cross-Account Abuse

Using the FTP vulnerability, we can see pictures from arbitrary devices, but is it possible to trigger this functionality and turn on specific cameras? Remember, although we have an arbitrary code execution allowing us to take pictures from internet-exposed devices and devices on the local network, what about the devices behind NAT?

The best place to look for the possibility of turning on a specific camera was the Session Initiation Protocol (SIP). SIP is a communication protocol used for real-time communication sessions between two or more participants over IP networks. SIP controls multimedia communication sessions such as voice and video calls, instant messaging, and online games.

SIP is also an open standard protocol and is widely used for voice over IP (VoIP) applications. It operates on a request-response model and is based on a client-server architecture. SIP clients can initiate communication sessions by sending SIP requests to a SIP server, which will then forward the requests to the appropriate destination.

SIP establishes multimedia sessions involving multiple participants through the use of SIP proxies and SIP servers, which manage communication and routing of data between the participants.

This is roughly how it works:

One person calls another and they can exchange over IP both voice and video. In the context of the Akuvox E11, and administrator can make a call to an intercom he owns with the mobile app:

We wanted to know what happens, however, if they call another Akuvox intercom that is not associated with their account?

We tested this using the intercom at our lab and another one at the office entrance. Each intercom is associated with different accounts and different parties. We were, in fact, able to activate the camera and microphone by making a SIP call from the lab’s account to the intercom at the door.

The issue stems from a missing authorization check. The platform does not verify that the caller is the owner of the edge device and therefore, it’s possible to call using SIP to any intercom and as a consequence to get the video and audio feed (CVE-2023-0348). This is a similar class of bug as a 2019 Apple FaceTime vulnerability that allowed users to hear audio from the iPhone they were calling before the FaceTime call was accepted.

This is where we stopped our research and decided to disclose the vulnerabilities. Unfortunately, the coordination between Team82 and Akuvox did not go as planned, as you can see by the timeline below.

Akuvox Disclosure Timeline

Jan. 23, 2022: Initial disclosure efforts begin; Akuvox has not published a secure email address or product security webpage.
Jan. 24, 2022: Email sent to support@akuvox.com asking for contact details to disclose vulnerabilities.
Jan. 24, 2022: Support ticket number #10125 opened
Jan. 25, 2022: Our support ticket #10125 was closed without explanation
Jan. 26, 2022: We sent another email to support@akuvox.com and explained we are not seeking a bug bounty, but simply are trying to responsibly disclose vulnerabilities. We also added a high-level description of the vulnerabilities and asked for details on how best we should report to Akuvox.
Jan. 26, 2022: Another support ticket was opened
Jan. 27, 2022: The new support ticket was closed again without explanation. In addition, Akuvox blocked our account and we were prevented from opening new tickets.
Jan. 27, 2022: We submitted a report through CERT/CC
Feb. 8, 2022: CERT/CC told us they were attempting to contact the vendor
Feb. 22, 2022: CERT/CC told us multiple attempts to contact the vendor were made without response.
March 3, 2022: A detailed email was sent to Akuvox explaining we tried to contact them multiple times and elaborated our intent. No response.
March 22, 2022: CERT/CC recommended we try to contact an email address associated with the FCC ID (/FCC-ID/2AHCR-VPR49G/). We emailed this address and got no response.
Dec. 5, 2022: The case was transferred to ICS-CERT. They also tried to contact Akuvox with no success.
March. 3, 2022: ICS-CERT and CERT/CC confirm they made multiple attempts to contact the vendor, but received no response.
March 9, 2023: public disclosure.

Note: It seems that Akuvox fixed the FTP server permissions issue. They disabled the ability to list its content so malicious actors could not enumerate files anymore.

Mitigations

Despite Akuvox’s failure to acknowledge the numerous disclosure attempts made by Team82 and others, we still recommend a number of mitigation measures.

First would be to ensure an organization’s Akuvox device is not exposed to the internet in order to shut off the current remote attack vector available to threat actors. Administrators would, however, likely lose their ability to remotely interact with the device over the SmartPlus mobile app.

Within the local area network, organizations are advised to segment and isolate the Akuvox device from the rest of the enterprise network. This prevents any lateral movement an attacker with access to the device might gain. Not only should the device reside on its own network segment, but communication to this segment should be limited to a minimal list of endpoints. Furthermore, only ports needed to configure the device should be opened; we also recommend disabling UDP port 8500 for incoming traffic, as the device’s discovery protocol is not needed.

Finally, we recommend changing the default password protecting the web interface. Right now the password is weak and included in the documentation to the device, which is publicly available.