YARA rules are an essential tool for security researchers that help them identify and classify malware samples. They do so by describing patterns and strings within malware code that can help an analyst identify known or new threats. YARA rules are also often integrated within commercial detection tools, or used internally to detect misbehaving binaries on the enterprise network.
But what if we don't have the malicious file we are writing rules for? What if we want to create a "malware" file based on YARA as input? Today, Team82 is making freely available via our Github repository a tool we call Arya that does just that. Arya can be used to generate custom-made, pseudo-malware files to trigger antivirus (AV) and endpoint detection and response (EDR) tools just like the good old EICAR test file. Arya has a number of use cases, including malware research, YARA rule QA testing, and pressure testing a network with code samples built from YARA rules.
Arya is a first-of-its-kind tool; it produces pseudo-malicious files meant to trigger YARA rules. The tool reads the given YARA (.yar suffix) files, parses their syntax using Avast's yaramod package—the YARA parsing engine used in this research—and builds a pseudo "malware" file, carefully placing desired bytes from the YARA rules to trigger the input rules.
The goal of the tool is to generate a tailor-made pseudo-malicious file that detection sensors such as AV or EDR will identify as the malware file an input YARA rule is meant to detect. To achieve this goal, not only are we are adding the necessary signatures, strings, and bytes from the input YARA rules, but also adding some "touches" such as real PE headers, increasing the outfile entropy, adding x86 bytecode, and function prologue/epilogue assembly code. All of this helps the AV/EDR triggering process, and bypasses some heuristics checks they might have.
Researchers may also use Arya to generate pseudo-malicious files using YARA rules as building blocks. These files can be used to build files that will be identified as specific malware. For example, if you don't have a Zeus malware sample, but you want to check how your AV reacts to it? No problem, load the Zeus YARA rules to Arya and generate your own "Zeus"-like pseudo-malware that AV/EDR tools will identify as Zeus.
Arya can also be used as part of incident response training—similar to purple-teaming—where pseudo-malicious files can be sent across the network to pressure test sensors and detectors in the network.
Arya currently supports the following:
Types
Strings - ASCII and Wide
Hex Streams (including Jumps and Alternations)
Arya functionalities
At operator
Int Functions (uint32, int16be, etc.)
Of operator (e.g. all of ($s*))
RegEx type support
Base64 type support
FileSize Operator
Range Operator
String Count Operator
YARA files (.yar extensions) are text files that contain one or more YARA rules. In this project, we used yaramod to turn YARA rules into a list of rules represented by Python objects, which can then be used to access the internal contents of a rule, such as strings, types, conditions, and more. Yaramond parses YARA rules into AST, or an Abstract Syntax Tree.
Traversal of the abstract syntax tree is done by using a combination of the Observer and Visitor design patterns. For every node that code goes through, it will determine which bytes and strings it needs to place, and where, in order to trigger the condition in the subtree it traverses. As a result, this tree will produce a mapping of strings and possible offsets to put them in; it can also reserve some of them in the file, which will be passed to the placer mechanism for further processing.
For example, below is a simple YARA rule with a few conditions:
In the AST parsing engine, this will be represented as:
And finally the mapping will look something like:
The table, above, closely portrays the mapping representation in the code. The only difference is that instead of the string representations, internal yaramod types are used. These can be used to easily get the yara string type (e.g. Hex, Base64, Plain Wide, etc.), and also provide a parsed version of the string which is easier to handle.
After the table is produced for every rule in the file, it is passed to the placer which will decide where the best place for each mapping record, considering the minimum and maximum offsets, pre-reserved spots by the AST, the types of the string, and the YARA action in the condition. The placer has its own data structure that manages the byte stream of the final file, as well as reserving spots, and filling the empty sections in the file.
Arya's main purpose is to trigger malware detection engines. In order to do that, we have to consider some things that matter in the internal workings of these AV and EDR engines.
The first consideration is native bytecode entropy. This is important because currently antivirus software measures the entropy of the file's code in order to check if it is packed, encrypted, or obfuscated. Sometimes it can also be used for comparisons with other malicious items.
The "empty holes" in the output file where none of the rules specify what to place in them are filled with x86 bytecode from another malware. The user may specify any malware/file they want to take the byte code from (Using the option -m and specifying the file name).
Another consideration here are x86 functions, in particular, function counts. Today, a wide variety of antivirus and detection software, might check the number of functions in a potentially malicious file. A file with more functions than a standard number could be considered malicious. Therefore, it was decided to add as many function prologues and epilogues as possible, while retaining their correct order in the code.
Example function x86 prologue and epilogue:
We decided to consider them in the following way:
Where the two question marks (??) are a randomly generated number divisible by 4, encoded as a byte.
Both of these additions will make the file seem more like a real malware file. Doing so would make Arya output pseudo-malicious files trigger more antivirus software and detection engines which will detect it as a real malware file.
Today's release of Arya gives security researchers, network analysts, and incident response teams an effective tool to test YARA rules, their software and themselves. YARA rules are an essential means of classifying and identifying malware samples. And Arya is a means by which organizations can test the security of their networks, train their IR teams and also improve their defense tools and software.
We invite you to download Arya from our Github repository, and join our Team82 Research Slack community to discuss it, share best practices, and success stories with our research team and peers.
CWE-120 BUFFER COPY WITHOUT CHECKING SIZE OF INPUT ('CLASSIC BUFFER OVERFLOW'):
A denial-of-service vulnerability exists in the affected product. The vulnerability results in a buffer overflow, potentially causing denial-of-service condition.
Rockwell Automation has corrected these problems in firmware revision 4.020 and recommends users upgrade to the latest version available.
CVSS v3: 9.8
CWE-122 HEAP-BASED BUFFER OVERFLOW:
A denial-of-service and possible remote code execution vulnerability exists in the affected product. The vulnerability results in the corruption of the heap memory, which may compromise the integrity of the system, potentially allowing for remote code execution or a denial-of-service attack.
Rockwell Automation has corrected these problems in firmware revision 4.020 and recommends users upgrade to the latest version available.
CVSS v3: 9.8
CWE-420 UNPROTECTED ALTERNATE CHANNEL:
A device takeover vulnerability exists in the affected product. This vulnerability allows configuration of a new Policyholder user without any authentication via API. Policyholder user is the most privileged user that can perform edit operations, creating admin users and performing factory reset.
Rockwell Automation has corrected these problems in firmware revision 4.020 and recommends users upgrade to the latest version available.
CVSS v3: 9.8
CWE-191 INTEGER UNDERFLOW (WRAP OR WRAPAROUND):
The affected product is vulnerable to an integer underflow. An unauthenticated attacker could send a malformed HTTP Requesty, which could allow the attacker to crash the program.
Planet Technology recommends users upgrade to version 1.305b241111 or later.
CVSS v3: 5.3
CWE-78 IMPROPER NEUTRALIZATION OF SPECIAL ELEMENTS USED IN AN OS COMMAND ('OS COMMAND INJECTION'):
The affected product is vulnerable to a command injection. An unauthenticated attacker could send commands through a malicious HTTP request which could result in remote code execution.
Planet Technology recommends users upgrade to version 1.305b241111 or later.
CVSS v3: 9.8