Between 1st-3rd November 2023, there was another CTF event - EKOPARTY CTF. It was a part of the EKOPARTY Security Conference in Buenos Aires, but the CTF was also available online. It was in real Retro theme with IRC and Gopher server. One challenge by Kaspersky was especially interesting for me - network traffic analysis, exploitation, malware and reverse engineering. I would like to share my solution for this very nice challenge.

We were provided with one pcap with captured network traffic of some hacked device. The task was obvious - to analyze the evidence and find the flag. The pcap itself was not big - approximately 350 kilobytes. So it is possible to use Wireshark for its analysis. There were several sessions, mostly encrypted with TLS. But, two unencrypted TCP streams were particulary interesting - it seems like login attempt with token to some corporate server. And instead of token, the client (better to say, the attacker) sent long string of characters ‘A’ followed by couple of nonprintable characters and long sequences of ‘U’:

Fig. 1: TCP Stream with login to the corporate server

This is a typical example of shellcode used together with exploitation of buffer overflow vulnerability. These exploits usually contains similar sequences to overwrite the bounds of the allocated memory structures. Characters such as ‘A’, ‘B’, ‘C’ or ‘0x90’ are often used - they are safe when they are (later) interpreted as instructions instead of data. (if possible, some exploits would like to change the stack to be executable, so its content can be executed as a code). Characters such as ‘A’, ‘B’, ‘C’ are for incrementing the registers ECX, EDX or EBX, while the nonprintable byte 0x90 is a NOP (no operation) instruction.

Let’s analyze the shellcode extracted from TCP stream. If we would like to use IDA Freeware, we should convert this shellcode to .exe file, because freeware version is restricted only to ELF and Portable Executable formats and it refuses the analysis of the binary shellcode. Or, we can use Cutter, an open source reverse engineering platform powered by Rizin (Radare2 fork). In Cutter, there is clearly visible one simple loop for xoring the rest of the payload with the character ‘U’.

Fig. 2: XORing the rest of payload with 'U'

Now, we can repeat the same step and xor the payload with the ‘U’ (‘0x55). In CyberChef, we will get the payload starting with “ELF” - the format for executable Linux binaries and libraries.

Fig. 3: Decrypting the ELF payload with CyberChef

After we dumped the decrypted payload from CyberChef, it is the ELF file - so it is supported by IDA Freeware. When we load it into IDA, we can see that it is only partially obfuscated - the code is pretty readable, but the strings are obfuscated. It turns out, that the strings are xored again, this time, with the character ‘3’ (value 0x33).

Fig. 4: Strings in dumped ELF payload are obfuscated and xored with '3'

Provided with this finding, we are able to decrypt couple of strings in the beginning of the main function - there is an IP address of attacker (the same IP address as we observed in Wireshark) and the path to db file on the corporate server of the victim.

Fig. 5: IP address of the attacker and path of victim's database

After the ELF payload is executed, it maps the db file into memory (with mmap call) and encrypts data from the db file with some derived secret key. Then it decrypts the IP address of the attacker and connects to that IP address and port 4919 (0x1337 in hex). Finally, it sends the encrypted data from db file to the attacker - see Figure 6.

Fig. 6: Encrypting the data from db file and sending to the attacker

What about the derivation of secret key? Well, there is a function for it, however, in disassembler it looks ugly - lot of low level arithmetic operations. Usually, it means that originally there was some kind of integer division or modulo, and the compiler used some binary tricks for the better performance. Thanks to the IDA Free Cloud decompiler, we can turn this mess into readable C pseudocode, see Figure 7. There is again xor 0x33 for deobfuscation of the hardcoded secret string “5P3R_S3CR3T_K3Y”, which is then concatenated with the current date. And yes, there is an integer division, or, modulo 100 applied for the current year. So, we know the secret string, but what about date? We have pcap, with captured network traffic, and there are also timestamps - the traffic was captured on 26th October 2023, so this was the current date when the malware was executed. It means, the derived secret key was “5P3R_S3CR3T_K3Y23/10/26”.

Fig. 7: Derivation of the secret key based on hardcoded string and date

At this point, we are just one step away from finishing the analysis of this sample and decrypting the data transmitted by this malware. We need to figure out which encryption algorithm is used. Let’s look into the function my_encrypt_data from the Figure 6. There is one call for initialization of some array with key stream, and then loop with xoring the data with the key stream. Based on constants (‘0xff’) and loops in the initialization function, and based on the operations in the encryption loop, it is definitely the RC4 encryption. You can compare the decompiled pseudocode from Figure 8 with the pseudocode on Wikipedia.

Fig. 8: RC4 encryption

And finally, we are ready for the last step - dump the transmitted data from Wireshark (display filter tcp.port == 4919 as we found in the analyzed ELF sample) and decrypt this data in CyberChef: RC4 encryption with the key “5P3R_S3CR3T_K3Y23/10/26”.

Fig. 9: Encrypted data transmitted by analyzed malware
Fig. 9: Decryption of the RC4-encrypted data in CyberChef

Success! Here is the decrypted content of exfiltraded db file with flag: EKO{W0ULD_Y0U_PL34S3_73LL_M3_WH1CH_D4T3_AR3_W3?}.

Used Tools