Wrestling with Python
Introduction
Python is high-level, dynamically typed, portable and interpreted language which is often used for scripting. Python 2 was discounted with version 2.7.18. Currently, Python 3 is used with version 3.10.5 being the latest.
When Python source code is executed, it is compiled to byte code which are often stored with .pyc extension. In case, it is not able to write to the machine, the byte code is generated in the memory and then discarded after the program exists.
Once the byte code is created, it is executed by Python Virtual Machine (PVM) which is technically a big loop that iterated through byte code instructions and executes it. PVM is runtime engine of python which is present in Python system.
It is important to note that byte code is Python-specific representation and is platform-independent. However, byte code instruction can change depending upon the version of python. You can read about byte code instructions here: https://docs.python.org/3/library/dis.html#python-bytecode-instructions
Implementation of Python
A few popular implementation of Python:
- CPython
- Jython
- IronPython
- PyPy
Creating executables
- Pyinstaller (most popular one)
- Py2exe (Good for Python 2)
- Py2App (If target is OS X)
- bbfreeze
- cx_freeze
Reversing an executable created using Pyinstaller
Identifying executable created using pyinstaller
Extracting files from executables
A few words about pyc file
Reading and disassembling .pyc files
- https://nowave.it/python-bytecode-analysis-1.html
- https://opensource.com/article/18/4/introduction-python-bytecode
- https://betterprogramming.pub/analysis-of-compiled-python-files-629d8adbe787
- https://www.synopsys.com/blogs/software-security/understanding-python-bytecode/
- https://florian-dahlitz.de/articles/disassemble-your-python-code
- https://www.tutorialspoint.com/python-code-objects
- https://www.codeguage.com/courses/python/functions-code-objects
- http://pymotw.com/2/dis/
Decompiling .pyc file
- uncompyle6
- decompile3 (updated and refactored version of uncompyle6)
- Decompyle++ (pycdc)
- Easy Python Decompiler (used for python 1.0 - 3.4 )
- These tools will not work 100% all the time.
- Currently, there is no support for python version 3.9 and 3.10. Decompyle++ has partial support.
- Output from uncompyl6 and decompile3 will be similar, if not same in most cases. However, feel free to try both in case of errors.
Output from Decompyle++
Output from uncompyle6/decompile3
Extracting compressed and encrypted bytecodes.
To decrypt the file, the key is needed which is stored in the file name: pyimod00_crypto_key.pyc . Decompile the file using uncompyl6 to find the key for decryption.
Now, decryption and decompression routine is required which is present in the file named: pyimod02_archive.pyc. Decompile the file using uncompyl6 to get the routine shown in the following code snippet:
The class zlibarchivereader has extract method which when implemented in a python script can be used to get the decrypted and decompressed passwordstealer file using the key found earlier.
Finally, using uncompyl6, decompiled passwordstealer.pyc would be obtained:
From reading the decompiled code, it becomes clear that the sample is a stealer malware.
Final points:
- The blog doesn't mention the project: Nuikta which is source-to-source compiler that compiles python source code to C. It also has a commercial offering that claims to be effective against reverse engineering. Perhaps, in a future blog , executables created using this project will be examined.
- Similar, Pyarmor is a project that is used for creating obfuscated python scripts which will be examined in a future blog.
- Currently, decompilation tools for python bytecodes are not fully mature. Deducing the nature and functionality of the sample from disassembled bytecode is the best option.
Have a good day!
Final points:
- The blog doesn't mention the project: Nuikta which is source-to-source compiler that compiles python source code to C. It also has a commercial offering that claims to be effective against reverse engineering. Perhaps, in a future blog , executables created using this project will be examined.
- Similar, Pyarmor is a project that is used for creating obfuscated python scripts which will be examined in a future blog.
- Currently, decompilation tools for python bytecodes are not fully mature. Deducing the nature and functionality of the sample from disassembled bytecode is the best option.