What is Python Pickle Module?

Posted in /  

What is Python Pickle Module?
vinaykhatri

Vinay Khatri
Last updated on September 21, 2024

    We all love pickles, the way a pickle is made by preserving the main ingredient for a period of time. The concept of preserving is the main idea behind every type of pickle. Similarly, Python also uses Pickling to persevere or store Python objects , generally also referred to as Python picking, serialization, marshaling, and flattening.

    What is Serialization in Python?

    When we want to transmit data from one resource to another or one network to another, we need a data format that can easily be transmitted over memory or networks. A Python object or data can not be transmitted directly, so we convert that data into a stream of bytes and store it using a process called Serialization.

    Python provides many libraries to serialize data, such as JSON , marshal, and pickle, and in this tutorial, we will learn about the Python pickle module and see how to serialize and deserialize data or Python objects using the Pickle module.

    What is de-serialization in Python?

    De-serialization is just the opposite of Serialization. The data or object that is converted into a transmittable file format (using serialization) need to get back into a Python object or data so it can be read by the user. This process of converting back the serialized bytes data stream back into Python object or data is known as de-serialization.

    What is Pickle in Python

    Pickle is one of Python's standard Libraries that is used to serialize and de-serialize Python objects. Pickle can serialize any Python data object into a byte stream, including list, string, tuple, dictionary, function, class, etc., and revert back them into Python object using de-serialization.

    Although, many other Programming languages support serialization, what makes Pickling different is that it can serialize any Python data object, whereas serialization has its own limitation and can only serialize limited data objects. That's why Pickling has its own different meaning than serialization, and Python developers do not interchangeably use the terms Picking and Serialization.

    JSON vs Pickle

    JSON "JavaScript Object Notation" is one of the most serialized data formats to send data over networks. Almost every programming language supports a library or a data format to serialize its data objects to JSON format. Python also has a standard JSON library that can serialize and deserialize Python objects to JSON format and vice versa. But the difference between JSON and Pickle are:

      1. json is limited to a few Python data objects means it can only serialize Python JSON-looking data like string and dictionary into a serialized JSON format. Whereas pickle is capable of serializing every Python object.
      2. The json module is compatible with most Python versions. But in pickle, you might face some bugs and errors while serializing or deserializing the same data with different Python versions.
      3. json provides a more suitable serialization format to send data over the network, and pickle is more suitable for sharing and storing data over memory.
      4. With big data sets pickle provide more efficiency than json .

    Python Objects that can be Pickled and Unpickled

    Here is a list of Python data objects that can be serialized and deserialized using the Python Pickle library.

    1. All Python Data types (Integer, Float, String, Boolean, Bytes, None)
    2. Python Data containers (List, Tuple, Dictionary, Sets)
    3. Python functions and Classes.

    Pickle Python Objects

    Pickle Python List

    Let's begin with pickling (serializing) a Python list into a pickle_file.pkl . Create a pickle_list.py Python script and code along.

    #pickle_list.py

    import pickle
    
    #list object
    fruits_list = ['Apples', 'Mangoes', 'Grapes', 'Peaches', 'Oranges']
    
    filename ='pickle_file.pkl'
    #create a pickle_file.pkl
    # and serialize Python objects as binary file
    with open (filename, 'wb') as pickle_file:
        pickle.dump(fruits_list, pickle_file)
    
    print(f"A file by name {filename} has been created with serialized data")
    

    Break the code

    In the first line, we imported the pickle module using the import pickle statement. Pickle is a Python standard library, so we do not need to install it separately. We can directly use it on our Python script. Then we define a List object by name fruits_list which is a list of fruits.

    The filename identifier holds the fine name of the Pickled list. Using the Python context manager, we open the file filename in write binary mode 'wb' as a file object pickle_file .

    It is important to open the file in write binary mode wb when storing Pickled (serialized) data. The pickle.dump(data_object, file_object) function accepts two arguments object and file_object . It serialized the data_object and store it in the file with file_object .

    Now execute the program

    python pickle_list.py
    
    A file by name pickle_file.pkl has been created with serialized data

    After executing the program, you will find a file with a filename pickle_file.pkl will be created in the same directory where your Python pickle_list.py is located.

    The pickle_file.pkl will contain the serialized Python list object in binary format. We can read the serialized data directly from this newly created .pkl file, but it will be in binary format. So we need to read it in binary format and deserialize the fruits_list data.

    Unpickle Python List

    Now let's create a new Python script by the name unpickle_list.py and deserialized and read the data that we created and stored in the above example as pickle_file.pkl . #unpickle_list.py

    import pickle
    
    filename ='pickle_file.pkl'
    
    #load t pickle_file.pkl
    # and de-serialize Python objects
    
    with open (filename, 'rb') as pickle_file:
        fruits_list = pickle.load(pickle_file)
    
    print(fruits_list)
    

    Output

    ['Apples', 'Mangoes', 'Grapes', 'Peaches', 'Oranges']

    Break the code To deserialize the 'pickle_file.pkl' file, we first open the file and read the data in binary mode using 'rb' . Then using the pickle.load() function, we de-serialized the pickle_file file serialized data.

    Wrapping Up

    Now let's wrap up our article on the Python Pickle module. In this article, we learned what serialization & deserialization is, what the Pickle module in Python is, how to store serialized Python objects, and how to deserialize Python objects using the Pickle module.

    To serialize and store any Python object, we first need to open the file in binary write mode "wb" and serialize the Python data object using the Pickle dump() function. The dump() function serialize the Python data object and writes the serialized data into a .pkl file in a binary format that can be easily read and transmitted later. To deserialize and read data from the .pkl file we first need to read the data in binary format using the "rb" mode, then deserialize data using the Pickle load() function.

    However, Pickle's serialized file makes it easy to send the data over the network and does not confuse pickling with compression. Compression is used to encode data to reduce disk space, whereas serialization is only used to translate data for better transmission.

    People are also reading:

    FAQs


    The Python Pickle module is one of the ways to serialize and deserialize Python objects. It is in contrast to the JSON module, wherein it serializes objects in a binary format, which is not understandable by humans.

    The primary benefit of using the Python pickle module is that it enables you to serialize any Python object without the need to write extra lines of code.

    As JSON is lightweight, it is pretty much faster than Pickle. In addition, Pickle comes with a security risk. On the flip side, JSON is always free from external threats.

    To read the Python Pickle file, you can leverage the Pandas library, as it has the read_pickle() method.

    You can pickle any object in Python so that you can save it on the disk. What pickle does is that it serializes the object before writing it into any file.

    Leave a Comment on this Post

    0 Comments