What is Python Pickle Module?

Posted in /  

What is Python Pickle Module?

Vinay Khatri
Last updated on August 31, 2022

    We all love Pickle, the way a pickle is made by preserving the main ingredient for a period of time. The concept of preserving is the main idea behind every type of pickle. Similarly, Python also uses Pickling to persevere or store Python objects generally also referred to as Python picking, serialization, marshaling, and flattening.

    What is Serialization in Python?

    When we want to transmit data from one resource to another or one network to another we need a data format that can easily be transmitted over memory or networks. A Python object or data can not be transmitted directly so we convert that data into a stream of bytes and store it using a process called Serialization. Python provides many libraries to serialize data such as JSON , marshal, and pickle, and in this tutorial, we will learn about the Python pickle module and see how to serialize and deserialize data or Python objects using the Pickle module.

    What is de-serialization in Python?

    De-serialization is just the opposite of Serialization. The data or object that is converted into a transmittable file format (using serialization) need to get back into a Python object or data so it can be read by the user. This process of converting back the serialized bytes data-stream back into Python object or data is known as de-serialization.

    What is Pickle in Python

    Pickle is one of Python's standard Libraries that is used to serialize and de-serialize Python objects. Pickle can serialize any Python data object into byte stream including list, string, tuple, dictionary, function, class, etc., and revert back them into Python object using de-serialization. Although, Serialization is supported by many other Programming languages, what makes Pickling different is that it can serialize any Python data object, whereas serialization has its own limitation and can only serialize limited data objects. That's why Pickling has its own different meaning than serialization, and Python developers do not interchangeably use the terms Picking and Serialization.

    JSON vs Pickle

    JSON "JavaScript Object Notation" is one of the most serialized data formats to send data over networks. Almost every programming language supports a library or a data format to serialize its data objects to JSON format. Python also has a standard JSON library that can serialize and deserialize Python objects to JSON format and vice versa. But the difference between JSON and Pickle are:

      1. json is limited to a few Python data objects means it can only serialize Python JSON-looking data like string and dictionary into a serialized JSON format. Whereas pickle is capable of serializing every Python object.
      2. json module is compatible with most of the Python versions. But in pickle you might face some bugs and errors while serializing or deserializing the same data with different Python versions.
      3. json provides a more suitable serialization format to send data over the network, and pickle is more suitable for sharing and storing data over memory.
      4. With big data sets pickle provide more efficiency than json .

    Python Objects that can be Pickled and Unpickled

    Here is a list of Python data objects that can be serialized and deserialized using the Python Pickle library.

    1. All Python Data types (Integer, Float, String, Boolean, Bytes, None)
    2. Python Data containers (List, Tuple, Dictionary, Sets)
    3. Python functions and Classes.

    Pickle Python Objects

    Pickle Python List

    Let's begin with pickling (serializing) a Python list into a pickle_file.pkl . Create a pickle_list.py Python script and code along.

    #pickle_list.py

    import pickle
    
    #list object
    fruits_list = ['Apples', 'Mangoes', 'Grapes', 'Peaches', 'Oranges']
    
    filename ='pickle_file.pkl'
    #create a pickle_file.pkl
    # and serialize Python objects as binary file
    with open (filename, 'wb') as pickle_file:
        pickle.dump(fruits_list, pickle_file)
    
    print(f"A file by name {filename} has been created with serialized data")
    

    Break the code In the first line, we imported the pickle module using the import pickle statement. Pickle is a Python standard library so we do not need to install it separately. We can directly use it on our Python script. Then we define a List object by name fruits_list which is a list of fruits. The filename identifier holds the fine name of the Pickled list. Using the Python context manager we open the file filename in write binary mode 'wb' as file object pickle_file . It is important to open the file in write binary mode wb when storing Pickled (serialized) data. The pickle.dump(data_object, file_object) function accepts two arguments object and file_object . It serialized the data_object and store it into the file with file_object . Now execute the program

    python pickle_list.py
    
    A file by name pickle_file.pkl has been created with serialized data

    After executing the program you a file with a filename pickle_file.pkl will be created in the same directory where your Python pickle_list.py is located.

    The pickle_file.pkl will contain the serialized Python list object in binary format. We can read the serialized data directly from this newly created .pkl file but it will be in binary format. So we need to read it in binary format and deserialize the fruits_list data.

    Unpickle Python List

    Now let's create a new Python script by the name unpickle_list.py and deserialized and read the data that we created and stored in the above example as pickle_file.pkl . #unpickle_list.py

    import pickle
    
    filename ='pickle_file.pkl'
    
    #load t pickle_file.pkl
    # and de-serialize Python objects
    
    with open (filename, 'rb') as pickle_file:
        fruits_list = pickle.load(pickle_file)
    
    print(fruits_list)
    

    Output

    ['Apples', 'Mangoes', 'Grapes', 'Peaches', 'Oranges']

    Break the code To deserialize the 'pickle_file.pkl' file we first open the file and read the data in binary mode using 'rb' . Then using the pickle.load() function we de-serialized the pickle_file file serialized data.

    Wrapping Up

    Now let's wrap up our article on the Python Pickle module. In this article, we learned what is serialization & deserialization, what is Pickle module in Python, how to store serialized Python objects,s and how to deserialize Python objects using the Pickle module. To serialize and store any Python object we first need to open the file in write binary mode "wb" , and serialized the Python data object using the Pickle dump() function. The dump() function serialize the Python data object and writes the serialized data into a .pkl file in a binary format that can be easily read and transmitted later. To deserialized and read data from .pkl file we first need to read the data in binary format using "rb" mode then deserialized data using Pickle load() function. However, Pickle serialized file makes it easy to send the data over the network, does not confuse pickling with compression. Compression is used to encode data to reduce disk space whereas serialization is only used to translate data for better transmission. People are also reading:

    Leave a Comment on this Post

    0 Comments