Parallel Processing with tqdm
tqdm is a popular library that’s widely used in a bunch of open-source python ML libraries for displaying progress bars. As such, it’s already pre-installed as a dependency when working on machine learning projects.
pip show tqdm
Required-by: datasets, dvc, evaluate, huggingface-hub, openai, sentence-transformers, spacy, transformers**
For example, consider a task where we loop over a list of websites and need to fetch the status code for each.
Naive loop
import requests
def ping(url):
return requests.head(url).status_code
= ['https://amitness.com']*10
urls = [ping(url) for url in urls] statuses
To get a progress bar, it’s as easy as wrapping the urls
list with the tqdm class.
shell
pip install tqdm
import requests
from tqdm.auto import tqdm
def ping(url):
return requests.head(url).status_code
= ['https://amitness.com'] * 10
urls = [ping(url) for url in tqdm(urls)] statuses
- 1
-
Import the tqdm object. Importing from
tqdm.auto
is preferred as it automatically select the best progress bar (jupyter-compatible or console-based) - 2
- Simply wrap the list of items and you get a progress bar
While this use case of tqdm
as a progress bar library is well known, there are two relatively undocumented features in tqdm to get progress bars while doing concurrent/parallel processing.
Running Concurrent Threads
You can execute a function on the list concurrently with multiple threads using the thread_map
function. It takes the function to run as the first argument and a list of items as the second argument and returns the results.
import requests
from tqdm.contrib.concurrent import thread_map
def ping(url):
return requests.head(url).status_code
= ['https://amitness.com']*500
urls = thread_map(ping, urls, max_workers=4) statuses
- 1
-
The number of threaded-workers to use can be specified using
max_workers
parameter.
This is useful to speed up IO-bound tasks such as fetching data by scraping a website, calling a remote third party API or querying a remote database.
Internally, thread_map
leverages the ThreadPoolExecutor
from concurrent.futures standard library.1
Running parallel processes
For compute-bound tasks, tqdm provides a process_map
function with a similar API to process the list in parallel using multiple child processes.
import requests
from tqdm.contrib.concurrent import process_map
def ping(url):
return requests.head(url).status_code
= ['https://amitness.com'] * 500
urls = process_map(ping, urls, max_workers=4) statuses
- 1
-
The number of processes to use can be specified using
max_workers
parameter.
This is particularly useful when the task involves heavy computation such as generating sentence embeddings for a large dataset or running batch model inference on CPU.
Internally, process_map
uses ProcessPoolExecutor
from concurrent.futures standard library.2
Conclusion
Thus, thread_map
and process_map
are useful tools to add to your toolbox when dealing with parallel processing. As tqdm is already pre-installed via other libraries you might use in ML, it’s a quick and easy way to add parallel processing to your program logic.
Footnotes
Source code for thread_map↩︎
Source code for process_map↩︎