Quick and easy Python concurrency
This post serves as a simple reminder to myself that multiprocessing.Pool
has map
and starmap
functions that are a quick and easy way to take a loop and parallelise it in Python.
Imagine some code like this:
urls = ["https://example.org",]
def take_screenshot(url):
# ...
for url in urls:
take_screenshot(url)
To make this run across four separate processes, you can do this:
pool = multiprocessing.Pool(4)
pool.map(take_screenshot, urls)
Where does starmap
come into it? Well, it lets you invoke the function with multiple arguments by expanding the list you pass in.
urls = [
("https://example.org", "example.png")
]
def take_screenshot(url, filename):
# ...
pool = multiprocessing.Pool(4)
pool.starmap(take_screenshot, urls)
This is super handy for quick scripts, but as always there's trade-offs here. Check out this more detailed post to see other ways of managing concurrency in Python.