performance - Python: parallelizing any/all statements -
i running python program , i've noticed bottleneck in line doing following
all(foo(s) s in l)
what wondering - best way make parallel computation? foo(s) thread safe method inspecting s , returning true/false respect criteria. no data structure changed foo.
so question is
how test in parallel if elements of list l have property foo , exiting 1 element of l not satisfy foo?
edit. adding more context. not know kind of context looking in scenario s graph , foo(s) computes graph theoretical invariant (for example average distance or perhaps similar)
this sort of depend on foo(s)
doing. if i/o bound, waiting on blocking calls, using threads help. easiest way create pool of threads , use pool.map
:
from multiprocessing.pool import threadpool pool = threadpool(10) all(pool.map(foo, l))
if, however, function cpu bound, using lot of processor power, not you. instead need use multiprocessing pool:
from multiprocessing import pool pool = pool(4) all(pool.map(foo, l))
this use separate processes instead of threads, allowing multiple cpu cores used. if function foo
quick, though, overhead eliminate advantage of parallel processing, need test make sure results expect
see: https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers
edit: i've assumed you're using python 2.7.x. if you're using python3 have more advanced concurrency features in concurrent.futures. including threadpoolexecutor
, processpoolexecutor
.
i recommend using parallel processing , asyncio lib i/o bound problems.
Comments
Post a Comment