多进程,而不是多线程,所以 pool 函数里面的第一个参数如果大于 CPU 的核心数可能反而导致效率更低!!
1、apply_async
例子:
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
for i in range(3):
pool.apply_async(run, (i,))
print('非阻塞~~~~')
print('end')
输出:
start
非阻塞~~~~
end
num is 0
解释:
进程的切换是操作系统来控制的,抢占式的切换模式。我们首先运行的是主进程,cpu运行很快,短短几行的代码,完全没有给操作系统进程切换的机会,主进程就运行完毕了,整个程序结束。子进程完全没有机会切换到程序就已经结束了。
例子
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
for i in range(3):
pool.apply_async(run, (i,))
print('非阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
非阻塞~~~~
num is 0
num is 1
1 is end
num is 2
2 is end
0 is end
end
解释:
pool.close()
pool.join()
告诉主进程,等着所有子进程执行完毕后,在运行剩余部分。剩余的部分是指pool.join()之后的部分。
注意:join()要放在close()后面
2、map_async
例子
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
num_list = [0, 1, 2]
pool.map_async(run, num_list)
print('非阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
非阻塞~~~~
num is 0
num is 1
1 is end
num is 2
2 is end
0 is end
end
1、apply
例子:
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = mp(5)
for i in range(3):
pool.apply(run, (i,))
print('阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
num is 0
0 is end
num is 1
1 is end
num is 2
2 is end
阻塞~~~~
end
2、map
例子:
import time
from multiprocessing import Pool as mp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
num_list = [0, 1, 2]
pool = mp(5)
pool.map(run, num_list)
print('阻塞~~~~')
pool.close()
pool.join()
print('end')
输出:
start
num is 0
num is 1
1 is end
num is 2
2 is end
0 is end
阻塞~~~~
end
同样支持四种方式
例子
import time
from multiprocessing.dummy import Pool as tp
def run(num):
print('num is {}'.format(num))
if num == 0:
time.sleep(5)
print('{} is end'.format(num))
if __name__ == '__main__':
print('start')
pool = tp(5)
num_list = [0, 1, 2]
pool.map_async(run, num_list)
print('非阻塞~~~~')
pool.close()
pool.join()
print('end')
结果:
start
非阻塞~~~~
num is 0
num is 1
num is 2
2 is end
1 is end
0 is end
end
1、iO 密集型建议使用多线程,CPU 密集型建议使用多进程。
2、关于多进程,本质上就是进程会在核之间切换的。但是核本身只负责计算操作,所以如果有大量 IO 之类的操作,那进程可以被 pause 的时间就比较长。这种就比较适合开大于 8 个的进程(8 核),一般情况就开 8 个就 ok 了,或者小于 8.
multiprocessing.Manager().list() 创建list,用于进程之间共享数据
注意,创建一个「multiprocessing.Manager().list() 」则会开一个新的进程用于数据的管理。创建 n 个共享数据 list,则会有 n 个进程用于数据的管理。
1、Python 多线程多进程中的几个坑:
https://www.findhao.net/easycoding/2410
2、进程之间数据共享:
https://blog.csdn.net/houyanhua1/article/details/78244288