MENU

Python杂谈之线程与进程

April 4, 2020 • Read: 553 • Python

Python杂谈——线程与进程

这里先分享一下线程和进程的区别:

线程和进程的区别是什么? - biaodianfu的回答 - 知乎

线程(multiprocessing)

Process

在UNIX和Linux里面,操作系统默认提供了一个fork()系统调用,这个函数非常特殊,有两个返回值,因为它会将当前进程负责一份,即由一个父进程生成了一个子进程,然后子进程永远返回0,父进程永远返回子进程的ID。但是这里要提的一点是:Windows内部没有fork()这个系统调用,而Python为了实现在Windows上的多进程,提供了一个跨平台的多进程模块——multiprocessing,这个模块提供了一个Process类来代表一个进程对象

'''
@Author: Mr.Sen
@LastEditTime: 2020-03-27 21:37:54
@Website1: 449293786.site
@Website2: sluowanx.cn
'''
import os 
from multiprocessing import Process

#child_process
def run_child(process_name):
    print('child process name is %s, id is %s' % (process_name,os.getpid()))
    
if __name__=='__main__':
    print('Parent process id is %s' % os.getpid())
    p = Process(target=run_child,args=('child',))
    print('start')
    p.start()    #start the child process
    p.join()    #wait until the child process ended
    print('ended')

运行结果如下:

Parent process id is 4720
start
child process name is child, id is 17876
ended

在上述代码中,我们可以通过os.getpid()来获取当前进程的id。

Process类中,我们可以用target关键字来传入需要运行的函数,通过args关键字向其传递参数,一定记得参数要以,结尾,否则string类型的参数会被拆成多个参数向其传入

#报错信息
Parent process id is 1588
start
Process Process-1:
Traceback (most recent call last):
  File "C:\Users\kunxucai\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 313, in _bootstrap
    self.run()
  File "C:\Users\kunxucai\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
TypeError: run_child() takes 1 positional argument but 5 were given
ended

然后我们可以通过调用.start()方法来启动子进程,通过.join()方法可以等待所有的子进程结束,常用于进程间的同步

Pool进程池

创建单个的子进程很简单,但是如果我们要创建多个子进程又怎么办呢?

是的,我们可以使用Pool(进程池)来大量创建进程

'''
@Author: Mr.Sen
@LastEditTime: 2020-03-27 21:37:54
@Website1: 449293786.site
@Website2: sluowanx.cn
'''
from multiprocessing import Pool
import os,time,random

def long_time_task(name):
    print('Run task %s (%s)...' % (name,os.getpid()))
    start = time.time()
    time.sleep(random.random()*3)
    end = time.time()
    print('Task %s run %0.2f seconds.' % (name,(end-start)))

if __name__=='__main__':
    print('Parent process %s.' % os.getpid())
    p=Pool(4)
    # Set up four processes
    for i in range(5):
        p.apply_async(long_time_task,args=(i,))
        #使用异步非阻塞式的方法启动子进程
    print('Wait for all subprocess done..')
    p.close()
    p.join()
    print('All subprocess done.')

关于代码中的p.apply_async(long_time_task,args=(i,))可以参考:python多进程apply与apply_async的区别

程序运行的结果如下:

Parent process 7952.
Wait for all subprocess done..
Run task 0 (9504)...
Task 0 run 0.90 seconds.
Run task 1 (16724)...
Task 1 run 0.28 seconds.
Run task 4 (16724)...
Task 4 run 1.27 seconds.
Run task 2 (18100)...
Task 2 run 2.65 seconds.
Run task 3 (9676)...
Task 3 run 2.93 seconds.
All subprocess done.

上述代码向只允许四个进程的进程池中添加了五个进程,因为线程池中只允许有四个进程,所以第五个进程需要等待前面四个进程有执行完毕才会开始执行(有五列火车,四列铁轨),.Pool()方法默认创建与cpu核数相同的进程池大小

进程间通信

multiprocess模块中,它提供了Queue和Pipes等方法来交换数据,我们这里着重讲一下Queue

'''
@Author: Mr.Sen
@LastEditTime: 2020-03-27 22:57:54
@Website1: 449293786.site
@Website2: sluowanx.cn
'''
from multiprocessing import Process, Queue
import os, time, random

# write data
def write(q):
    # q passed in here is a class
    print('Process to write: %s' % os.getpid())
    for value in ['A', 'B', 'C']:
        print('Put %s to queue...' % value)
        q.put(value)
        time.sleep(random.random())

# read data
def read(q):
    # q passed in here is a class
    print('Process to read: %s' % os.getpid())
    while True:
        value = q.get(True)
        print('Get %s from queue.' % value)

if __name__=='__main__':
    # The parent process creates the Queue 
    # and passes it to each child process
    q = Queue()
    pw = Process(target=write, args=(q,))
    pr = Process(target=read, args=(q,))
    # Start child process pw and write:
    pw.start()
    # Start child process pr and read:
    pr.start()
    # Wait for pw to finish
    pw.join()
    # pr can't be terminated unless it is forcibly terminated
    pr.terminate()

运行结果如下:

Process to read: 2312
Process to write: 6004
Put A to queue...
Get A from queue.
Put B to queue...
Get B from queue.
Put C to queue...
Get C from queue.

因为Python解释器的内在机制,上述代码在解释器中执行时,子进程中不会有输出,所以应当在cmd环境下运行

线程

//TODO

//待续

文章参考:

廖雪峰的官方网站

线程和进程的区别是什么? - biaodianfu的回答 - 知乎

python多进程apply与apply_async的区别

Python并行计算

Archives Tip
QR Code for this page
Tipping QR Code