若是計算密集的任務, 在處理器中頻繁切換執行續不一定會增加效率, 若能在一個新process啟動直譯器執行任務, 在多核心的情況下, 有機會分配到各核心平行運作
- Multi-processing (多處理程序/多進程):
- 資料在彼此間傳遞變得更加複雜及花時間,因為一個 process 在作業系統的管理下是無法去存取別的 process 的 memory
- 適合需要 CPU 密集,像是迴圈計算,即瓶頸在於計算等情況,且有多核可用時,就可以考慮用多進程提高效率
- 可多顆CPU運行
- Multi-threading (多執行緒/多線程):
- 資料彼此傳遞簡單,因為多執行緒的 memory 之間是共用的,最直接的辦法就是設置一個全域變數,多個線程共享這個全域變數即可,但也因此要避免會有 Race Condition 問題。
- 適合需要 I/O 密集,像是爬蟲需要時間等待 request 回覆
- 如果你的程序有大量與數據交互/網絡交互,可以使用多線程,因為程序時間瓶頸不在於GIL而是在I/O,這時多線程的小開銷就比多進程更實用
- 如果你的程序有圖形界面GUI,使用多線程,GIL鎖會幫助你讓你的UI線程不會產生死鎖等問題
- 同一顆CPU運行, 但Python本身對於執行緒排程進行優化
import threading import sys def foo(filename: str) -> int: with open(filename) as f: text = f.read() ct = 0 for ch in text: n = ord(ch.upper()) + 1 if n == 67: ct += 1 return ct count = 0 for filename in sys.argv[1:]: count += foo(filename) print(count)
import sys from multiprocessing import Queue, Process def foo(filename: str, queue: Queue): with open(filename) as f: text = f.read() ct = 0 for ch in text: n = ord(ch.upper()) + 1 if n == 67: ct += 1 queue.put(ct) if __name__ == '__main__': queue: Queue = Queue() ps = [Process(target = foo, args = (filename, queue)) for filename in sys.argv[1:]] for p in ps: p.start() for p in ps: p.join() count = 0 while not queue.empty(): count += queue.get() print(count)
import sys, multiprocessing def foo(filename: str) -> int: with open(filename) as f: text = f.read() ct = 0 for ch in text: n = ord(ch.upper()) + 1 if n == 67: ct += 1 return ct if __name__ == '__main__': filenames = sys.argv[1:] with multiprocessing.Pool(2) as pool: results = [pool.apply_async(foo, (filename,)) for filename in filenames] count = sum(result.get() for result in results) print(count)
mutiprocess最好不要共享狀態, 如果必要時須使用queue lock
no lock demo
from multiprocessing import Process, Lock def f(i: int): print('hello world', i) print('hello world', i + 1) if __name__ == '__main__': for num in range(100): Process(target=f, args=(num, )).start()
lock demo
import multiprocessing from multiprocessing.synchronize import Lock def f(lock: Lock, i: int): with lock: print('hello world', i) print('hello world', i + 1) if __name__ == '__main__': lock: Lock = multiprocessing.Lock() for num in range(100): multiprocessing.Process(target=f, args=(lock, num)).start()
- Python3.7技術手冊, 林信良, 碁峯
- 【Python教學】淺談 Multi-processing & Multi-threading 使用方法