python线程小学习

以前我写多线程程序时候,经常把一个线程类单独封装,然后使用全局锁来控制竞争资源的处理。今天看了这样一段代码

class Fetcher:
    def __init__(self,threads):
        self.opener = urllib2.build_opener(urllib2.HTTPHandler)
        self.lock = Lock() #线程锁
        self.q_req = Queue() #任务队列
        self.q_ans = Queue() #完成队列
        self.threads = threads
        for i in range(threads):
            t = Thread(target=self.threadget)
            t.setDaemon(True)
            t.start()
        self.running = 0
    def __del__(self): #解构时需等待两个队列完成
        time.sleep(0.5)
        self.q_req.join()
        self.q_ans.join()
    def taskleft(self):
        return self.q_req.qsize()+self.q_ans.qsize()+self.running
    def push(self,req):
        self.q_req.put(req)
    def pop(self):
        return self.q_ans.get()
    def threadget(self):
        while True:
            req = self.q_req.get()
            with self.lock: #要保证该操作的原子性,进入critical area
                self.running += 1
            try:
                ans = self.opener.open(req).read()
            except Exception, what:
                ans = ''
                print what
            self.q_ans.put((req,ans))
            with self.lock:
                self.running -= 1
            self.q_req.task_done()
            time.sleep(0.1) # don't spam

原文链接: http://www.pythonclub.org/python-network-application/observer-spider

从上面文章收获了不少。

把一个类作为一个线程池,进而从内部控制加锁或者释放锁,这样代码更加清晰。