使用 Python+Tesseract 快速识别验证码

安装

通过brew安装tesseract

brew install tesseract

通过pip安装以下的依赖

  • requests
  • pytesseract
  • pillow

pip install requests pytesseract pillow

当然,通过virtualenv创建隔离环境更好啦~

使用

# -*- coding: utf-8 -*-
import sys  
import io  
import requests  
import pytesseract  
from PIL import Image


url = '此处为验证码链接'

captcha_data = requests.session().get(url, stream=True)  
captcha_image = Image.open(io.BytesIO(captcha_data.content))  
captcha_image.save('Captcha.png','PNG')  
captcha = pytesseract.image_to_string(captcha_image)  
print('Captcha:', captcha)