使用 Python+Tesseract 快速识别验证码

安装

通过brew安装tesseract

brew install tesseract

通过pip安装以下的依赖

  • requests
  • pytesseract
  • pillow

pip install requests pytesseract pillow

当然,通过virtualenv创建隔离环境更好啦~

使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# -*- coding: utf-8 -*-
import sys
import io
import requests
import pytesseract
from PIL import Image


url = '此处为验证码链接'

captcha_data = requests.session().get(url, stream=True)
captcha_image = Image.open(io.BytesIO(captcha_data.content))
captcha_image.save('Captcha.png','PNG')
captcha = pytesseract.image_to_string(captcha_image)
print('Captcha:', captcha)