最近需要抓一个需要翻墙才能访问的网页的包，发现VPN直连时会导致 Fiddler 和 Charles 抓包工具无法正常进行抓包，网上找了以后发现了一些解决方案：Github:VPN直连，导致 Fiddler 和 Charles 抓包工具无法正常进行抓包解决方案 ——试了貌似没用、windows下，实现vpn访问下的charles抓包设置中无网络问题的解决——收此启发指导了在charles的Proxy->external proxy允许其他端口代理

1.找到VPN软件的代理端口proxy port

我这边使用的是vmess，可以在选项->参数设置中查看，需要明确的参数是端口和协议，我这边是10808和socks协议

vmess参数

2.设置charles：

Proxy->external proxy, 首先允许其他proxy，然后根据刚刚查看到的vmess端口和协议进行填写

charles

3.设置完成，开始抓包

完结撒花~

附录：

requests使用代理

import requests

cookies = {
    'PB3_SESSION': '"2|1:0|10:1650810241|11:PB3_SESSION|40:djJleDo1Mi4xNDAuMjAxLjIxMTo1OTQ4NjM0Mg==|f661892137fd704b91fa09d8c58fd641a15ab9e83f94c69981dbeed7980fc9e4"',
    'V2EX_LANG': 'zhcn',
}

headers = {
    'authority': 'cn.v2ex.com',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'accept-language': 'zh-CN,zh;q=0.9,en;q=0.8',
    'cache-control': 'no-cache',
    'pragma': 'no-cache',
    'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="100", "Google Chrome";v="100"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'document',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-site': 'none',
    'sec-fetch-user': '?1',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36',
}

http_proxy = "socks5h://127.0.0.1:10808"
https_proxy = "socks5h://127.0.0.1:10808"
proxies = {
    "https": https_proxy,
    "http": http_proxy
}

response = requests.get('https://cn.v2ex.com/about', cookies=cookies, headers=headers, proxies=proxies)

注：一开始在Sublime里运行的，结果一直在response.text时报编码错误，但是通过网页的content-type和meta charset进行确认过没问题，后来经过一个启发想到会不会是控制台有编码显示不了，于是在Pycharm中运行，成功!

aiohttp使用socks代理

from: https://pypi.org/project/aiohttp-socks/、https://www.cnblogs.com/john-xiong/p/13812567.html

pip install aiohttp_socks

connector = ProxyConnector.from_url('socks5://127.0.0.1:10808')

async def getDataByChromeDriver(url):
	async with aiohttp.ClientSession(connector=connector) as session:
        async with session.get(url) as response:
            return await response.text()

if __name__ == '__main__':
        loop.run_until_complete(asyncio.wait([getDataByChromeDriver(index) for title, index in title_list.items()]))

运行即可

request-html使用代理

Python爬虫一个requests_html模块足矣！（支持JS加载&异步请求）

from requests_html import AsyncHTMLSession

http_proxy = "socks5h://127.0.0.1:10808"
https_proxy = "socks5h://127.0.0.1:10808"
proxies = {
    "https": https_proxy,
    "http": http_proxy
}

session = AsyncHTMLSession()

async def getDataByChromeDriver(index: Union[int, str]):
	response = await session.get('https://www.qkl123.com/sector/{}'.format(index), headers=headers, proxies=proxies)
    # ...

request-html异步

from requests_html import AsyncHTMLSession
asession = AsyncHTMLSession()
async def get_pyclock(index):
    r = await asession.get('http://httpbin.org/get')
    await r.html.arender()
    return r

results = asession.run(get_pyclock, get_pyclock,
                       get_pyclock)  # 这里作者将同一个页面使用异步方式进行了3次渲染，但是实际上使用的时间并不是平时的3倍！可能只是比平时渲染一个页面多花了一点时间而已！这就是异步的好处！
print(results)

and:https://cloud.tencent.com/developer/article/1575104

asession.run无法传参的问题

修改requests_html.AsyncHTMLSessions使得支持url参数