<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Cloudflare on 猫猫鱼的小窝</title>
    <link>https://csdn.fjh1997.top/tags/cloudflare/</link>
    <description>Recent content from 猫猫鱼的小窝</description>
    <generator>Hugo</generator>
    <language>zh-CN</language>
    
    <managingEditor>xxx@example.com (catcatyu)</managingEditor>
    <webMaster>xxx@example.com (catcatyu)</webMaster>
    
    <copyright>本博客所有文章除特别声明外，均采用 BY-NC-SA 许可协议。转载请注明出处！</copyright>
    
    <lastBuildDate>Mon, 09 Mar 2026 15:00:00 +0800</lastBuildDate>
    
    
    <atom:link href="https://csdn.fjh1997.top/tags/cloudflare/atom.xml" rel="self" type="application/rss&#43;xml" />
    

    
    

    <item>
      <title>Python DrissionPage 绕过Cloudflare验证下载图片</title>
      <link>https://csdn.fjh1997.top/posts/24102.html</link>
      <pubDate>Mon, 09 Mar 2026 15:00:00 &#43;0800</pubDate>
      <author>xxx@example.com (catcatyu)</author>
      <guid>https://csdn.fjh1997.top/posts/24102.html</guid>
      <description>
        <![CDATA[<h1>Python DrissionPage 绕过Cloudflare验证下载图片</h1><p>作者：catcatyu（xxx@example.com）</p>
        
          <p>有些网站（如 linux.do）的图片资源受 Cloudflare 保护，直接用 requests 请求会被拦截。本文介绍如何用 DrissionPage 控制真实浏览器通过验证，提取 cookie 后用 urllib 批量下载图片。</p>
<h2 id="思路">
<a class="header-anchor" href="#%e6%80%9d%e8%b7%af"></a>
思路
</h2><ol>
<li>用 DrissionPage 启动 Chromium 浏览器访问目标页面</li>
<li>等待 Cloudflare &ldquo;Just a moment&rdquo; 验证页面自动通过</li>
<li>通过 CDP 协议提取浏览器中的 cookie</li>
<li>用提取到的 cookie 构造请求头，用 urllib 下载图片</li>
</ol>
<h2 id="关键代码">
<a class="header-anchor" href="#%e5%85%b3%e9%94%ae%e4%bb%a3%e7%a0%81"></a>
关键代码
</h2><h3 id="1-配置浏览器选项">
<a class="header-anchor" href="#1-%e9%85%8d%e7%bd%ae%e6%b5%8f%e8%a7%88%e5%99%a8%e9%80%89%e9%a1%b9"></a>
1. 配置浏览器选项
</h3><div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="kn">from</span> <span class="nn">DrissionPage</span> <span class="kn">import</span> <span class="n">Chromium</span><span class="p">,</span> <span class="n">ChromiumOptions</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">build_browser_options</span><span class="p">(</span><span class="n">proxy_url</span><span class="p">:</span> <span class="nb">str</span> <span class="o">|</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">ChromiumOptions</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">options</span> <span class="o">=</span> <span class="n">ChromiumOptions</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">options</span><span class="o">.</span><span class="n">set_argument</span><span class="p">(</span><span class="s2">&#34;--no-first-run&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">options</span><span class="o">.</span><span class="n">set_argument</span><span class="p">(</span><span class="s2">&#34;--disable-features=Translate&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">proxy_url</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">options</span><span class="o">.</span><span class="n">set_argument</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;--proxy-server=</span><span class="si">{</span><span class="n">proxy_url</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">options</span><span class="o">.</span><span class="n">auto_port</span><span class="p">()</span>  <span class="c1"># 自动分配调试端口，避免冲突</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">options</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p><code>auto_port()</code> 让每次启动使用不同端口，可以多实例并行。</p>
<h3 id="2-等待-cloudflare-验证通过">
<a class="header-anchor" href="#2-%e7%ad%89%e5%be%85-cloudflare-%e9%aa%8c%e8%af%81%e9%80%9a%e8%bf%87"></a>
2. 等待 Cloudflare 验证通过
</h3><div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">time</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">PAGE_WAIT_SECONDS</span> <span class="o">=</span> <span class="mi">20</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">wait_for_page</span><span class="p">(</span><span class="n">tab</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kc">None</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">wait_rounds</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">PAGE_WAIT_SECONDS</span> <span class="o">//</span> <span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">_</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">wait_rounds</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">        <span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">title</span> <span class="o">=</span> <span class="p">(</span><span class="n">tab</span><span class="o">.</span><span class="n">title</span> <span class="ow">or</span> <span class="s2">&#34;&#34;</span><span class="p">)</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">            <span class="n">lowered</span> <span class="o">=</span> <span class="n">title</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">            <span class="c1"># Cloudflare 验证页的标题包含这些关键词</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="n">title</span> <span class="ow">and</span> <span class="s2">&#34;just a moment&#34;</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">lowered</span> <span class="ow">and</span> <span class="s2">&#34;checking your browser&#34;</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">lowered</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="k">return</span>
</span></span><span class="line"><span class="cl">        <span class="k">except</span> <span class="ne">Exception</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">pass</span>
</span></span><span class="line"><span class="cl">    <span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>原理很简单：Cloudflare 验证页的 <code>&lt;title&gt;</code> 通常是 &ldquo;Just a moment&hellip;&rdquo; 或 &ldquo;Checking your browser&hellip;&quot;，只要轮询到标题变成正常内容，就说明验证通过了。</p>
<h3 id="3-通过-cdp-提取-cookie">
<a class="header-anchor" href="#3-%e9%80%9a%e8%bf%87-cdp-%e6%8f%90%e5%8f%96-cookie"></a>
3. 通过 CDP 提取 Cookie
</h3><p>这是最核心的部分——用 Chrome DevTools Protocol 的 <code>Network.getAllCookies</code> 拿到浏览器里的所有 cookie：</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">domain_matches</span><span class="p">(</span><span class="n">host</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span> <span class="n">domain</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">bool</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="s2">&#34;&#34;&#34;判断 host 是否匹配 cookie 的 domain 字段&#34;&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">normalized</span> <span class="o">=</span> <span class="n">domain</span><span class="o">.</span><span class="n">lstrip</span><span class="p">(</span><span class="s2">&#34;.&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="nb">bool</span><span class="p">(</span><span class="n">normalized</span><span class="p">)</span> <span class="ow">and</span> <span class="p">(</span><span class="n">host</span> <span class="o">==</span> <span class="n">normalized</span> <span class="ow">or</span> <span class="n">host</span><span class="o">.</span><span class="n">endswith</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;.</span><span class="si">{</span><span class="n">normalized</span><span class="si">}</span><span class="s2">&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">extract_cookie_header</span><span class="p">(</span><span class="n">browser</span><span class="p">:</span> <span class="n">Chromium</span><span class="p">,</span> <span class="n">tab</span><span class="p">,</span> <span class="n">host</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">str</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">cookies</span><span class="p">:</span> <span class="nb">list</span><span class="p">[</span><span class="nb">dict</span><span class="p">]</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># 方法1: 通过 CDP 协议获取所有 cookie（包括 HttpOnly 的）</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">cookies</span><span class="o">.</span><span class="n">extend</span><span class="p">(</span><span class="n">tab</span><span class="o">.</span><span class="n">run_cdp</span><span class="p">(</span><span class="s2">&#34;Network.getAllCookies&#34;</span><span class="p">)</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;cookies&#34;</span><span class="p">,</span> <span class="p">[]))</span>
</span></span><span class="line"><span class="cl">    <span class="k">except</span> <span class="ne">Exception</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">pass</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># 方法2: 通过 DrissionPage API 获取</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">cookies</span><span class="o">.</span><span class="n">extend</span><span class="p">(</span><span class="n">browser</span><span class="o">.</span><span class="n">cookies</span><span class="p">(</span><span class="n">all_info</span><span class="o">=</span><span class="kc">True</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">    <span class="k">except</span> <span class="ne">Exception</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">pass</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># 去重，只保留目标域名的 cookie</span>
</span></span><span class="line"><span class="cl">    <span class="n">seen</span><span class="p">:</span> <span class="nb">set</span><span class="p">[</span><span class="nb">tuple</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">]]</span> <span class="o">=</span> <span class="nb">set</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">host_cookies</span><span class="p">:</span> <span class="nb">list</span><span class="p">[</span><span class="nb">dict</span><span class="p">]</span> <span class="o">=</span> <span class="p">[]</span>
</span></span><span class="line"><span class="cl">    <span class="k">for</span> <span class="n">cookie</span> <span class="ow">in</span> <span class="n">cookies</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">domain</span> <span class="o">=</span> <span class="n">cookie</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;domain&#34;</span><span class="p">,</span> <span class="s2">&#34;&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="ow">not</span> <span class="n">domain_matches</span><span class="p">(</span><span class="n">host</span><span class="p">,</span> <span class="n">domain</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="n">key</span> <span class="o">=</span> <span class="p">(</span><span class="n">cookie</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;name&#34;</span><span class="p">,</span> <span class="s2">&#34;&#34;</span><span class="p">),</span> <span class="n">domain</span><span class="p">,</span> <span class="n">cookie</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;path&#34;</span><span class="p">,</span> <span class="s2">&#34;/&#34;</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">key</span> <span class="ow">in</span> <span class="n">seen</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="k">continue</span>
</span></span><span class="line"><span class="cl">        <span class="n">seen</span><span class="o">.</span><span class="n">add</span><span class="p">(</span><span class="n">key</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">host_cookies</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">cookie</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1"># 拼接成 HTTP Cookie 头格式</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="s2">&#34;; &#34;</span><span class="o">.</span><span class="n">join</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">        <span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">cookie</span><span class="p">[</span><span class="s1">&#39;name&#39;</span><span class="p">]</span><span class="si">}</span><span class="s2">=</span><span class="si">{</span><span class="n">cookie</span><span class="p">[</span><span class="s1">&#39;value&#39;</span><span class="p">]</span><span class="si">}</span><span class="s2">&#34;</span>
</span></span><span class="line"><span class="cl">        <span class="k">for</span> <span class="n">cookie</span> <span class="ow">in</span> <span class="n">host_cookies</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">cookie</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;name&#34;</span><span class="p">)</span> <span class="ow">and</span> <span class="n">cookie</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;value&#34;</span><span class="p">)</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="p">)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>两种方式互补取 cookie：CDP 的 <code>Network.getAllCookies</code> 能拿到 <code>HttpOnly</code> 标记的 cookie（Cloudflare 的 <code>cf_clearance</code> 就是 HttpOnly 的），DrissionPage 自带的 API 作为兜底。</p>
<h3 id="4-收集会话信息完整流程">
<a class="header-anchor" href="#4-%e6%94%b6%e9%9b%86%e4%bc%9a%e8%af%9d%e4%bf%a1%e6%81%af%e5%ae%8c%e6%95%b4%e6%b5%81%e7%a8%8b"></a>
4. 收集会话信息（完整流程）
</h3><p>把上面的步骤串起来，打开浏览器 → 等验证 → 轮询提取 cookie → 验证 cookie 有效性：</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span><span class="lnt">42
</span><span class="lnt">43
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="n">COOKIE_WAIT_SECONDS</span> <span class="o">=</span> <span class="mi">180</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">collect_site_session</span><span class="p">(</span>
</span></span><span class="line"><span class="cl">    <span class="n">site_url</span><span class="p">:</span> <span class="nb">str</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">proxy_url</span><span class="p">:</span> <span class="nb">str</span> <span class="o">|</span> <span class="kc">None</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">    <span class="n">validation_url</span><span class="p">:</span> <span class="nb">str</span> <span class="o">|</span> <span class="kc">None</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span>
</span></span><span class="line"><span class="cl"><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">tuple</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">]:</span>
</span></span><span class="line"><span class="cl">    <span class="n">host</span> <span class="o">=</span> <span class="n">urllib</span><span class="o">.</span><span class="n">parse</span><span class="o">.</span><span class="n">urlparse</span><span class="p">(</span><span class="n">site_url</span><span class="p">)</span><span class="o">.</span><span class="n">hostname</span> <span class="ow">or</span> <span class="s2">&#34;&#34;</span>
</span></span><span class="line"><span class="cl">    <span class="n">options</span> <span class="o">=</span> <span class="n">build_browser_options</span><span class="p">(</span><span class="n">proxy_url</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">browser</span> <span class="o">=</span> <span class="kc">None</span>
</span></span><span class="line"><span class="cl">    <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">browser</span> <span class="o">=</span> <span class="n">Chromium</span><span class="p">(</span><span class="n">options</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">tab</span> <span class="o">=</span> <span class="n">browser</span><span class="o">.</span><span class="n">latest_tab</span>
</span></span><span class="line"><span class="cl">        <span class="n">tab</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">site_url</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">        <span class="n">wait_for_page</span><span class="p">(</span><span class="n">tab</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">        <span class="n">deadline</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="o">+</span> <span class="n">COOKIE_WAIT_SECONDS</span>
</span></span><span class="line"><span class="cl">        <span class="k">while</span> <span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="o">&lt;</span> <span class="n">deadline</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">time</span><span class="o">.</span><span class="n">sleep</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="n">cookie_header</span> <span class="o">=</span> <span class="n">extract_cookie_header</span><span class="p">(</span><span class="n">browser</span><span class="p">,</span> <span class="n">tab</span><span class="p">,</span> <span class="n">host</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="ow">not</span> <span class="n">cookie_header</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="k">continue</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="c1"># 从浏览器获取真实 User-Agent</span>
</span></span><span class="line"><span class="cl">            <span class="k">try</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="n">user_agent</span> <span class="o">=</span> <span class="n">tab</span><span class="o">.</span><span class="n">run_js</span><span class="p">(</span><span class="s2">&#34;return navigator.userAgent&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">            <span class="k">except</span> <span class="ne">Exception</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="n">user_agent</span> <span class="o">=</span> <span class="s2">&#34;Mozilla/5.0 ...&#34;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="c1"># 如果提供了验证 URL，先试下载一张图验证 cookie 是否可用</span>
</span></span><span class="line"><span class="cl">            <span class="k">if</span> <span class="n">validation_url</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                <span class="n">ok</span><span class="p">,</span> <span class="n">reason</span> <span class="o">=</span> <span class="n">probe_session</span><span class="p">(</span><span class="n">opener</span><span class="p">,</span> <span class="n">validation_url</span><span class="p">,</span> <span class="n">cookie_header</span><span class="p">,</span> <span class="n">user_agent</span><span class="p">,</span> <span class="n">site_url</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">                <span class="k">if</span> <span class="n">ok</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">                    <span class="k">return</span> <span class="n">cookie_header</span><span class="p">,</span> <span class="n">user_agent</span>
</span></span><span class="line"><span class="cl">                <span class="k">continue</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">            <span class="k">return</span> <span class="n">cookie_header</span><span class="p">,</span> <span class="n">user_agent</span>
</span></span><span class="line"><span class="cl">    <span class="k">finally</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="k">if</span> <span class="n">browser</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">            <span class="n">browser</span><span class="o">.</span><span class="n">quit</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="s2">&#34;&#34;</span><span class="p">,</span> <span class="s2">&#34;Mozilla/5.0 ...&#34;</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>关键点：</p>
<ul>
<li>设 180 秒超时，留足人工点击验证码的时间</li>
<li>每 3 秒轮询一次 cookie，拿到后立即用 <code>probe_session</code> 试下载一张图片验证 cookie 是否有效</li>
<li>用 <code>tab.run_js(&quot;return navigator.userAgent&quot;)</code> 获取浏览器真实 UA，保证后续请求的 UA 和获取 cookie 时一致</li>
</ul>
<h3 id="5-带-cookie-下载图片">
<a class="header-anchor" href="#5-%e5%b8%a6-cookie-%e4%b8%8b%e8%bd%bd%e5%9b%be%e7%89%87"></a>
5. 带 Cookie 下载图片
</h3><div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">ssl</span>
</span></span><span class="line"><span class="cl"><span class="kn">import</span> <span class="nn">urllib.request</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">build_opener</span><span class="p">(</span><span class="n">proxy_url</span><span class="p">:</span> <span class="nb">str</span> <span class="o">|</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">urllib</span><span class="o">.</span><span class="n">request</span><span class="o">.</span><span class="n">OpenerDirector</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">ssl_context</span> <span class="o">=</span> <span class="n">ssl</span><span class="o">.</span><span class="n">create_default_context</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="n">ssl_context</span><span class="o">.</span><span class="n">check_hostname</span> <span class="o">=</span> <span class="kc">False</span>
</span></span><span class="line"><span class="cl">    <span class="n">ssl_context</span><span class="o">.</span><span class="n">verify_mode</span> <span class="o">=</span> <span class="n">ssl</span><span class="o">.</span><span class="n">CERT_NONE</span>  <span class="c1"># 跳过 SSL 验证</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">handlers</span> <span class="o">=</span> <span class="p">[</span><span class="n">urllib</span><span class="o">.</span><span class="n">request</span><span class="o">.</span><span class="n">HTTPSHandler</span><span class="p">(</span><span class="n">context</span><span class="o">=</span><span class="n">ssl_context</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="n">proxy_url</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">handlers</span><span class="o">.</span><span class="n">insert</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">urllib</span><span class="o">.</span><span class="n">request</span><span class="o">.</span><span class="n">ProxyHandler</span><span class="p">({</span>
</span></span><span class="line"><span class="cl">            <span class="s2">&#34;http&#34;</span><span class="p">:</span> <span class="n">proxy_url</span><span class="p">,</span> <span class="s2">&#34;https&#34;</span><span class="p">:</span> <span class="n">proxy_url</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">        <span class="p">}))</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="n">urllib</span><span class="o">.</span><span class="n">request</span><span class="o">.</span><span class="n">build_opener</span><span class="p">(</span><span class="o">*</span><span class="n">handlers</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">download_image</span><span class="p">(</span><span class="n">opener</span><span class="p">,</span> <span class="n">url</span><span class="p">,</span> <span class="n">output_path</span><span class="p">,</span> <span class="n">cookie_header</span><span class="p">,</span> <span class="n">user_agent</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">bool</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="n">request</span> <span class="o">=</span> <span class="n">urllib</span><span class="o">.</span><span class="n">request</span><span class="o">.</span><span class="n">Request</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">request</span><span class="o">.</span><span class="n">add_header</span><span class="p">(</span><span class="s2">&#34;User-Agent&#34;</span><span class="p">,</span> <span class="n">user_agent</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">request</span><span class="o">.</span><span class="n">add_header</span><span class="p">(</span><span class="s2">&#34;Accept&#34;</span><span class="p">,</span> <span class="s2">&#34;image/webp,image/apng,image/*,*/*;q=0.8&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">request</span><span class="o">.</span><span class="n">add_header</span><span class="p">(</span><span class="s2">&#34;Referer&#34;</span><span class="p">,</span> <span class="sa">f</span><span class="s2">&#34;</span><span class="si">{</span><span class="n">urllib</span><span class="o">.</span><span class="n">parse</span><span class="o">.</span><span class="n">urlparse</span><span class="p">(</span><span class="n">url</span><span class="p">)</span><span class="o">.</span><span class="n">scheme</span><span class="si">}</span><span class="s2">://</span><span class="si">{</span><span class="n">urllib</span><span class="o">.</span><span class="n">parse</span><span class="o">.</span><span class="n">urlparse</span><span class="p">(</span><span class="n">url</span><span class="p">)</span><span class="o">.</span><span class="n">netloc</span><span class="si">}</span><span class="s2">/&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="n">request</span><span class="o">.</span><span class="n">add_header</span><span class="p">(</span><span class="s2">&#34;Cookie&#34;</span><span class="p">,</span> <span class="n">cookie_header</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">with</span> <span class="n">opener</span><span class="o">.</span><span class="n">open</span><span class="p">(</span><span class="n">request</span><span class="p">,</span> <span class="n">timeout</span><span class="o">=</span><span class="mi">30</span><span class="p">)</span> <span class="k">as</span> <span class="n">response</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">        <span class="n">data</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
</span></span><span class="line"><span class="cl">        <span class="n">content_type</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">headers</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">&#34;Content-Type&#34;</span><span class="p">,</span> <span class="s2">&#34;&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="k">if</span> <span class="ow">not</span> <span class="n">data</span> <span class="ow">or</span> <span class="s2">&#34;text/html&#34;</span> <span class="ow">in</span> <span class="n">content_type</span><span class="o">.</span><span class="n">lower</span><span class="p">():</span>
</span></span><span class="line"><span class="cl">        <span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="sa">f</span><span class="s2">&#34;下载失败, content-type=</span><span class="si">{</span><span class="n">content_type</span><span class="si">!r}</span><span class="s2">&#34;</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="n">output_path</span><span class="o">.</span><span class="n">write_bytes</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="k">return</span> <span class="kc">True</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>注意设置 <code>Referer</code> 头，有些 CDN 会校验来源。下载后检查 <code>Content-Type</code>，如果返回的是 HTML 而不是图片，说明 cookie 已失效或被拦截了。</p>
<h2 id="安装依赖">
<a class="header-anchor" href="#%e5%ae%89%e8%a3%85%e4%be%9d%e8%b5%96"></a>
安装依赖
</h2><div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">pip install DrissionPage
</span></span></code></pre></td></tr></table>
</div>
</div><p>DrissionPage 会自动查找系统中已安装的 Chromium 内核浏览器（Chrome、Edge 等），无需额外安装 chromedriver。</p>
<h2 id="总结">
<a class="header-anchor" href="#%e6%80%bb%e7%bb%93"></a>
总结
</h2><p>整个方案的核心就一句话：<strong>用真实浏览器过 Cloudflare 验证，通过 CDP 协议偷 cookie，再把 cookie 塞到 urllib 请求里下载资源</strong>。相比 undetected-chromedriver 等方案，DrissionPage 的优势是 API 简洁，且内置了 CDP 操作支持。</p>

        
        <hr><p>本文2026-03-09首发于<a href='https://csdn.fjh1997.top/'>猫猫鱼的小窝</a>，最后修改于2026-03-09</p>]]>
      </description>
      
    </item>
    
  </channel>
</rss>
