<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <author>
    <name>insidentally</name>
  </author>
  <generator uri="https://hexo.io/">Hexo</generator>
  <id>https://www.insidentally.com/</id>
  <link href="https://www.insidentally.com/" rel="alternate"/>
  <link href="https://www.insidentally.com/atom.xml" rel="self"/>
  <rights>All rights reserved 2026, insidentally</rights>
  <subtitle>天下雷行，物与无妄；先王以茂对时，育万物。从个人实际出发，不抱非分之想，脚踏实地，勤奋努力，检点行为，防意外灾祸。不计较得失，诚心追求，待机而动，事业必成。</subtitle>
  <title>无妄当自持</title>
  <updated>2026-06-02T09:44:03.955Z</updated>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Hermes Agent" scheme="https://www.insidentally.com/tags/Hermes-Agent/"/>
    <category term="Firecrawl" scheme="https://www.insidentally.com/tags/Firecrawl/"/>
    <category term="SearXNG" scheme="https://www.insidentally.com/tags/SearXNG/"/>
    <category term="Docker" scheme="https://www.insidentally.com/tags/Docker/"/>
    <category term="自托管" scheme="https://www.insidentally.com/tags/%E8%87%AA%E6%89%98%E7%AE%A1/"/>
    <content>
      <![CDATA[<p>用 Hermes Agent 做本地 AI 助手，网页搜索和内容抓取是刚需。SearXNG 负责聚合多引擎搜索，Firecrawl 负责 JS 渲染抓取，两者用 Docker Compose 一键部署，通过 <code>127.0.0.1:3002</code> 暴露给 Hermes Agent 直连。</p><span id="more"></span><h2 id="为什么需要两层？">为什么需要两层？</h2><p>直接上结论——单用 SearXNG 搜不到动态渲染页面的内容，单用 Firecrawl 的搜索功能又依赖外部服务。组合起来：</p><table><thead><tr><th>能力</th><th>SearXNG</th><th>Firecrawl</th></tr></thead><tbody><tr><td>聚合搜索</td><td>✓</td><td>✓ (通过 SearXNG)</td></tr><tr><td>JS 渲染抓取</td><td>✗</td><td>✓ (Playwright)</td></tr><tr><td>批量爬取</td><td>✗</td><td>✓</td></tr><tr><td>结构化提取</td><td>✗</td><td>✓</td></tr></tbody></table><h2 id="架构">架构</h2><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">Hermes Agent</span><br><span class="line">  ├── web_search → Firecrawl API (127.0.0.1:3002/v2/search)</span><br><span class="line">  │                    └── SearXNG (容器内 searxng:8080)</span><br><span class="line">  │                         ├── Google (走代理)</span><br><span class="line">  │                         ├── Bing (直连 cn.bing.com)</span><br><span class="line">  │                         └── Baidu (直连)</span><br><span class="line">  │</span><br><span class="line">  └── web_extract → Firecrawl API (127.0.0.1:3002/v1/scrape)</span><br><span class="line">                        └── Playwright 渲染 JS 页面</span><br></pre></td></tr></table></figure><p>涉及的容器：</p><table><thead><tr><th>组件</th><th>作用</th><th>端口</th></tr></thead><tbody><tr><td>Firecrawl API</td><td>网页抓取 + 搜索代理</td><td>3002 (本地)</td></tr><tr><td>Firecrawl Playwright</td><td>JS 渲染</td><td>容器内 3000</td></tr><tr><td>SearXNG</td><td>元搜索引擎</td><td>8080 (本地)</td></tr><tr><td>Redis</td><td>Firecrawl 任务队列</td><td>容器内 6379</td></tr><tr><td>RabbitMQ</td><td>Firecrawl 消息队列</td><td>容器内 5672</td></tr><tr><td>PostgreSQL</td><td>Firecrawl 数据存储</td><td>容器内 5432</td></tr></tbody></table><h2 id="部署">部署</h2><h3 id="1-创建目录">1. 创建目录</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">mkdir -p /vol1/1000/docker/firecrawl</span><br><span class="line">mkdir -p /vol1/1000/docker/searxng</span><br></pre></td></tr></table></figure><h3 id="2-docker-compose-yaml">2. docker-compose.yaml</h3><p>创建 <code>/vol1/1000/docker/firecrawl/docker-compose.yaml</code>：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br><span class="line">100</span><br><span class="line">101</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">x-common-service:</span> <span class="string">&amp;common-service</span></span><br><span class="line">  <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line">  <span class="attr">networks:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">backend</span></span><br><span class="line"></span><br><span class="line"><span class="attr">x-common-env:</span> <span class="string">&amp;common-env</span></span><br><span class="line">  <span class="attr">REDIS_URL:</span> <span class="string">$&#123;REDIS_URL:-redis://redis:6379&#125;</span></span><br><span class="line">  <span class="attr">REDIS_RATE_LIMIT_URL:</span> <span class="string">$&#123;REDIS_URL:-redis://redis:6379&#125;</span></span><br><span class="line">  <span class="attr">PLAYWRIGHT_MICROSERVICE_URL:</span> <span class="string">$&#123;PLAYWRIGHT_MICROSERVICE_URL:-http://playwright-service:3000/scrape&#125;</span></span><br><span class="line">  <span class="attr">POSTGRES_USER:</span> <span class="string">$&#123;POSTGRES_USER:-postgres&#125;</span></span><br><span class="line">  <span class="attr">POSTGRES_PASSWORD:</span> <span class="string">&quot;$&#123;POSTGRES_PASSWORD:-postgres&#125;&quot;</span></span><br><span class="line">  <span class="attr">POSTGRES_DB:</span> <span class="string">$&#123;POSTGRES_DB:-postgres&#125;</span></span><br><span class="line">  <span class="attr">POSTGRES_HOST:</span> <span class="string">$&#123;POSTGRES_HOST:-nuq-postgres&#125;</span></span><br><span class="line">  <span class="attr">POSTGRES_PORT:</span> <span class="string">$&#123;POSTGRES_PORT:-5432&#125;</span></span><br><span class="line">  <span class="attr">USE_DB_AUTHENTICATION:</span> <span class="string">$&#123;USE_DB_AUTHENTICATION:-false&#125;</span></span><br><span class="line">  <span class="attr">NUM_WORKERS_PER_QUEUE:</span> <span class="string">$&#123;NUM_WORKERS_PER_QUEUE:-8&#125;</span></span><br><span class="line">  <span class="attr">CRAWL_CONCURRENT_REQUESTS:</span> <span class="string">$&#123;CRAWL_CONCURRENT_REQUESTS:-10&#125;</span></span><br><span class="line">  <span class="attr">MAX_CONCURRENT_JOBS:</span> <span class="string">$&#123;MAX_CONCURRENT_JOBS:-5&#125;</span></span><br><span class="line">  <span class="attr">BROWSER_POOL_SIZE:</span> <span class="string">$&#123;BROWSER_POOL_SIZE:-5&#125;</span></span><br><span class="line">  <span class="attr">BULL_AUTH_KEY:</span> <span class="string">$&#123;BULL_AUTH_KEY&#125;</span></span><br><span class="line">  <span class="attr">TEST_API_KEY:</span> <span class="string">$&#123;TEST_API_KEY&#125;</span></span><br><span class="line">  <span class="attr">SEARXNG_ENDPOINT:</span> <span class="string">$&#123;SEARXNG_ENDPOINT&#125;</span></span><br><span class="line"></span><br><span class="line"><span class="attr">networks:</span></span><br><span class="line">  <span class="attr">backend:</span></span><br><span class="line">    <span class="attr">driver:</span> <span class="string">bridge</span></span><br><span class="line"></span><br><span class="line"><span class="attr">services:</span></span><br><span class="line">  <span class="attr">playwright-service:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">ghcr.io/firecrawl/playwright-service:latest</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="attr">PORT:</span> <span class="number">3000</span></span><br><span class="line">      <span class="attr">MAX_CONCURRENT_PAGES:</span> <span class="string">$&#123;CRAWL_CONCURRENT_REQUESTS:-10&#125;</span></span><br><span class="line">    <span class="attr">networks:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">backend</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">api:</span></span><br><span class="line">    <span class="string">&lt;&lt;:</span> <span class="string">*common-service</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">ghcr.io/firecrawl/firecrawl-api:latest</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="string">&lt;&lt;:</span> <span class="string">*common-env</span></span><br><span class="line">      <span class="attr">HOST:</span> <span class="string">&quot;0.0.0.0&quot;</span></span><br><span class="line">      <span class="attr">PORT:</span> <span class="string">$&#123;INTERNAL_PORT:-3002&#125;</span></span><br><span class="line">      <span class="attr">EXTRACT_WORKER_PORT:</span> <span class="string">$&#123;EXTRACT_WORKER_PORT:-3004&#125;</span></span><br><span class="line">      <span class="attr">WORKER_PORT:</span> <span class="string">$&#123;WORKER_PORT:-3005&#125;</span></span><br><span class="line">    <span class="attr">ports:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;127.0.0.1:3002:3002&quot;</span></span><br><span class="line">    <span class="attr">depends_on:</span></span><br><span class="line">      <span class="attr">redis:</span></span><br><span class="line">        <span class="attr">condition:</span> <span class="string">service_healthy</span></span><br><span class="line">      <span class="attr">rabbitmq:</span></span><br><span class="line">        <span class="attr">condition:</span> <span class="string">service_healthy</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">redis:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">redis:alpine</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line">    <span class="attr">networks:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">backend</span></span><br><span class="line">    <span class="attr">healthcheck:</span></span><br><span class="line">      <span class="attr">test:</span> [<span class="string">&quot;CMD&quot;</span>, <span class="string">&quot;redis-cli&quot;</span>, <span class="string">&quot;ping&quot;</span>]</span><br><span class="line">      <span class="attr">interval:</span> <span class="string">10s</span></span><br><span class="line">      <span class="attr">timeout:</span> <span class="string">5s</span></span><br><span class="line">      <span class="attr">retries:</span> <span class="number">5</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">rabbitmq:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">rabbitmq:3-management</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line">    <span class="attr">networks:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">backend</span></span><br><span class="line">    <span class="attr">healthcheck:</span></span><br><span class="line">      <span class="attr">test:</span> <span class="string">rabbitmq-diagnostics</span> <span class="string">-q</span> <span class="string">ping</span></span><br><span class="line">      <span class="attr">interval:</span> <span class="string">30s</span></span><br><span class="line">      <span class="attr">timeout:</span> <span class="string">30s</span></span><br><span class="line">      <span class="attr">retries:</span> <span class="number">3</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">nuq-postgres:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">ghcr.io/mendableai/nuq-postgres:latest</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="attr">POSTGRES_PASSWORD:</span> <span class="string">$&#123;POSTGRES_PASSWORD:-postgres&#125;</span></span><br><span class="line">      <span class="attr">POSTGRES_USER:</span> <span class="string">$&#123;POSTGRES_USER:-postgres&#125;</span></span><br><span class="line">      <span class="attr">POSTGRES_DB:</span> <span class="string">$&#123;POSTGRES_DB:-postgres&#125;</span></span><br><span class="line">    <span class="attr">networks:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">backend</span></span><br><span class="line">    <span class="attr">healthcheck:</span></span><br><span class="line">      <span class="attr">test:</span> [<span class="string">&quot;CMD-SHELL&quot;</span>, <span class="string">&quot;pg_isready -U postgres&quot;</span>]</span><br><span class="line">      <span class="attr">interval:</span> <span class="string">10s</span></span><br><span class="line">      <span class="attr">timeout:</span> <span class="string">5s</span></span><br><span class="line">      <span class="attr">retries:</span> <span class="number">5</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">searxng:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">searxng/searxng:latest</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line">    <span class="attr">volumes:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">/vol1/1000/docker/searxng/settings.yml:/etc/searxng/settings.yml:ro</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">/vol1/1000/docker/searxng/limiter.toml:/etc/searxng/limiter.toml:ro</span></span><br><span class="line">    <span class="attr">networks:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">backend</span></span><br><span class="line">    <span class="attr">extra_hosts:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;host.docker.internal:host-gateway&quot;</span></span><br></pre></td></tr></table></figure><h3 id="3-环境变量">3. 环境变量</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line">cd /vol1/1000/docker/firecrawl</span><br><span class="line"></span><br><span class="line">POSTGRES_PASSWORD=$(openssl rand -hex 16)</span><br><span class="line">BULL_AUTH_KEY=$(openssl rand -hex 16)</span><br><span class="line">TEST_API_KEY=$(openssl rand -hex 16)</span><br><span class="line"></span><br><span class="line">cat &gt; .env &lt;&lt; EOF</span><br><span class="line">PORT=3002</span><br><span class="line">HOST=0.0.0.0</span><br><span class="line">REDIS_URL=redis://redis:6379</span><br><span class="line">PLAYWRIGHT_MICROSERVICE_URL=http://playwright-service:3000/scrape</span><br><span class="line">POSTGRES_USER=postgres</span><br><span class="line">POSTGRES_PASSWORD=$POSTGRES_PASSWORD</span><br><span class="line">POSTGRES_DB=postgres</span><br><span class="line">POSTGRES_HOST=nuq-postgres</span><br><span class="line">POSTGRES_PORT=5432</span><br><span class="line">USE_DB_AUTHENTICATION=false</span><br><span class="line">NUM_WORKERS_PER_QUEUE=8</span><br><span class="line">CRAWL_CONCURRENT_REQUESTS=10</span><br><span class="line">MAX_CONCURRENT_JOBS=5</span><br><span class="line">BROWSER_POOL_SIZE=5</span><br><span class="line">BULL_AUTH_KEY=$BULL_AUTH_KEY</span><br><span class="line">TEST_API_KEY=$TEST_API_KEY</span><br><span class="line">SEARXNG_ENDPOINT=http://searxng:8080</span><br><span class="line">EOF</span><br><span class="line"></span><br><span class="line">echo &quot;TEST_API_KEY: $TEST_API_KEY&quot;</span><br></pre></td></tr></table></figure><h3 id="4-SearXNG-配置">4. SearXNG 配置</h3><p><code>/vol1/1000/docker/searxng/settings.yml</code>：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">use_default_settings:</span> <span class="literal">true</span></span><br><span class="line"><span class="attr">general:</span></span><br><span class="line">  <span class="attr">instance_name:</span> <span class="string">&quot;Firecrawl SearXNG&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="attr">search:</span></span><br><span class="line">  <span class="attr">safe_search:</span> <span class="number">0</span></span><br><span class="line">  <span class="attr">autocomplete:</span> <span class="string">&quot;&quot;</span></span><br><span class="line">  <span class="attr">default_lang:</span> <span class="string">&quot;auto&quot;</span></span><br><span class="line">  <span class="attr">formats:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">html</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">json</span></span><br><span class="line"></span><br><span class="line"><span class="attr">server:</span></span><br><span class="line">  <span class="attr">bind_address:</span> <span class="string">&quot;0.0.0.0&quot;</span></span><br><span class="line">  <span class="attr">secret_key:</span> <span class="string">&quot;$(openssl rand -hex 32)&quot;</span></span><br><span class="line">  <span class="attr">limiter:</span> <span class="literal">false</span></span><br><span class="line">  <span class="attr">image_proxy:</span> <span class="literal">true</span></span><br><span class="line"></span><br><span class="line"><span class="attr">ui:</span></span><br><span class="line">  <span class="attr">static_use_hash:</span> <span class="literal">true</span></span><br><span class="line"></span><br><span class="line"><span class="attr">engines:</span></span><br><span class="line">  <span class="comment"># 需要代理的引擎</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">google</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">false</span></span><br><span class="line">    <span class="attr">proxies:</span></span><br><span class="line">      <span class="attr">all://:</span></span><br><span class="line">        <span class="bullet">-</span> <span class="string">http://host.docker.internal:7890</span></span><br><span class="line"></span><br><span class="line">  <span class="comment"># 可直连的引擎</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">bing</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">false</span></span><br><span class="line">    <span class="attr">base_url:</span> <span class="string">https://cn.bing.com/</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">baidu</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">false</span></span><br><span class="line"></span><br><span class="line">  <span class="comment"># 禁用不可用的引擎</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">duckduckgo</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">brave</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">startpage</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">wikipedia</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">wikidata</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">qwant</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">mojeek</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="bullet">-</span> <span class="attr">name:</span> <span class="string">yahoo</span></span><br><span class="line">    <span class="attr">disabled:</span> <span class="literal">true</span></span><br></pre></td></tr></table></figure><p><code>/vol1/1000/docker/searxng/limiter.toml</code>：</p><figure class="highlight toml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="section">[botdetection]</span></span><br><span class="line"><span class="attr">ipv4_prefix</span> = <span class="number">32</span></span><br><span class="line"><span class="attr">ipv6_prefix</span> = <span class="number">48</span></span><br><span class="line"><span class="attr">trusted_proxies</span> = [</span><br><span class="line">  <span class="string">&#x27;127.0.0.0/8&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;::1&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;172.16.0.0/12&#x27;</span>,</span><br><span class="line">]</span><br><span class="line"></span><br><span class="line"><span class="section">[botdetection.ip_limit]</span></span><br><span class="line"><span class="attr">filter_link_local</span> = <span class="literal">false</span></span><br><span class="line"><span class="attr">link_token</span> = <span class="literal">false</span></span><br><span class="line"></span><br><span class="line"><span class="section">[botdetection.ip_lists]</span></span><br><span class="line"><span class="attr">block_ip</span> = []</span><br><span class="line"><span class="attr">pass_ip</span> = [</span><br><span class="line">  <span class="string">&#x27;127.0.0.0/8&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;::1&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;172.16.0.0/12&#x27;</span>,</span><br><span class="line">  <span class="string">&#x27;10.0.0.0/8&#x27;</span>,</span><br><span class="line">]</span><br></pre></td></tr></table></figure><h3 id="5-启动">5. 启动</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">cd /vol1/1000/docker/firecrawl</span><br><span class="line">docker compose up -d</span><br><span class="line">docker compose ps</span><br></pre></td></tr></table></figure><p>验证：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">Firecrawl API</span></span><br><span class="line">curl -s http://127.0.0.1:3002/</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">SearXNG 搜索</span></span><br><span class="line">curl -s &#x27;http://127.0.0.1:8080/search?q=test&amp;format=json&#x27; | head -c 200</span><br></pre></td></tr></table></figure><h3 id="6-配置-Hermes-Agent">6. 配置 Hermes Agent</h3><p><code>~/.hermes/.env</code>：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">FIRECRAWL_API_URL=http://127.0.0.1:3002</span><br></pre></td></tr></table></figure><p><code>~/.hermes/config.yaml</code>：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">web:</span></span><br><span class="line">  <span class="attr">backend:</span> <span class="string">firecrawl</span></span><br></pre></td></tr></table></figure><p>重启网关：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes gateway restart</span><br></pre></td></tr></table></figure><h2 id="踩坑记录">踩坑记录</h2><h3 id="SearXNG">SearXNG</h3><table><thead><tr><th>问题</th><th>原因</th><th>解决</th></tr></thead><tbody><tr><td>JSON API 返回 403</td><td>默认禁止 <code>format=json</code></td><td><code>search.formats</code> 加 <code>json</code></td></tr><tr><td>启动崩溃 <code>limiter.toml schema invalid</code></td><td>limiter.toml 格式错误</td><td>用上文格式，挂载到 <code>/etc/searxng/limiter.toml</code></td></tr><tr><td>bing 返回 0 结果</td><td><code>www.bing.com</code> 302 到 <code>cn.bing.com</code>，httpx 不跟重定向</td><td><code>base_url: https://cn.bing.com/</code></td></tr><tr><td>duckduckgo 被 CAPTCHA</td><td>DDG 对自动化请求激进</td><td>禁用，无可靠方案</td></tr><tr><td>大量 timeout</td><td><code>use_default_settings: true</code> 启用所有引擎</td><td>显式禁用不需要的</td></tr><tr><td><code>secret_key</code> 太短</td><td>要求 32+ 字符</td><td><code>openssl rand -hex 32</code></td></tr></tbody></table><h3 id="Firecrawl">Firecrawl</h3><table><thead><tr><th>问题</th><th>原因</th><th>解决</th></tr></thead><tbody><tr><td>搜索返回空结果</td><td><code>SEARXNG_ENDPOINT</code> 未传入容器</td><td><code>x-common-env</code> 须含 <code>SEARXNG_ENDPOINT</code></td></tr><tr><td>改了环境变量不生效</td><td>docker compose 只 recreate 不 reload env</td><td><code>docker compose up -d --force-recreate</code></td></tr><tr><td><code>USE_DB_AUTHENTICATION</code> 报错</td><td>Firecrawl 不识别该值</td><td>设为 <code>false</code></td></tr><tr><td><a href="http://ghcr.io">ghcr.io</a> 镜像拉取慢</td><td>国内网络</td><td>配 registry mirror，拉取后 <code>docker tag</code> 回原名</td></tr></tbody></table><h3 id="引擎可用性测试">引擎可用性测试</h3><p>部署后逐个确认：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">for engine in google bing baidu; do</span><br><span class="line">  count=$(curl -s &quot;http://127.0.0.1:8080/search?q=test&amp;format=json&amp;engines=$engine&quot; \</span><br><span class="line">    | python3 -c &quot;import sys,json; print(len(json.load(sys.stdin).get(&#x27;results&#x27;,[])))&quot;)</span><br><span class="line">  echo &quot;$engine: $count results&quot;</span><br><span class="line">done</span><br></pre></td></tr></table></figure><h2 id="总结">总结</h2><p>Firecrawl + SearXNG 这套方案，SearXNG 负责聚合 Google/Bing/Baidu 的搜索结果，Firecrawl 负责 JS 渲染和网页抓取。两个容器各自独立，通过内部网络通信，对外只暴露 <code>127.0.0.1:3002</code> 一个端口。部署过程主要的坑集中在 SearXNG 的引擎配置上——默认启用太多引擎会导致 timeout，bing 要用 <code>cn.bing.com</code> 避免重定向问题，DuckDuckGo 直接禁用。</p><p>整套方案不需要公网暴露，纯本地运行，适合对数据隐私有要求的场景。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000047/</id>
    <link href="https://www.insidentally.com/articles/000047/"/>
    <published>2026-06-02T04:30:00.000Z</published>
    <summary>
      <![CDATA[<p>用 Hermes Agent 做本地 AI 助手，网页搜索和内容抓取是刚需。SearXNG 负责聚合多引擎搜索，Firecrawl 负责 JS 渲染抓取，两者用 Docker Compose 一键部署，通过 <code>127.0.0.1:3002</code> 暴露给 Hermes Agent 直连。</p>]]>
    </summary>
    <title>用 Firecrawl + SearXNG 给 Hermes Agent 搭建本地搜索与网页抓取</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="NAS" scheme="https://www.insidentally.com/tags/NAS/"/>
    <category term="AI Agent" scheme="https://www.insidentally.com/tags/AI-Agent/"/>
    <category term="自托管" scheme="https://www.insidentally.com/tags/%E8%87%AA%E6%89%98%E7%AE%A1/"/>
    <category term="obsidian" scheme="https://www.insidentally.com/tags/obsidian/"/>
    <category term="mcp" scheme="https://www.insidentally.com/tags/mcp/"/>
    <content>
      <![CDATA[<p>我的主力笔记工具是 Obsidian，本地通过插件直接管理文件，体验很好。但有一个问题：服务器上跑着 AI Agent（Hermes），它需要读写我的笔记库，而 Linux 服务器装不了 Obsidian 桌面客户端。Fast Note Sync 的 MCP 接口解决了这个问题——AI Agent 通过 MCP 协议直接操作笔记，本地依然用 Obsidian 管理，数据在两端实时同步。</p><span id="more"></span><h2 id="两条路径，一个笔记库">两条路径，一个笔记库</h2><p>整套方案的逻辑很简单：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">本地（有 Obsidian）          服务器（无 Obsidian）</span><br><span class="line">    ↓                           ↓</span><br><span class="line">Obsidian + FNS 插件            AI Agent (Hermes)</span><br><span class="line">    ↓ WebSocket                 ↓ MCP 协议</span><br><span class="line">Fast Note Sync Service (NAS, Docker)</span><br><span class="line">    ↓ SQLite</span><br><span class="line">本地存储</span><br></pre></td></tr></table></figure><p>本地通过 Obsidian 客户端直接管理笔记，实时同步到服务端。服务器上的 AI Agent 通过 MCP 协议读写同一个笔记库。两条路径操作的是同一份数据，通过 WebSocket 保持实时一致。</p><h2 id="为什么选-Fast-Note-Sync">为什么选 Fast Note Sync</h2><table><thead><tr><th>方案</th><th>同步延迟</th><th>私有部署</th><th>AI 可操作</th><th>本地体验</th></tr></thead><tbody><tr><td>Obsidian Sync</td><td>毫秒级</td><td>否</td><td>否</td><td>完整</td></tr><tr><td>iCloud / 坚果云</td><td>秒~分钟</td><td>否</td><td>否</td><td>完整</td></tr><tr><td>Git 同步</td><td>手动</td><td>是</td><td>否</td><td>完整</td></tr><tr><td>Fast Note Sync</td><td>毫秒级</td><td>是</td><td>MCP 原生</td><td>完整</td></tr></tbody></table><p>关键差异：Fast Note Sync 是唯一同时满足「自托管」「实时同步」「AI 可操作」三个条件的方案。原生 MCP 支持意味着不需要额外写胶水代码，配置即用。</p><h2 id="部署服务端">部署服务端</h2><p>服务端用 Golang 编写，Docker 部署最省事。</p><h3 id="Docker-Compose">Docker Compose</h3><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># /vol1/docker/fast-note-sync/docker-compose.yml</span></span><br><span class="line"><span class="attr">services:</span></span><br><span class="line">  <span class="attr">fast-note-sync-service:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">ghcr.io/haierkeys/fast-note-sync-service:latest</span></span><br><span class="line">    <span class="attr">container_name:</span> <span class="string">fast-note-sync-service</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">always</span></span><br><span class="line">    <span class="attr">ports:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;9000:9000&quot;</span></span><br><span class="line">    <span class="attr">volumes:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">./storage:/fast-note-sync/storage</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">./config:/fast-note-sync/config</span></span><br></pre></td></tr></table></figure><h3 id="服务端配置">服务端配置</h3><p><code>config/config.yaml</code> 关键参数：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">server:</span></span><br><span class="line">  <span class="attr">run-mode:</span> <span class="string">release</span></span><br><span class="line">  <span class="attr">http-port:</span> <span class="string">&quot;:9000&quot;</span></span><br><span class="line">  <span class="attr">ext-api-url:</span> <span class="string">&quot;https://your-nas.example.com:自定义端口&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="attr">app:</span></span><br><span class="line">  <span class="attr">soft-delete-retention-time:</span> <span class="string">&quot;7d&quot;</span></span><br><span class="line">  <span class="attr">sync-log-retention-time:</span> <span class="string">&quot;30d&quot;</span></span><br><span class="line">  <span class="attr">history-keep-versions:</span> <span class="number">100</span></span><br><span class="line">  <span class="attr">ws-compression-enabled:</span> <span class="literal">true</span></span><br><span class="line">  <span class="attr">fts-bleve-enabled:</span> <span class="literal">true</span>  <span class="comment"># 全文搜索</span></span><br><span class="line"></span><br><span class="line"><span class="attr">database:</span></span><br><span class="line">  <span class="attr">type:</span> <span class="string">sqlite</span></span><br><span class="line">  <span class="attr">path:</span> <span class="string">storage/database/db.sqlite3</span></span><br><span class="line">  <span class="attr">auto-migrate:</span> <span class="literal">true</span></span><br><span class="line"></span><br><span class="line"><span class="attr">security:</span></span><br><span class="line">  <span class="attr">token-expiry:</span> <span class="string">&quot;365d&quot;</span></span><br><span class="line">  <span class="attr">webgui-login-token-bind-ip:</span> <span class="literal">false</span>  <span class="comment"># 反向代理时务必关闭</span></span><br></pre></td></tr></table></figure><blockquote><p><code>webgui-login-token-bind-ip: false</code> 是关键——使用反向代理时客户端 IP 会变化，开启此项会导致频繁掉线。</p></blockquote><h2 id="反向代理与-SSL">反向代理与 SSL</h2><p>飞牛 OS 的 80/443 端口被 trim_nginx 占用，用 Caddy 监听空闲端口做 SSL 终结：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">&#123;</span><br><span class="line">    auto_https off  # 必须加，否则抢 80 端口</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">your-nas.example.com:自定义端口 &#123;</span><br><span class="line">    tls /path/to/certs/fullchain.crt /path/to/certs/cert.key</span><br><span class="line"></span><br><span class="line">    @websockets &#123;</span><br><span class="line">        header Connection *Upgrade*</span><br><span class="line">        header Upgrade    websocket</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    reverse_proxy @websockets localhost:9000 &#123;</span><br><span class="line">        header_up X-Real-IP &#123;remote_host&#125;</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    reverse_proxy localhost:9000 &#123;</span><br><span class="line">        header_up X-Real-IP &#123;remote_host&#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><strong>两个容易踩的坑</strong>：</p><ol><li><strong><code>auto_https off</code> 必须加</strong>——否则 Caddy 尝试监听 80 端口，跟现有服务冲突。</li><li><strong>WebSocket 升级需要单独匹配</strong>——FNS 的 <code>/api/user/sync</code> 使用 WebSocket，Caddy 需要通过 <code>@websockets</code> matcher 正确转发 <code>Upgrade</code> 头。</li></ol><h3 id="证书自动续期">证书自动续期</h3><p>飞牛 OS 的 SSL 证书路径含时间戳目录，每次续期会变。用符号链接 + systemd timer 解决：从系统配置读取当前活跃证书路径，每 6 小时检测一次，证书更新时自动重建符号链接并通知 Caddy 重载。</p><h2 id="本地配置-Obsidian-客户端">本地配置 Obsidian 客户端</h2><p>这是日常使用的主路径：</p><ol><li>访问 <code>https://your-nas.example.com:自定义端口/webgui</code> 注册管理员账号</li><li>在 Web UI「笔记库」中创建或选择默认笔记库</li><li>点击「一键授权 Obsidian」获取授权配置</li><li>各端 Obsidian 安装 Fast Note Sync 插件，粘贴授权配置</li></ol><p>插件自动监听 Vault 内所有笔记的创建、更新、删除，通过 WebSocket 实时同步。离线期间的操作在重连后自动合并，不会丢数据。日常写笔记、整理知识库，全程在 Obsidian 里完成，同步是无感的。</p><h2 id="接入-AI-Agent（MCP）">接入 AI Agent（MCP）</h2><p>这是解决服务器端问题的关键——服务器没有 Obsidian，但 AI Agent 需要读写笔记。Fast Note Sync 原生支持 MCP 协议，StreamableHTTP 传输，开箱即用。</p><h3 id="获取-API-Token">获取 API Token</h3><p>Web GUI → 设置 → API Token → 创建新 Token，权限选 <code>p:rest</code>。</p><h3 id="安装-MCP-SDK">安装 MCP SDK</h3><p>Hermes Agent 的 venv 默认不含 MCP SDK：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">uv pip install --python ~/.hermes/hermes-agent/venv/bin/python mcp</span><br></pre></td></tr></table></figure><h3 id="配置-Hermes">配置 Hermes</h3><p>编辑 <code>~/.hermes/config.yaml</code>，添加：</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">mcp_servers:</span></span><br><span class="line">  <span class="attr">fns:</span></span><br><span class="line">    <span class="attr">url:</span> <span class="string">&quot;http://127.0.0.1:9000/api/mcp&quot;</span></span><br><span class="line">    <span class="attr">headers:</span></span><br><span class="line">      <span class="attr">Authorization:</span> <span class="string">&quot;Bearer &lt;API_Token&gt;&quot;</span></span><br><span class="line">      <span class="attr">X-Default-Vault-Name:</span> <span class="string">&quot;wiki&quot;</span></span><br><span class="line">    <span class="attr">timeout:</span> <span class="number">180</span></span><br></pre></td></tr></table></figure><p>重启后自动注入 25 个 <code>mcp_fns_*</code> 工具：</p><table><thead><tr><th>类别</th><th>工具示例</th></tr></thead><tbody><tr><td>笔记读写</td><td>note_list, note_get, note_create_or_update, note_append</td></tr><tr><td>文件操作</td><td>file_list, file_read, file_write</td></tr><tr><td>笔记库管理</td><td>vault_list, vault_create_or_update</td></tr><tr><td>回收站</td><td>note_restore, file_restore</td></tr></tbody></table><h3 id="使用效果">使用效果</h3><p>配置完成后，AI Agent 可以直接操作笔记库：</p><ul><li>“列出 wiki 库所有笔记”</li><li>“读取某篇笔记并总结要点”</li><li>“在 daily 目录创建今天的日程”</li><li>“把这篇文章追加到项目日志末尾”</li></ul><p>通过 MCP 修改的笔记实时同步到所有 Obsidian 客户端，本地打开 Obsidian 就能看到 AI 写入的内容。反过来，本地编辑的笔记也能被 AI 读取。</p><h2 id="注意事项">注意事项</h2><ul><li>MCP 工具在新会话中注入，配置完成后需要新开会话才能使用</li><li>本地 Obsidian 操作和服务器 MCP 操作是两条独立路径，但操作同一份数据</li><li>WebSocket 端点 <code>/api/user/sync</code> 需要在反向代理中正确处理 Upgrade 头</li><li>Token 仅本地使用，不暴露公网</li><li><code>user-database</code> 配置中 <code>port</code> 字段必须是整数，留空字符串会导致启动报错</li></ul><h2 id="总结">总结</h2><p>这套方案的核心思路：本地用 Obsidian 管理笔记（完整体验），服务器通过 MCP 让 AI 操作笔记（无桌面环境），Fast Note Sync 作为中间层提供实时同步。两条路径各取所长，数据始终一致。</p><h2 id="参考资料">参考资料</h2><ul><li><a href="https://github.com/haierkeys/obsidian-fast-note-sync">Fast Note Sync Plugin</a> — Obsidian 插件源码</li><li><a href="https://github.com/haierkeys/fast-note-sync-service">Fast Note Sync Service</a> — 服务端源码与部署文档</li><li><a href="https://community.obsidian.md/plugins/fast-note-sync">Fast Note Sync - Obsidian Community Plugins</a> — 官方插件页面</li></ul>]]>
    </content>
    <id>https://www.insidentally.com/articles/000048/</id>
    <link href="https://www.insidentally.com/articles/000048/"/>
    <published>2026-06-02T02:00:00.000Z</published>
    <summary>
      <![CDATA[<p>我的主力笔记工具是 Obsidian，本地通过插件直接管理文件，体验很好。但有一个问题：服务器上跑着 AI Agent（Hermes），它需要读写我的笔记库，而 Linux 服务器装不了 Obsidian 桌面客户端。Fast Note Sync 的 MCP 接口解决了这个问题——AI Agent 通过 MCP 协议直接操作笔记，本地依然用 Obsidian 管理，数据在两端实时同步。</p>]]>
    </summary>
    <title>让服务器的 AI 读写你的 Obsidian 笔记库：从自建同步到 MCP 集成</title>
    <updated>2026-06-02T10:59:21.525Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="llama.cpp" scheme="https://www.insidentally.com/tags/llama-cpp/"/>
    <category term="CUDA" scheme="https://www.insidentally.com/tags/CUDA/"/>
    <category term="Qwen3.5" scheme="https://www.insidentally.com/tags/Qwen3-5/"/>
    <category term="本地大模型" scheme="https://www.insidentally.com/tags/%E6%9C%AC%E5%9C%B0%E5%A4%A7%E6%A8%A1%E5%9E%8B/"/>
    <content>
      <![CDATA[<p>之前写了在 Fedora 44 上编译支持 CUDA 的 llama.cpp 的过程，这次记录一下用编译好的二进制运行 Qwen 3.5 9B 模型的完整经历，包括踩坑和性能测试结果。</p><span id="more"></span><h2 id="硬件与模型">硬件与模型</h2><table><thead><tr><th>项目</th><th>配置</th></tr></thead><tbody><tr><td>CPU</td><td>AMD Ryzen 7 7840HS</td></tr><tr><td>GPU</td><td>NVIDIA RTX 4060 Max-Q 8GB</td></tr><tr><td>内存</td><td>32GB DDR5</td></tr><tr><td>系统</td><td>Fedora 44 (Linux 7.0.9)</td></tr><tr><td>模型</td><td>Qwopus3.5-9B-coder-Exp-IQ4_XS.gguf (5.2GB)</td></tr><tr><td>视觉投影器</td><td>mmproj.gguf (921MB)</td></tr></tbody></table><p>模型是 Qwen 3.5 系列的 9B 参数编码变体，IQ4_XS 量化后体积约 5.2GB。Qwen 3.5 是混合架构，同时使用 Transformer 注意力机制和 Mamba 状态空间模型（SSM），其中只有部分层（第 3、7、11、15、19、23、27、31 层）使用完整注意力，其余为 Mamba 层。这种设计使得 KV cache 占用比纯 Transformer 模型小得多。</p><p>编译产物在 <code>/home/insidentally/Documents/shell/llama-cpp/bin/</code>，模型文件在 <code>/home/insidentally/Documents/shell/llama-cpp/model/</code>。</p><h2 id="启动服务">启动服务</h2><p>最基础的启动命令：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">cd /home/insidentally/Documents/shell/llama-cpp/bin</span><br><span class="line">export LD_LIBRARY_PATH=&quot;$(pwd):/usr/local/lib/ollama/cuda_v13:$LD_LIBRARY_PATH&quot;</span><br><span class="line"></span><br><span class="line">./llama-server \</span><br><span class="line">  -m /home/insidentally/Documents/shell/llama-cpp/model/Qwopus3.5-9B-coder-Exp-IQ4_XS.gguf \</span><br><span class="line">  -ngl 99 \</span><br><span class="line">  --host 0.0.0.0 \</span><br><span class="line">  --port 8080 \</span><br><span class="line">  -c 4096 \</span><br><span class="line">  --flash-attn on</span><br></pre></td></tr></table></figure><h3 id="踩坑-1：CUDA-runtime-找不到">踩坑 1：CUDA runtime 找不到</h3><p>编译好的 llama-server 依赖 <code>libcudart.so.13</code>，但系统没有独立安装 CUDA Toolkit。本机的 CUDA 库来自 Ollama 自带的版本，位于 <code>/usr/local/lib/ollama/cuda_v13/</code>。需要在 <code>LD_LIBRARY_PATH</code> 中显式加入这个路径。</p><h3 id="踩坑-2：-flash-attn-参数格式变了">踩坑 2：<code>--flash-attn</code> 参数格式变了</h3><p>早版本 llama.cpp 中 <code>--flash-attn</code> 是一个布尔开关，不带参数。当前版本（b9161）要求显式指定 <code>on</code>、<code>off</code> 或 <code>auto</code>：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">旧写法，会报错</span></span><br><span class="line">--flash-attn</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">新写法</span></span><br><span class="line">--flash-attn on</span><br></pre></td></tr></table></figure><h2 id="上下文长度的权衡">上下文长度的权衡</h2><p>模型原生支持的上下文长度为 262144（262K），但 8GB 显存是硬约束。我测试了几种配置：</p><table><thead><tr><th>配置</th><th>上下文</th><th>视觉模型</th><th>GPU 占用</th><th>能否运行</th></tr></thead><tbody><tr><td>A</td><td>4096</td><td>加载</td><td>6534 MiB</td><td>正常</td></tr><tr><td>B</td><td>65536 (64K)</td><td>不加载</td><td>7336 MiB</td><td>正常</td></tr><tr><td>C</td><td>65536 (64K)</td><td>加载</td><td>-</td><td>OOM 崩溃</td></tr></tbody></table><p>64K 上下文加上 mmproj 视觉模型会超出显存。最终选择了方案 B：64K 上下文，不加载视觉模型。</p><p>对于混合架构的 Qwen 3.5 来说，Mamba 层的内存占用是 O(1) 的（固定大小状态），只有注意力层的 KV cache 随上下文长度线性增长。这使得 64K 上下文在 8GB 显存上成为可能。纯 Transformer 的 9B 模型在这个显存下大概只能跑到 8K-16K。</p><h2 id="性能测试">性能测试</h2><p>用 OpenAI 兼容 API 进行测试：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">curl -s http://localhost:8080/v1/chat/completions \</span><br><span class="line">  -H &quot;Content-Type: application/json&quot; \</span><br><span class="line">  -d &#x27;&#123;</span><br><span class="line">    &quot;model&quot;: &quot;qwen3.5-9b&quot;,</span><br><span class="line">    &quot;messages&quot;: [&#123;&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;你好，请用中文简单介绍一下你自己，50字以内。&quot;&#125;],</span><br><span class="line">    &quot;max_tokens&quot;: 500,</span><br><span class="line">    &quot;temperature&quot;: 0.7</span><br><span class="line">  &#125;&#x27;</span><br></pre></td></tr></table></figure><p>结果：</p><table><thead><tr><th>指标</th><th>数值</th></tr></thead><tbody><tr><td>Prompt 处理速度</td><td>433.6 tokens/s</td></tr><tr><td>Token 生成速度</td><td>46.3 tokens/s</td></tr><tr><td>单 token 延迟</td><td>21.6 ms</td></tr><tr><td>Prompt 处理延迟</td><td>2.3 ms/token</td></tr></tbody></table><p>46 tokens/s 的生成速度在笔记本 GPU 上算是不错的表现，日常对话基本感觉不到延迟。</p><h2 id="注意事项">注意事项</h2><p>Qwen 3.5 默认开启思考模式（thinking mode），模型会先在 <code>reasoning_content</code> 字段输出推理过程，再在 <code>content</code> 字段输出正式回复。如果 <code>max_tokens</code> 设得太小，可能所有 token 都花在思考上，导致实际回复为空。建议至少设 300-500。</p><h2 id="服务管理">服务管理</h2><p>启动后可以通过以下方式使用：</p><ul><li>聊天页面：<code>http://localhost:8080</code>（默认自带 WebUI，如果你 llama.cpp 没有编译 WebUI 则无法访问）</li><li>OpenAI 兼容 API：<code>http://localhost:8080/v1/chat/completions</code></li><li>健康检查：<code>http://localhost:8080/health</code></li></ul><p>后台运行可以用 <code>nohup</code> 或 systemd user service。</p><h2 id="总结">总结</h2><p>在 8GB 显存的笔记本上运行 9B 参数的混合架构模型，体验比预期好。IQ4_XS 量化在体积和质量之间取得了不错的平衡，混合架构的内存效率让 64K 上下文成为可能。主要限制是视觉模型和大上下文不能同时启用，以及量化级别不宜再低（否则质量下降明显）。</p><p>如果需要视觉能力，可以切换回 4K 上下文 + mmproj 的配置。两种方案各有取舍，根据实际需求选择即可。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000045/</id>
    <link href="https://www.insidentally.com/articles/000045/"/>
    <published>2026-05-24T15:30:00.000Z</published>
    <summary>
      <![CDATA[<p>之前写了在 Fedora 44 上编译支持 CUDA 的 llama.cpp 的过程，这次记录一下用编译好的二进制运行 Qwen 3.5 9B 模型的完整经历，包括踩坑和性能测试结果。</p>]]>
    </summary>
    <title>在 RTX 4060 笔记本上运行 Qwen 3.5 9B：llama.cpp 本地部署实录</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Fedora" scheme="https://www.insidentally.com/tags/Fedora/"/>
    <category term="AI" scheme="https://www.insidentally.com/tags/AI/"/>
    <category term="llama.cpp" scheme="https://www.insidentally.com/tags/llama-cpp/"/>
    <category term="CUDA" scheme="https://www.insidentally.com/tags/CUDA/"/>
    <category term="编译" scheme="https://www.insidentally.com/tags/%E7%BC%96%E8%AF%91/"/>
    <content>
      <![CDATA[<p>llama.cpp 是本地运行大模型的首选推理引擎，但官方预编译版本在 Linux 上不提供 CUDA 支持。本文记录了在 Fedora 44 系统上，使用 Toolbox 容器从源码编译支持 CUDA 13.2 的 llama.cpp 的完整过程，包括环境配置、依赖解决和性能优化。</p><span id="more"></span><h3 id="为什么选择自己编译？">为什么选择自己编译？</h3><p>在尝试了多种方案后，我决定从源码编译 llama.cpp，主要基于以下考虑：</p><table><thead><tr><th>方案</th><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>llama.cpp 预编译</td><td>免编译</td><td><strong>官方不提供 Linux CUDA 版本</strong></td></tr><tr><td>Ollama</td><td>一键安装，自动 CUDA</td><td>封装层，略重</td></tr><tr><td>Vulkan 后端</td><td>无需 CUDA Toolkit</td><td>性能损失约 20-30%</td></tr><tr><td>源码编译</td><td>性能最优，可定制</td><td>需要解决依赖问题</td></tr></tbody></table><p><strong>关键发现</strong>：llama.cpp 官方发布的 Linux 版本只有 CPU、Vulkan、ROCm 和 SYCL 后端，<strong>没有 CUDA 后端</strong>。CUDA 预编译仅限 Windows。这意味着在 Linux 上用 NVIDIA GPU 跑 llama.cpp，要么从源码编译，要么接受 Vulkan 的性能损失。</p><h3 id="本机配置">本机配置</h3><p>我的开发环境如下：</p><table><thead><tr><th>项目</th><th>配置</th></tr></thead><tbody><tr><td>OS</td><td>Fedora 44 Workstation（内核 7.0.9-204.fc44.x86_64）</td></tr><tr><td>CPU</td><td>AMD Ryzen 7 7840HS（Zen 4 架构，8核16线程）</td></tr><tr><td>GPU</td><td>NVIDIA GeForce RTX 4060 Laptop（8GB GDDR6）</td></tr><tr><td>内存</td><td>32GB DDR5</td></tr><tr><td>CUDA 驱动</td><td>595.71.05（支持 CUDA 13.2）</td></tr></tbody></table><p><strong>关键限制</strong>：8GB 显存决定了模型量化方案，CUDA 13.2 需要特定版本的编译器支持。</p><h3 id="编译环境选择：为什么用-Toolbox？">编译环境选择：为什么用 Toolbox？</h3><p>直接在主机上编译会污染系统环境，而 Docker 容器又过于笨重。Fedora Toolbox 是完美的折衷方案：</p><ul><li><strong>环境隔离</strong>：与主机系统完全隔离，避免依赖冲突</li><li><strong>文件共享</strong>：自动挂载主机 <code>$HOME</code> 目录，方便访问模型文件</li><li><strong>GPU 透传</strong>：自动透传 NVIDIA 驱动，只需配置 CUDA 运行时</li><li><strong>轻量级</strong>：基于 Podman，启动快速，资源占用少</li></ul><h3 id="详细编译步骤">详细编译步骤</h3><h4 id="1-创建-Toolbox-容器">1. 创建 Toolbox 容器</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 创建 Fedora 44 工具箱容器</span></span><br><span class="line">toolbox create --assumeyes --image registry.fedoraproject.org/fedora-toolbox:44 --container fedora-toolbox-44-cuda</span><br><span class="line"></span><br><span class="line"><span class="comment"># 进入容器</span></span><br><span class="line">toolbox enter --container fedora-toolbox-44-cuda</span><br></pre></td></tr></table></figure><h4 id="2-安装基础开发工具">2. 安装基础开发工具</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 更新包管理器</span></span><br><span class="line"><span class="built_in">sudo</span> dnf distro-sync</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装编译工具链</span></span><br><span class="line"><span class="built_in">sudo</span> dnf install @c-development @development-tools cmake</span><br></pre></td></tr></table></figure><h4 id="3-配置-CUDA-Toolkit-13-2">3. 配置 CUDA Toolkit 13.2</h4><h5 id="3-1-添加-NVIDIA-CUDA-仓库">3.1 添加 NVIDIA CUDA 仓库</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 添加 Fedora 43 的 CUDA 仓库（兼容 Fedora 44）</span></span><br><span class="line"><span class="built_in">sudo</span> dnf config-manager addrepo --from-repofile=https://developer.download.nvidia.com/compute/cuda/repos/fedora43/x86_64/cuda-fedora43.repo</span><br><span class="line"></span><br><span class="line"><span class="comment"># 同步仓库元数据</span></span><br><span class="line"><span class="built_in">sudo</span> dnf distro-sync</span><br></pre></td></tr></table></figure><h5 id="3-2-处理-NVIDIA-驱动透传">3.2 处理 NVIDIA 驱动透传</h5><p>由于主机已安装 NVIDIA 驱动，容器内会自动挂载 <code>libcuda.so.1</code>。为避免文件冲突，我们只更新 RPM 数据库而不实际安装驱动：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 下载驱动 RPM 包</span></span><br><span class="line"><span class="built_in">sudo</span> dnf download --destdir=/tmp/nvidia-driver-libs --resolve --<span class="built_in">arch</span> x86_64 \</span><br><span class="line">    nvidia-driver-cuda nvidia-driver-libs nvidia-driver-cuda-libs nvidia-persistenced</span><br><span class="line"></span><br><span class="line"><span class="comment"># 仅更新 RPM 数据库（不实际安装文件）</span></span><br><span class="line"><span class="built_in">sudo</span> rpm --install --verbose --<span class="built_in">hash</span> --justdb /tmp/nvidia-driver-libs/*</span><br></pre></td></tr></table></figure><h5 id="3-3-安装-CUDA-Toolkit">3.3 安装 CUDA Toolkit</h5><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 安装 CUDA Toolkit 元包</span></span><br><span class="line"><span class="built_in">sudo</span> dnf install cuda-toolkit</span><br><span class="line"></span><br><span class="line"><span class="comment"># 配置环境变量</span></span><br><span class="line"><span class="built_in">sudo</span> sh -c <span class="string">&#x27;echo &quot;export PATH=\$PATH:/usr/local/cuda/bin&quot; &gt;&gt; /etc/profile.d/cuda.sh&#x27;</span></span><br><span class="line"><span class="built_in">sudo</span> <span class="built_in">chmod</span> +x /etc/profile.d/cuda.sh</span><br><span class="line"><span class="built_in">source</span> /etc/profile.d/cuda.sh</span><br></pre></td></tr></table></figure><p>验证安装：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">nvcc --version</span><br><span class="line"><span class="comment"># 输出应显示：Cuda compilation tools, release 13.2</span></span><br></pre></td></tr></table></figure><h4 id="4-解决-GCC-版本兼容性问题">4. 解决 GCC 版本兼容性问题</h4><p>CUDA 13.2 不支持默认的 GCC 16，需要安装 GCC 15：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 安装 GCC 15 和 G++ 15</span></span><br><span class="line"><span class="built_in">sudo</span> dnf install gcc15 gcc15-c++</span><br><span class="line"></span><br><span class="line"><span class="comment"># 设置 NVCC 使用 GCC 15</span></span><br><span class="line"><span class="built_in">export</span> NVCC_CCBIN=<span class="string">&#x27;g++-15&#x27;</span></span><br><span class="line"><span class="built_in">echo</span> <span class="string">&#x27;export NVCC_CCBIN=&quot;g++-15&quot;&#x27;</span> &gt;&gt; ~/.bashrc</span><br></pre></td></tr></table></figure><h4 id="5-获取并编译-llama-cpp">5. 获取并编译 llama.cpp</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 克隆源码（使用浅克隆加速）</span></span><br><span class="line">git <span class="built_in">clone</span> --depth 1 https://github.com/ggml-org/llama.cpp.git</span><br><span class="line"><span class="built_in">cd</span> llama.cpp</span><br><span class="line"></span><br><span class="line"><span class="comment"># 创建构建目录</span></span><br><span class="line"><span class="built_in">mkdir</span> build &amp;&amp; <span class="built_in">cd</span> build</span><br><span class="line"></span><br><span class="line"><span class="comment"># 配置 CMake（启用 CUDA，禁用 WebUI）</span></span><br><span class="line">cmake .. \</span><br><span class="line">    -DGGML_CUDA=ON \</span><br><span class="line">    -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \</span><br><span class="line">    -DLLAMA_BUILD_WEBUI=OFF</span><br><span class="line"></span><br><span class="line"><span class="comment"># 编译（使用所有 CPU 核心）</span></span><br><span class="line">cmake --build . --config Release -j$(<span class="built_in">nproc</span>)</span><br></pre></td></tr></table></figure><blockquote><p>WebUI 所需要的依赖下载很慢，我干脆禁用了 WebUI。</p></blockquote><h4 id="6-验证安装">6. 验证安装</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 检查生成的二进制文件</span></span><br><span class="line"><span class="built_in">ls</span> -la ./bin/llama-*</span><br><span class="line"></span><br><span class="line"><span class="comment"># 测试 CUDA 支持</span></span><br><span class="line">./bin/llama-cli --<span class="built_in">help</span> | grep -i <span class="string">&quot;gpu\|cuda&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 验证 CUDA 库链接</span></span><br><span class="line">ldd ./bin/libggml-cuda.so.0.11.1 | grep -i cuda</span><br></pre></td></tr></table></figure><h3 id="常见问题与解决方案">常见问题与解决方案</h3><h4 id="1-GCC-版本不兼容">1. GCC 版本不兼容</h4><p><strong>错误</strong>：<code>#error -- unsupported GNU version! gcc versions later than 15 are not supported!</code></p><p><strong>解决</strong>：安装 GCC 15 并设置 <code>NVCC_CCBIN='g++-15'</code></p><h4 id="2-驱动透传问题">2. 驱动透传问题</h4><p><strong>现象</strong>：容器内无法访问 GPU</p><p><strong>解决</strong>：确保主机 NVIDIA 驱动正常工作，重启容器：<code>podman restart fedora-toolbox-44-cuda</code></p><h4 id="3-WebUI-下载失败">3. WebUI 下载失败</h4><p><strong>错误</strong>：<code>WebUI: failed to download assets from HF Bucket</code></p><p><strong>解决</strong>：编译时添加 <code>-DLLAMA_BUILD_WEBUI=OFF</code> 禁用 WebUI</p><h4 id="4-模型加载错误">4. 模型加载错误</h4><p><strong>错误</strong>：<code>missing tensor 'blk.32.ssm_conv1d.weight'</code></p><p><strong>原因</strong>：某些模型（如 Qwopus3.5-9B-Coder-MTP）使用 MTP（多令牌预测）架构，需要特定版本的 llama.cpp。</p><p><strong>解决</strong>：使用支持 MTP 的分支或选择非 MTP 版本的模型。</p><h3 id="性能对比：容器内-vs-主机">性能对比：容器内 vs 主机</h3><table><thead><tr><th>指标</th><th>容器内编译</th><th>主机直接编译</th></tr></thead><tbody><tr><td>环境隔离</td><td>✅ 完全隔离</td><td>❌ 依赖系统包</td></tr><tr><td>驱动兼容性</td><td>✅ 自动透传</td><td>✅ 原生支持</td></tr><tr><td>编译速度</td><td>略慢（容器开销）</td><td>快</td></tr><tr><td>系统影响</td><td>无</td><td>可能污染系统</td></tr></tbody></table><h3 id="使用建议">使用建议</h3><h4 id="1-模型推理">1. 模型推理</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 启动 API 服务器</span></span><br><span class="line">./bin/llama-server \</span><br><span class="line">    -m /path/to/model.gguf \</span><br><span class="line">    -ngl 99 \</span><br><span class="line">    --host 0.0.0.0 \</span><br><span class="line">    --port 8080</span><br><span class="line"></span><br><span class="line"><span class="comment"># 命令行推理</span></span><br><span class="line">./bin/llama-cli \</span><br><span class="line">    -m /path/to/model.gguf \</span><br><span class="line">    -p <span class="string">&quot;你的提示词&quot;</span> \</span><br><span class="line">    -ngl 99</span><br></pre></td></tr></table></figure><h4 id="2-性能调优">2. 性能调优</h4><ul><li><strong>显存管理</strong>：根据显存调整 <code>-ngl</code> 参数（RTX 4060 8GB 建议 20-30 层）</li><li><strong>上下文长度</strong>：使用 <code>-c 4096</code> 设置合适的上下文长度</li><li><strong>Flash Attention</strong>：启用 <code>-fa on</code> 提升推理速度</li><li><strong>量化选择</strong>：Q4_K_M 量化是 8GB 显存的最佳平衡点</li></ul><h4 id="3-容器管理">3. 容器管理</h4><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 退出容器后，快速进入</span></span><br><span class="line">toolbox enter --container fedora-toolbox-44-cuda</span><br><span class="line"></span><br><span class="line"><span class="comment"># 停止容器</span></span><br><span class="line">podman stop fedora-toolbox-44-cuda</span><br><span class="line"></span><br><span class="line"><span class="comment"># 重新启动</span></span><br><span class="line">podman start fedora-toolbox-44-cuda</span><br></pre></td></tr></table></figure><h3 id="总结">总结</h3><p>通过本次编译，我获得了：</p><ol><li><strong>完全兼容的 CUDA 13.2 支持</strong>，充分发挥 RTX 4060 性能</li><li><strong>干净隔离的开发环境</strong>，不影响主机系统稳定性</li><li><strong>最新版本的 llama.cpp</strong>，支持所有新模型和功能</li><li><strong>可复现的编译流程</strong>，便于后续更新和维护</li></ol><p><strong>关键收获</strong>：</p><ul><li>使用 Toolbox 容器是 Fedora 上进行 CUDA 开发的最佳实践</li><li>版本兼容性问题（如 GCC 版本）需要提前解决</li><li>自己编译虽然步骤较多，但能获得最佳性能和最新功能</li></ul><p>对于希望在本机运行大模型的用户，我强烈推荐从源码编译 llama.cpp。虽然过程略显复杂，但带来的性能提升和灵活性是值得的。</p><blockquote><p>参考资料：<br><a href="https://github.com/ggml-org/llama.cpp">llama.cpp 官方文档</a><br><a href="https://docs.nvidia.com/cuda/cuda-installation-guide-linux/">NVIDIA CUDA 安装指南</a><br><a href="https://docs.fedoraproject.org/en-US/fedora-silverblue/toolbox/">Fedora Toolbox 文档</a><br><a href="https://www.datacamp.com/tutorial/multi-token-prediction-llama-cpp">Multi-Token Prediction Tutorial</a></p></blockquote>]]>
    </content>
    <id>https://www.insidentally.com/articles/000044/</id>
    <link href="https://www.insidentally.com/articles/000044/"/>
    <published>2026-05-24T14:00:00.000Z</published>
    <summary>
      <![CDATA[<p>llama.cpp 是本地运行大模型的首选推理引擎，但官方预编译版本在 Linux 上不提供 CUDA 支持。本文记录了在 Fedora 44 系统上，使用 Toolbox 容器从源码编译支持 CUDA 13.2 的 llama.cpp 的完整过程，包括环境配置、依赖解决和性能优化。</p>]]>
    </summary>
    <title>在 Fedora 44 上编译支持 CUDA 的 llama.cpp：完整指南</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="AI" scheme="https://www.insidentally.com/tags/AI/"/>
    <category term="Hermes" scheme="https://www.insidentally.com/tags/Hermes/"/>
    <category term="WebUI" scheme="https://www.insidentally.com/tags/WebUI/"/>
    <category term="安装部署" scheme="https://www.insidentally.com/tags/%E5%AE%89%E8%A3%85%E9%83%A8%E7%BD%B2/"/>
    <content>
      <![CDATA[<h2 id="背景">背景</h2><p>Hermes Agent 默认在终端里交互，但终端对很多人来说不够直观。<a href="https://github.com/nesquena/hermes-webui">hermes-webui</a> 是社区维护的专用 Web 前端，提供流式聊天、工具调用卡片、会话管理、文件浏览等功能，是目前体验最好的 Hermes 浏览器客户端。</p><p>本文介绍 hermes-webui 的安装、配置和日常使用。</p><span id="more"></span><h2 id="hermes-webui-是什么">hermes-webui 是什么</h2><p>先厘清三个概念：</p><table><thead><tr><th></th><th>hermes-webui</th><th>hermes dashboard</th><th>Open WebUI</th></tr></thead><tbody><tr><td><strong>定位</strong></td><td>专用聊天前端</td><td>内置管理面板</td><td>通用 LLM 前端</td></tr><tr><td><strong>仓库</strong></td><td><a href="https://github.com/nesquena/hermes-webui">nesquena/hermes-webui</a></td><td>内置命令 <code>hermes dashboard</code></td><td><a href="https://github.com/open-webui/open-webui">open-webui/open-webui</a></td></tr><tr><td><strong>聊天体验</strong></td><td>流式输出、工具卡片、Mermaid 图</td><td>嵌入式终端 TUI</td><td>通用聊天界面</td></tr><tr><td><strong>Hermes 特性</strong></td><td>工具调用卡片、审批流、子代理卡片</td><td>配置编辑、Cron、技能、日志</td><td>无（不知道 Hermes 的存在）</td></tr><tr><td><strong>适合场景</strong></td><td>日常聊天 + 工作区</td><td>管理/运维/配置</td><td>多模型对比</td></tr></tbody></table><p><strong>推荐组合</strong>：hermes-webui 负责聊天，<code>hermes dashboard</code> 负责管理。两者互补。</p><h2 id="安装">安装</h2><h3 id="前置条件">前置条件</h3><ul><li>Python 3（系统自带即可）</li><li>已安装 Hermes Agent（<code>hermes doctor</code> 能正常运行）</li></ul><h3 id="手动安装（推荐）">手动安装（推荐）</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> ~</span><br><span class="line">git <span class="built_in">clone</span> https://github.com/nesquena/hermes-webui.git hermes-webui</span><br><span class="line"><span class="built_in">cd</span> hermes-webui</span><br><span class="line">python3 -m venv .venv</span><br><span class="line">.venv/bin/pip install pyyaml python-dotenv requests</span><br></pre></td></tr></table></figure><p>依赖只有三个包，没有构建步骤，前端是原生 JS。</p><h3 id="启动">启动</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">python3 server.py</span><br></pre></td></tr></table></figure><p>默认监听 <code>http://127.0.0.1:8787</code>。</p><h3 id="关于-bootstrap-py">关于 <a href="http://bootstrap.py">bootstrap.py</a></h3><p>仓库提供了 <code>python3 bootstrap.py</code> 一键脚本，会尝试自动检测/安装 Hermes Agent。如果系统已经装好了 Hermes，这个脚本可能卡在 agent 发现阶段。<strong>直接用上面的手动方式更可靠。</strong></p><h2 id="配置">配置</h2><p>通过环境变量控制行为：</p><table><thead><tr><th>变量</th><th>默认值</th><th>说明</th></tr></thead><tbody><tr><td><code>HERMES_WEBUI_HOST</code></td><td>127.0.0.1</td><td>监听地址</td></tr><tr><td><code>HERMES_WEBUI_PORT</code></td><td>8787</td><td>端口</td></tr><tr><td><code>HERMES_WEBUI_STATE_DIR</code></td><td>~/.hermes/webui</td><td>状态/数据目录</td></tr><tr><td><code>HERMES_WEBUI_PASSWORD</code></td><td>（无）</td><td>认证密码（暴露到网络时建议设置）</td></tr><tr><td><code>HERMES_WEBUI_AGENT_DIR</code></td><td>自动检测</td><td>hermes-agent 源码路径</td></tr></tbody></table><p>示例：修改端口并设置密码</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">HERMES_WEBUI_PORT=9000 \</span><br><span class="line">HERMES_WEBUI_PASSWORD=mysecret \</span><br><span class="line">python3 server.py</span><br></pre></td></tr></table></figure><h2 id="后台运行">后台运行</h2><h3 id="ctl-sh-守护进程"><a href="http://ctl.sh">ctl.sh</a> 守护进程</h3><p>仓库自带 <code>ctl.sh</code> 脚本：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cd</span> ~/hermes-webui</span><br><span class="line">./ctl.sh start        <span class="comment"># 后台启动，PID 写入 ~/.hermes/webui.pid</span></span><br><span class="line">./ctl.sh status       <span class="comment"># 查看 PID、运行时间、端口、健康状态</span></span><br><span class="line">./ctl.sh logs --lines 100</span><br><span class="line">./ctl.sh restart</span><br><span class="line">./ctl.sh stop</span><br></pre></td></tr></table></figure><h3 id="systemd-用户服务（开机自启）">systemd 用户服务（开机自启）</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="built_in">cat</span> &gt; ~/.config/systemd/user/hermes-webui.service &lt;&lt; <span class="string">&#x27;EOF&#x27;</span></span><br><span class="line">[Unit]</span><br><span class="line">Description=Hermes WebUI</span><br><span class="line">After=network.target</span><br><span class="line"></span><br><span class="line">[Service]</span><br><span class="line">Type=simple</span><br><span class="line">WorkingDirectory=%h/hermes-webui</span><br><span class="line">ExecStart=%h/hermes-webui/.venv/bin/python3 server.py</span><br><span class="line">Restart=on-failure</span><br><span class="line">RestartSec=5</span><br><span class="line"></span><br><span class="line">[Install]</span><br><span class="line">WantedBy=default.target</span><br><span class="line">EOF</span><br><span class="line"></span><br><span class="line">systemctl --user daemon-reload</span><br><span class="line">systemctl --user <span class="built_in">enable</span> --now hermes-webui</span><br></pre></td></tr></table></figure><p>查看状态：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">systemctl --user status hermes-webui</span><br><span class="line">journalctl --user -u hermes-webui -f</span><br></pre></td></tr></table></figure><h2 id="功能概览">功能概览</h2><h3 id="聊天界面">聊天界面</h3><ul><li><strong>流式输出</strong>：实时显示 AI 回复，不是等全部生成完再一次性弹出</li><li><strong>工具调用卡片</strong>：Hermes 执行 terminal、file、web 等工具时，以卡片形式展示输入/输出，而不是一堆原始 JSON</li><li><strong>子代理卡片</strong>：<code>delegate_task</code> 生成的子任务有独立的进度展示</li><li><strong>Mermaid 图表</strong>：AI 生成的 Mermaid 语法直接渲染成图</li><li><strong>审批流程</strong>：危险命令执行前的确认交互，在 WebUI 里直接操作</li></ul><h3 id="会话管理">会话管理</h3><ul><li>搜索历史会话（FTS5 全文检索）</li><li>置顶、归档、导出会话</li><li>跨会话上下文延续</li></ul><h3 id="文件浏览器">文件浏览器</h3><ul><li>浏览工作区文件</li><li>自动检测 Git 仓库状态</li><li>直接查看文件内容</li></ul><h3 id="语音输入">语音输入</h3><p>支持 Web Speech API，浏览器端语音转文字。</p><h2 id="内置-Dashboard">内置 Dashboard</h2><p>Hermes Agent 自带一个管理面板，和 hermes-webui 互补：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">hermes dashboard              <span class="comment"># http://127.0.0.1:9119</span></span><br><span class="line">hermes dashboard --port 8080  <span class="comment"># 自定义端口</span></span><br><span class="line">hermes dashboard --no-open    <span class="comment"># 不自动打开浏览器</span></span><br></pre></td></tr></table></figure><p>需要额外依赖：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pip install <span class="string">&#x27;hermes-agent[web,pty]&#x27;</span></span><br></pre></td></tr></table></figure><p>Dashboard 包含：Status、Chat（嵌入式 TUI）、Config（150+ 字段可视化编辑）、API Keys、Sessions、Logs、Analytics、Cron、Skills。</p><h2 id="踩坑记录">踩坑记录</h2><h3 id="端口被占用">端口被占用</h3><p>如果之前手动运行过 <code>server.py</code>，端口可能没释放：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">fuser -k 8787/tcp</span><br></pre></td></tr></table></figure><p>然后再启动 systemd 服务。</p><h3 id="缺少依赖">缺少依赖</h3><p><code>server.py</code> 会导入 <code>run_agent</code>，需要 <code>pyyaml</code>、<code>python-dotenv</code>、<code>requests</code> 三个包。缺了的话服务能启动但 agent 功能静默失败。确保都装了：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">.venv/bin/pip install pyyaml python-dotenv requests</span><br></pre></td></tr></table></figure><h3 id="暴露到网络">暴露到网络</h3><p>默认只监听 127.0.0.1，安全。如果需要从其他设备访问：</p><ol><li>设置 <code>HERMES_WEBUI_HOST=0.0.0.0</code></li><li><strong>必须设置 <code>HERMES_WEBUI_PASSWORD</code></strong>，否则任何能访问端口的人都能读你的会话</li></ol><h2 id="总结">总结</h2><p>hermes-webui 是目前 Hermes Agent 体验最好的 Web 前端：</p><ol><li><strong>轻量</strong>：三个 Python 依赖，无构建步骤</li><li><strong>专用</strong>：工具卡片、审批流、子代理展示都是 Hermes 特有的</li><li><strong>易部署</strong>：<a href="http://ctl.sh">ctl.sh</a> 或 systemd 二选一</li></ol><p>配合内置的 <code>hermes dashboard</code> 做管理，日常使用和运维都能覆盖。</p><h2 id="参考资料">参考资料</h2><ul><li><a href="https://github.com/nesquena/hermes-webui">hermes-webui GitHub 仓库</a></li><li><a href="https://hermes-agent.nousresearch.com/docs/">Hermes Agent 官方文档</a></li><li><a href="https://github.com/NousResearch/hermes-agent">Hermes Agent GitHub 仓库</a></li></ul>]]>
    </content>
    <id>https://www.insidentally.com/articles/000046/</id>
    <link href="https://www.insidentally.com/articles/000046/"/>
    <published>2026-05-24T06:30:00.000Z</published>
    <summary>
      <![CDATA[<h2 id="背景">背景</h2>
<p>Hermes Agent 默认在终端里交互，但终端对很多人来说不够直观。<a href="https://github.com/nesquena/hermes-webui">hermes-webui</a> 是社区维护的专用 Web 前端，提供流式聊天、工具调用卡片、会话管理、文件浏览等功能，是目前体验最好的 Hermes 浏览器客户端。</p>
<p>本文介绍 hermes-webui 的安装、配置和日常使用。</p>]]>
    </summary>
    <title>Hermes WebUI 安装与使用指南</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Hermes Agent" scheme="https://www.insidentally.com/tags/Hermes-Agent/"/>
    <category term="AI" scheme="https://www.insidentally.com/tags/AI/"/>
    <category term="开源" scheme="https://www.insidentally.com/tags/%E5%BC%80%E6%BA%90/"/>
    <category term="Agent" scheme="https://www.insidentally.com/tags/Agent/"/>
    <content>
      <![CDATA[<p>Hermes Agent 是 Nous Research 开源的 AI Agent 框架，模型无关、支持多平台。和传统 AI 聊天工具不同，它具备自进化能力——用得越久，积累的技能和记忆越多。本文重点介绍其核心机制和日常使用的命令技巧。</p><span id="more"></span><h3 id="核心机制：它凭什么-越用越聪明">核心机制：它凭什么&quot;越用越聪明&quot;</h3><p>Hermes 的本质可以用一句话概括：<strong>让 AI 自己进化，而不是你当保姆。</strong></p><p>传统 AI 工具是你写规则、调参数、加技能，整个过程依赖你持续输入。Hermes 把规则的生成过程自动化——它从经验中总结规则，写进系统。你只需要用，它在使用过程中反向构建自己的结构。</p><p>这个&quot;自进化&quot;的核心是一个五步闭环：</p><p><strong>记住 → 总结 → 形成技能 → 用技能 → 根据反馈再改</strong></p><p>这个循环不是偶尔触发，而是每一轮对话结束后都会发生。你每用一次，它就复盘一次。复盘不是&quot;记录聊天&quot;，而是&quot;提炼经验&quot;——只记有用的东西，还会整理结构。</p><h3 id="三层记忆：不是-存聊天记录-那么简单">三层记忆：不是&quot;存聊天记录&quot;那么简单</h3><p>很多人以为 AI 记忆就是存聊天记录。Hermes 搞了三层结构，每层解决一个问题：</p><table><thead><tr><th>层级</th><th>存储内容</th><th>存放位置</th></tr></thead><tbody><tr><td>会话记忆</td><td>发生了什么（对话历史）</td><td>SQLite 数据库，按需加载</td></tr><tr><td>持久记忆</td><td>你是谁（偏好、习惯）</td><td><code>MEMORY.md</code> / <code>USER.md</code></td></tr><tr><td>技能记忆</td><td>怎么做事（任务流程）</td><td><code>~/.hermes/skills/</code></td></tr></tbody></table><p>会话记忆不会全部加载到上下文，而是需要时再查。不会因为历史太多变卡，也不会因为上下文太长变蠢。</p><p>持久记忆会提炼你的习惯、偏好、工作方式。你写代码喜欢什么风格、讨厌什么结构，它都会慢慢总结出来。</p><p>技能记忆是最值钱的部分。它把任务流程固化成方法，下次直接调用。就像从&quot;会做一道题&quot;变成&quot;掌握一类题&quot;。</p><h3 id="会话管理命令">会话管理命令</h3><p>启动和恢复会话是最基础的操作：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">启动交互式会话</span></span><br><span class="line">hermes</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">继续上次对话</span></span><br><span class="line">hermes --continue</span><br><span class="line">hermes -c</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">恢复指定会话</span></span><br><span class="line">hermes --resume &lt;session_id&gt;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">单次提问（不进入交互模式）</span></span><br><span class="line">hermes chat -q &quot;写一个快速排序算法&quot;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">单次提问并指定模型</span></span><br><span class="line">hermes chat -q &quot;解释休克的病理机制&quot; -m deepseek/deepseek-chat</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">启动时预加载技能</span></span><br><span class="line">hermes -s github-pr-workflow</span><br></pre></td></tr></table></figure><p>会话管理还有 CLI 命令：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">hermes sessions list          # 列出所有会话</span><br><span class="line">hermes sessions browse        # 交互式浏览</span><br><span class="line">hermes sessions rename ID 新名字</span><br><span class="line">hermes sessions export 文件名  # 导出会话</span><br><span class="line">hermes sessions delete ID     # 删除会话</span><br></pre></td></tr></table></figure><h3 id="对话中的斜杠命令">对话中的斜杠命令</h3><p>进入对话后，斜杠命令是最常用的交互方式。命令不区分大小写。</p><h4 id="会话控制">会话控制</h4><table><thead><tr><th>命令</th><th>说明</th></tr></thead><tbody><tr><td><code>/new</code></td><td>开始新对话</td></tr><tr><td><code>/retry</code></td><td>重试上一条回复</td></tr><tr><td><code>/undo</code></td><td>撤销上一步操作</td></tr><tr><td><code>/compress</code></td><td>手动压缩上下文（上下文接近满时自动触发）</td></tr><tr><td><code>/rollback</code></td><td>回滚文件改动（需启动时加 <code>--checkpoints</code>）</td></tr><tr><td><code>/title &lt;标题&gt;</code></td><td>设置会话标题</td></tr></tbody></table><h4 id="模型与推理">模型与推理</h4><table><thead><tr><th>命令</th><th>说明</th></tr></thead><tbody><tr><td><code>/model</code></td><td>查看当前模型</td></tr><tr><td><code>/model &lt;provider:model&gt;</code></td><td>临时切换模型</td></tr><tr><td><code>/reasoning high</code></td><td>提高推理强度（复杂任务时使用）</td></tr><tr><td><code>/reasoning off</code></td><td>关闭推理（节省 token）</td></tr></tbody></table><h4 id="记忆管理">记忆管理</h4><table><thead><tr><th>命令</th><th>说明</th></tr></thead><tbody><tr><td><code>/memory_add &lt;内容&gt;</code></td><td>永久保存信息到记忆</td></tr><tr><td><code>/memory_remove &lt;内容&gt;</code></td><td>删除记忆</td></tr><tr><td><code>/search_sessions &lt;关键词&gt;</code></td><td>搜索历史会话</td></tr><tr><td><code>/recent</code></td><td>查看最近会话</td></tr></tbody></table><blockquote><p>记忆的价值在于减少重复说明。你纠正一次，它就记住一次。用得越久，需要纠正的越少。</p></blockquote><h4 id="信息查看">信息查看</h4><table><thead><tr><th>命令</th><th>说明</th></tr></thead><tbody><tr><td><code>/help</code></td><td>帮助信息</td></tr><tr><td><code>/usage</code></td><td>查看 token 用量和费用明细</td></tr><tr><td><code>/tools</code></td><td>列出可用工具</td></tr><tr><td><code>/skills</code></td><td>列出已加载技能</td></tr><tr><td><code>/insights --days 7</code></td><td>生成周报：学习的技能、高频调用、重复任务模式</td></tr><tr><td><code>/verbose</code></td><td>切换显示模式（off → new → all → verbose）</td></tr></tbody></table><blockquote><p><code>/insights</code> 相当于 AI 员工的&quot;绩效考核&quot;，可以观察它的成长轨迹。</p></blockquote><h3 id="文件与终端操作">文件与终端操作</h3><table><thead><tr><th>命令</th><th>说明</th></tr></thead><tbody><tr><td><code>/read &lt;文件路径&gt;</code></td><td>读取文件（带行号）</td></tr><tr><td><code>/write &lt;文件路径&gt; &lt;内容&gt;</code></td><td>覆盖写入文件</td></tr><tr><td><code>/patch &lt;文件路径&gt; &lt;旧文本&gt; &lt;新文本&gt;</code></td><td>查找替换</td></tr><tr><td><code>/search &lt;模式&gt;</code></td><td>搜索文件内容或文件名</td></tr><tr><td><code>/term &lt;命令&gt;</code></td><td>执行 shell 命令</td></tr><tr><td><code>/exec &lt;Python代码&gt;</code></td><td>执行 Python 代码</td></tr></tbody></table><h3 id="浏览器操作">浏览器操作</h3><table><thead><tr><th>命令</th><th>说明</th></tr></thead><tbody><tr><td><code>/nav &lt;URL&gt;</code></td><td>导航至网页</td></tr><tr><td><code>/click &lt;元素&gt;</code></td><td>点击页面元素</td></tr><tr><td><code>/type &lt;元素&gt; &lt;文本&gt;</code></td><td>在输入框中填写文本</td></tr><tr><td><code>/snap</code></td><td>获取页面快照</td></tr><tr><td><code>/vision &lt;问题&gt;</code></td><td>对当前页面进行视觉分析</td></tr><tr><td><code>/browser connect</code></td><td>连接本地浏览器</td></tr></tbody></table><h3 id="技能系统：会自己进化的-经验文档">技能系统：会自己进化的&quot;经验文档&quot;</h3><p>技能（Skill）不是静态模板，而是会根据使用反馈自动修改的经验文档。</p><p>传统 Skill 是人写、人改，人不动它就不变。Hermes 的 Skill 会根据你的反馈自动修改。你说一句&quot;这里不对&quot;，它不只是这次改，还会更新未来的做法。一个写代码的 Skill，用一周和用三周，完全不是一个水平。</p><p>技能管理命令：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">对话中</span></span><br><span class="line">/skills                       # 列出已加载技能</span><br><span class="line">/skill &lt;技能名&gt;                # 查看技能详情</span><br><span class="line">/skill_create &lt;技能名&gt; &lt;内容&gt;   # 创建自定义技能</span><br><span class="line">/skill_patch &lt;技能名&gt; &lt;旧&gt; &lt;新&gt; # 更新技能</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">CLI 中</span></span><br><span class="line">hermes skills browse           # 浏览所有可用技能</span><br><span class="line">hermes skills search &quot;关键词&quot;   # 搜索技能</span><br><span class="line">hermes skills install 技能名    # 安装技能</span><br><span class="line">hermes skills list             # 列出已安装技能</span><br><span class="line">hermes skills update           # 更新已安装技能</span><br><span class="line">hermes skills publish          # 发布自己的技能</span><br></pre></td></tr></table></figure><p>技能覆盖 16 个方向，包括 GitHub 工作流、学术研究、创意生成、机器学习、生产力工具等。开始任务前先用 <code>/skills</code> 检查是否有现成技能，是最实用的习惯。</p><h3 id="定时任务（Cron）">定时任务（Cron）</h3><p>Hermes 内置 cron 子系统，支持用自然语言描述任务：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">对话中创建</span></span><br><span class="line">/cron_create &quot;每天早上8:30，读取 ~/reports/daily.csv，总结异常数据&quot;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">CLI 中管理</span></span><br><span class="line">hermes cron list              # 列出所有定时任务</span><br><span class="line">hermes cron run &lt;任务ID&gt;       # 立即运行</span><br><span class="line">hermes cron pause &lt;任务ID&gt;     # 暂停</span><br><span class="line">hermes cron resume &lt;任务ID&gt;    # 恢复</span><br></pre></td></tr></table></figure><p>时间格式灵活：<code>'30m'</code>（30分钟）、<code>'every 2h'</code>（每2小时）、<code>'0 9 * * *'</code>（每天9点）。</p><blockquote><p>安全约束：定时任务不能递归创建新的定时任务，防止失控。</p></blockquote><h3 id="子代理与并行任务">子代理与并行任务</h3><p>复杂任务可以委派给子代理并行处理：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">/delegate &lt;目标&gt;           # 委派单个任务</span><br><span class="line">/delegate_batch &lt;任务数组&gt;  # 批量委派（最多3个并行）</span><br></pre></td></tr></table></figure><p>每个子代理有独立的上下文和终端会话，互不干扰。并发子代理上限为 3 个，且工具权限受限，防止失控。</p><h3 id="其他实用命令">其他实用命令</h3><h4 id="后台任务">后台任务</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">/background &lt;提示&gt;   # 在后台独立会话中运行任务</span><br></pre></td></tr></table></figure><p>适合长时间运行的研究任务，可以边等结果边做其他事情。</p><h4 id="人格切换">人格切换</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">/personality pirate    # 切换为海盗风格</span><br><span class="line">/personality technical # 切换为技术风格</span><br><span class="line">/personality default   # 恢复默认</span><br></pre></td></tr></table></figure><p>也可以编辑 <code>~/.hermes/SOUL.md</code> 自定义 Agent 的说话风格和性格。</p><h4 id="语音模式">语音模式</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">安装语音支持</span></span><br><span class="line">pip install &quot;hermes-agent[voice]&quot;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">对话中使用</span></span><br><span class="line">/voice on    # 开始语音输入</span><br><span class="line">/voice tts   # 朗读回复</span><br></pre></td></tr></table></figure><h4 id="配置管理">配置管理</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">hermes config              # 查看当前配置</span><br><span class="line">hermes config edit         # 编辑配置文件</span><br><span class="line">hermes config set KEY VALUE</span><br><span class="line">hermes doctor              # 检查环境</span><br><span class="line">hermes doctor --fix        # 检查并自动修复</span><br></pre></td></tr></table></figure><h4 id="多实例隔离">多实例隔离</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">hermes profile create NAME   # 创建独立空间（配置、记忆、技能互不干扰）</span><br><span class="line">hermes profile use NAME      # 切换</span><br></pre></td></tr></table></figure><h3 id="键盘快捷键">键盘快捷键</h3><table><thead><tr><th>按键</th><th>说明</th></tr></thead><tbody><tr><td><code>Enter</code></td><td>发送消息</td></tr><tr><td><code>Alt+Enter</code> / <code>Ctrl+J</code></td><td>换行（多行输入）</td></tr><tr><td><code>Ctrl+C</code></td><td>中断 Agent（2秒内按两次强制退出）</td></tr><tr><td><code>Ctrl+D</code></td><td>退出</td></tr><tr><td><code>Tab</code></td><td>接受自动补全或斜杠命令</td></tr></tbody></table><h3 id="状态栏说明">状态栏说明</h3><p>对话界面底部的状态栏实时显示：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">mimo-v2.5-pro | 12.4K/200K | [██████░░░░] 6% | $0.06 | 15m</span><br></pre></td></tr></table></figure><p>分别是：模型名称 / token 用量 / 上下文进度条 / 预估费用 / 会话时长。</p><p>上下文进度条的颜色含义：</p><ul><li>绿色（&lt; 50%）：空间充足</li><li>黄色（50-80%）：接近满</li><li>橙色（80-95%）：即将到限</li><li>红色（≥ 95%）：需要压缩，用 <code>/compress</code></li></ul><h3 id="实用技巧总结">实用技巧总结</h3><ol><li><strong>技能优先</strong>：开始任务前先用 <code>/skills</code> 检查现有技能</li><li><strong>记忆持久化</strong>：用 <code>/memory_add</code> 保存关键信息，跨会话复用</li><li><strong>会话搜索</strong>：用 <code>/search_sessions</code> 查找历史解决方案</li><li><strong>合理委派</strong>：复杂推理任务交给 <code>/delegate</code>，简单操作自己做</li><li><strong>定时自动化</strong>：重复性工作用 <code>/cron_create</code> 设定周期任务</li><li><strong>文件回滚</strong>：启动时加 <code>--checkpoints</code>，改动前自动 git commit，随时 <code>/rollback</code></li><li><strong>谨慎使用 <code>/yolo</code></strong>：跳过危险命令确认，调试后务必关闭</li><li><strong>Docker 隔离</strong>：<code>hermes config set terminal.backend docker</code>，在容器中运行命令更安全</li></ol><h3 id="需要注意的问题">需要注意的问题</h3><p>Hermes 的自动化很强，但不是&quot;什么都不管就能变神&quot;。</p><p>它的进化质量取决于你的使用方式。反馈模糊，它学得也模糊；要求清晰，它进化就快。记忆可能会记错、理解偏、甚至形成错误习惯，需要偶尔检查和修正。</p><p>这不是完全自动驾驶，而是&quot;高自动化 + 人类监督&quot;。它是个会成长的系统，但你仍然是方向盘。</p><blockquote><p>参考资料：<br><a href="https://hermes-agent.nousresearch.com/docs">Hermes Agent 官方文档</a><br><a href="https://github.com/NousResearch/hermes-agent">GitHub - NousResearch/hermes-agent</a><br><a href="https://www.jdon.com/91357-Hermes-Agent-From-Zero-to-Hero.html">Hermes Agent 从入门到精通</a><br><a href="https://www.analyticsvidhya.com/blog/2026/05/hermes-agent-guide/">Hermes Agent Guide - Analytics Vidhya</a><br><a href="https://dev.to/truongpx396/hermes-agent-deep-dive-build-your-own-guide-1pcc">Hermes Agent Deep Dive - DEV Community</a><br><a href="https://zhuanlan.zhihu.com/p/2027128115831260939">Hermes Agent 完整指南 - 知乎</a><br><a href="https://zhuanlan.zhihu.com/p/2029680669119246545">80+ 条命令和用法 - 知乎</a><br><a href="https://www.runoob.com/ai-agent/hermes-agent-cli.html">Hermes Agent CLI - 菜鸟教程</a></p></blockquote>]]>
    </content>
    <id>https://www.insidentally.com/articles/000043/</id>
    <link href="https://www.insidentally.com/articles/000043/"/>
    <published>2026-05-18T08:30:00.000Z</published>
    <summary>
      <![CDATA[<p>Hermes Agent 是 Nous Research 开源的 AI Agent 框架，模型无关、支持多平台。和传统 AI 聊天工具不同，它具备自进化能力——用得越久，积累的技能和记忆越多。本文重点介绍其核心机制和日常使用的命令技巧。</p>]]>
    </summary>
    <title>Hermes Agent 使用实践：核心机制与命令技巧</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="AI" scheme="https://www.insidentally.com/tags/AI/"/>
    <category term="大模型" scheme="https://www.insidentally.com/tags/%E5%A4%A7%E6%A8%A1%E5%9E%8B/"/>
    <category term="本地部署" scheme="https://www.insidentally.com/tags/%E6%9C%AC%E5%9C%B0%E9%83%A8%E7%BD%B2/"/>
    <category term="MiMo" scheme="https://www.insidentally.com/tags/MiMo/"/>
    <category term="Ollama" scheme="https://www.insidentally.com/tags/Ollama/"/>
    <content>
      <![CDATA[<blockquote><p>本文参考：<br><a href="https://huggingface.co/XiaomiMiMo">XiaomiMiMo HuggingFace</a><br><a href="https://ollama.com">Ollama 官方文档</a><br><a href="https://arxiv.org/abs/2505.07608">MiMo-7B-RL 技术报告</a></p></blockquote><p>小米开源的 MiMo 系列模型在推理能力上表现亮眼，尤其是 MiMo-7B-RL 在数学推理任务上甚至超过了 DeepSeek R1。本文记录了在一台消费级笔记本上本地部署 MiMo 模型的完整过程，包括硬件配置分析、模型选型、部署方案对比，以及实际遇到的问题和解决方案。</p><span id="more"></span><h3 id="硬件环境">硬件环境</h3><p>本机配置如下：</p><table><thead><tr><th>项目</th><th>配置</th></tr></thead><tbody><tr><td>OS</td><td>Fedora 43 Workstation</td></tr><tr><td>CPU</td><td>x86_64</td></tr><tr><td>GPU</td><td>NVIDIA GeForce RTX 4060 Laptop (8GB VRAM)</td></tr><tr><td>内存</td><td>16GB</td></tr><tr><td>磁盘</td><td>128GB（可用 82GB）</td></tr><tr><td>CUDA 驱动</td><td>580.142 (CUDA 13.0)</td></tr></tbody></table><p>关键限制：<strong>8GB 显存</strong>。这直接决定了能跑什么模型、用什么量化方案。</p><h3 id="模型选型">模型选型</h3><p>小米 MiMo 系列开源了多个模型：</p><table><thead><tr><th>模型</th><th>参数量</th><th>类型</th><th>本地部署可行性</th></tr></thead><tbody><tr><td>MiMo-7B-RL</td><td>7B</td><td>纯文本推理</td><td>✅ 单卡可跑</td></tr><tr><td>MiMo-7B-RL-0530</td><td>7B</td><td>纯文本推理（增强版）</td><td>✅ 单卡可跑</td></tr><tr><td>MiMo-VL-7B-RL</td><td>7B</td><td>视觉语言</td><td>✅ 单卡可跑</td></tr><tr><td>MiMo-Audio-7B-Instruct</td><td>7B</td><td>音频理解</td><td>✅ 单卡可跑</td></tr><tr><td>MiMo-V2-Flash</td><td>309B (15B激活)</td><td>MoE 纯文本</td><td>❌ 需多卡服务器</td></tr><tr><td>MiMo-V2.5</td><td>310B (15B激活)</td><td>MoE 多模态</td><td>❌ 需多卡服务器</td></tr></tbody></table><p>V2-Flash 和 V2.5 虽然性能更强，但 309B 参数即使 MoE 只激活 15B，完整权重也需要 600GB+ 显存，消费级硬件根本跑不动。</p><p><strong>最终选择：MiMo-7B-RL</strong>，理由：</p><ol><li>7B 参数在 8GB 显存上绑绑有余</li><li>RL（强化学习）版本推理能力最强，AIME 2024 数学竞赛得分 68.2</li><li>纯文本场景够用，不需要多模态</li></ol><h3 id="量化版本分析">量化版本分析</h3><p>MiMo-7B-RL 的架构参数：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">num_attention_heads: 32</span><br><span class="line">num_key_value_heads: 8  (GQA 4:1)</span><br><span class="line">num_hidden_layers: 36</span><br><span class="line">hidden_size: 4096</span><br></pre></td></tr></table></figure><p>社区提供了 GGUF 量化版本：</p><table><thead><tr><th>量化版本</th><th>模型大小</th><th>质量</th></tr></thead><tbody><tr><td>Q4_K_M</td><td>4.4GB</td><td>好（推荐）</td></tr><tr><td>Q6_K</td><td>5.8GB</td><td>更好</td></tr><tr><td>Q8_0</td><td>7.5GB</td><td>接近原版</td></tr><tr><td>FP16</td><td>14.2GB</td><td>无损</td></tr></tbody></table><p><strong>64K 上下文的 KV Cache 计算</strong>：</p><p>GQA 架构下，64K 上下文的 KV Cache 大小：</p><ul><li>FP16 KV：约 9.0GB → 模型 + KV = 13.4GB，<strong>爆显存</strong></li><li>Q8 KV：约 4.5GB → 模型 + KV = 8.9GB，<strong>紧张</strong></li><li>Q4 KV：约 2.3GB → 模型 + KV = 6.7GB，<strong>可用</strong></li></ul><p>结论：<strong>Q4_K_M 模型 + Q4 量化 KV Cache</strong> 是 8GB 显存跑 64K 上下文的唯一稳妥方案。</p><h3 id="部署方案对比">部署方案对比</h3><p>最初考虑了四个方案：</p><table><thead><tr><th>方案</th><th>优点</th><th>缺点</th></tr></thead><tbody><tr><td>llama.cpp 源码编译</td><td>性能最优，可定制</td><td>需装 CUDA Toolkit，Fedora 上麻烦</td></tr><tr><td>llama.cpp 预编译</td><td>免编译</td><td><strong>官方不提供 Linux CUDA 版本</strong></td></tr><tr><td>Ollama</td><td>一键安装，自动 CUDA</td><td>封装层，略重</td></tr><tr><td>Transformers + bitsandbytes</td><td>Python 集成好</td><td>推理速度较慢</td></tr></tbody></table><p><strong>llama.cpp 预编译方案的坑</strong>：查了 b9041 版本的所有发布资产，Linux 只有 CPU、Vulkan、ROCm、SYCL 版本，<strong>没有 CUDA 版本</strong>。CUDA 预编译只有 Windows 的。这意味着在 Linux 上用 llama.cpp 跑 NVIDIA GPU，要么从源码编译，要么用 Vulkan（性能打折扣）。</p><p>最终选择 <strong>Ollama</strong>，理由：</p><ol><li>Fedora 43 仓库自带 <code>ollama</code> 包，<code>dnf install</code> 搞定</li><li>自动检测 NVIDIA GPU，内置 CUDA 支持</li><li>模型管理方便，一条命令下载</li></ol><h3 id="部署过程">部署过程</h3><h4 id="1-安装-Ollama">1. 安装 Ollama</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install -y ollama</span><br></pre></td></tr></table></figure><p>安装后自动创建 systemd 服务，GPU 检测正常。</p><h4 id="2-下载并导入模型">2. 下载并导入模型</h4><p>直接用 <code>ollama pull</code> 从 HuggingFace 下载：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ollama pull hf.co/jedisct1/MiMo-7B-RL-GGUF</span><br></pre></td></tr></table></figure><blockquote><p>如果 <code>ollama pull</code> 遇到网络问题，可以用 <code>curl</code> 手动下载 GGUF 文件，再用 <code>ollama create</code> 导入。</p></blockquote><h4 id="3-配置-64K-上下文">3. 配置 64K 上下文</h4><p>创建 Modelfile 配置上下文长度和其他参数：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">cat &gt; /tmp/Modelfile &lt;&lt; &#x27;EOF&#x27;</span><br><span class="line">FROM /home/youruser/MiMo-7B-RL-Q4_K_M.gguf</span><br><span class="line">PARAMETER num_ctx 65536</span><br><span class="line">PARAMETER temperature 0.6</span><br><span class="line">PARAMETER top_p 0.95</span><br><span class="line">EOF</span><br><span class="line"></span><br><span class="line">ollama create mimo-7b-rl -f /tmp/Modelfile</span><br></pre></td></tr></table></figure><h3 id="测试验证">测试验证</h3><p>部署完成后进行了三项测试：</p><p><strong>测试 1：简单问答</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">&gt;&gt;&gt; 1+1等于几？请直接回答数字</span><br><span class="line">2</span><br></pre></td></tr></table></figure><p><strong>测试 2：中文知识问答</strong></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">&gt;&gt;&gt; 请解释什么是休克，用3句话概括</span><br><span class="line">休克是一种生理状态，涉及全身血液循环量急剧减少，导致器官供氧不足。</span><br><span class="line">1. 循环血量骤减：外伤、失血或血管收缩等因素导致有效循环血量显著下降</span><br><span class="line">2. 组织灌注不足：血压降低引发重要器官缺血，代谢紊乱加剧</span><br><span class="line">3. 代谢代偿失效：身体通过快速心率等机制短暂维持供氧，但持续失衡将导致多器官衰竭</span><br></pre></td></tr></table></figure><p><strong>测试 3：代码生成</strong></p><p>生成的快速排序算法完整可用，包含测试代码。</p><p><strong>GPU 状态</strong>：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">显存占用：5241 MiB / 8188 MiB（64%）</span><br><span class="line">GPU 利用率：推理时满载，空闲时 0%</span><br></pre></td></tr></table></figure><h3 id="常用命令">常用命令</h3><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">交互式对话</span></span><br><span class="line">ollama run mimo-7b-rl</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">查看已安装模型</span></span><br><span class="line">ollama list</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">API 调用</span></span><br><span class="line">curl http://localhost:11434/api/generate -d &#x27;&#123;</span><br><span class="line">  &quot;model&quot;: &quot;mimo-7b-rl&quot;,</span><br><span class="line">  &quot;prompt&quot;: &quot;你的问题&quot;,</span><br><span class="line">  &quot;stream&quot;: false</span><br><span class="line">&#125;&#x27;</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">删除模型</span></span><br><span class="line">ollama rm mimo-7b-rl</span><br></pre></td></tr></table></figure><h3 id="总结">总结</h3><p>在 RTX 4060 8GB 显存上本地部署 MiMo-7B-RL 完全可行，Q4_K_M 量化 + 64K 上下文是最佳平衡点。部署过程中最大的坑是 llama.cpp 官方不提供 Linux CUDA 预编译包，最终用 Ollama 绕过了这个问题。</p><blockquote><p>本地跑 7B 模型的意义不在于替代云端大模型，而在于隐私数据不出本机、离线可用、零 API 费用。对于医疗、教育等数据敏感场景，这是刚需。</p></blockquote>]]>
    </content>
    <id>https://www.insidentally.com/articles/000042/</id>
    <link href="https://www.insidentally.com/articles/000042/"/>
    <published>2026-05-06T10:00:00.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>本文参考：<br>
<a href="https://huggingface.co/XiaomiMiMo">XiaomiMiMo HuggingFace</a><br>
<a href="https://ollama.com">Ollama 官方文档</a><br>
<a href="https://arxiv.org/abs/2505.07608">MiMo-7B-RL 技术报告</a></p>
</blockquote>
<p>小米开源的 MiMo 系列模型在推理能力上表现亮眼，尤其是 MiMo-7B-RL 在数学推理任务上甚至超过了 DeepSeek R1。本文记录了在一台消费级笔记本上本地部署 MiMo 模型的完整过程，包括硬件配置分析、模型选型、部署方案对比，以及实际遇到的问题和解决方案。</p>]]>
    </summary>
    <title>小米 MiMo 模型本地部署实践：从选型到踩坑</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Hermes Agent" scheme="https://www.insidentally.com/tags/Hermes-Agent/"/>
    <category term="AI Agent" scheme="https://www.insidentally.com/tags/AI-Agent/"/>
    <category term="终端工具" scheme="https://www.insidentally.com/tags/%E7%BB%88%E7%AB%AF%E5%B7%A5%E5%85%B7/"/>
    <category term="LLM" scheme="https://www.insidentally.com/tags/LLM/"/>
    <content>
      <![CDATA[<blockquote><p>本文参考：<br><a href="https://hermes-agent.nousresearch.com/docs/">Hermes Agent 官方文档</a><br><a href="https://github.com/NousResearch/hermes-agent">Hermes Agent GitHub 仓库</a></p></blockquote><p><a href="https://hermes-agent.nousresearch.com/">Hermes Agent</a> 是由 Nous Research 开发的开源 AI Agent 框架，运行在终端、消息平台和 IDE 中。它属于 Claude Code（Anthropic）、Codex（OpenAI）同类的自主编码和任务执行代理，通过工具调用与系统交互。Hermes 支持任意 LLM 提供商（OpenRouter、Anthropic、OpenAI、DeepSeek、本地模型等 15+ 种），可在 Linux、macOS 和 WSL 上运行。</p><span id="more"></span><p>与其他 AI Agent 相比，Hermes Agent 有几个显著特点：</p><ul><li><strong>通过技能自我改进</strong>：Hermes 通过将可复用的过程保存为技能（Skills）来学习经验。当它解决复杂问题、发现工作流或被纠正时，可以将这些知识持久化为技能文档，在未来的会话中加载。技能随时间积累，使代理在特定任务和环境中表现越来越好。</li><li><strong>跨会话持久记忆</strong>：记住你是谁、你的偏好、环境细节和经验教训。可插拔的内存后端（内置、Honcho、Mem0 等）让你选择内存的工作方式。</li><li><strong>多平台网关</strong>：同一个代理可以运行在 Telegram、Discord、Slack、WhatsApp、Signal、Matrix、Email 等 10+ 个平台上，并拥有完整的工具访问权限。</li><li><strong>提供商无关</strong>：可以在工作流中随时切换模型和提供商，凭证池自动跨多个 API 密钥轮换。</li><li><strong>可扩展</strong>：支持插件、MCP 服务器、自定义工具、Webhook 触发、定时任务和完整的 Python 生态。</li></ul><p>人们使用 Hermes 进行软件开发、研究、系统管理、数据分析、内容创作、家庭自动化等任何需要持久上下文和完整系统访问的 AI 代理场景。</p><h3 id="安装">安装</h3><p>Hermes Agent 的安装非常简单，执行以下命令即可：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash</span><br></pre></td></tr></table></figure><p>安装完成后，可以通过以下命令检查是否安装成功：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes --version</span><br></pre></td></tr></table></figure><h3 id="初始配置">初始配置</h3><p>安装完成后，运行安装向导进行初始配置：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes setup</span><br></pre></td></tr></table></figure><p>该向导会引导你配置模型、终端、网关、工具和代理设置。</p><h4 id="配置模型提供商">配置模型提供商</h4><p>Hermes 支持 20+ 种模型提供商。使用以下命令交互式选择模型和提供商：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes model</span><br></pre></td></tr></table></figure><p>常用的提供商包括：</p><table><thead><tr><th>提供商</th><th>环境变量</th></tr></thead><tbody><tr><td>OpenRouter</td><td><code>OPENROUTER_API_KEY</code></td></tr><tr><td>Anthropic</td><td><code>ANTHROPIC_API_KEY</code></td></tr><tr><td>OpenAI</td><td><code>OPENAI_API_KEY</code></td></tr><tr><td>Google Gemini</td><td><code>GOOGLE_API_KEY</code></td></tr><tr><td>DeepSeek</td><td><code>DEEPSEEK_API_KEY</code></td></tr><tr><td>Xiaomi MiMo</td><td><code>XIAOMI_API_KEY</code></td></tr></tbody></table><p>你需要在 <code>~/.hermes/.env</code> 文件中设置对应的 API 密钥。</p><h4 id="检查健康状态">检查健康状态</h4><p>使用以下命令检查配置和依赖是否正确：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes doctor</span><br></pre></td></tr></table></figure><p>如果有问题，可以尝试自动修复：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes doctor --fix</span><br></pre></td></tr></table></figure><h3 id="基本使用">基本使用</h3><h4 id="交互式聊天">交互式聊天</h4><p>直接运行 <code>hermes</code> 命令即可进入交互式聊天模式：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes</span><br></pre></td></tr></table></figure><p>进入后，你可以像和普通 AI 对话一样与 Hermes 交流，但不同的是，Hermes 可以执行命令、读写文件、搜索网络等操作。</p><h4 id="单次查询">单次查询</h4><p>如果你只需要问一个问题，不需要进入交互模式：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes chat -q &quot;什么是 LAMP 环境？&quot;</span><br></pre></td></tr></table></figure><h4 id="指定模型">指定模型</h4><p>可以在运行时指定使用的模型：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes chat -m anthropic/claude-sonnet-4 -q &quot;解释一下 Docker 的基本概念&quot;</span><br></pre></td></tr></table></figure><h3 id="技能系统">技能系统</h3><p>技能（Skills）是 Hermes Agent 的核心特性之一。技能是可复用的程序文档，Hermes 通过学习经验自动创建和更新。</p><h4 id="查看已安装技能">查看已安装技能</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes skills list</span><br></pre></td></tr></table></figure><h4 id="搜索技能">搜索技能</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes skills search &quot;docker&quot;</span><br></pre></td></tr></table></figure><h4 id="安装技能">安装技能</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes skills install &lt;skill-id&gt;</span><br></pre></td></tr></table></figure><h3 id="工具系统">工具系统</h3><p>Hermes 通过工具集（Toolsets）来管理可用的工具。使用以下命令查看和管理工具：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes tools list</span><br></pre></td></tr></table></figure><p>常用的工具集包括：</p><ul><li><code>terminal</code>：Shell 命令和进程管理</li><li><code>file</code>：文件读写搜索</li><li><code>web</code>：网络搜索和内容提取</li><li><code>browser</code>：浏览器自动化</li><li><code>vision</code>：图像分析</li><li><code>memory</code>：跨会话持久记忆</li></ul><p>启用或禁用工具：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">hermes tools enable browser</span><br><span class="line">hermes tools disable vision</span><br></pre></td></tr></table></figure><blockquote><p>工具更改需要在新会话中生效，使用 <code>/reset</code> 命令重置会话。</p></blockquote><h3 id="会话管理">会话管理</h3><p>Hermes 会自动保存会话记录，你可以随时恢复之前的会话。</p><h4 id="列出最近会话">列出最近会话</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes sessions list</span><br></pre></td></tr></table></figure><h4 id="恢复会话">恢复会话</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes --continue</span><br></pre></td></tr></table></figure><p>或者恢复指定会话：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes --resume &lt;session-id&gt;</span><br></pre></td></tr></table></figure><h3 id="定时任务">定时任务</h3><p>Hermes 支持创建定时任务，类似 cron 的功能：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes cron create &quot;0 9 * * *&quot; -q &quot;查看今天的天气&quot;</span><br></pre></td></tr></table></figure><p>查看定时任务：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes cron list</span><br></pre></td></tr></table></figure><h3 id="网关（消息平台）">网关（消息平台）</h3><p>Hermes 可以通过网关连接到各种消息平台，让你在 Telegram、Discord 等平台直接与 Hermes 对话。</p><h4 id="配置网关">配置网关</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes gateway setup</span><br></pre></td></tr></table></figure><h4 id="启动网关">启动网关</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">hermes gateway install</span><br><span class="line">hermes gateway start</span><br></pre></td></tr></table></figure><h4 id="查看状态">查看状态</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">hermes gateway status</span><br></pre></td></tr></table></figure><h3 id="会话内命令">会话内命令</h3><p>在交互式聊天会话中，你可以使用斜杠命令：</p><ul><li><code>/help</code>：显示帮助</li><li><code>/new</code>：新建会话</li><li><code>/model</code>：查看或切换模型</li><li><code>/skills</code>：搜索安装技能</li><li><code>/tools</code>：管理工具</li><li><code>/config</code>：查看配置</li><li><code>/quit</code>：退出</li></ul><h3 id="配置文件">配置文件</h3><p>Hermes 的配置文件位于：</p><ul><li>主配置：<code>~/.hermes/config.yaml</code></li><li>环境变量：<code>~/.hermes/.env</code></li></ul><p>你可以直接编辑这些文件，或者使用命令：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">hermes config edit</span><br><span class="line">hermes config set &lt;key&gt; &lt;value&gt;</span><br></pre></td></tr></table></figure><h3 id="总结">总结</h3><p>Hermes Agent 是一个功能强大的 AI Agent 框架，它不仅能够进行对话，还能执行命令、管理文件、搜索网络，甚至连接到各种消息平台。通过技能系统，Hermes 可以不断学习和改进，成为你真正的 AI 助手。</p><p>本文只是介绍了 Hermes Agent 的基本安装和使用方法，更多高级功能如 MCP 服务器、Webhook、Profile 等，可以参考<a href="https://hermes-agent.nousresearch.com/docs/">官方文档</a>进行深入学习。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000041/</id>
    <link href="https://www.insidentally.com/articles/000041/"/>
    <published>2026-05-05T09:00:00.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>本文参考：<br>
<a href="https://hermes-agent.nousresearch.com/docs/">Hermes Agent 官方文档</a><br>
<a href="https://github.com/NousResearch/hermes-agent">Hermes Agent GitHub 仓库</a></p>
</blockquote>
<p><a href="https://hermes-agent.nousresearch.com/">Hermes Agent</a> 是由 Nous Research 开发的开源 AI Agent 框架，运行在终端、消息平台和 IDE 中。它属于 Claude Code（Anthropic）、Codex（OpenAI）同类的自主编码和任务执行代理，通过工具调用与系统交互。Hermes 支持任意 LLM 提供商（OpenRouter、Anthropic、OpenAI、DeepSeek、本地模型等 15+ 种），可在 Linux、macOS 和 WSL 上运行。</p>]]>
    </summary>
    <title>介绍并初步使用 Hermes Agent</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Linux" scheme="https://www.insidentally.com/tags/Linux/"/>
    <category term="浏览器缓存" scheme="https://www.insidentally.com/tags/%E6%B5%8F%E8%A7%88%E5%99%A8%E7%BC%93%E5%AD%98/"/>
    <category term="tmpfs" scheme="https://www.insidentally.com/tags/tmpfs/"/>
    <category term="内存" scheme="https://www.insidentally.com/tags/%E5%86%85%E5%AD%98/"/>
    <content>
      <![CDATA[<p>如果你使用浏览器比较频繁，那么你的浏览器缓存会造成大量的磁盘 IO。想要减少磁盘 IO，保护磁盘的同时加快浏览器速度，可设置缓存使用内存。但是单纯使用内存放置浏览器缓存，会在重启后丢失缓存。所以本文使用脚本在登录登出时将缓存从内存同步回磁盘。</p><p>linux 下不同浏览器缓存位置不同：</p><ul><li>默认 Microsoft Edge 缓存位置在 ~/.cache/microsoft-edge</li><li>默认 Google Chrome 缓存位置在 ~/.cache/google-chrome</li><li>默认 Mozilla Firefox 缓存位置在 ~/.cache/mozilla/firefox/XXXXXXXX.default-release/cache2</li></ul><blockquote><p>本文以 Microsoft Edge 浏览器为例。</p></blockquote><blockquote><p>Firefox 浏览器缓存位置中 XXXXXXXX 为八位因人而异的随机代码，请自行查找你缓存文件的位置。</p></blockquote><h3 id="1-缓存同步（打包解包）脚本">1. 缓存同步（打包解包）脚本</h3><p>首先需要安装 tar 的 lzop，根据你的发行版自行安装。</p><p>然后在你喜欢的位置建立核心脚本并添加可执行权限：</p><blockquote><p>建议将脚本建立在你的用户主目录下的某个位置，因为本文使用普通用户权限的 systemd 引用脚本。本文其他脚本同样如此。</p></blockquote><p>内容如下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">#</span><span class="language-bash">!/usr/bin/sh</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">在登录后登出前执行此脚本</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">确保你的系统已经安装 tar 和 lzop</span></span><br><span class="line"></span><br><span class="line">case &quot;$1&quot; in</span><br><span class="line"> import)</span><br><span class="line">   cd /dev/shm</span><br><span class="line">   tar --lzop -pxf /home/你的用户名/.cache/edgecache-backup.tar.lzop</span><br><span class="line">   ;;</span><br><span class="line"> dump)</span><br><span class="line">   cd /dev/shm</span><br><span class="line"><span class="meta prompt_">   # </span><span class="language-bash">删除大于 5MB 的缓存文件</span></span><br><span class="line">   find ./microsoft-edge/ -size +5M -exec rm &#123;&#125; \;</span><br><span class="line">   tar --lzop -pcf /home/你的用户名/.cache/edgecache-backup.tar.lzop microsoft-edge/</span><br><span class="line">   ;;</span><br><span class="line"> *)</span><br><span class="line">   echo -e &quot;Usage: $(cd `dirname $0`; pwd)/edgecache &#123;import|dump&#125;&quot;</span><br><span class="line">   exit 1</span><br><span class="line">   ;;</span><br><span class="line">esac</span><br><span class="line"></span><br><span class="line">exit 0</span><br></pre></td></tr></table></figure><h3 id="2-登录导入脚本">2. 登录导入脚本</h3><p>然后在你喜欢的位置建立登录导入脚本，登录时设置缓存路径，及从压缩包导入缓存</p><p>内容如下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">#</span><span class="language-bash">!/bin/sh</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">将 microsoft edge 缓存路径到内存</span></span><br><span class="line">/bin/rm ~/.cache/microsoft-edge -R</span><br><span class="line">/bin/mkdir -p /dev/shm/microsoft-edge</span><br><span class="line">/bin/ln -sf /dev/shm/microsoft-edge ~/.cache/microsoft-edge</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">将浏览器缓存同步到内存</span></span><br><span class="line">echo [`date +&quot;%Y-%m-%d %H:%M&quot;`] On login - Importing caches to ram &gt;&gt; /home/你的用户名/edgecache_sync.log</span><br><span class="line">/你/核心脚本/的位置 import &gt;&gt; /home/你的用户名/edgecache_sync.log</span><br><span class="line">echo [`date +&quot;%Y-%m-%d %H:%M&quot;`] On login - Caches imported to ram &gt;&gt; /home/你的用户名/edgecache_sync.log</span><br></pre></td></tr></table></figure><h3 id="3-登出前导出缓存到硬盘">3. 登出前导出缓存到硬盘</h3><p>然后在你喜欢的位置建立登出导出脚本，登出时将缓存同步到磁盘。</p><p>内容如下：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">#</span><span class="language-bash">!/bin/sh</span></span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">将浏览器缓存同步到磁盘</span></span><br><span class="line">echo [`date +&quot;%Y-%m-%d %H:%M&quot;`] On logout - Dumping caches to disk &gt;&gt; /home/你的用户名/edgecache_sync.log</span><br><span class="line">/你/核心脚本/的位置 dump &gt;&gt; /home/你的用户名/edgecache_sync.log</span><br><span class="line">echo [`date +&quot;%Y-%m-%d %H:%M&quot;`] On logout - Caches dumped to disk &gt;&gt; /home/你的用户名/edgecache_sync.log</span><br><span class="line"><span class="meta prompt_"></span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">等待 4 秒</span></span><br><span class="line">ping -c 4 127.1 &gt; /dev/null</span><br></pre></td></tr></table></figure><h3 id="4-让-systemd-在登录和登出时自动执行上述脚本">4. 让 systemd 在登录和登出时自动执行上述脚本</h3><p>在 ~/.config/systemd/user/ 目录下创建文件 edgecache.service：</p><p>内容如下：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">[Unit]</span><br><span class="line">Description=Synchronize edge caches to disk</span><br><span class="line">PartOf=graphical-session.target</span><br><span class="line">DefaultDependencies=no</span><br><span class="line">Before=umount.target shutdown.target reboot.target halt.target</span><br><span class="line"></span><br><span class="line">[Service]</span><br><span class="line">Type=simple</span><br><span class="line">RemainAfterExit=true</span><br><span class="line">ExecStart=/你/登录导入脚本/的位置</span><br><span class="line">ExecStop=/你/登出导出脚本/的位置</span><br><span class="line"></span><br><span class="line">[Install]</span><br><span class="line">WantedBy=graphical-session.target</span><br><span class="line"></span><br></pre></td></tr></table></figure><h3 id="5-启用-systemd">5. 启用 systemd</h3><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">systemctl --user daemon-reload</span><br><span class="line">systemctl --user enable edgecache.service</span><br></pre></td></tr></table></figure><p>重启生效。</p><h3 id="6-参考资料">6. 参考资料</h3><p><a href="https://www.cnblogs.com/dylanchu/p/9750494.html">Linux 设置 chrome 缓存至内存，及开关机同步</a></p><p><a href="https://wiki.archlinux.org/index.php/Tmpfs">Arch linux tmpfs</a></p><p><a href="https://wiki.archlinux.org/index.php/Chromium/Tips_and_tricks#Cache_in_tmpfs">Arch linux Chromium 小贴士</a></p><p><a href="http://docs.observium.org/persistent_ramdisk/">内存盘和硬盘同步</a></p><p><a href="https://askubuntu.com/questions/416299/">用 systemd 关机前执行指令</a></p><p><a href="https://zhuanlan.zhihu.com/p/386255961">使用 systemd 管理普通用户的服务</a></p><p><a href="http://blog.csdn.net/kai165416/article/details/79449638">删除大于固定大小的文件</a></p><p><a href="https://www.omgubuntu.co.uk/2010/11/move-google-chrome-cache-to-ramdisk">将 chrome 浏览器缓存放到内存</a></p><p><a href="https://www.freedesktop.org/software/systemd/man/bootup.html">systemd 启动顺序</a></p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000040/</id>
    <link href="https://www.insidentally.com/articles/000040/"/>
    <published>2024-04-05T15:07:42.000Z</published>
    <summary>
      <![CDATA[<p>如果你使用浏览器比较频繁，那么你的浏览器缓存会造成大量的磁盘 IO。想要减少磁盘 IO，保护磁盘的同时加快浏览器速度，可设置缓存使用内存。但是单纯使用内存放置浏览器缓存，会在重启后丢失缓存。所以本文使用脚本在登录登出时将缓存从内存同步回磁盘。</p>
<p>linux 下不]]>
    </summary>
    <title>Linux 下将浏览器缓存放到内存，并在登录登出时进行同步</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Linux" scheme="https://www.insidentally.com/tags/Linux/"/>
    <category term="镜像源" scheme="https://www.insidentally.com/tags/%E9%95%9C%E5%83%8F%E6%BA%90/"/>
    <category term="Fedora" scheme="https://www.insidentally.com/tags/Fedora/"/>
    <content>
      <![CDATA[<p>Fedora 默认使用 Metalink 给出推荐的镜像列表，保证用户使用的镜像仓库足够新，并且能够尽快拿到安全更新，从而提供更好的安全性。所以通常情况下使用默认配置即可，无需更改配置文件。</p><p>由于 Metalink 需要从国外的 Fedora 项目服务器上获取元信息，所以对于校园内网、无国外访问等特殊情况，metalink 并不适用，此时可以如下方法修改配置文件。</p><blockquote><p>本脚本在Fedora 36 至 Fedora 39 测试通过</p></blockquote><span id="more"></span><h3 id="更改-Fedora-镜像源">更改 Fedora 镜像源</h3><p>Fedora 的软件源配置文件可以有多个，其中： 系统默认的 fedora 仓库配置文件为 <code>/etc/yum.repos.d/fedora.repo</code>，系统默认的 updates 仓库配置文件为 <code>/etc/yum.repos.d/fedora-updates.repo</code>。此外还有相应的 modular 仓库。</p><h4 id="备份文件">备份文件</h4><p>将仓库配置文件备份到 <code>/etc/yum.repos.d/backup</code> 文件夹下。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">cd /etc/yum.repos.d/</span><br><span class="line">sudo mkdir backup/</span><br><span class="line">sudo cp fedora.repo backup/</span><br><span class="line">sudo cp fedora-modular.repo backup/</span><br><span class="line">sudo cp fedora-updates.repo backup/</span><br><span class="line">sudo cp fedora-updates-modular.repo backup/</span><br></pre></td></tr></table></figure><h4 id="更换清华源">更换清华源</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">sudo sed -e &#x27;s|^metalink=|#metalink=|g&#x27; \</span><br><span class="line">    -e &#x27;s|^#baseurl=http://download.example/pub/fedora/linux|baseurl=https://mirrors.tuna.tsinghua.edu.cn/fedora|g&#x27; \</span><br><span class="line">    -i.bak \</span><br><span class="line">    /etc/yum.repos.d/fedora.repo \</span><br><span class="line">    /etc/yum.repos.d/fedora-modular.repo \</span><br><span class="line">    /etc/yum.repos.d/fedora-updates.repo \</span><br><span class="line">    /etc/yum.repos.d/fedora-updates-modular.repo</span><br></pre></td></tr></table></figure><h4 id="更新本地缓存">更新本地缓存</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf makecache</span><br></pre></td></tr></table></figure><h3 id="安装-RPM-Fusion-并更换清华源">安装 RPM Fusion 并更换清华源</h3><p>RPM Fusion 为 Fedora/RHEL 提供额外的大量 RPM 软件包的第三方软件源。</p><h4 id="安装并启动-RPM-Fusion-软件源">安装并启动 RPM Fusion 软件源</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install --nogpgcheck https://mirrors.tuna.tsinghua.edu.cn/rpmfusion/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://mirrors.tuna.tsinghua.edu.cn/rpmfusion/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm</span><br></pre></td></tr></table></figure><h4 id="备份文件-2">备份文件</h4><p>将仓库配置文件备份到 <code>/etc/yum.repos.d/backup</code> 文件夹下。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">cd /etc/yum.repos.d/</span><br><span class="line">sudo cp rpmfusion-* backup/</span><br></pre></td></tr></table></figure><h4 id="修改-rpmfusion-为清华源">修改 rpmfusion 为清华源</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">sudo sed -e &#x27;s|^metalink=|#metalink=|g&#x27; \</span><br><span class="line">         -e &#x27;s|^#baseurl=http://download1.rpmfusion.org|baseurl=https://mirrors.tuna.tsinghua.edu.cn/rpmfusion|g&#x27; \</span><br><span class="line">         /etc/yum.repos.d/rpmfusion-free.repo \</span><br><span class="line">         /etc/yum.repos.d/rpmfusion-free-updates.repo \</span><br><span class="line">         /etc/yum.repos.d/rpmfusion-free-updates-testing.repo \</span><br><span class="line">         /etc/yum.repos.d/rpmfusion-nonfree.repo \</span><br><span class="line">         /etc/yum.repos.d/rpmfusion-nonfree-updates.repo \</span><br><span class="line">         /etc/yum.repos.d/rpmfusion-nonfree-updates-testing.repo</span><br></pre></td></tr></table></figure><h4 id="更新本地缓存-2">更新本地缓存</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf makecache</span><br></pre></td></tr></table></figure><h3 id="补充内容">补充内容</h3><h4 id="安装多媒体补充包">安装多媒体补充包</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install gstreamer1-plugins-&#123;bad-\*,good-\*,base&#125; gstreamer1-plugin-openh264 gstreamer1-libav --exclude=gstreamer1-plugins-bad-free-devel</span><br><span class="line">sudo dnf install ffmpeg</span><br></pre></td></tr></table></figure><h4 id="其他国内镜像源">其他国内镜像源</h4><ol><li>清华源：<a href="https://mirrors.tuna.tsinghua.edu.cn">https://mirrors.tuna.tsinghua.edu.cn</a></li><li>中科大源：<a href="https://mirrors.ustc.edu.cn">https://mirrors.ustc.edu.cn</a></li><li>阿里云源：<a href="https://mirrors.aliyun.com">https://mirrors.aliyun.com</a></li><li>腾讯云：<a href="https://mirrors.cloud.tencent.com">https://mirrors.cloud.tencent.com</a></li></ol>]]>
    </content>
    <id>https://www.insidentally.com/articles/000039/</id>
    <link href="https://www.insidentally.com/articles/000039/"/>
    <published>2024-03-18T11:07:42.000Z</published>
    <summary>
      <![CDATA[<p>Fedora 默认使用 Metalink 给出推荐的镜像列表，保证用户使用的镜像仓库足够新，并且能够尽快拿到安全更新，从而提供更好的安全性。所以通常情况下使用默认配置即可，无需更改配置文件。</p>
<p>由于 Metalink 需要从国外的 Fedora 项目服务器上获取元信息，所以对于校园内网、无国外访问等特殊情况，metalink 并不适用，此时可以如下方法修改配置文件。</p>
<blockquote>
<p>本脚本在Fedora 36 至 Fedora 39 测试通过</p>
</blockquote>]]>
    </summary>
    <title>Fedora 使用脚本配置国内镜像源</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Fedora" scheme="https://www.insidentally.com/tags/Fedora/"/>
    <category term="Nextcloud" scheme="https://www.insidentally.com/tags/Nextcloud/"/>
    <category term="手动部署" scheme="https://www.insidentally.com/tags/%E6%89%8B%E5%8A%A8%E9%83%A8%E7%BD%B2/"/>
    <category term="LAMP" scheme="https://www.insidentally.com/tags/LAMP/"/>
    <content>
      <![CDATA[<blockquote><p>本文参考：<br><a href="https://docs.nextcloud.com/server/latest/admin_manual/installation/index.html"> <em>Nextcloud Installation and server configuration</em> </a></p></blockquote><p>本文主要演示了如何使用 Fedora Server 部署 LAMP 和 Nextcloud。搭建一个私人云。<a href="https://nextcloud.com/">NextCloud</a> 是一款开源免费的私有云存储网盘项目。网上的教程大多是使用宝塔面板或者各种脚本傻瓜式部署的方法。对于我这种追求清真的人不能接受。本文演示了纯手动部署 Nextcloud 的方法。</p><span id="more"></span><p>本文主要参考的是 Nextcloud 的官方文档。本人求新求快，追求对最新设备的支持，所以选择的是 Fedora Server。但是其他 rpm 系的部署方法应当与本文类似。</p><h3 id="环境准备">环境准备</h3><p>部署 Nextcloud 需要先搭建 LAMP 环境。LAMP 即 Linux + <a href="https://httpd.apache.org/">Apache</a> + <a href="https://www.mysql.com/cn/">Mysql</a>/<a href="https://mariadb.org/">Mariadb</a> + <a href="https://www.php.net/">PHP</a>。由于 Fedora 的软件仓库源足够的全面，并且相对较新。本文搭建环境全部从软件源中获取，无需另外下载编译。</p><h4 id="从源中安装环境">从源中安装环境</h4><p>使用下面的命令安装环境：</p><blockquote><p>开始安装之前，建议先启用 <a href="https://mirrors.tuna.tsinghua.edu.cn/help/rpmfusion/">rpmfusion 源</a>。</p></blockquote><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install httpd mariadb mariadb-server php mod_ssl php-gmp php-bcmath php-pecl-apcu php-pecl-imagick php-mcrypt php-mysqlnd php-curl php-gd php-xml php-bcmath php-zip php-intl</span><br></pre></td></tr></table></figure><h4 id="配置-Mariadb">配置 Mariadb</h4><p>安装完成后，设置字符集</p><p>编辑 /etc/my.cnf.d/mariadb-server.cnf 在 [mysqld] 部分下设置您的字符集。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">[mysqld]</span><br><span class="line">character-set-server=utf8</span><br></pre></td></tr></table></figure><p>然后启动 mariadb 服务并使其开机启动</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sudo systemctl start mariadb</span><br><span class="line">sudo systemctl enable mariadb</span><br></pre></td></tr></table></figure><p>执行 MariaDB 初始设置，例如设置 root 密码、禁用远程 root 登录等：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo mysql_secure_installation </span><br></pre></td></tr></table></figure><p>安照提示回答如下问题：</p><ul><li>输入 root 的当前密码（不输入）：</li><li>设置 root 密码？ [是/否] 是，并输入密码</li><li>删除匿名用户？ [是/否] 是</li><li>删除测试数据库并访问它？ [是/否] 是</li><li>现在重新加载权限表？ [是/否] 是</li></ul><blockquote><p>该 root 账户为 Mariadb 的 root 账户，不是指系统 root 账户。</p></blockquote><p>创建用于 Nextcloud 的数据库：</p><p>首先登陆 mariadb。输入以下命令后输入 root 密码即可登陆 mariadb。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">mysql -u root -p</span><br></pre></td></tr></table></figure><p>然后输入以下 sql 命令创建用于 Nextcloud 的用户和数据库。</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">CREATE</span> DATABASE 数据库名;</span><br><span class="line"></span><br><span class="line"><span class="keyword">GRANT</span> <span class="keyword">ALL</span> PRIVILEGES <span class="keyword">ON</span> 数据库名.<span class="operator">*</span> <span class="keyword">TO</span> <span class="string">&#x27;用户名&#x27;</span>@<span class="string">&#x27;localhost&#x27;</span> IDENTIFIED <span class="keyword">BY</span> &quot;用户密码&quot;;</span><br><span class="line"></span><br><span class="line">FLUSH PRIVILEGES;</span><br></pre></td></tr></table></figure><p>如果你需要通过互联网访问 mariadb，请自行打开 mariadb 通过互联网访问权限并打开防火墙端口。本文为本地访问数据库。</p><h4 id="配置-PHP">配置 PHP</h4><p>编辑 /etc/php.ini 文件。来设置 PHP。</p><p>在 [Date] 下找到 date.timezone 行。去掉 ; 号设置时区。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">date.timezone =Asia/Shanghai</span><br></pre></td></tr></table></figure><p>找到 memory_limit 行，将 128M 改为 512M 以上。（除非你的服务器内存极小，否则都建议将内存限制改为 512M 以上）</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">memory_limit = 512M</span><br></pre></td></tr></table></figure><h4 id="配置-apache-httpd">配置 apache httpd</h4><p>为 Nextcloud 准备一个文件夹，本文以 <code>/srv/nextcloud/</code> 文件夹为例。</p><p>将 nextcloud 文件夹的所有权改为 apache 。</p><blockquote><p>查看 <code>/etc/httpd/conf/httpd.conf</code> 文件中 User 和 Group 设置的是什么，nextcloud 目录的所有权就改为相应的用户。</p></blockquote><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo chown apache:apache /srv/nextcloud/</span><br></pre></td></tr></table></figure><p>从 <a href="https://download.nextcloud.com/server/installer/setup-nextcloud.php">https://download.nextcloud.com/server/installer/setup-nextcloud.php</a> 下载 setup-nextcloud.php 文件至 <code>/srv/nextcloud/</code> 文件夹。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">cd /srv/nextcloud/</span><br><span class="line">sudo -u apache wget https://download.nextcloud.com/server/installer/setup-nextcloud.php</span><br></pre></td></tr></table></figure><p>新建 httpd 的配置文件 <code>/etc/httpd/conf.d/nextcloud.conf</code>，设置虚拟主机。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line">ServerName  你的域名或 ip 地址</span><br><span class="line">&lt;VirtualHost *:80&gt;</span><br><span class="line">    DocumentRoot &quot;/srv/nextcloud&quot;</span><br><span class="line"></span><br><span class="line">    &lt;IfModule mod_headers.c&gt;</span><br><span class="line">        Header always set Strict-Transport-Security &quot;max-age=15552000; includeSubDomains&quot;</span><br><span class="line">    &lt;/IfModule&gt;</span><br><span class="line"></span><br><span class="line">    &lt;Directory /srv/nextcloud/&gt;</span><br><span class="line">        Require all granted</span><br><span class="line">        AllowOverride All</span><br><span class="line">        Options FollowSymLinks MultiViews</span><br><span class="line">        &lt;IfModule mod_dav.c&gt;</span><br><span class="line">            Dav off</span><br><span class="line">        &lt;/IfModule&gt;</span><br><span class="line">    &lt;/Directory&gt;</span><br><span class="line">&lt;/VirtualHost&gt;</span><br></pre></td></tr></table></figure><p>如果你有域名并且有 ssl 证书，请将证书放在合适路径，使用以下配置。</p><blockquote><p>如果你开启了 Selinux，记得正确设置证书文件的安全上下文：<code>sudo chcon -u system_u -r object_r -t httpd_sys_content_t 证书文件</code></p></blockquote><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line">ServerName 你的域名或 ip 地址</span><br><span class="line"></span><br><span class="line">&lt;VirtualHost *:80&gt;           </span><br><span class="line">  DocumentRoot &quot;/srv/nextcloud&quot;</span><br><span class="line">  RewriteEngine on</span><br><span class="line">  RewriteCond %&#123;SERVER_PORT&#125; !^443$</span><br><span class="line">  RewriteRule ^/?(.*)$ https://%&#123;SERVER_NAME&#125;/$1 [L,R]</span><br><span class="line">&lt;/VirtualHost&gt;</span><br><span class="line"></span><br><span class="line">&lt;VirtualHost *:443&gt;</span><br><span class="line">    DocumentRoot &quot;/srv/nextcloud&quot;</span><br><span class="line">    SSLEngine on</span><br><span class="line">    #证书文件的路径</span><br><span class="line">    SSLCertificateFile /证书/文件/路径</span><br><span class="line">    #私钥文件的路径</span><br><span class="line">    SSLCertificateKeyFile /私钥/文件/路径</span><br><span class="line">    #证书链文件的路径</span><br><span class="line">    SSLCertificateChainFile /证书链/文件/路径</span><br><span class="line"></span><br><span class="line">    &lt;IfModule mod_headers.c&gt;</span><br><span class="line">        Header always set Strict-Transport-Security &quot;max-age=15552000; includeSubDomains&quot;</span><br><span class="line">    &lt;/IfModule&gt;</span><br><span class="line"></span><br><span class="line">    &lt;Directory /srv/nextcloud/&gt;</span><br><span class="line">        Require all granted</span><br><span class="line">        AllowOverride All</span><br><span class="line">        Options FollowSymLinks MultiViews</span><br><span class="line"></span><br><span class="line">        &lt;IfModule mod_dav.c&gt;</span><br><span class="line">            Dav off</span><br><span class="line">        &lt;/IfModule&gt;</span><br><span class="line">    &lt;/Directory&gt;</span><br><span class="line">&lt;/VirtualHost&gt;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>接下来启动 httpd 就可以进行下一阶段配置了。</p><blockquote><p>如果你有开启 Selinux，记得提前放开端口。<code>semanage port -a -t http_port_t -p tcp 你的端口</code></p></blockquote><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo systemctl enable --now httpd</span><br></pre></td></tr></table></figure><h3 id="Nextcloud-配置精灵">Nextcloud 配置精灵</h3><p>这时访问 <a href="http://xn--6qqv7i2xdt95b/setup-nextcloud.php">http://你的域名/setup-nextcloud.php</a> 就可以打开 Nextcloud 配置精灵了。界面如下：</p> <img src="/images/000038/01.png" width = "600" alt="图片名称" align=center /><p>如果检测依赖都没有问题了，就会让您选择 Nextcloud 安装目录。我们只需要输入一个 <code>.</code> 安装在当前目录下。</p> <img src="/images/000038/02.png" width="600" alt='Nextcloud 安装目录选择' align=center /><p>点击 Next，然后静静的等待一会，随后 Nextcloud 就会安装完成。点击 Next。</p><img src="/images/000038/03.png" width="600" alt='Nextcloud 安装成功' align=center /><p>接下来就到了输入用户名和数据库信息的时候了。用户名和密码您自行选择，第一次创建的用户为管理员。</p><img src="/images/000038/04.png" width="600" alt='Nextcloud 用户名和密码' align=center /><p>数据库选择 Mariadb ，数据库名，用户名以及密码为之前 <a href="#%E9%85%8D%E7%BD%AE-Mariadb">配置 Mariadb</a> 时的数据库名，用户名以及密码。</p><img src="/images/000038/05.png" width="600" alt='Nextcloud 配置数据库' align=center /><p>点击安装，然后静静等待，就大功告成了。</p><h3 id="Nextcloud-常见问题汇总">Nextcloud 常见问题汇总</h3><p>进入 Nextcloud 后，点击个人头像，选择管理设置会进入<code>安全与设置警告</code>栏目，他会自动检测你配置中还存在的问题。</p><h4 id="Server-has-no-maintenance-window-start-time-configured。">Server has no maintenance window start time configured。</h4><p>编辑 Nextcloud目录/config/config.php 文件。在 <code>);</code> 前添加行：</p><figure class="highlight php"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&#x27;maintenance_window_start&#x27;</span> =&gt; <span class="number">1</span>,</span><br></pre></td></tr></table></figure><h4 id="该实例缺少一些推荐的-PHP-模块">该实例缺少一些推荐的 PHP 模块</h4><p>大多数模块 Fedora 的源里面都有，缺少什么模块装什么就行了。</p><h4 id="PHP-的组件-OPcache-没有正确配置">PHP 的组件 OPcache 没有正确配置</h4><p>编辑 <code>/etc/php.d/10-opcache.ini</code>：将以下参数改为类似的即可。</p><figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">opcache.enable</span>=<span class="number">1</span></span><br><span class="line"><span class="attr">opcache.enable_cli</span>=<span class="number">1</span></span><br><span class="line"><span class="attr">opcache.interned_strings_buffer</span>=<span class="number">8</span></span><br><span class="line"><span class="attr">opcache.max_accelerated_files</span>=<span class="number">10000</span></span><br><span class="line"><span class="attr">opcache.memory_consumption</span>=<span class="number">128</span></span><br><span class="line"><span class="attr">opcache.save_comments</span>=<span class="number">1</span></span><br><span class="line"><span class="attr">opcache.revalidate_freq</span>=<span class="number">1</span></span><br></pre></td></tr></table></figure><h4 id="HTTP的请求头-“Strict-Transport-Security”-未设置为至少-“15552000”-秒">HTTP的请求头 “Strict-Transport-Security” 未设置为至少 “15552000” 秒.</h4><p>在 apache httpd 的 nextcloud.conf 配置中添加以下行</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">&lt;IfModule mod_headers.c&gt;</span><br><span class="line">    Header always set Strict-Transport-Security &quot;max-age=15552000; includeSubDomains&quot;</span><br><span class="line">&lt;/IfModule&gt;</span><br></pre></td></tr></table></figure><blockquote><p>本文<a href="#%E9%85%8D%E7%BD%AE-apache-httpd">前文</a>已经添加了这些配置。</p></blockquote><h4 id="你还没有设置或验证你的电子邮件服务器配置。">你还没有设置或验证你的电子邮件服务器配置。</h4><p>在个人-基本设置-电子邮件服务器中设置即可。注意需要先在个人信息中填写好个人的邮箱地址。</p><h4 id="内存缓存未配置，为了提升使用体验，请尽量配置内存缓存。">内存缓存未配置，为了提升使用体验，请尽量配置内存缓存。</h4><p>编辑 Nextcloud目录/config/config.php 文件。在 <code>);</code> 前添加行：</p><figure class="highlight php"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&#x27;memcache.local&#x27;</span> =&gt; <span class="string">&#x27;\\OC\\Memcache\\APCu&#x27;</span>,</span><br><span class="line"><span class="string">&#x27;memcache.locking&#x27;</span> =&gt; <span class="string">&#x27;\\OC\\Memcache\\APCu&#x27;</span>,</span><br></pre></td></tr></table></figure><blockquote><p>对于个人使用的 Nextcloud 而言 APCu 足够用了。</p></blockquote><h4 id="您的安装没有设置默认的电话区域。">您的安装没有设置默认的电话区域。</h4><p>编辑 Nextcloud目录/config/config.php 文件。在 <code>);</code> 前添加行：</p><figure class="highlight php"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">&#x27;default_phone_region&#x27;</span> =&gt; <span class="string">&#x27;CN&#x27;</span>,</span><br></pre></td></tr></table></figure><h4 id="apcu-报错。">apcu 报错。</h4><p>编辑 /etc/php.d/40-apcu.ini，编辑 apc.enable_cli=1 并删除 ; 取消注释。</p><h4 id="配置-cron-php-定时任务。">配置 cron.php 定时任务。</h4><img src="/images/000038/06.png" width="800" alt='Nextcloud 后台任务' align=center /><p>设置为 cron。并在操作系统中编辑文件 /etc/systemd/system/nextcloudcron.service：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">[Unit]</span><br><span class="line">Description=Nextcloud cron.php job</span><br><span class="line"></span><br><span class="line">[Service]</span><br><span class="line">User=apache </span><br><span class="line">ExecCondition=php -f 你的Nextcloud安装目录/occ status -e</span><br><span class="line">ExecStart=/usr/bin/php -f 你的Nextcloud安装目录/cron.php</span><br><span class="line">KillMode=process</span><br></pre></td></tr></table></figure><blockquote><p>User=apache：<a href="#%E9%85%8D%E7%BD%AE-apache-httpd">如前所述</a>，此处填写 <code>/etc/httpd/conf/httpd.conf</code> 文件中 User 的用户名。</p></blockquote><p>并且编辑文件 /etc/systemd/system/nextcloudcron.timer：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">[Unit]</span><br><span class="line">Description=Run Nextcloud cron.php every 5 minutes</span><br><span class="line"></span><br><span class="line">[Timer]</span><br><span class="line">OnBootSec=5min</span><br><span class="line">OnUnitActiveSec=5min</span><br><span class="line">Unit=nextcloudcron.service</span><br><span class="line"></span><br><span class="line">[Install]</span><br><span class="line">WantedBy=timers.target</span><br></pre></td></tr></table></figure><p>然后执行命令：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sudo systemctl daemon-reload</span><br><span class="line">sudo systemctl enable --now nextcloudcron.timer</span><br></pre></td></tr></table></figure><p>以上就是在 Fedora Server 39 中配置 Nextcloud 的方法啦，希望能够对你有所帮助。</p><blockquote><p>Selinux 太难了，我配置到后面还是把 Selinux 关了，下次再学习这个 Selinux 怎么玩吧。</p></blockquote>]]>
    </content>
    <id>https://www.insidentally.com/articles/000038/</id>
    <link href="https://www.insidentally.com/articles/000038/"/>
    <published>2024-03-09T13:34:28.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>本文参考：<br>
<a href="https://docs.nextcloud.com/server/latest/admin_manual/installation/index.html"> <em>Nextcloud Installation and server configuration</em> </a></p>
</blockquote>
<p>本文主要演示了如何使用 Fedora Server 部署 LAMP 和 Nextcloud。搭建一个私人云。<a href="https://nextcloud.com/">NextCloud</a> 是一款开源免费的私有云存储网盘项目。网上的教程大多是使用宝塔面板或者各种脚本傻瓜式部署的方法。对于我这种追求清真的人不能接受。本文演示了纯手动部署 Nextcloud 的方法。</p>]]>
    </summary>
    <title>在 Fedora server 39 中纯手动部署 Nextcloud</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="gnome" scheme="https://www.insidentally.com/tags/gnome/"/>
    <category term="systemd" scheme="https://www.insidentally.com/tags/systemd/"/>
    <category term="背景壁纸" scheme="https://www.insidentally.com/tags/%E8%83%8C%E6%99%AF%E5%A3%81%E7%BA%B8/"/>
    <category term="定时任务" scheme="https://www.insidentally.com/tags/%E5%AE%9A%E6%97%B6%E4%BB%BB%E5%8A%A1/"/>
    <content>
      <![CDATA[<blockquote><p>本文参考：<br><a href="https://coda.world/gnome-wallpaper-slideshow/"> <em>定时替换 Gnome 壁纸</em> </a></p></blockquote><p>Gnome 的壁纸更换功能需要自己编写 2 个 xml 文件，xml 文件要手动将所有图片的地址写进去非常的麻烦。虽然 Gnome 下也有不少 Extensions 可以做到更换壁纸的效果，但是总体而言并不好用。</p><span id="more"></span><h3 id="换壁纸的思路">换壁纸的思路</h3><ol><li><p>使用 find 命令生成包含所有图片地址的列表。</p></li><li><p>从列表中随机挑选一张图片。</p></li><li><p>使用 gsettings 设置壁纸。</p></li><li><p>使用 systemd 定期执行脚本。</p></li></ol><h3 id="Bash-脚本">Bash 脚本</h3><p>首先写一个 Bash 脚本，实现更换壁纸的目的，同时为了响应速度和硬盘寿命着想，所有相关文件都保存在 <code>$XDG_RUNTIME_DIR</code>。</p><p><code>$XDG_RUNTIME_DIR</code> 是一个变量，后面将使用 systemd 传入你存放壁纸文件夹的路径这个变量。</p><h4 id="生成地址列表">生成地址列表</h4><p>查找 $1 下面的图片，并且生成列表到 $XDG_RUNTIME_DIR/bg_db，如果已经生成过不需要重复生成。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">if [[ ! -f &quot;$&#123;XDG_RUNTIME_DIR&#125;/bg_db&quot; ]]; then</span><br><span class="line">    find &quot;$&#123;1&#125;&quot; \( -iname &#x27;*.jpg&#x27; -o -iname &#x27;*.jpeg&#x27; -o -iname &#x27;*.png&#x27; \) -print &gt; &quot;$&#123;XDG_RUNTIME_DIR&#125;/bg_db&quot;</span><br><span class="line">fi</span><br></pre></td></tr></table></figure><h4 id="随机挑选一张图片">随机挑选一张图片</h4><p>使用 shuf 命令挑选一张图片，如果文件不存在则重新挑选，注意如果上一步没有找到任何图片会陷入死循环。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">while [[ ! -f &quot;$&#123;TARGET&#125;&quot; ]]; do</span><br><span class="line">    TARGET=$(shuf -n 1 &quot;$&#123;XDG_RUNTIME_DIR&#125;/bg_db&quot;)</span><br><span class="line">done</span><br></pre></td></tr></table></figure><p>假如需要跳过重复壁纸只需要每次挑选图片后删掉相应的行，然后在地址列表为空时重新扫描即可。</p><h4 id="设置壁纸">设置壁纸</h4><p>通过 gsettings 来进行设置。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">gsettings set org.gnome.desktop.background picture-uri &quot;file://$&#123;TARGET&#125;/&quot; || true</span><br><span class="line">gsettings set org.gnome.desktop.background picture-uri-dark &quot;file://$&#123;TARGET&#125;&quot; || true</span><br></pre></td></tr></table></figure><p><code>picture-uri</code> 是亮色模式下的壁纸，<code>picture-uri-dark</code> 是暗色模式下的壁纸。我这里将两个模式下的壁纸换为同一个。</p><h4 id="合体后的脚本-background-sh">合体后的脚本 <a href="http://background.sh">background.sh</a></h4><p>运行这个脚本就可以自动更换壁纸了。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">#</span><span class="language-bash">!/usr/bin/env bash</span></span><br><span class="line">set -Eeo pipefail</span><br><span class="line"></span><br><span class="line">if [[ ! -f &quot;$&#123;XDG_RUNTIME_DIR&#125;/bg_db&quot; ]]; then</span><br><span class="line">    find &quot;$&#123;1&#125;&quot; \( -iname &#x27;*.jpg&#x27; -o -iname &#x27;*.jpeg&#x27; -o -iname &#x27;*.png&#x27; \) -print &gt; &quot;$&#123;XDG_RUNTIME_DIR&#125;/bg_db&quot;</span><br><span class="line">fi</span><br><span class="line"></span><br><span class="line">while [[ ! -f &quot;$&#123;TARGET&#125;&quot; ]]; do</span><br><span class="line">    TARGET=$(shuf -n 1 &quot;$&#123;XDG_RUNTIME_DIR&#125;/bg_db&quot;)</span><br><span class="line">done</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">gsettings set org.gnome.desktop.background picture-uri &quot;file://$&#123;TARGET&#125;/&quot; || true</span><br><span class="line">gsettings set org.gnome.desktop.background picture-uri-dark &quot;file://$&#123;TARGET&#125;&quot; || true</span><br></pre></td></tr></table></figure><p>记得给这个文件添加可执行权限：<code>chmod +x background.sh</code>。</p><p>运行时记得传入你壁纸文件夹的路径： <code>background.sh ./your/background/PATH</code>。</p><h3 id="Systemd">Systemd</h3><p>使用 systemd --user 定期运行脚本。systemd 会给每个用户生成一个 systemd 实例，用户可以在这个实例下管理服务，启动、停止、启用以及禁用他们自己的单元。这个能力大大方便了那些通常在特定用户下运行的守护进程和服务。详细的玩法可以参见 <a href="https://wiki.archlinuxcn.org/wiki/Systemd/%E7%94%A8%E6%88%B7">ArchWiki</a></p><h4 id="background-service">background@.service</h4><p>创建文件 background@.service，内容如下：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">[Unit]</span><br><span class="line">Description=Select a random background from %I</span><br><span class="line"></span><br><span class="line">[Service]</span><br><span class="line">Type=oneshot</span><br><span class="line">ExecStart=%h/path/to/background.sh &quot;%I&quot;</span><br></pre></td></tr></table></figure><p>其中 %h 对于 systemd --user 而言相当于 $HOME 的作用，而 %I 则用于传递参数也就是存储的图片文件夹路径。</p><h4 id="background-timer">background@.timer</h4><p>创建文件 background@.service，内容如下：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">[Unit]</span><br><span class="line">Description=Select a random background from %I every 1 min</span><br><span class="line">PartOf=graphical-session.target</span><br><span class="line"></span><br><span class="line">[Timer]</span><br><span class="line">OnStartupSec=1</span><br><span class="line">OnUnitActiveSec=5min</span><br><span class="line">AccuracySec=1s</span><br><span class="line"></span><br><span class="line">[Install]</span><br><span class="line">WantedBy=gnome-session.target</span><br></pre></td></tr></table></figure><p>使用 AccuracySec=1s 避免随机延迟，另外 PartOf 以及 WantedBy 确保了只有处于 Gnome 桌面环境时才会触发。</p><h3 id="启动-Timer">启动 Timer</h3><p>将 background@.service 和 background@.timer 放入 <code>~/.config/systemd/user/</code> 下，并启动 systemd 定时任务。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">install -m640 background@.service background@.timer ~/.config/systemd/user/</span><br><span class="line">systemctl --user daemon-reload</span><br><span class="line">BGPATH=$(systemd-escape &quot;path/to/background/directory&quot;)</span><br><span class="line">systemctl --user enable --now background@$BGPATH.timer</span><br></pre></td></tr></table></figure><p>第三行 path/to/background/directory 替换为你壁纸的路径。</p><p>注意 @ 后面的路径需要使用 systemd-escape 进行转义。</p><h3 id="查看当前壁纸">查看当前壁纸</h3><p>有时候需要知道当前壁纸的原图位置，只需要在 <a href="http://background.sh">background.sh</a> 相应位置添加：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">echo &quot;$TARGET&quot; &gt; &quot;$XDG_RUNTIME_DIR/bg_path&quot;</span><br></pre></td></tr></table></figure><p>然后还有一些其他玩法：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash">显示当前壁纸原图的位置</span></span><br><span class="line">cat &quot;$XDG_RUNTIME_DIR/bg_path&quot;</span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash">在桌面环境打开原图</span></span><br><span class="line">xdg-open $(cat &quot;$XDG_RUNTIME_DIR/bg_path&quot;)</span><br></pre></td></tr></table></figure>]]>
    </content>
    <id>https://www.insidentally.com/articles/000037/</id>
    <link href="https://www.insidentally.com/articles/000037/"/>
    <published>2024-02-04T08:34:28.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>本文参考：<br>
<a href="https://coda.world/gnome-wallpaper-slideshow/"> <em>定时替换 Gnome 壁纸</em> </a></p>
</blockquote>
<p>Gnome 的壁纸更换功能需要自己编写 2 个 xml 文件，xml 文件要手动将所有图片的地址写进去非常的麻烦。虽然 Gnome 下也有不少 Extensions 可以做到更换壁纸的效果，但是总体而言并不好用。</p>]]>
    </summary>
    <title>使用 shell 脚本和 systemd 定时替换 Gnome 壁纸</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="字体" scheme="https://www.insidentally.com/tags/%E5%AD%97%E4%BD%93/"/>
    <category term="fontconfig" scheme="https://www.insidentally.com/tags/fontconfig/"/>
    <category term="等宽" scheme="https://www.insidentally.com/tags/%E7%AD%89%E5%AE%BD/"/>
    <content>
      <![CDATA[<blockquote><p>本文参考：<br><a href="https://catcat.cc/post/2021-03-07/"> <em>用 fontconfig 治理 Linux 中的字体</em> </a><br><a href="https://www.jinbuguo.com/gui/linux_fontconfig.html"> <em>Linux字体美化实战(Fontconfig配置)</em> </a><br><a href="https://blog.lilydjwg.me/2023/3/5/linux-fonts.216591.html"> <em>Linux 上的字体配置与故障排除</em> </a></p></blockquote><p>利用 fontconfig 对 linux 下的字体进行配置。分享我自己的配置方案，尽量把各类问题处理好。</p><span id="more"></span><h3 id="字体的分类">字体的分类</h3><p>字体的数量可以说是成千上万，但一般在电脑上显示的基本为以下这三类</p><ol><li>monospace [等宽]</li></ol><p>等宽字体是指字符宽度相同的字体，用于需要字符严格对齐的场合，例如控制台和源代码。与此相对，字符宽度各不相同的字体称为比例字体(其余四类字体都是)。不过，对于中文字体而言，并不存在等宽与比例的差别，因为所有中文字都是等宽的。中文字体中的“等宽”指的是字体的西文部分是等宽的，2个字母对应1个汉字。</p><ol start="2"><li>sans-serif [无衬线]</li></ol><p>是指笔画末端没有修饰(衬线)的字体，通常用于屏幕显示。中文的黑体与圆体就属于此类字体。</p><ol start="3"><li>serif [有衬线]</li></ol><p>是指笔画末端有修饰(衬线)的字体，通常用于打印。中文的宋体与仿宋就属于此类字体。</p><p>我们要做的字体配置主要就是针对上面这三类字体。</p><h3 id="选字体">选字体</h3><p>有了目标，下面就是选一个自己喜欢的字体了。不过，对于中文字体，目前免费商用的中文字体越来越多了，除了 google 主导的 Noto 系列字体，以及 Adobe 主导的思源系列字体以外。还有阿里巴巴普惠体、OPPO Sans、HarmonyOS Sans 和 MiSans。</p><p>对于编程字体，可以选择的余地就多多了，像是 Source Code Pro，Consolas，Menlo 等等。我最终也是选择了广受好评的 Fira Code 。</p><p>小结一下，我的选择是：</p><ul><li>系统UI：MiSans</li><li>无衬线：西文 DejaVu Sans，中文 MiSans</li><li>衬线：西文 DejaVu Serif，中文 方正书宋</li><li>等宽：西文 Fira Code，中文 方正中等线</li></ul><p>需要注意的是，方正书宋，方正中等线都是个人免费使用，但是不能商用，需要自行去<a href="https://www.foundertype.com/">方正官网</a>拿到授权。而 <a href="https://hyperos.mi.com/font/zh/">Misans</a> 则是小米的免费字体。</p><p>关于 emoji，我选择了 Debian 系统默认会装上的 Noto emoji 。</p><p>另外，少不了人见人爱的图标字体 Nerd Fonts。我是下载使用了 FiraCode Nerd Fonts。</p><p>这里就有小伙伴开始好奇了，如何让西文和中文使用不同的字体呢？</p><p>在 Windows 下，我们可以选择合成字体，即将各类字体打包到一起。例如更纱黑体就是由思源黑体和西文字体 Iosevka 整合而来的。这种字体的好处就是方便，直接选择使用即可。但是缺点也是显而易见，就是打包太麻烦了，引入 Iosevka 要打一次包，想要支持 Nerd Fonts，又要打一次包。如果是别人帮你提供好的合成字体那还好说， 从网上下载、从软件仓库安装就完事了，自己打包的话真的工作量巨大。</p><p>而在 Liunx 下，我们只需要配置 fontconfig 就好了，无论想怎么搭配都可以实现，听起来是不是特别酷😎。可惜的是，有一些程序对 fontconfig 支持并不完善，这就达不到我们想要的效果。（说的就是你，Chrome😠）</p><h3 id="fontconfig">fontconfig</h3><p>在我们开始正式配置前，还是有必要了解一些基本的知识。这里我就简单介绍一下，如果想要深入了解的话可以看看<a href="https://catcat.cc/post/2021-03-07/">双猫大佬</a>和<a href="https://www.jinbuguo.com/gui/linux_fontconfig.html">金步国大佬</a>的文章，里面详细介绍了 Linux fontconfig 工作原理。</p><h4 id="字体的属性">字体的属性</h4><p>字体有很多属性，常用的有字族（family）、倾斜（slant）、字重（weight）。后两者合一起叫样式（style）。</p><p>字族就是它的名字啦。一个字体文件，可以提供多个字体族名 (family)。比如 Debian 用户安装 fonts-noto 后，系统端增加了 NotoSans-Regular.ttf 等字体文件，文件会提供一系列字体名，它们是一个意思。我们可以运行 fontconfig 提供的命令行工具 fc-list 去查看系统上已安装的字体已经它们对应的字体族名。</p><p>倾斜就是斜不斜，英文叫 Roman、Italic 或者 Oblique、Italic 是专门的斜体写法（更接近手写样式）， Oblique 是把常规写法倾斜一下完事。</p><p>字重就更简单了，就是笔划的粗细。常见的有 Regular、Normal、Medium、Bold、Semibold、Black、Thin、Light、Extralight 等。</p><h4 id="通用字族名">通用字族名</h4><p>很多时候，程序并不在乎用户具体使用的是哪款字体，像很多网站的 CSS 那样把各个平台的常见字体全部列出来太傻了，又容易出问题。所以，人们发明了“通用字族名”，也就是 sans-serif (sans)、serif 和 monospace (mono) 这些。它们不是真实存在的字体，而是分别指示程序去使用无衬线、衬线、等宽字体。那么桌面程序又是如何知道具体使用哪些字体呢？它只需要去查询 fontconfig 就行了。由于它们必定要经过 fontconfig 的查询流程后才能使用字体，所以我们可以通过 fontconfig 的配置去精准控制程序使用的字体。</p><h4 id="如何调试">如何调试</h4><p>传入环境变量FC_DEBUG=4即可，例如：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">FC_DEBUG=4 firefox</span><br></pre></td></tr></table></figure><p>fontconfig 就会打印调试信息，其中可以看到：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">FcConfigSubstitute editPattern has 8 elts (size 16)</span><br><span class="line">family: &quot;sans-serif&quot;(w)</span><br><span class="line">pixelsize: 26(f)(s)</span><br><span class="line">antialias: True(w)</span><br><span class="line">hintstyle: 1(i)(w)</span><br><span class="line">rgba: 5(i)(w)</span><br><span class="line">lang: &quot;zh-CN&quot;(w)</span><br><span class="line">lcdfilter: 1(i)(w)</span><br><span class="line">prgname: &quot;firefox&quot;(s)</span><br></pre></td></tr></table></figure><p>除了启动一个程序来看它字体的调用日志，我们也可以手动调用。例如，我想看 monospace 在系统里被修改成了什么字体，就可以执行：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">FC_DEBUG=4 fc-match &#x27;monospace&#x27;</span><br></pre></td></tr></table></figure><p>打印出的调试信息会很长，我们主要看几个部分：</p><p>第一部分，Add Rule，指已添加的配置文件规则。这里面也包含了家目录下的配置文件，可以找来看看被解析成了什么。</p><p>第二部分，在 Add Rule 之后，迎来了最关键的、我们应当关心的 FcConfigSubstitute Pattern，它包含了 font pattern。(s) 和 (w) 分别代表强弱绑定；prgname 代表程序名，此时就是 fc-match。至于 lang，由于没有对 fc-match 指定语言，所以默认是 en。</p><p>接下来有很多条 FcConfigSubstitute editPattern，代表对 font pattern 的替换操作。但是必须当规则匹配的时候，也就是 Rule Set 不是 No match 的情况下，才执行 FcConfigSubstitute editPattern。那么，又应该怎么看 FcConfigSubstitute editPattern 呢？主要看 family，因为 family 代表着字体匹配顺序。它就是配置文件中的<edit target="pattern">操作。</p><p>最后应该关心 FcConfigSubstitute donePattern，这是 fontconfig 执行完字体替换后的结果。</p><h4 id="配置文件">配置文件</h4><p>整个配置文件由如下几个部分依次拼接而成：</p><ul><li>目录设置(<code>&lt;dir&gt;</code>,<code>&lt;cachedir&gt;</code>,<code>&lt;include&gt;</code>)</li><li>杂项设置(<code>&lt;config&gt;</code>)</li><li>扫描阶段(<code>&lt;match target=&quot;scan&quot;&gt;</code>)</li><li>匹配阶段(<code>&lt;alias&gt;</code>, <code>&lt;match target=&quot;pattern&quot;&gt;</code>)</li><li>渲染阶段(<code>&lt;match target=&quot;font&quot;&gt;</code>)</li></ul><p>想要实现合成字体的效果，一个最简单的思路，本文也基于该思路：不让程序使用某个具体的字体，而是使用通用字体族名 (Generic Font Family)。比如，让程序使用 sans-serif，也就是默认的无衬线字体。</p><p>我们要关心第四个部分，即匹配阶段，使用 fontconfig 配置如下：</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>sans-serif<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;prepend&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;strong&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>MiSans<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>DejaVu Sans Book<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>Noto Color Emoji<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br></pre></td></tr></table></figure><p>这种 font stack 的方式，即可让程序按照以下顺序渲染字体：</p><blockquote><p>Misans —&gt; DejaVu Sans Book -&gt; Noto Color Emoji</p></blockquote><p>这里的 <code>&lt;test&gt;</code> 就是条件判断，<code>mode=&quot;prepend&quot;</code> 指在前添加，<code>binding=&quot;strong&quot;</code> 则是强绑定</p><h4 id="开始配置">开始配置</h4><p>我们的思路就是就是修改默认的字族，让其成为我们想要指定的字体。然后将所有程序的字体配置改为通用字体族名：sans-serif，serif，monospace。</p><p>单用户使用，则配置文件在 ~/.config/fontconfig/fonts.conf；多用户使用则配置文件为 /etc/fonts/local.conf 。</p><p>设置默认字体</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">&lt;!-- Default system-ui fonts --&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>system-ui<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;prepend&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;strong&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>sans-serif<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="comment">&lt;!-- Default sans-serif fonts--&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>sans-serif<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;prepend&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;strong&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>MiSans<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>DejaVu Sans Book<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>Noto Color Emoji<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="comment">&lt;!-- Default serif fonts--&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>serif<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;prepend&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;strong&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>方正书宋_GBK<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>FZShuSong<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>DejaVu Serif Book<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>Noto Color Emoji<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br><span class="line"></span><br><span class="line"><span class="comment">&lt;!-- Default monospace fonts--&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>monospace<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;prepend&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;strong&quot;</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>FiraCode Nerd Font<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>方正中等线_GBK<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">        <span class="tag">&lt;<span class="name">string</span>&gt;</span>Noto Color Emoji<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br></pre></td></tr></table></figure><p>对 system-ui，sans-serif，serif，monospace 设置优先显示的字体。在这里我让 system-ui 默认为无衬线。注意，system-ui 必须在最前。由于 fontconfig 对 font pattern 的操作是按顺序执行的，所以必须先让 system-ui 能优先以 sans-serif 显示，然后才是对 sans-serif 的操作。</p><h4 id="覆盖西文字体">覆盖西文字体</h4><p>如果去观察 Noto Sans CJK 这个中文字体，会发现它的西文部分的字形其实和 Noto Sans 不一样，虽然它们都以 Noto 自称。中文字体携带的英文字符有可能十分糟糕，特别是 Windows 自带的 SimHei，也就是中易黑体，它的英文相当糟糕。另外，微软雅黑的字重实在是太少了，对于设计师来说很不友好。而各种流行的英文字体支持很多字重。</p><p>此处仅在英文状态下将 MiSans 替换为 DejaVu Sans Book字体。</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">&lt;!-- Replace english fonts--&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;lang&quot;</span> <span class="attr">compare</span>=<span class="string">&quot;contains&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>en<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">compare</span>=<span class="string">&quot;contains&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>MiSans<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;prepend&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;strong&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>DejaVu Sans Book<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br></pre></td></tr></table></figure><h4 id="浏览器字体问题">浏览器字体问题</h4><p>有些程序，主要是浏览器程序，居然只使用 font pattern 结果中的首个字体，比如 Chrome（以及衍生的Chromium），虽然 Chrome 接受了我们指定的西文字体，但是它忽略了紧接其后的中文字体，即使配置采用了强绑定！然后中文字体又不知道它 fallback 到哪去了，可能会出现你想要的中文字体，也可能不是。我们可以指定程序来渲染。</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">&lt;!-- 替换浏览器字体--&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;prgname&quot;</span> <span class="attr">compare</span>=<span class="string">&quot;not_eq&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>msedge<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">test</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">compare</span>=<span class="string">&quot;contains&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>MiSans<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;prepend&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;strong&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>FiraCode Nerd Font<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br></pre></td></tr></table></figure><p>这里是 msedge 主要是我平时都用 edge 而不是 Chrome。可惜由于 Chromium 在 Linux 上小问题实在是太多了，还是老老实实的用 firefox 吧。</p><p>在所有情况下，除了程序名为 msedge 的情况下，优先使用 Fira Code 显示西文，再用 MiSans 显示中文。虽然我不能让 msedge 使用 Fira Code，但它一定能用上 MiSans 显示中文。</p><h4 id="替换任意字体">替换任意字体</h4><p>当系统里已经安装了一些不需要的字体，但又不想删除或者屏蔽它怎么办呢？替换掉 font pattern 就可以了。</p><p>我这里则是用方正书宋来替换普通的宋体。</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">test</span> <span class="attr">qual</span>=<span class="string">&quot;any&quot;</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>宋体<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;same&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>方正书宋_GBK<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;pattern&quot;</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">test</span> <span class="attr">qual</span>=<span class="string">&quot;any&quot;</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>新宋体<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">test</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;family&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span> <span class="attr">binding</span>=<span class="string">&quot;same&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">string</span>&gt;</span>方正书宋_GBK<span class="tag">&lt;/<span class="name">string</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br></pre></td></tr></table></figure><p>字体渲染参数</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">&lt;!--rendering options--&gt;</span></span><br><span class="line"><span class="tag">&lt;<span class="name">match</span> <span class="attr">target</span>=<span class="string">&quot;font&quot;</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;autohint&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">bool</span>&gt;</span>false<span class="tag">&lt;/<span class="name">bool</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;hinting&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">bool</span>&gt;</span>true<span class="tag">&lt;/<span class="name">bool</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;hintstyle&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">const</span>&gt;</span>hintslight<span class="tag">&lt;/<span class="name">const</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;antialias&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">bool</span>&gt;</span>true<span class="tag">&lt;/<span class="name">bool</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;lcdfilter&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">const</span>&gt;</span>lcddefault<span class="tag">&lt;/<span class="name">const</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;<span class="name">edit</span> <span class="attr">name</span>=<span class="string">&quot;rgba&quot;</span> <span class="attr">mode</span>=<span class="string">&quot;assign&quot;</span>&gt;</span></span><br><span class="line">    <span class="tag">&lt;<span class="name">const</span>&gt;</span>rgb<span class="tag">&lt;/<span class="name">const</span>&gt;</span></span><br><span class="line">  <span class="tag">&lt;/<span class="name">edit</span>&gt;</span></span><br><span class="line"><span class="tag">&lt;/<span class="name">match</span>&gt;</span></span><br></pre></td></tr></table></figure><p>这里主要设置了一些字体的渲染方式：</p><p>autohint：优先使用内嵌微调<br>hinting：开启微调<br>hintstyle：微调的程度，轻微<br>antialias：开启抗锯齿功能<br>lcdfilter：LCD filter 的风格，默认<br>rgba：LCD 子像素的排列顺序，rgb<br>这里就直接抄作业了。</p><h3 id="不能解决的问题">不能解决的问题</h3><p>Linux 不强迫程序必须使用特定的依赖，而是程序主动选择了约定俗成的依赖。老话重谈，程序可以自由选择完全遵守 fontconfig，也可以选择部分使用 fontconfig 的配置，或者完全不遵守它。这也导致了对一些程序无法实现字体的修改。以及上面提到的 chrome 对 fontconfig 并不是很好，或许面对这种程序，就需要合成字体的出场了。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000036/</id>
    <link href="https://www.insidentally.com/articles/000036/"/>
    <published>2024-01-31T08:34:28.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>本文参考：<br>
<a href="https://catcat.cc/post/2021-03-07/"> <em>用 fontconfig 治理 Linux 中的字体</em> </a><br>
<a href="https://www.jinbuguo.com/gui/linux_fontconfig.html"> <em>Linux字体美化实战(Fontconfig配置)</em> </a><br>
<a href="https://blog.lilydjwg.me/2023/3/5/linux-fonts.216591.html"> <em>Linux 上的字体配置与故障排除</em> </a></p>
</blockquote>
<p>利用 fontconfig 对 linux 下的字体进行配置。分享我自己的配置方案，尽量把各类问题处理好。</p>]]>
    </summary>
    <title>Linux 下字体配置</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="VirtualBox" scheme="https://www.insidentally.com/tags/VirtualBox/"/>
    <category term="虚拟机" scheme="https://www.insidentally.com/tags/%E8%99%9A%E6%8B%9F%E6%9C%BA/"/>
    <category term="静态 ip" scheme="https://www.insidentally.com/tags/%E9%9D%99%E6%80%81-ip/"/>
    <content>
      <![CDATA[<p>现在的 Virtualbox 支持无界面启动了，这时我们一般是在主机上直接用 ssh 访问虚拟机。但是如果主机经常使用不同的网络，IP 地址无法固定（比如主机是笔记本，经常往返于办公室和卧室），最终导致虚拟机的 IP 地址也经常变动，那无疑为我们用 ssh 访问虚拟机添加了许多麻烦。本文介绍如何使 Virtualbox 虚拟机的 IP 地址相对于主机固定的方法。</p><span id="more"></span><p>本文的主机为 Windows 11 系统，虚拟机软件为 VirtualBox 7.0，虚拟机系统为 Debian 13。正常情况下应该适用于所以操作系统。</p><h3 id="网络连接种类">网络连接种类</h3><p>Virtualbox 网络连接方式有很多种选择，但是我们常用的是这是三个：</p><ol><li>网络地址转换(NAT)；</li><li>桥接网卡；</li><li>仅主机(Host-Only)网络。</li></ol><img src="/images/000035/01.png" width="600" alt='Virtualbox 网络连接方式' align=center /><p>每种网络连接有着自己的特点，下面列举下主要特点</p><table><thead><tr><th>模式名称</th><th>特点</th></tr></thead><tbody><tr><td>网络地址转换（NAT）</td><td>连接这个网络可以访问外部网络，但是外部网络不可访问虚拟机</td></tr><tr><td>桥接网卡</td><td>这个网络完全可以共享主机网络，主机网络发生变化时，也跟随变化，IP 也随之变动</td></tr><tr><td>仅主机（Host-Only）网络</td><td>这个网络也可以用来主机访问虚拟机以及虚拟机上 Web 服务，但是虚拟机不可访问外网</td></tr></tbody></table><h3 id="选择网络连接方式">选择网络连接方式</h3><p>从上面的介绍来看有多种网络连接可以选择</p><p>最简单的就是选择桥接网卡，直接共享主机网络，主机、虚拟机之间访问都没有问题</p><p>但是我们家用或者公司使用，都不会固定 IP 的，主机随时变化，那么虚拟机的 IP 也随时变化，很不方便，我们希望虚拟机的 IP 是固定的，方便我们连接和访问服务使用</p><p>所以最终的选择是：网络地址转换（NAT） + 仅主机（Host-Only）网络的组合。</p><h3 id="操作步骤">操作步骤</h3><h4 id="新增-仅主机-Host-Only-网络">新增 仅主机(Host-Only)网络</h4><p>在 Virtualbox 中依次选择：管理 =&gt; 工具 =&gt; 网络管理器</p><img src="/images/000035/02.png" width="600" alt='Virtualbox 网络管理器' align=center /><p>点创建，仅主机（Host-Only）网络，在下面的网卡设置中选择手动配置网卡，填上你喜欢的 IP 地址和正确的子网掩码（本文选择的地址是192.168.56.1，子网掩码为255.255.255.0），最后点击应用。DHCP 服务器选择不启用。</p><p>然后就会在 Windows 主机的设置 =&gt; 网络和 Internet =&gt; 高级网络设置这里看到对应的网络：Virtualbox Host-Only Ethernet Adapter。</p><img src="/images/000035/03.png" width="600" alt='Virtualbox Host-Only Ethernet Adapter' align=center /><p>这样主机(Host-Only)网络的 IP 就配置好了。</p><h4 id="配置虚拟机网络">配置虚拟机网络</h4><p>点击安装好的 Debian 虚拟机：设置 =&gt; 网络。</p><p>网卡1：启用网络连接，连接方式为网络地址转换（NAT）</p><img src="/images/000035/04.png" width="600" alt='网卡 1 配置' align=center /><p>网卡2 启用网络连接，连接方式为仅主机（Host-Only）</p><img src="/images/000035/05.png" width="600" alt='网卡 2 配置' align=center /><p>网络这里的网卡顺序也是有意义的，会对应虚拟机对两张网卡识别的顺序。</p><h4 id="虚拟机内部设置">虚拟机内部设置</h4><p>启动 Debian</p><p>执行</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">ip address</span><br></pre></td></tr></table></figure><p>会看到类似于如下输出</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">1: lo: &lt;LOOPBACK,UP,LOWER_UP&gt; mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000</span><br><span class="line">    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00</span><br><span class="line">    inet 127.0.0.1/8 scope host lo</span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line">    inet6 ::1/128 scope host noprefixroute</span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line">2: enp0s3: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc fq_codel state UP group default qlen 1000</span><br><span class="line">    link/ether 08:00:27:6e:2d:62 brd ff:ff:ff:ff:ff:ff</span><br><span class="line">    inet 10.0.2.15/24 brd 10.0.2.255 scope global dynamic enp0s3</span><br><span class="line">       valid_lft 77591sec preferred_lft 77591sec</span><br><span class="line">    inet6 fe80::a00:27ff:fe6e:2d62/64 scope link proto kernel_ll</span><br><span class="line">       valid_lft forever preferred_lft forever</span><br><span class="line">3: enp0s8: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc fq_codel state UP group default qlen 1000</span><br><span class="line">    link/ether 08:00:27:ad:06:3c brd ff:ff:ff:ff:ff:ff</span><br></pre></td></tr></table></figure><p>可以看到网卡 1 被识别为 enp0s3，并且自动获取了 IP 地址，（这一般是安装系统时自动设置的，如果你没有自动设置，那还需要手动设置）。网卡 2 被识别为 enp0s8，当前没有 IP 地址。</p><p>我们需要手动编辑网络配置文件。网络配置文件一般为：<code>/etc/network/interface</code>。可以用你自己喜欢的编辑器来编辑这个文件。</p><p>可以按照下面的示例来编辑网络配置文件，要注意网卡 2（enp0s8）需要设置为 static，并且地址需要设置为上文主机（Host-Only）网络的同一个网段的 IP，且不能于主机(Host-Only)网络的地址相同。比如上文我们将主机（Host-Only）网络的 IP 设置为 192.168.56.1，那么此处我们可以将 address 设置为 192.168.56.2。并且需要注意，不要为网卡 2 设置<ruby>网关<rt><strong>gateway</strong></rt></ruby>，否则虚拟机将不能联网。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"># This file describes the network interfaces available on your system</span><br><span class="line"># and how to activate them. For more information, see interfaces(5).</span><br><span class="line"></span><br><span class="line">source /etc/network/interfaces.d/*</span><br><span class="line"></span><br><span class="line"># The loopback network interface</span><br><span class="line">auto lo</span><br><span class="line">  iface lo inet loopback</span><br><span class="line"></span><br><span class="line"># The primary network interface 网卡 1 的配置</span><br><span class="line">allow-hotplug enp0s3</span><br><span class="line">  iface enp0s3 inet dhcp</span><br><span class="line"></span><br><span class="line"># The second network interface 网卡 2 的配置</span><br><span class="line">allow-hotplug enp0s8</span><br><span class="line">  iface enp0s8 inet static</span><br><span class="line">    address 192.168.56.2</span><br><span class="line">    netmask 255.255.255.0</span><br></pre></td></tr></table></figure><p>修改完毕之后，需要重新启动网路</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">systemctl restart network</span><br></pre></td></tr></table></figure><p>到这里就可以执行 <code>ip address</code> 查看网卡情况，网卡 2（enp0s8）已经设置成固定 IP 了，同时 <code>ping http://www.baidu.com</code>也是可以联网的。</p><p>至此虚拟机设置完成，我们可以愉快的使用 ssh 管理我们的虚拟机了。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000035/</id>
    <link href="https://www.insidentally.com/articles/000035/"/>
    <published>2023-10-19T08:34:28.000Z</published>
    <summary>
      <![CDATA[<p>现在的 Virtualbox 支持无界面启动了，这时我们一般是在主机上直接用 ssh 访问虚拟机。但是如果主机经常使用不同的网络，IP 地址无法固定（比如主机是笔记本，经常往返于办公室和卧室），最终导致虚拟机的 IP 地址也经常变动，那无疑为我们用 ssh 访问虚拟机添加了许多麻烦。本文介绍如何使 Virtualbox 虚拟机的 IP 地址相对于主机固定的方法。</p>]]>
    </summary>
    <title>设置 VirtualBox 虚拟机为静态 ip 以方便主机访问</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Fedora" scheme="https://www.insidentally.com/tags/Fedora/"/>
    <category term="Nvidia" scheme="https://www.insidentally.com/tags/Nvidia/"/>
    <category term="内核模块" scheme="https://www.insidentally.com/tags/%E5%86%85%E6%A0%B8%E6%A8%A1%E5%9D%97/"/>
    <category term="Secure Boot" scheme="https://www.insidentally.com/tags/Secure-Boot/"/>
    <category term="安全启动" scheme="https://www.insidentally.com/tags/%E5%AE%89%E5%85%A8%E5%90%AF%E5%8A%A8/"/>
    <content>
      <![CDATA[<p>现在新出厂的电脑 UEFI 会默认开启<ruby>安全启动<rt><strong>Secure Boot</strong></rt></ruby>，安全启动的作用是防止恶意软件侵入。当电脑引导器被病毒修改之后，它会给出提醒并拒绝启动，避免可能带来的进一步损失。不过它同样会阻止一些未经微软签名的 Linux 内核启动运行。虽然可以直接选择在主板设置中关闭安全启动来解决一系列麻烦，但就在近期微软公布的 Windows 11 最低硬件标准中可以看到，安全启动被微软看的越来越重。如果你的电脑是 Windows + Linux 双系统，最好还是让 Linux 本身支持安全启动。</p><span id="more"></span><p>而最好用的发行版之一 Fedora 更热衷于开源驱动。Fedora 其本身是支持安全启动的，但是当你通过 rpmfusion 安装官方的英伟达驱动，会造成这些驱动的内核模块未签名。在 Linux 启动过程中因为安全启动校验签名，会阻止加载这些模块，进而无法正常驱动显卡。用过 Ubuntu 的伙伴们应该知道，在安全启动开启的情况下 ，Ubuntu 安装程序会自动用自签密钥签名英伟达驱动内核模块，并在开机过程中自动将该自签密钥导入 MOK List（安全启动机器主人信任密钥列表）。而 Fedora 只会保证自身内核签名有效，对 rpmfusion 中的第三方内核模块签名问题不予理会，导致无法正常加载英伟达驱动。</p><p>本文介绍如何在 Fedora 中自动签署英伟达内核模块</p><h3 id="准备工作">准备工作</h3><p>在 Fedora 36 之前，要像 Ubuntu 那样自动签署内核模块有点困难。但从这个版本开始，您只需几个简单的步骤就能做到。</p><p>在开始之前，让我们先确认一些前提条件已经满足：</p><ol><li>已启用安全启动；</li><li>尚未安装英伟达驱动程序（<strong>非常关键</strong>，如果你已经安装了专有的英伟达驱动，可能需要重装系统才行）；</li><li>以及安装了 Fedora 36 及以上版本。</li></ol><p>本指南主要参考了以下资料：</p><ol><li><a href="https://rpmfusion.org/Howto/NVIDIA">rpmfusion 的官方英伟达文档</a></li><li><a href="https://rpmfusion.org/Howto/Secure%20Boot">rpmfusion 的官方安全启动文档</a></li><li><a href="https://blog.monosoul.dev/2022/05/17/automatically-sign-nvidia-kernel-module-in-fedora-36/">Andrei Nevedomskii 的博客教程</a></li></ol><p>不满足于本文的朋友可以阅读上述资料进一步深入研究。</p><h3 id="具体步骤">具体步骤</h3><h4 id="1-安装自动签名所需的工具">1. 安装自动签名所需的工具</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install kmodtool akmods mokutil openssl</span><br></pre></td></tr></table></figure><h4 id="2-生成签名密钥">2. 生成签名密钥</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo kmodgenca -a</span><br></pre></td></tr></table></figure><p>该命令会在 <code>/etc/pki/akmods/certs/</code> 目录下生成密钥，运行正确的情况下不会有输出。</p><h4 id="3-启动密钥注册">3. 启动密钥注册</h4><p>这将使 Linux 内核信任使用你的密钥签名的驱动程序</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo mokutil --import /etc/pki/akmods/certs/public_key.der</span><br></pre></td></tr></table></figure><p>你会被要求输入一个密码。请记住这个密码，在第五步中还需要再次使用。</p><h4 id="4-重启以注册密钥">4. 重启以注册密钥</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo reboot</span><br></pre></td></tr></table></figure><h4 id="5-注册密钥">5. 注册密钥</h4><p>重启后，你将看到蓝色的 MOK 管理器界面，不要惊惶，按照以下步骤注册密钥。</p><blockquote><p>如果你曾在启用安全启动的 Ubuntu 中安装过英伟达驱动程序，你可能见过这个界面。</p></blockquote><ol><li>首先要及时按任意建继续进入 MOK 管理（如果没有及时进入 MOK 管理，系统会重启）</li></ol><img src="/images/000034/01.png" width="600" alt='MOK 管理界面1' align=center /><ol start="2"><li><p>接着选择“Enroll MOK”注册 MOK。</p></li><li><p>然后选择“Continue”。</p></li><li><p>选择“Yes”并输入步骤 3 中的密码并回车（<strong>密码不会在输入框中显示，输入密码直接回车就好了</strong>）。</p></li></ol><img src="/images/000034/02.png" width="600" alt='MOK 管理界面5' align=center /><ol start="5"><li>此时密钥已经注册，选择“reboot”，设备将再次重启。</li></ol><img src="/images/000034/03.png" width="600" alt='MOK 管理界面6' align=center /><h4 id="6-安装英伟达驱动程序">6. 安装英伟达驱动程序</h4><p>现在只需正常安装英伟达驱动程序。</p><blockquote><p>你需要提前配置好 rpmfusion 软件源。参看清华开源镜像站的<a href="https://mirrors.tuna.tsinghua.edu.cn/help/rpmfusion/">教程</a></p></blockquote><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install gcc kernel-headers kernel-devel akmod-nvidia xorg-x11-drv-nvidia xorg-x11-drv-nvidia-libs</span><br></pre></td></tr></table></figure><h4 id="7-确保内核模块已编译">7. 确保内核模块已编译</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo akmods --force</span><br></pre></td></tr></table></figure><h4 id="8-确保启动镜像也已更新">8. 确保启动镜像也已更新</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dracut --force</span><br></pre></td></tr></table></figure><h4 id="9-重启设备">9. 重启设备</h4><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo reboot</span><br></pre></td></tr></table></figure><h3 id="确认是否成功">确认是否成功</h3><p>重启完成后，输入以下命令确认驱动是否加载：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">lsmod | grep -i nvidia</span><br></pre></td></tr></table></figure><p>如果有类似以下的输出，恭喜你，一切顺利，一切就绪！</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">lsmod | grep -i nvidia</span></span><br><span class="line"></span><br><span class="line">nvidia_drm             94208  2</span><br><span class="line">nvidia_modeset       1560576  2 nvidia_drm</span><br><span class="line">nvidia_uvm           3493888  0</span><br><span class="line">nvidia              62517248  118 nvidia_uvm,nvidia_modeset</span><br><span class="line">video                  73728  3 asus_wmi,i915,nvidia_modeset</span><br></pre></td></tr></table></figure><p>现在，你可以愉快的在开启安全启动的情况下使用英伟达显卡了。</p><p>对于使用 Debian 的朋友，可以参考 Debian 关于安全启动的<a href="https://wiki.debian.org/SecureBoot#MOK_-_Machine_Owner_Key">官方教程</a>。由于我没有在 Debian 上尝试过，无法为你提供帮助。</p><p>希望本文能够帮助到你。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000034/</id>
    <link href="https://www.insidentally.com/articles/000034/"/>
    <published>2023-07-20T05:45:48.000Z</published>
    <summary>
      <![CDATA[<p>现在新出厂的电脑 UEFI 会默认开启<ruby>安全启动<rt><strong>Secure Boot</strong></rt></ruby>，安全启动的作用是防止恶意软件侵入。当电脑引导器被病毒修改之后，它会给出提醒并拒绝启动，避免可能带来的进一步损失。不过它同样会阻止一些未经微软签名的 Linux 内核启动运行。虽然可以直接选择在主板设置中关闭安全启动来解决一系列麻烦，但就在近期微软公布的 Windows 11 最低硬件标准中可以看到，安全启动被微软看的越来越重。如果你的电脑是 Windows + Linux 双系统，最好还是让 Linux 本身支持安全启动。</p>]]>
    </summary>
    <title>在启用安全启动的 Fedora 中安装 Nvidia 驱动</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="摄影摄像" scheme="https://www.insidentally.com/categories/%E6%91%84%E5%BD%B1%E6%91%84%E5%83%8F/"/>
    <category term="日喀则" scheme="https://www.insidentally.com/tags/%E6%97%A5%E5%96%80%E5%88%99/"/>
    <category term="旅游" scheme="https://www.insidentally.com/tags/%E6%97%85%E6%B8%B8/"/>
    <category term="曲登尼玛" scheme="https://www.insidentally.com/tags/%E6%9B%B2%E7%99%BB%E5%B0%BC%E7%8E%9B/"/>
    <category term="冰川" scheme="https://www.insidentally.com/tags/%E5%86%B0%E5%B7%9D/"/>
    <content>
      <![CDATA[<p>曲登尼玛冰川的全名“多吉曲登尼玛”，是“金刚石太阳塔”的意思，位于日喀则岗巴县境内。与一些其它的冰川不同，看到其它的冰川你可能会觉得这是一个黑黢黢的冰块而已，而曲登尼玛冰川可以满足你关于冰川的所有想象–晶莹剔透、泛蓝光、近距离接触、冰川下方常年飘着冰川的圣湖、不远处的寺庙、朝圣的信众、远处众多的8000+高山，这里是最典型的西藏风景素描画。</p><span id="more"></span><p>要抵达曲登尼玛冰川，需徒步约3公里，约1个半小时的时间。这里没有石阶，没有硬化，完全是人走多了后的自然形成。道路很窄小，蜿蜒曲折，高低起伏。</p><p>虽然徒步需要消耗体力、时间和耐心，但震撼的美景，绝对不会辜负你！</p><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/01.jpg","alt":"曲登尼玛冰川徒步路线","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/02.jpg","alt":"曲登尼玛冰川的雄鹰","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/03.jpg","alt":"曲登尼玛冰川西圣湖下的鸽子","title":""}]</div>  </div><p>西圣湖下方有一排用于淋浴的简易房和水池，据说用此水沐浴的人，可以洗去一世罪孽</p><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/04.jpg","alt":"曲登尼玛冰川的冰泉","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/05.jpg","alt":"曲登尼玛冰川下充满雪的路线","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/06.jpg","alt":"曲登尼玛冰川下的徒步路线","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/07.jpg","alt":"曲登尼玛冰川两边的悬崖峭壁","title":""}]</div>  </div><p>曲登尼玛冰川下的圣湖都是“观相湖”，也就是可以通过观察湖面可以看到观察者自己死后的托生之地。</p><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/08.jpg","alt":"曲登尼玛冰川西圣湖","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/09.jpg","alt":"曲登尼玛冰川西圣湖边徒步路线","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/10.jpg","alt":"曲登尼玛冰川西圣湖","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/11.jpg","alt":"曲登尼玛冰川","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/12.jpg","alt":"曲登尼玛冰川","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/13.jpg","alt":"曲登尼玛冰川","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/14.jpg","alt":"曲登尼玛冰川旁边的冰锥","title":""}]</div>  </div><div class="gallery-container" data-type="data" data-button="" data-limit="10" data-first="10">    <div class="gallery-items">[{"url":"/images/000033/15.jpg","alt":"曲登尼玛冰川","title":""}]</div>  </div>]]>
    </content>
    <id>https://www.insidentally.com/articles/000033/</id>
    <link href="https://www.insidentally.com/articles/000033/"/>
    <published>2023-05-02T15:45:48.000Z</published>
    <summary>
      <![CDATA[<p>曲登尼玛冰川的全名“多吉曲登尼玛”，是“金刚石太阳塔”的意思，位于日喀则岗巴县境内。与一些其它的冰川不同，看到其它的冰川你可能会觉得这是一个黑黢黢的冰块而已，而曲登尼玛冰川可以满足你关于冰川的所有想象–晶莹剔透、泛蓝光、近距离接触、冰川下方常年飘着冰川的圣湖、不远处的寺庙、朝圣的信众、远处众多的8000+高山，这里是最典型的西藏风景素描画。</p>]]>
    </summary>
    <title>曲登尼玛冰川</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="医学探索" scheme="https://www.insidentally.com/categories/%E5%8C%BB%E5%AD%A6%E6%8E%A2%E7%B4%A2/"/>
    <category term="人工神经网络" scheme="https://www.insidentally.com/tags/%E4%BA%BA%E5%B7%A5%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C/"/>
    <category term="机器学习" scheme="https://www.insidentally.com/tags/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/"/>
    <category term="PyTorch" scheme="https://www.insidentally.com/tags/PyTorch/"/>
    <category term="神经生物学" scheme="https://www.insidentally.com/tags/%E7%A5%9E%E7%BB%8F%E7%94%9F%E7%89%A9%E5%AD%A6/"/>
    <content>
      <![CDATA[<blockquote><p>本文参考：<br><a href="https://doi.org/10.1016/j.neuron.2020.09.005"> <em>Artificial Neural Networks for Neuroscientists: A Primer</em> </a></p></blockquote><p>人工神经网络（ANN）是机器学习中必不可少的工具，在神经科学中引起了越来越多的关注。除了提供强大的数据分析技术外，人工神经网络还为神经科学家提供了一种新方法，可以构建复杂行为、异构神经活动和电路连接的模型，并以传统模型无法设计的方式探索神经系统的优化。在本文中，我们介绍了人工神经网络，并展示了它们如何被富有成效地用于研究神经科学问题。我们首先讨论人工神经网络的基本概念和方法。然后，重点是使这个数学框架更接近神经生物学，我们详细介绍了如何定制人工神经网络的分析、结构和学习，以更好地应对大脑研究中的各种挑战。为了帮助读者获得实践经验，本文附有 PyTorch 和 Jupyter Notebook 中的教程式代码，涵盖了主要主题。</p><h2 id="1-神经科学中的人工神经网络">1. 神经科学中的人工神经网络</h2><p>使用人工神经网络（ANN）或深度学习进行学习已成为当今机器学习的主导框架，从而在广泛的应用中取得了突破，包括计算机视觉，自然语言处理和战略游戏。该领域的一些关键思想可以追溯到大脑研究：监督学习规则源于训练感知器的理论，而感知器又受到大脑的启发；而分层架构和卷积原理与我们关于灵长类视觉系统的知识密切相关。今天，从神经科学到人工智能领域的思想交流仍在继续。</p><p>与此同时，机器学习为系统神经科学提供了新的强大工具。深度学习框架的一个用途是分析神经科学数据（图1）。事实上，计算机视觉的进步，尤其是卷积神经网络，已经彻底改变了图像和视频数据处理。例如，随着时间的推移，不受控制的行为，例如实验室实验中动物的微运动，现在可以在深度神经网络的帮助下有效地跟踪和量化。大脑连接组学、转录组学和神经生理学的大量大数据不断催生出创新的神经技术，这些数据的分析可以从机器学习中受益。示例包括实现详细的微米级图像分割，神经微电路中的连接重建，以及从尖峰数据估计神经放电率。</p><img src="/images/000032/02.jpg" width="500" alt="图1 使用人工神经网络进行神经科学研究的原因" align=center /><blockquote><p>（左上）神经/行为数据分析。人工神经网络可用作图像处理工具，实现高效的姿态预测。<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>1</sup></a><br>（右上）对复杂行为进行建模。人工神经网络可以执行涉及具有挑战性的自然视觉物体的物体辨别任务。<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>2</sup></a><br>（左下）说明人工神经网络可用于模拟复杂的神经活动/连接模式（蓝线）。<br>（右下）从优化角度理解神经回路。在此视图中，函数神经网络（星号）是目标函数在受神经网络架构（彩色空间）约束的模型的抽象空间中优化（箭头）的结果。</p></blockquote><p>本文不侧重于数据分析，相反我们的主要目的是提出在计算神经科学领域开发生物神经回路的 ANN 模型的基本概念和方法。值得注意的是，一般而言，ANN 不应与神经网络模型混淆。数学模型都是“人造”的，因为它们不是生物模型。我们用 ANN 具体表示的模型部分受到神经科学的启发，但生物学上的合理性并不是主要关注的问题，与其他类型的模型相比，这些模型力求建立在神经科学两大支柱（神经解剖学和神经生理学）的定量数据上。神经网络在神经科学和认知科学中的应用可以追溯到神经网络的早期。近年来，神经网络正在成为神经科学中越来越常见的模型系统。人工神经网络或深度学习模型已经并可能继续对神经科学家特别有用，原因有三。</p><p>首先，需要新的建模方法来应对大脑研究中的新挑战。在过去几十年中，计算神经科学取得了巨大进步，成为系统神经科学的一个重要组成部分。通过实验和理论的结合，获得了许多见解，包括激发与抑制平衡以及归一化思想。此外在开发基本认知功能模型（如简单决策）方面也取得了进展。然而，现实生活中的问题可能非常复杂；基本的大脑系统通常很难用“手工构建”的计算模型来捕捉。例如，大脑中的对象分类是通过多层复杂的线性非线性处理进行的。建立视觉系统的功能模型，以实现接近人类的行为表现，不仅对神经科学家，而且对计算机视觉研究人员来说，仍然是一项艰巨的挑战。通过直接训练复杂任务和行为的神经网络模型，深度学习提供了一种有效生成大脑功能候选模型的方法，否则这些模型几乎不可能建模（图1）。通过学习执行动物的各种复杂行为，神经网络可以作为生物神经网络的潜在模型系统，补充非人类动物模型来理解人脑。</p><p>在系统神经科学中提倡深度网络的第二个原因是承认相对简单的模型通常不能解释异质神经群体中广泛多样的活动模式（图1）。人类可以正确地认知，这是一种优点而不是缺陷，因为简单性和普遍性是好理论的标志。然而，复杂的神经信号也告诉我们，现有的模型可能不足以阐明大脑的奥秘。这在前额叶皮层可能尤其如此。前额叶皮层的神经元通常对各种任务变量表现出复杂的混合选择性。使用手工构建的模型，这种复杂的模式通常不容易解释和理解，而这些模型在设计上力求简单。神经网络有望捕捉神经活动的复杂性。</p><p>第三，除了提供生物系统的机械模型外，机器学习还可以用来探索神经科学中的“为什么”问题。大脑是在强大而高效的计算压力下进化而来的生物机器。即使我们了解了系统是如何工作的，我们也可能会问为什么它会这样工作。与进化以生存的生物系统类似，神经网络被训练以优化给定各种结构约束（神经元数量、电路布线的经济性等）的目标函数（图1）。通过识别导致大脑类似神经网络的特定目标和约束条件，我们可能会深入了解生物系统面临的进化压力。</p><p>在本文中，我们将讨论神经网络如何通过上述三种方式使神经科学家受益。在第2节中，我们将首先介绍 ANN 研究中常见的关键成分。在第 3 节中，我们将描述神经网络作为神经科学模型的两个主要应用：卷积网络作为感官（尤其是视觉）系统的模型，递归神经网络作为认知和运动系统的模型。在第 4 节和第 5 节中，我们将概述如何定制 ANN 的分析和架构设计，以更好地解决广泛的神经科学问题。为了帮助读者获得实践经验，我们在 PyTorch 和 Jupyter 笔记本中附带了教程风格的<a href="https://github.com/gyyang/nn-brain">代码</a>，涵盖所有主要主题。</p><h2 id="2-ANN-的基本成分和变化">2. ANN 的基本成分和变化</h2><p>在本节中，我们将介绍 ANN 中的基本概念及其常见变体。如果读者熟悉 ANN 和深度学习，可以跳过本节。读者可以参考麻省理工的书籍<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>3</sup></a>。</p><h3 id="2-1-基本成分：学习问题、体系结构和算法">2.1 基本成分：学习问题、体系结构和算法</h3><p>使用深度网络的典型研究包括三个基本要素：学习问题、网络架构和训练算法。神经网络中单元或神经元之间的连接权重受网络架构的约束，但其特定值在初始化时随机分配。这些权重构成了大量参数，统称为<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>θ</mi></mrow><annotation encoding="application/x-tex">θ</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span></span></span></span>，还包括其他模型参数（见下文），将使用算法进行训练。训练算法指定连接权重如何变化以更好地解决学习问题，例如拟合数据集或执行任务。我们将讨论一个简单的示例，其中多层感知器（MLP）被训练为使用监督学习执行简单的数字分类任务。</p><h4 id="学习问题">学习问题</h4><p>在监督学习中，系统学习拟合包含一组输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">{</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">}</mo><mo separator="true">,</mo><mtext> </mtext><mi>i</mi><mo>=</mo><mn>1</mn><mo separator="true">,</mo><mtext> </mtext><mo>…</mo><mo>…</mo><mo separator="true">,</mo><mtext> </mtext><mi>N</mi></mrow><annotation encoding="application/x-tex">\lbrace x^{(i)} \rbrace,\ i=1,\ ……,\ N</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.138em;vertical-align:-0.25em;"></span><span class="mopen">{</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">}</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">i</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner">……</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.109em;">N</span></span></span></span> 的数据集。每个输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">x^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 与目标输出 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup></mrow><annotation encoding="application/x-tex">y_{target}^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.4267em;vertical-align:-0.3819em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span></span></span></span> 配对。粗体符号表示向量（默认情况下为列向量）。目标是学习神经网络函数 $F(⋅, θ) $ 的参数 $ θ $ ，该函数预测给定输入的目标输出，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mi>F</mi><mo stretchy="false">(</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo separator="true">,</mo><mi>θ</mi><mo stretchy="false">)</mo><mo>≈</mo><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup></mrow><annotation encoding="application/x-tex">y^{(i)}=F(x^{(i)},θ)≈y_{target}^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0824em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.138em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">≈</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.4267em;vertical-align:-0.3819em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span></span></span></span> 。在简单数字分类任务 MNIST 中，每个输入是包含单个数字的图像，而目标输出是由十维向量或简单地对应于该对象类别的整数给出的所有类别 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mn>1</mn><mo separator="true">,</mo><mo>…</mo><mo separator="true">,</mo><mn>9</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(0,1,…,9)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner">…</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">9</span><span class="mclose">)</span></span></span></span> 的概率分布。</p><p>更准确地说，系统被训练为优化目标函数的值，或者通常最小化损失函数的值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>L</mi><mo>=</mo><mfrac><mn>1</mn><mi>N</mi></mfrac><msub><mo>∑</mo><mi>i</mi></msub><mi>L</mi><mo stretchy="false">(</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo separator="true">,</mo><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">L = \frac{1}{N} \sum_{i}L(y^{(i)},y_{target}^{(i)})</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">L</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.4267em;vertical-align:-0.3819em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8451em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.162em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2997em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">L</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span> ，其中 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>L</mi><mo stretchy="false">(</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo separator="true">,</mo><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">L(y^{(i)},y_{target}^{(i)})</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.4267em;vertical-align:-0.3819em;"></span><span class="mord mathnormal">L</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span> 量化了目标输出 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup></mrow><annotation encoding="application/x-tex">y^{(i)}_{target}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.4267em;vertical-align:-0.3819em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span></span></span></span> 和实际输出 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">y^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0824em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 之间的差异。</p><h4 id="网络体系结构">网络体系结构</h4><p>ANN 具有难以置信的多功能性，包括广泛的架构。在所有架构中，最基本的架构是 MLP（图2A）。MLP 由多层神经元组成，其中第 l 层的神经元仅接收来自第 (l−1) 层的输入，并且仅投射到第 (l+1) 层。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mi>x</mi></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(1)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">r^{(1)}=x \tag{1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.938em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span><span class="tag"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">1</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mi>f</mi><mo stretchy="false">(</mo><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><mo>+</mo><msup><mi>b</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo separator="true">,</mo><mtext> </mtext><mn>1</mn><mo>&lt;</mo><mi>l</mi><mo>&lt;</mo><mi>N</mi></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(2)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">r^{(l)}=f(W^{(l)}r^{(l-1)}+b^{(l)}),\ 1&lt;l&lt;N \tag{2}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.938em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">−</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7335em;vertical-align:-0.0391em;"></span><span class="mord mathnormal" style="margin-right:0.0197em;">l</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.109em;">N</span></span><span class="tag"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">2</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>y</mi><mo>=</mo><mi>f</mi><mo stretchy="false">(</mo><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>N</mi><mo stretchy="false">)</mo></mrow></msup><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>N</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><mo>+</mo><msup><mi>b</mi><mrow><mo stretchy="false">(</mo><mi>N</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(3)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">y=f(W^{(N)}r^{(N-1)}+b^{(N)}) \tag{3}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span><span class="mbin mtight">−</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">3</span></span><span class="mord">)</span></span></span></span></span></span></p><img src="/images/000032/03.jpg" width="600" alt="图2 常见神经网络结构示意图" align=center /><blockquote><p>（A） 多层感知器（MLP）。<br>（B） 递归神经网络（中间）接收输入流（左侧）。训练后，输出单元（右）应产生期望的的输出。<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>4</sup></a><br>（C） 递归神经网络作为前馈系统在时间上展开，每个层在一个时间步长对应于网络状态。 $ c_t $ 和 $ r_t $ 分别描述了时间 $ t $ 的网络状态和输出活动。 $ c_t $ 是 $ r_{t-1} $ 和输入 $ x_t $ 的函数。 <br><br>（D） 用于处理图像的卷积神经网络。每层包含多个通道（第 1 中有四个，第 2 层中有六个）。通道（由正方形表示）由空间组织的神经元组成，每个神经元接收来自具有相似空间偏好的神经元的连接。这些连接的空间范围由内核大小来描述。<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>5</sup></a></p></blockquote><p>这里， <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 是外部输入， <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">r^{(l)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 表示第 l 层神经元的神经活动， <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">W^{(l)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 是从第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(l-1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0197em;">l</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span></span></span></span> 层到第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>l</mi></mrow><annotation encoding="application/x-tex">l</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.0197em;">l</span></span></span></span> 层的连接矩阵。 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><mo>⋅</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f(⋅)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord">⋅</span><span class="mclose">)</span></span></span></span> 是模型神经元的一个（通常是非线性的）激活函数。网络的输出通过连接 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>N</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">W^{(N)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 读出。参数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>b</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">b^{(l)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>b</mi><mrow><mo stretchy="false">(</mo><mi>N</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">b^{(N)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 分别是模型神经元和输出单元的偏差。如果网络被训练为分类，那么输出通常被归一化为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mo>∑</mo><mi>j</mi></msub><msub><mi>y</mi><mi>j</mi></msub><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\sum_{j}y_j=1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1858em;vertical-align:-0.4358em;"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.162em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4358em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span> ，其中 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>y</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">y_j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 表示类 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>j</mi></mrow><annotation encoding="application/x-tex">j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0572em;">j</span></span></span></span> 的预测概率。</p><p>当每层有足够的神经元时，理论上 MLP 可以近似任意函数。然而，在实践中，网络规模有限，即使存在良好的解决方案，也可能无法通过训练找到。MLP 通常与更现代的神经网络架构结合使用或作为其一部分。</p><h4 id="训练算法">训练算法</h4><p>深度学习中训练的标志性方法是随机梯度下降（SGD）。可训练参数（统称为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>θ</mi></mrow><annotation encoding="application/x-tex">θ</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span></span></span></span> ）在损耗梯度的相反方向上更新，即：<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>θ</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{\partial L}{\partial θ}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2251em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">θ</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span>。直观地说，如果成本函数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>L</mi></mrow><annotation encoding="application/x-tex">L</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">L</span></span></span></span> 随着第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>j</mi></mrow><annotation encoding="application/x-tex">j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0572em;">j</span></span></span></span> 个参数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>θ</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">θ_j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9805em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 的增加而增加，则应通过训练来减少该参数。对于训练的每一步，因为使用整个训练集评估损失通常太昂贵，所以使用少量 M 个随机选择的训练示例（小批量）来计算损失，以 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="double-struck">B</mi><mo>=</mo><mo stretchy="false">{</mo><msub><mi>k</mi><mn>1</mn></msub><mo separator="true">,</mo><mo>…</mo><mo>…</mo><mo separator="true">,</mo><msub><mi>k</mi><mi>M</mi></msub><mo stretchy="false">}</mo></mrow><annotation encoding="application/x-tex">\mathbb{B}=\lbrace k_1,……,k_M\rbrace</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6889em;"></span><span class="mord mathbb">B</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">{</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0315em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner">……</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0315em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.109em;">M</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">}</span></span></span></span> 为索引，</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>L</mi><mrow><mi>b</mi><mi>a</mi><mi>t</mi><mi>c</mi><mi>h</mi></mrow></msub><mo>=</mo><mfrac><mn>1</mn><mi>M</mi></mfrac><munder><mo>∑</mo><mrow><mi>k</mi><mo>∈</mo><mi mathvariant="double-struck">B</mi></mrow></munder><mi>L</mi><mo stretchy="false">(</mo><msup><mi>y</mi><mrow><mo stretchy="false">(</mo><mi>k</mi><mo stretchy="false">)</mo></mrow></msup><mo separator="true">,</mo><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>k</mi><mo stretchy="false">)</mo></mrow></msubsup><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(4)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">L_{batch}=\frac{1}{M}\sum_{k∈\mathbb{B}}L(y^{(k)},y^{(k)}_{target}) \tag{4}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">L</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ba</span><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">c</span><span class="mord mathnormal mtight">h</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.6509em;vertical-align:-1.3295em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">M</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8479em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0315em;">k</span><span class="mrel mtight">∈</span><span class="mord mathbb mtight">B</span></span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.3295em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">L</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0315em;">k</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0315em;">k</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.6509em;vertical-align:-1.3295em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">4</span></span><span class="mord">)</span></span></span></span></span></span></p><p>因此称为“随机”。为简单起见，我们假设小批次大小为 1，并在以下等式中省略批次（ <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>L</mi><mrow><mi>b</mi><mi>a</mi><mi>t</mi><mi>c</mi><mi>h</mi></mrow></msub></mrow><annotation encoding="application/x-tex">L_{batch}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">L</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ba</span><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">c</span><span class="mord mathnormal mtight">h</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 将被称为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>L</mi></mrow><annotation encoding="application/x-tex">L</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">L</span></span></span></span> ）。梯度 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>θ</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{\partial L}{\partial θ}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2251em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">θ</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> 是参数变化的方向，参数小小的变化就能导致损失函数较大的变化。为了减少损失，在梯度的相反方向上更新可训练参数，其大小与学习率 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>η</mi></mrow><annotation encoding="application/x-tex">η</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span></span></span></span> 成正比：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi mathvariant="normal">Δ</mi><mi>θ</mi><mo>=</mo><mo>−</mo><mi>η</mi><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>θ</mi></mrow></mfrac></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(方程 5)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">Δθ=−η\frac{∂L}{∂θ} \tag{方程 5}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord">Δ</span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord cjk_fallback">方程</span><span class="mord"> 5</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi></mrow><annotation encoding="application/x-tex">W</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>b</mi></mrow><annotation encoding="application/x-tex">b</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">b</span></span></span></span> 等参数通常是可训练的。其他参数由建模者设置，称为超参数，例如学习率 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>η</mi></mrow><annotation encoding="application/x-tex">η</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span></span></span></span>。计算梯度的一个关键要求是可微性，即模型中函数的导数定义良好。</p><p>对于没有任何中间（隐藏）层的前馈网络:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>y</mi><mo>=</mo><mi>W</mi><mi>x</mi><mo>+</mo><mi>b</mi><mo separator="true">,</mo><mspace width="1em"/><mtext>或等价方程</mtext><mo separator="true">,</mo><mspace width="1em"/><msub><mi>y</mi><mi>i</mi></msub><mo>=</mo><munder><mo>∑</mo><mi>j</mi></munder><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><msub><mi>x</mi><mi>j</mi></msub><mo>+</mo><msub><mi>b</mi><mi>i</mi></msub></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(6)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">y=Wx+b, \quad \text{或等价方程},\quad y_i=\sum_{j}W_{ij}x_{j}+b_i \tag{6}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">b</span><span class="mpunct">,</span><span class="mspace" style="margin-right:1em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord text"><span class="mord cjk_fallback">或等价方程</span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:1em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.4638em;vertical-align:-1.4138em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.4138em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.4638em;vertical-align:-1.4138em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">6</span></span><span class="mord">)</span></span></span></span></span></span></p><p>计算梯度是直接的:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfrac><mo>=</mo><munder><mo>∑</mo><mi>k</mi></munder><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>y</mi><mi>k</mi></msub></mrow></mfrac><mfrac><mrow><mi mathvariant="normal">∂</mi><msub><mi>y</mi><mi>k</mi></msub></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfrac><mo>=</mo><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>y</mi><mi>i</mi></msub></mrow></mfrac><msub><mi>x</mi><mi>j</mi></msub></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(7)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{∂L}{∂W_{ij}}=\sum_{k}\frac{∂L}{∂y_k}\frac{∂y_k}{∂W_{ij}}=\frac{∂L}{∂y_i}x_j \tag{7}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.3435em;vertical-align:-0.9721em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.9721em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.6736em;vertical-align:-1.3021em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8479em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0315em;">k</span></span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.3021em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0315em;">k</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8804em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0315em;">k</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.9721em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.2519em;vertical-align:-0.8804em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8804em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.6736em;vertical-align:-1.3021em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">7</span></span><span class="mord">)</span></span></span></span></span></span></p><p>其中，当 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo>=</mo><mi>i</mi></mrow><annotation encoding="application/x-tex">k=i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span> 时，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><msub><mi>y</mi><mi>k</mi></msub></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂y_k}{∂W_{ij}}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.4745em;vertical-align:-0.5423em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.9322em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3281em;"><span style="top:-2.357em;margin-left:-0.1389em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2819em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.4461em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3488em;margin-left:-0.0359em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0315em;">k</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1512em;"><span></span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.5423em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> 等于 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">x_j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span>，否则为0。在矢量表示法中：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>W</mi></mrow></mfrac><mo>=</mo><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>y</mi></mrow></mfrac><msup><mi>x</mi><mo>⊺</mo></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(8)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{∂L}{∂W}=\frac{∂L}{∂y}x^⊺ \tag{8}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.2519em;vertical-align:-0.8804em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8804em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.2519em;vertical-align:-0.8804em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">8</span></span><span class="mord">)</span></span></span></span></span></span></p><p>在这里，我们遵循如下惯例：使得 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>W</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂W}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2251em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>y</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂y}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.3612em;vertical-align:-0.4811em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4811em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> 分别与 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi></mrow><annotation encoding="application/x-tex">W</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi></mrow><annotation encoding="application/x-tex">y</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span></span></span></span> 具有相同的形式。假设:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>L</mi><mo>=</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><mi mathvariant="normal">∥</mi><mi>y</mi><mo>−</mo><msub><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow></msub><msup><mi mathvariant="normal">∥</mi><mn>2</mn></msup><mo>=</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><munder><mo>∑</mo><mi>j</mi></munder><mo stretchy="false">(</mo><msub><mi>y</mi><mi>j</mi></msub><mo>−</mo><msub><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow></msub><mo separator="true">,</mo><mtext> </mtext><mi>j</mi><msup><mo stretchy="false">)</mo><mn>2</mn></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(9)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">L=\frac{1}{2}\Vert y−y_{target}\Vert^2=\frac{1}{2}\sum_j(y_j−y_{target},\ j)^{2} \tag{9}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">L</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.0074em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">2</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord">∥</span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1502em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord">∥</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.7352em;vertical-align:-1.4138em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">2</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.4138em;"><span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1502em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.0572em;">j</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.7352em;vertical-align:-1.4138em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">9</span></span><span class="mord">)</span></span></span></span></span></span></p><p>可得：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>W</mi></mrow></mfrac><mo>=</mo><mo stretchy="false">(</mo><mi>y</mi><mo>−</mo><msub><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow></msub><mo stretchy="false">)</mo><msup><mi>x</mi><mo>⊺</mo></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(10)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{∂L}{∂W}=(y−y_{target})x^⊺ \tag{10}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">10</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi mathvariant="normal">Δ</mi><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>∝</mo><mo>−</mo><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow></mfrac><mo>=</mo><mo stretchy="false">(</mo><msub><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow></msub><mo separator="true">,</mo><mtext> </mtext><mi>i</mi><mo>−</mo><msub><mi>y</mi><mi>i</mi></msub><mo stretchy="false">)</mo><msub><mi>x</mi><mi>j</mi></msub></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(11)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">ΔW_{ij}\propto−\frac{∂L}{∂W_{ij}}=(y_{target},\ i−y_i)x_j \tag{11}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord">Δ</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">∝</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.3435em;vertical-align:-0.9721em;"></span><span class="mord">−</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.9721em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">i</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.3435em;vertical-align:-0.9721em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">11</span></span><span class="mord">)</span></span></span></span></span></span></p><p>此修改仅取决于关于每个连接的输入和输出单元的本地信息。因此，如果 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow></msub><mo separator="true">,</mo><mtext> </mtext><mi>i</mi><mo>&gt;</mo><msub><mi>y</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">y_{target},\ i&gt;y_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9456em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">i</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&gt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> ，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">W_{ij}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 应改变以增加净输入，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">Δ</mi><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">ΔW_{ij}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord">Δ</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 的符号与 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">x_j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 相同。如果 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow></msub><mo separator="true">,</mo><mtext> </mtext><mi>i</mi><mo>&lt;</mo><msub><mi>y</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">y_{target},\ i&lt;y_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9456em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">i</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 则相反。</p><p>对于多层网络，则使用反向传播算法进行微分。为了计算损耗 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>L</mi></mrow><annotation encoding="application/x-tex">L</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">L</span></span></span></span> ，网络是以正向方式运行的（公式 1,2 和 3）。接下来，为了有效地计算准确的梯度 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>θ</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂θ}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2251em;vertical-align:-0.345em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">θ</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.345em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> ，需要在相反方向上反向传递关于损耗的信息，因此称为反向传播。</p><p>为了说明这个概念，考虑 N 层线性前馈网络（公式 1,2 和 3，并且 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>x</mi></mrow><annotation encoding="application/x-tex">f(x)=x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> ）。要计算 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂W^{(l)}}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2655em;vertical-align:-0.3854em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.6146em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.822em;"><span style="top:-2.822em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5357em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3854em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> ，我们需要计算 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂r^{(l)}}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2655em;vertical-align:-0.3854em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.6146em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.822em;"><span style="top:-2.822em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5357em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3854em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> 。从 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup><mo>+</mo><msup><mi>b</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">r^{(l+1)}=W^{(l+1)}r^{(l)}+b^{(l+1)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.9713em;vertical-align:-0.0833em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> ，我们得到：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msubsup><mi>r</mi><mi>i</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msubsup></mrow></mfrac><mo>=</mo><munder><mo>∑</mo><mi>j</mi></munder><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msubsup><mi>r</mi><mi>j</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msubsup></mrow></mfrac><mfrac><mrow><mi mathvariant="normal">∂</mi><msubsup><mi>r</mi><mi>j</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msubsup></mrow><mrow><mi mathvariant="normal">∂</mi><msubsup><mi>r</mi><mi>i</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msubsup></mrow></mfrac><mo>=</mo><munder><mo>∑</mo><mi>j</mi></munder><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msubsup><mi>r</mi><mi>j</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msubsup></mrow></mfrac><msubsup><mi>W</mi><mrow><mi>j</mi><mi>i</mi></mrow><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msubsup><mo>=</mo><munder><mo>∑</mo><mi>j</mi></munder><mo stretchy="false">[</mo><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><msubsup><mo stretchy="false">]</mo><mrow><mi>i</mi><mi>j</mi></mrow><mo lspace="0em" rspace="0em">⊺</mo></msubsup><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>r</mi><msup><mi>j</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(12)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{∂L}{∂r_{i}^{(l)}}=\sum_j \frac{∂L}{∂r_{j}^{(l+1)}} \frac{∂r_{j}^{(l+1)}}{∂r_{i}^{(l)}}=\sum_j \frac{∂L}{∂r_{j}^{(l+1)}}W_{ji}^{(l+1)}=\sum_j[W^{(l+1)}]_{ij}^{⊺} \frac{∂L}{∂r{j}^{(l+1)}} \tag{12}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.5831em;vertical-align:-1.2117em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.11em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2769em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.2748em;"><span class="pstrut" style="height:3.0448em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.7218em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2117em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:3.2615em;vertical-align:-1.4138em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.4138em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.11em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.2748em;"><span class="pstrut" style="height:3.0448em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.7218em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.3478em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.8478em;"><span style="top:-2.11em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2769em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.2748em;"><span class="pstrut" style="height:3.0448em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.8478em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2117em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.7852em;vertical-align:-1.4138em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.4138em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.11em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.2748em;"><span class="pstrut" style="height:3.0448em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.7218em;"><span class="pstrut" style="height:3.0448em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.3478em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mord mathnormal mtight">i</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.7852em;vertical-align:-1.4138em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.4138em;"><span></span></span></span></span></span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose"><span class="mclose">]</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.7823em;"><span style="top:-2.4231em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span><span style="top:-3.1809em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord amsrm mtight">⊺</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.296em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0572em;">j</span></span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.814em;"><span style="top:-2.989em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8984em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span><span class="tag"><span class="strut" style="height:3.2615em;vertical-align:-1.4138em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">12</span></span><span class="mord">)</span></span></span></span></span></span></p><p>矢量表示法：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac><mo>=</mo><mo stretchy="false">[</mo><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><msup><mo stretchy="false">]</mo><mo>⊺</mo></msup><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac><mo>=</mo><mo stretchy="false">[</mo><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><msup><mo stretchy="false">]</mo><mo>⊺</mo></msup><mo stretchy="false">[</mo><msup><mi>W</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>2</mn><mo stretchy="false">)</mo></mrow></msup><msup><mo stretchy="false">]</mo><mo>⊺</mo></msup><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>2</mn><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac><mo>=</mo><mo>⋯</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(13)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{∂L}{∂r^{(l)}}=[W^{(l+1)}]^⊺ \frac{∂L}{∂r^{(l+1)}}=[W^{(l+1)}]^⊺[W^{(l+2)}]^⊺ \frac{∂L}{∂r^{(l+2)}}=⋯ \tag{13}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0754em;vertical-align:-0.704em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.296em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.814em;"><span style="top:-2.989em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.704em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.0754em;vertical-align:-0.704em;"></span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose"><span class="mclose">]</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.296em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.814em;"><span style="top:-2.989em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.704em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.0754em;vertical-align:-0.704em;"></span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose"><span class="mclose">]</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">2</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose"><span class="mclose">]</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.296em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.814em;"><span style="top:-2.989em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">2</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.704em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.313em;"></span><span class="minner">⋯</span></span><span class="tag"><span class="strut" style="height:2.0754em;vertical-align:-0.704em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">13</span></span><span class="mord">)</span></span></span></span></span></span></p><p>因此，首先对于 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>y</mi></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂y}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.3612em;vertical-align:-0.4811em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.655em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4811em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> ，当 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>L</mi><mo>=</mo><mi>N</mi><mo>−</mo><mn>1</mn><mo separator="true">,</mo><mtext> </mtext><mo>…</mo><mo separator="true">,</mo><mtext> </mtext><mn>1</mn></mrow><annotation encoding="application/x-tex">L=N−1,\ …,\ 1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">L</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8389em;vertical-align:-0.1944em;"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner">…</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">1</span></span></span></span> 时，可以从 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂r^{(l+1)}}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2655em;vertical-align:-0.3854em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.6146em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.822em;"><span style="top:-2.822em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5357em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">+</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3854em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> 中递归地计算出 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msup><mi>r</mi><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msup></mrow></mfrac></mrow><annotation encoding="application/x-tex">\frac{∂L}{∂r^{(l)}}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2655em;vertical-align:-0.3854em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8801em;"><span style="top:-2.6146em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.822em;"><span style="top:-2.822em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5357em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.394em;"><span class="pstrut" style="height:3em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal mtight">L</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3854em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span> 。这种计算与正向传递相反，称为反向传递。通常，反向传播适用于具有任意微分分量的神经网络。</p><p>通过反向传播计算准确的梯度，在生物学上被认为是不现实的，因为更新每一层的连接需要下游层的连接权重的精确、非本地信息（以连接矩阵转置的形式，公式 13）。</p><h3 id="2-2-学习问题-目标函数的变化">2.2 学习问题/目标函数的变化</h3><p>在本节和以下章节（2.3和2.4）中，我们介绍了学习问题、网络架构和训练算法的常见变体。</p><p>传统上，学习问题分为三种：有监督的、强化的和无监督的学习问题。这三种学习问题的区别在于目标或对象。在监督学习中，每个输入都与一个目标相关联。系统学习产生符合目标的输出。在强化学习中，系统获得一系列标量奖励，而不是显式（高维）目标。它学会产生最大化总回报的产出（行动）。无监督学习是指系统没有明确目标或奖励的一系列问题。由于篇幅有限，我们将在本文中主要关注通过监督学习训练的神经网络。</p><h4 id="监督学习">监督学习</h4><p>如前所述，对于监督学习任务，提供了输入和目标输出对 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">{</mo><mo stretchy="false">(</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo separator="true">,</mo><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup><mo stretchy="false">)</mo><mo stretchy="false">}</mo></mrow><annotation encoding="application/x-tex">\lbrace(x^{(i)}, y_{target}^{(i)})\rbrace</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.4267em;vertical-align:-0.3819em;"></span><span class="mopen">{(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span><span class="mclose">)}</span></span></span></span> 目标是最小化网络预测的目标输出和实际输出之间的差异。在许多常见的监督学习问题中，目标输出是行为输出。例如，在典型的对象分类任务中，每个输入都是包含单个对象的图像，而目标输出是与该对象的类别（例如，狗、猫等）相对应的整数。在某种情况下，目标输出可以直接是神经记录数据。</p><p>具有随机点运动的经典感知决策任务可以被表述为监督学习问题，因为存在正确答案。在该任务中，动物观察随机移动的点，并通过选择两个备选方案 A 或 B 中的一个来报告点的总体运动方向。该任务可以简化为在第 i 次试验的每个时间点 t 接收噪声输入流 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>x</mi><mi>t</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup></mrow><annotation encoding="application/x-tex">x_t^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2906em;vertical-align:-0.2458em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2458em;"><span></span></span></span></span></span></span></span></span></span> 的网络，这可以表示支持 A 和反对 B 的净证据，系统应该学会报告平均输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup><mo>=</mo><mi>s</mi><mi>i</mi><mi>g</mi><mi>n</mi><mo stretchy="false">(</mo><msub><mrow><mo fence="true">⟨</mo><msubsup><mi>x</mi><mi>t</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msubsup><mo fence="true">⟩</mo></mrow><mi>t</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">y_{target}^{(i)}=sign(\left\langle x^{(i)}_{t} \right\rangle _{t})</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.4267em;vertical-align:-0.3819em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3819em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.8497em;vertical-align:-0.6997em;"></span><span class="mord mathnormal">s</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="mord mathnormal">n</span><span class="mopen">(</span><span class="minner"><span class="minner"><span class="mopen delimcenter" style="top:0em;"><span class="delimsizing size2">⟨</span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4542em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2458em;"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;"><span class="delimsizing size2">⟩</span></span></span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:-0.2692em;"><span style="top:-2.0003em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.6997em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span> ，选项 A 为 +1，选项 B 为 -1。</p><h4 id="强化学习">强化学习</h4><p>对于强化学习，模型（智能体）与环境交互，例如（虚拟）迷宫。在时间步骤 t，智能体从环境接收观察 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>o</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">o_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">o</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> ，产生将环境状态更新为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>s</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msub></mrow><annotation encoding="application/x-tex">s_{t+1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6389em;vertical-align:-0.2083em;"></span><span class="mord"><span class="mord mathnormal">s</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span></span></span></span> 的动作 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>a</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">a_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">a</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> ，并接收标量奖励 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">r_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> （惩罚的负值）。例如，在虚拟迷宫中导航的模型可以接收基于像素的视觉输入作为观察结果，产生在迷宫中移动的动作，并在离开迷宫时获得奖励。目标是根据过去和现在的观察结果采取适当的行动，使累积回报 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mo>∑</mo><mi>t</mi></msub><msub><mi>r</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">\sum_t r_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0497em;vertical-align:-0.2997em;"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1308em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2997em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 最大化。在许多经典的强化学习问题中，观察 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>o</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">o_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">o</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 等于环境状态 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>s</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">s_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">s</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> ，它包含关于环境的完整信息。</p><p>强化学习（无神经网络）已被神经科学家和认知科学家广泛用于研究基于价值的学习和决策任务。例如，在多武装强盗任务中，智能体在多个选项之间反复选择，其中每个选项产生一定概率的奖励。强化学习理论可以模拟智能体的行为随时间的变化，并帮助神经科学家研究基于价值的行为的神经机制。</p><p>深度强化学习使用强化学习训练深度神经网络，使其能够应用于许多更复杂的问题。原则上，深度强化学习可以用于研究实验室动物执行的大多数任务，因为动物通常通过奖励来执行任务。尽管当存在正确选择（例如，感知决策）时，许多此类任务也可以被表述为监督学习问题，但许多其他任务只能被描述为强化学习任务，因为答案是主观的。例如，存在正确答案（A，而不是 B）的感知决策任务可以扩展到评估动物对其选择的信心。除了这两种选择，正确的选择会获得高额奖励，否则不会获得任何奖励外，猴子还会得到一个确保获得少量奖励的肯定下注选项。由于少量奖励比没有奖励好，当受试者对做出感性判断不太自信时，他们更有可能选择肯定的投注选项。强化学习在这里是必要的，因为没有基本事实选择输出：最佳选择取决于动物自己在感知决策时的信心水平。</p><h4 id="无监督学习">无监督学习</h4><p>对于无监督学习，仅提供输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">{</mo><msup><mi>x</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">}</mo></mrow><annotation encoding="application/x-tex">\lbrace x^{(i)} \rbrace</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.138em;vertical-align:-0.25em;"></span><span class="mopen">{</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">}</span></span></span></span> ；仅使用输入和网络参数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>L</mi><mo stretchy="false">(</mo><mi>x</mi><mo separator="true">,</mo><mtext> </mtext><mi>θ</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">L(x,\ θ)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">L</span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="mclose">)</span></span></span></span>，（没有目标或奖励）来定义目标函数。例如，在主成分分析（PCA）中寻找第一个分量可以表示为简单神经网络中的无监督学习。从一组输入神经元 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 中读出的单个神经元 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi></mrow><annotation encoding="application/x-tex">y</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span></span></span></span> ，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>y</mi><mo>=</mo><msup><mi>ω</mi><mo>⊺</mo></msup><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(y=\omega ^⊺x)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">ω</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span><span class="mord mathnormal">x</span><span class="mclose">)</span></span></span></span>，可以通过最大化其方差 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>V</mi><mi>a</mi><mi>r</mi><mo stretchy="false">(</mo><mi>y</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">Var(y)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.2222em;">V</span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mclose">)</span></span></span></span> 来学习提取第一主分量，同时保持其连接权归一化 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi mathvariant="normal">∥</mi><mi>w</mi><mi mathvariant="normal">∥</mi><mo>=</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(\Vert w\Vert=1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord">∥</span><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="mord">∥</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span></span></span></span> 。</p><p>无监督学习对于建立感觉皮层发育的模型特别相关。虽然在机器学习中得到了广泛的应用，但对于大多数动物来说，监督学习所需的标记数据（如图像对象）很少。无监督学习很早就已被用来解释视觉区域的神经反应，以及最新的研究揭示了高级视觉区域的神经反应。</p><p>与强化学习和无监督学习相比，监督学习可以特别有效，因为神经网络以高维目标输出的形式接收更多的信息反馈。因此，将强化/非监督学习问题(或其中的一部分)表示为监督学习问题是很常见的。例如，考虑一个无监督学习问题，该问题将高维输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 压缩为低维表示 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>z</mi></mrow><annotation encoding="application/x-tex">z</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.044em;">z</span></span></span></span>，同时保留关于输入的尽可能多的信息(不一定是信息论意义上的)。解决这个问题的一种方法是使用监督学习训练自动编码器网络。自动编码器由将输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 映射到低维潜在表示 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>z</mi><mo>=</mo><msub><mi>f</mi><mrow><mi>e</mi><mi>n</mi><mi>c</mi><mi>o</mi><mi>d</mi><mi>e</mi></mrow></msub><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">z=f_{encode}(x)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.044em;">z</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">n</span><span class="mord mathnormal mtight">co</span><span class="mord mathnormal mtight">d</span><span class="mord mathnormal mtight">e</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose">)</span></span></span></span> 的编码器和将该潜在映射回高维表示 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi><mo>=</mo><msub><mi>f</mi><mrow><mi>d</mi><mi>e</mi><mi>c</mi><mi>o</mi><mi>d</mi><mi>e</mi></mrow></msub><mo stretchy="false">(</mo><mi>z</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">y=f_{decode}(z)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">d</span><span class="mord mathnormal mtight">eco</span><span class="mord mathnormal mtight">d</span><span class="mord mathnormal mtight">e</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.044em;">z</span><span class="mclose">)</span></span></span></span> 的解码器组成。为了确保 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>z</mi></mrow><annotation encoding="application/x-tex">z</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.044em;">z</span></span></span></span> 包含关于 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 的信息，自动编码器使用原始输入作为监督学习目标 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>y</mi><mrow><mi>t</mi><mi>a</mi><mi>r</mi><mi>g</mi><mi>e</mi><mi>t</mi></mrow></msub><mo>=</mo><mi>x</mi></mrow><annotation encoding="application/x-tex">y_{target}=x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">a</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 。</p><h3 id="2-3-网络结构的变化">2.3 网络结构的变化</h3><h4 id="递归神经网络">递归神经网络</h4><p>除了 MLP，另一个基本的 ANN 架构是递归神经网络（RNNs），它在时间上处理信息（图2B）。在 &quot;vanilla &quot;或 Elman RNN 中<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>6</sup></a>，模型神经元在时间 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em;"></span><span class="mord mathnormal">t</span></span></span></span> 的活动， <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">r_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> ，由递归连接 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mi>r</mi></msub></mrow><annotation encoding="application/x-tex">W_r</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 驱动，输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">x_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 通过连接 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mi>x</mi></msub></mrow><annotation encoding="application/x-tex">W_x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">x</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 驱动。网络的输出是通过连接 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mi>y</mi></msub></mrow><annotation encoding="application/x-tex">W_y</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 读出的。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>c</mi><mi>t</mi></msub><mo>=</mo><msub><mi>W</mi><mi>r</mi></msub><msub><mi>r</mi><mi>t</mi></msub><mo>−</mo><mn>1</mn><mo>+</mo><msub><mi>W</mi><mi>x</mi></msub><msub><mi>x</mi><mi>t</mi></msub><mo>+</mo><msub><mi>b</mi><mi>r</mi></msub></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(14)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">c_t=W_rr_t−1+W_xx_t+b_r \tag{14}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.7278em;vertical-align:-0.0833em;"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">x</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">14</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>r</mi><mi>t</mi></msub><mo>=</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>c</mi><mi>t</mi></msub><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(15)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">r_t=f(c_t) \tag{15}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">15</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>y</mi><mi>t</mi></msub><mo>=</mo><msub><mi>W</mi><mi>y</mi></msub><msub><mi>r</mi><mi>t</mi></msub><mo>+</mo><msub><mi>b</mi><mi>y</mi></msub></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(16)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">y_t=W_yr_t+b_y \tag{16}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.9805em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">16</span></span><span class="mord">)</span></span></span></span></span></span></p><p>这里， <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>c</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">c_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 代表细胞状态，类似于膜电位或输入电流，而 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">r_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 代表神经元活动。一个 RNN 可以在时间上展开（图 2C），并被视为 MLP 的一个特殊形式。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>r</mi><mi>t</mi></msub><mo>=</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>W</mi><mi>r</mi></msub><msub><mi>r</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>W</mi><mi>x</mi></msub><msub><mi>x</mi><mi>t</mi></msub><mo>+</mo><msub><mi>b</mi><mi>r</mi></msub><mo stretchy="false">)</mo><mo separator="true">,</mo><mspace width="1em"/><mi>f</mi><mi>o</mi><mi>r</mi><mtext> </mtext><mi>t</mi><mo>=</mo><mn>1</mn><mo separator="true">,</mo><mo>⋯</mo><mo separator="true">,</mo><mi>T</mi></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(17)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">r_t=f(W_rr_{t−1}+W_xx_t+b_r),\quad for\ t=1,⋯,T \tag{17}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">x</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:1em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mord mathnormal" style="margin-right:0.0278em;">or</span><span class="mspace"> </span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner">⋯</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">T</span></span><span class="tag"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">17</span></span><span class="mord">)</span></span></span></span></span></span></p><p>这里，第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em;"></span><span class="mord mathnormal">t</span></span></span></span> 层 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">r_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的神经元接收来自第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>t</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(t-1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span></span></span></span> 层 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub></mrow><annotation encoding="application/x-tex">r_{t-1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6389em;vertical-align:-0.2083em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span></span></span></span> 的输入和来自递归网络外部的额外输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">x_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>。与普通的 MLP 不同，从每一层到下一层的连接是跨时间共享的。</p><p>逆向传播也适用于 RNN。虽然 MLP 中的反向传播将梯度信息从最后一层向后传播（公式 13），但计算 RNN 的梯度需要将信息向之后传播（通过时间的反向传播，也被称为 BPTT）。假设损失是由最后一个时间点 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi></mrow><annotation encoding="application/x-tex">T</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">T</span></span></span></span> 的输出和一个线性激活函数计算出来的，BPTT 的关键步骤与公式 13 类似，计算为：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>r</mi><mi>t</mi></msub></mrow></mfrac><mo>=</mo><msubsup><mi>W</mi><mi>r</mi><mo lspace="0em" rspace="0em">⊺</mo></msubsup><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>r</mi><mrow><mi>t</mi><mo>+</mo><mn>1</mn></mrow></msub></mrow></mfrac><mo>=</mo><mo stretchy="false">[</mo><msubsup><mi>W</mi><mi>r</mi><mo lspace="0em" rspace="0em">⊺</mo></msubsup><msup><mo stretchy="false">]</mo><mn>2</mn></msup><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>L</mi></mrow><mrow><mi mathvariant="normal">∂</mi><msub><mi>r</mi><mrow><mi>t</mi><mo>+</mo><mn>2</mn></mrow></msub></mrow></mfrac><mo>=</mo><mo>⋯</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(18)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{∂L}{∂r_t}=W_{r}^{⊺}\frac{∂L}{∂r_{t+1}}=[W_{r}^{⊺}]^{2} \frac{∂L}{∂r_{t+2}}=⋯ \tag{18}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.2074em;vertical-align:-0.836em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.836em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.2658em;vertical-align:-0.8943em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-2.453em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord amsrm mtight">⊺</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">+</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8943em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.2658em;vertical-align:-0.8943em;"></span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-2.453em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord amsrm mtight">⊺</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mclose"><span class="mclose">]</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">+</span><span class="mord mtight">2</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">L</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8943em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.313em;"></span><span class="minner">⋯</span></span><span class="tag"><span class="strut" style="height:2.2658em;vertical-align:-0.8943em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">18</span></span><span class="mord">)</span></span></span></span></span></span></p><p>随着 RNN 中时间步数的增加，权重的修改涉及许多矩阵的乘积（公式 18）。对于非常深的前馈网络（例如，有10层以上的网络），也存在类似的问题。这个矩阵乘积的规范，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∥</mi><mo stretchy="false">[</mo><msubsup><mi>W</mi><mi>r</mi><mo>⊺</mo></msubsup><msup><mo stretchy="false">]</mo><mi>T</mi></msup><mi mathvariant="normal">∥</mi></mrow><annotation encoding="application/x-tex">\Vert [W^⊺_r]^T\Vert</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0913em;vertical-align:-0.25em;"></span><span class="mord">∥</span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-2.453em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mclose"><span class="mclose">]</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8413em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">T</span></span></span></span></span></span></span></span><span class="mord">∥</span></span></span></span>，如果 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mi>r</mi></msub></mrow><annotation encoding="application/x-tex">W_r</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 很大（更确切地说，最大特征值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mi>r</mi></msub><mo>&gt;</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">W_r&gt;1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&gt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span> ），可以随 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi></mrow><annotation encoding="application/x-tex">T</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">T</span></span></span></span> 呈指数增长，如果 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mi>r</mi></msub></mrow><annotation encoding="application/x-tex">W_r</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 很小，则消失为零，这使得历史上很难训练递归网络。这种爆炸式增长或者突然消失的问题可以通过现代技术的组合来大大缓解，包括网络结构和初始网络连接，倾向于保留反向传播梯度的规范。</p><h4 id="卷积神经网络">卷积神经网络</h4><p>一种特别重要的网络结构类型是卷积神经网络（图 2D）。卷积的使用意味着一组神经元将使用相同的函数–换言之，相同的连接权重集来处理各自的输入。在处理视觉输入的典型卷积神经网络中，神经元被组织成 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mrow><mi>c</mi><mi>h</mi><mi>a</mi><mi>n</mi><mi>n</mi><mi>e</mi><mi>l</mi></mrow></msub></mrow><annotation encoding="application/x-tex">N_{channel}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">c</span><span class="mord mathnormal mtight">hann</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> &quot;通道 &quot;或 “特征图”。每个通道包含具有不同空间选择性的 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mrow><mi>h</mi><mi>e</mi><mi>i</mi><mi>g</mi><mi>h</mi><mi>t</mi></mrow></msub><mo>×</mo><msub><mi>N</mi><mrow><mi>w</mi><mi>i</mi><mi>d</mi><mi>t</mi><mi>h</mi></mrow></msub></mrow><annotation encoding="application/x-tex">N_{height}×N_{width}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">h</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">h</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0269em;">w</span><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight">d</span><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">h</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 神经元。卷积层中的每个神经元都由一个元组 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi><mo>=</mo><mo stretchy="false">(</mo><msub><mi>i</mi><mi>C</mi></msub><mo separator="true">,</mo><msub><mi>i</mi><mi>H</mi></msub><mo separator="true">,</mo><msub><mi>i</mi><mi>W</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">i=(i_C,i_H,i_W)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span> 来索引，代表通道索引（<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>i</mi><mi>C</mi></msub></mrow><annotation encoding="application/x-tex">i_C</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8095em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>）和空间偏好索引（<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>i</mi><mi>H</mi></msub><mo separator="true">,</mo><mtext> </mtext><msub><mi>i</mi><mi>W</mi></msub></mrow><annotation encoding="application/x-tex">i_H,\ i_W</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>）。第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>l</mi></mrow><annotation encoding="application/x-tex">l</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.0197em;">l</span></span></span></span> 层的第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span> 个神经元通常由前一层的神经元驱动（偏置项和激活函数省略）。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msubsup><mi>r</mi><mrow><msub><mi>i</mi><mi>C</mi></msub><msub><mi>i</mi><mi>H</mi></msub><msub><mi>i</mi><mi>W</mi></msub></mrow><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msubsup><mo>=</mo><munder><mo>∑</mo><mrow><msub><mi>j</mi><mi>C</mi></msub><msub><mi>j</mi><mi>H</mi></msub><msub><mi>j</mi><mi>W</mi></msub></mrow></munder><msubsup><mi>W</mi><mrow><msub><mi>i</mi><mi>C</mi></msub><msub><mi>i</mi><mi>H</mi></msub><msub><mi>i</mi><mi>W</mi></msub><mo separator="true">,</mo><mtext> </mtext><msub><mi>j</mi><mi>C</mi></msub><msub><mi>j</mi><mi>H</mi></msub><msub><mi>j</mi><mi>W</mi></msub></mrow><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msubsup><msubsup><mi>r</mi><mrow><msub><mi>j</mi><mi>C</mi></msub><msub><mi>j</mi><mi>H</mi></msub><msub><mi>j</mi><mi>W</mi></msub></mrow><mrow><mo stretchy="false">(</mo><mi>l</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msubsup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(19)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">r_{i_{C}i_{H}i_{W}}^{(l)}=\sum_{j_{C}j_{H}j_{W}}W_{i_{C}i_{H}i_{W},\ j_{C}j_{H}j_{W}}^{(l)}r_{j_{C}j_{H}j_{W}}^{(l−1)} \tag{19}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.422em;vertical-align:-0.3772em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3772em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.4638em;vertical-align:-1.4138em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.4138em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mpunct mtight">,</span><span class="mspace mtight"><span class="mtight"> </span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mbin mtight">−</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.4638em;vertical-align:-1.4138em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">19</span></span><span class="mord">)</span></span></span></span></span></span></p><p>重要的是，在卷积网络中，连接权重并不取决于第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span> 个神经元的绝对空间位置；相反，它们只取决于突触前和突触后神经元之间的空间位移（<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>i</mi><mi>H</mi></msub><mo>−</mo><msub><mi>j</mi><mi>H</mi></msub><mo separator="true">,</mo><mtext> </mtext><msub><mi>i</mi><mi>W</mi></msub><mo>−</mo><msub><mi>j</mi><mi>W</mi></msub></mrow><annotation encoding="application/x-tex">i_H-j_H,\ i_W-j_W</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8095em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0572em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0572em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>）。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msubsup><mi>W</mi><mrow><msub><mi>i</mi><mi>C</mi></msub><msub><mi>i</mi><mi>H</mi></msub><msub><mi>i</mi><mi>W</mi></msub><mo separator="true">,</mo><mtext> </mtext><msub><mi>j</mi><mi>C</mi></msub><msub><mi>j</mi><mi>H</mi></msub><msub><mi>j</mi><mi>W</mi></msub></mrow><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msubsup><mo>=</mo><msubsup><mi>W</mi><mrow><msub><mi>i</mi><mi>C</mi></msub><mo separator="true">,</mo><mtext> </mtext><msub><mi>j</mi><mi>C</mi></msub></mrow><mrow><mo stretchy="false">(</mo><mi>l</mi><mo stretchy="false">)</mo></mrow></msubsup><mo stretchy="false">(</mo><msub><mi>i</mi><mi>H</mi></msub><mo>−</mo><msub><mi>j</mi><mi>H</mi></msub><mo separator="true">,</mo><mtext> </mtext><msub><mi>i</mi><mi>W</mi></msub><mo>−</mo><msub><mi>j</mi><mi>W</mi></msub><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(20)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">W_{i_{C}i_{H}i_{W},\ j_{C}j_{H}j_{W}}^{(l)}=W_{i_C,\ j_C}^{(l)}(i_H−j_H,\ i_W−j_W) \tag{20}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.4578em;vertical-align:-0.413em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mpunct mtight">,</span><span class="mspace mtight"><span class="mtight"> </span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.4578em;vertical-align:-0.413em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.0448em;"><span style="top:-2.4231em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span><span class="mpunct mtight">,</span><span class="mspace mtight"><span class="mtight"> </span></span><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3448em;"><span style="top:-2.3567em;margin-left:-0.0572em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em;">C</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.1433em;"><span></span></span></span></span></span></span></span></span></span><span style="top:-3.2198em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0197em;">l</span><span class="mclose mtight">)</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.413em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0572em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0572em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:1.4578em;vertical-align:-0.413em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">20</span></span><span class="mord">)</span></span></span></span></span></span></p><p>因此，单一通道内的所有神经元使用相同的共享连接权重处理输入空间的不同部分，使这些神经元在不同空间位置的感受野具有相同的刺激选择性。此外，神经元只接受具有类似空间偏好的其他神经元的输入，即当 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∥</mi><msub><mi>i</mi><mi>H</mi></msub><mo>−</mo><msub><mi>j</mi><mi>H</mi></msub><mi mathvariant="normal">∥</mi></mrow><annotation encoding="application/x-tex">\Vert i_H-j_H\Vert</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">∥</span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0572em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0813em;">H</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord">∥</span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∥</mi><msub><mi>i</mi><mi>W</mi></msub><mo>−</mo><msub><mi>j</mi><mi>W</mi></msub><mi mathvariant="normal">∥</mi></mrow><annotation encoding="application/x-tex">\Vert i_W-j_W\Vert</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">∥</span><span class="mord"><span class="mord mathnormal">i</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0572em;">j</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0572em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1389em;">W</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord">∥</span></span></span></span> 值较小时（图 2D）。</p><p>这种权重的重复使用不仅大大减少了可训练参数的数量，而且还对处理过程施加了不变性。对于视觉处理来说，卷积网络通常会施加空间不变性，即无论物体的空间位置如何，都用同一组权重来处理。</p><p>在一个典型的卷积网络中，在各层中，每个通道的神经元数量（<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mrow><mi>h</mi><mi>e</mi><mi>i</mi><mi>g</mi><mi>h</mi><mi>t</mi></mrow></msub><mo>×</mo><msub><mi>N</mi><mrow><mi>w</mi><mi>i</mi><mi>d</mi><mi>t</mi><mi>h</mi></mrow></msub></mrow><annotation encoding="application/x-tex">N_{height}×N_{width}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">h</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span><span class="mord mathnormal mtight">h</span><span class="mord mathnormal mtight">t</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">×</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0269em;">w</span><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight">d</span><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight">h</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>）减少（随着空间分辨率的提高），而更多的特征被提取出来（随着通道数量的增加）。通常在系统的末端设定一个分类器，用来学习一个特定的任务，如对视觉对象进行分类。</p><h4 id="激活功能">激活功能</h4><p>ANNs 中的大多数神经元，像它们的生物对应物一样，根据它们的输入进行非线性计算。这些神经元通常是具有单一非线性激活函数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><mo>⋅</mo><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f(⋅)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord">⋅</span><span class="mclose">)</span></span></span></span> 的点状神经元，它将输入的总和与输出的活动联系起来。非线性对于 ANN 的功能至关重要。常用的激活函数是整流线性单元（ReLU）函数，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>m</mi><mi>a</mi><mi>x</mi><mo stretchy="false">(</mo><mi>x</mi><mo separator="true">,</mo><mtext> </mtext><mn>0</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">f(x)=max(x,\ 0)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">ma</span><span class="mord mathnormal">x</span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">0</span><span class="mclose">)</span></span></span></span><a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>7</sup></a>。ReLU 在 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">x=0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">0</span></span></span></span> 时的导数在数学上是未定义的，但在实践中习惯性地设置为 0。ReLU 及其变体通常用于前馈网络，而双曲正切（tanh）函数通常用于递归网络 。ReLU 和类似的激活函数是不对称的，在高值时是不饱和的。尽管生物神经元在高速率下最终会饱和，但它们经常在非饱和状态下工作。因此，带有速率单元的传统神经回路模型也经常使用非饱和激活函数。</p><h4 id="归一化">归一化</h4><p>归一化方法是许多 ANN 的重要组成部分，特别是非常深的神经网络。与生物神经回路中的归一化相似，ANNs 中的归一化方法将神经元的输入或输出保持在理想的范围内。例如，对于一个层的输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span>（例如，刺激），该层的归一化相当于一种跨单元的 “Z评分”，因此第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span> 个神经元的实际输入<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mover accent="true"><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="true">^</mo></mover></mrow><annotation encoding="application/x-tex">\widehat{x_i}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8206em;vertical-align:-0.15em;"></span><span class="mord accent"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6706em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span><span class="svg-align" style="top:-3.4306em;"><span class="pstrut" style="height:3em;"></span><span style="height:0.24em;"><svg xmlns="http://www.w3.org/2000/svg" width="100%" height="0.24em" viewBox="0 0 1062 239" preserveAspectRatio="none"><path d="M529 0h5l519 115c5 1 9 5 9 10 0 1-1 2-1 3l-4 22c-1 5-5 9-11 9h-2L532 67 19 159h-2c-5 0-9-4-11-9l-5-22c-1-6 2-12 8-13z"/></svg></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span> 是：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mover accent="true"><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="true">^</mo></mover><mo>=</mo><mi>γ</mi><mo>⋅</mo><mfrac><mrow><msub><mi>x</mi><mi>i</mi></msub><mo>−</mo><mi>μ</mi></mrow><mi>σ</mi></mfrac><mo>+</mo><mi>β</mi></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(21)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\widehat{x_i}=γ⋅\frac{x_i−μ}{σ}+β \tag{21}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8206em;vertical-align:-0.15em;"></span><span class="mord accent"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6706em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span><span class="svg-align" style="top:-3.4306em;"><span class="pstrut" style="height:3em;"></span><span style="height:0.24em;"><svg xmlns="http://www.w3.org/2000/svg" width="100%" height="0.24em" viewBox="0 0 1062 239" preserveAspectRatio="none"><path d="M529 0h5l519 115c5 1 9 5 9 10 0 1-1 2-1 3l-4 22c-1 5-5 9-11 9h-2L532 67 19 159h-2c-5 0-9-4-11-9l-5-22c-1-6 2-12 8-13z"/></svg></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6389em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0556em;">γ</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">⋅</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.9463em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.2603em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord mathnormal">μ</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0528em;">β</span></span><span class="tag"><span class="strut" style="height:1.9463em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">21</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>μ</mi><mo>=</mo><mo stretchy="false">⟨</mo><msub><mi>x</mi><mi>j</mi></msub><mo stretchy="false">⟩</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(22)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">μ=\langle x_j \rangle \tag{22}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">μ</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mopen">⟨</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mclose">⟩</span></span><span class="tag"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">22</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>σ</mi><mo>=</mo><msqrt><mrow><mo stretchy="false">⟨</mo><mo stretchy="false">(</mo><msub><mi>x</mi><mi>j</mi></msub><mo>−</mo><mi>μ</mi><msup><mo stretchy="false">)</mo><mn>2</mn></msup><mo stretchy="false">⟩</mo><mo>+</mo><mi>ϵ</mi></mrow></msqrt></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(23)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">σ=\sqrt{ \langle (x_j−μ)^2 \rangle+ϵ} \tag{23}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.84em;vertical-align:-0.5742em;"></span><span class="mord sqrt"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.2658em;"><span class="svg-align" style="top:-3.8em;"><span class="pstrut" style="height:3.8em;"></span><span class="mord" style="padding-left:1em;"><span class="mopen">⟨(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord mathnormal">μ</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7401em;"><span style="top:-2.989em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span><span class="mclose">⟩</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord mathnormal">ϵ</span></span></span><span style="top:-3.2258em;"><span class="pstrut" style="height:3.8em;"></span><span class="hide-tail" style="min-width:1.02em;height:1.88em;"><svg xmlns="http://www.w3.org/2000/svg" width="400em" height="1.88em" viewBox="0 0 400000 1944" preserveAspectRatio="xMinYMin slice"><path d="M983 90l0 -0c4,-6.7,10,-10,18,-10 H400000v40H1013.1s-83.4,268,-264.1,840c-180.7,572,-277,876.3,-289,913c-4.7,4.7,-12.7,7,-24,7s-12,0,-12,0c-1.3,-3.3,-3.7,-11.7,-7,-25c-35.3,-125.3,-106.7,-373.3,-214,-744c-10,12,-21,25,-33,39s-32,39,-32,39c-6,-5.3,-15,-14,-27,-26s25,-30,25,-30c26.7,-32.7,52,-63,76,-91s52,-60,52,-60s208,722,208,722c56,-175.3,126.3,-397.3,211,-666c84.7,-268.7,153.8,-488.2,207.5,-658.5c53.7,-170.3,84.5,-266.8,92.5,-289.5zM1001 80h400000v40h-400000z"/></svg></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.5742em;"><span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:1.84em;vertical-align:-0.5742em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">23</span></span><span class="mord">)</span></span></span></span></span></span></p><p>其中，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">⟨</mo><msub><mi>x</mi><mi>j</mi></msub><mo stretchy="false">⟩</mo></mrow><annotation encoding="application/x-tex">\langle x_j \rangle</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mopen">⟨</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mclose">⟩</span></span></span></span> 是指同一层中所有单元的平均值；<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>μ</mi></mrow><annotation encoding="application/x-tex">μ</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">μ</span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>σ</mi></mrow><annotation encoding="application/x-tex">σ</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span></span></span></span> 是 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 的均值和方差。归一化后，不同的外部输入导致 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mover accent="true"><mi>x</mi><mo stretchy="true">^</mo></mover></mrow><annotation encoding="application/x-tex">\widehat{x}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6706em;"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6706em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal">x</span></span><span class="svg-align" style="width:calc(100% - 0.0556em);margin-left:0.0556em;top:-3.4306em;"><span class="pstrut" style="height:3em;"></span><span style="height:0.24em;"><svg xmlns="http://www.w3.org/2000/svg" width="100%" height="0.24em" viewBox="0 0 1062 239" preserveAspectRatio="none"><path d="M529 0h5l519 115c5 1 9 5 9 10 0 1-1 2-1 3l-4 22c-1 5-5 9-11 9h-2L532 67 19 159h-2c-5 0-9-4-11-9l-5-22c-1-6 2-12 8-13z"/></svg></span></span></span></span></span></span></span></span></span> 的均值和方差相同，由可训练参数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>γ</mi></mrow><annotation encoding="application/x-tex">γ</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0556em;">γ</span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>β</mi></mrow><annotation encoding="application/x-tex">β</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0528em;">β</span></span></span></span> 设定。小常数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>ε</mi></mrow><annotation encoding="application/x-tex">ε</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">ε</span></span></span></span> 确保 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>σ</mi></mrow><annotation encoding="application/x-tex">σ</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span></span></span></span> 不会小到消失。</p><h3 id="2-4-训练算法的变体">2.4.训练算法的变体</h3><h4 id="基于SGD的方法的变种">基于SGD的方法的变种</h4><p>监督、强化和无监督学习任务都可以用基于 SGD 的方法进行训练。部分由于估计梯度的随机性，直接应用 SGD（公式 5）往往会导致训练效果不佳。在训练过程中逐渐衰减的学习率值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>η</mi></mrow><annotation encoding="application/x-tex">η</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span></span></span></span> 通常可以改善性能，因为在后期训练中较小的学习率会鼓励对参数进行更精细的调整。基于 SGD 的各种优化方法被用来改善学习。一种简单有效的技术是动量，在步骤 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>j</mi></mrow><annotation encoding="application/x-tex">j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0572em;">j</span></span></span></span> 上，基于时间平滑梯度 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">v^{(j)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span>，用 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">Δ</mi><msup><mi>θ</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">Δθ^{(j)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord">Δ</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 更新参数。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mi>μ</mi><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo>−</mo><mn>1</mn><mo stretchy="false">)</mo></mrow></msup><mo>+</mo><mfrac><mrow><mi mathvariant="normal">∂</mi><msup><mi>L</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup></mrow><mrow><mi mathvariant="normal">∂</mi><mi>θ</mi></mrow></mfrac><mo separator="true">,</mo><mspace width="2em"/><mn>0</mn><mo>&lt;</mo><mi>μ</mi><mo>&lt;</mo><mn>1</mn></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(24)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">v^{(j)}=μv^{(j-1)}+\frac{∂L^{(j)}}{∂θ},\qquad 0&lt;μ&lt;1 \tag{24}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.938em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.1324em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">μ</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mbin mtight">−</span><span class="mord mtight">1</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:2.251em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.565em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal">L</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:2em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">0</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7335em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">μ</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span><span class="tag"><span class="strut" style="height:2.251em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">24</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi mathvariant="normal">Δ</mi><msup><mi>θ</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mo>−</mo><mi>η</mi><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(25)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">Δθ^{(j)}=−ηv^{(j)} \tag{25}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.938em;"></span><span class="mord">Δ</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.1324em;vertical-align:-0.1944em;"></span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:1.188em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">25</span></span><span class="mord">)</span></span></span></span></span></span></p><p>另外，在自适应学习率方法中，单个参数的学习率是根据其在训练步骤中的梯度统计（如平均值和方差）来调整。例如，在 Adam 方法中<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>8</sup></a>，如果参数的梯度在各步骤中一直保持一致（低方差），则参数更新的价值会被放大。自适应学习率方法可以被视为近似地考虑了损失函数的曲率。</p><h4 id="正则化">正则化</h4><p>为了提高深度网络的泛化性能，正则化技术在训练期间非常重要。在损失函数中加入 L2 正则化项 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>L</mi><mrow><mi>r</mi><mi>e</mi><mi>g</mi></mrow></msub><mo>=</mo><mi>λ</mi><msub><mo>∑</mo><mrow><mi>i</mi><mi>j</mi></mrow></msub><msubsup><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow><mn>2</mn></msubsup></mrow><annotation encoding="application/x-tex">L_{reg}=λ\sum_{ij}W_{ij}^2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">L</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.2499em;vertical-align:-0.4358em;"></span><span class="mord mathnormal">λ</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.162em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4358em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8141em;"><span style="top:-2.4413em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3948em;"><span></span></span></span></span></span></span></span></span></span>（相当于权重衰减），阻止网络使用大的连接权重，这可以通过隐性限制模型的复杂性来提高泛化性能。在训练的每一步，Dropout 会使随机选择的一部分神经元沉默。它减少了网络对特定神经元或神经元的精确组合的依赖。Dropout 可以被认为是松散地接近于尖峰噪声。</p><p>超参数（学习率、批量大小、网络初始化等）的选择通常由理论、经验证据和硬件限制的组合指导。对于神经科学的应用，重要的是科学结论不能严重依赖超参数的选择。如果必须依赖超参数，应该清楚地记录这种依赖关系。</p><h2 id="3-构建-ANN-以解决神经科学问题的例子">3. 构建 ANN 以解决神经科学问题的例子</h2><p>在本节中，我们概述了 ANN 在解决神经科学问题中的两种常见用途。</p><h3 id="3-1-视觉系统的卷积网络">3.1 视觉系统的卷积网络</h3><p>深度卷积神经网络是目前计算机视觉研究和应用的标准工具。这些网络通常由几十层，有时几百层的卷积处理组成。对深层前馈神经网络的有效训练过去是很困难的。通过各个领域的创新组合，这个可训练性问题已经得到了极大的改善。如果不是通用 GPU（图形处理单元）和 TPU（张量处理单元）等硬件的快速发展，现代深度网络将过于庞大，因此运行速度太慢，更不用说训练了。深度卷积网络通常是用大型自然数据集进行训练的，其中包含数百万张高分辨率标记的图像，使用具有自适应学习率的训练方法。除了默认使用卷积，一系列的网络架构创新也提高了性能，包括采用 ReLU 激活函数，归一化方法，以及使用残差连接，可以提供从网络层的输入直接到输出的架构捷径。</p><p>深度卷积网络已被提议作为视觉系统的计算模型，特别是腹侧视觉流或视觉物体信息处理的途径（图 3）。这些模型通常在与计算机视觉研究中使用的相同的图像分类任务上使用监督学习进行训练，在许多情况下，是与计算机视觉中开发的卷积网络完全相同。相比之下，视觉系统的经典模型通常依赖于手工设计的特征（突触权重），比如 Gabor 滤波器，或者基于高效编码原则的无监督学习训练。尽管经典模型在解释低级视觉区域的各种特征方面取得了成功，但深度卷积网络在解释猴子和人类的高级视觉区域的神经活动方面大大超过了它们。除了被训练来对物体进行分类，卷积网络也可以被训练来直接重现在各种视觉区域记录的神经活动模式。</p><img src="/images/000032/04.jpg" width="700" alt="图3 比较视觉系统和深度卷积神经网络" align=center /><blockquote><p>同样的图像通过猴子的视觉皮层（顶部）和深度卷积神经网络（底部），允许生物和 ANN 之间进行并排比较。来自 IT的神经反应被卷积网络最后一层的反应预测得最好，而来自 V4 的神经反应则被中间网络层预测得更好（绿色虚线箭头）。<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>9</sup></a></p></blockquote><p>在比较卷积网络与高等视觉区的经典工作中，Yamins 及其同事在视觉分类任务上训练了数千个具有不同架构的卷积网络。为了研究人工和生物视觉系统的相似程度，他们量化了网络对自然图像的反应可以用来线性预测猴子观看相同图像的下颞叶（IT）皮层的反应。他们发现，这种神经预测能力与分类任务的准确性高度相关，这表明可以通过在具有挑战性的自然图像分类任务上开发性能更好的模型来建立更好的IT预测模型。他们进一步发现，与 IT 不同，来自相对较低的视觉区域 V4 的神经反应是由网络的中间层预测的最好的（图 3）。<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>10</sup></a></p><p>作为视觉系统的计算模型，卷积网络可以对下游区域的复杂、高维输入进行建模，对于使用基于像素的视觉输入的大规模模型非常有用。随着 Pytorch 和 Tensorflow 等标准深度学习框架中许多预训练的网络的容易获得，这个过程变得特别简单。</p><h3 id="3-2-认知和运动系统的-RNN">3.2 认知和运动系统的 RNN</h3><p>RNN 是处理语音和文本等序列的常用机器学习工具。在神经科学中，它们已用于建模认知、运动和导航系统的各个方面。与用于模拟视觉系统的卷积网络不同，递归网络通常针对神经科学家正在研究的特定认知或运动任务进行训练。通过比较在动物或人类执行的相同任务上训练的 RNN，可以在 RNN 和大脑之间进行并排比较。可以在多个层面进行比较，包括单神经元活性和选择性、种群解码、状态空间动力学和网络对扰动的响应。我们将在下一节详细介绍如何分析 RNN。</p><p>使用 RNN 来建模认知的一项有影响力的工作涉及一个猴子实验，用于上下文相关的感知决策。在这项任务中，随机移动点的一部分（称为运动相干性）沿相同方向（左或右）移动；独立地，一部分（颜色一致性）点是红色的，其余的是绿色的。在一次试验中，受试者被背景信号提示执行运动任务（判断净运动方向是右还是左）或颜色任务（判断红点是否多于绿点）。猴子通过在时间上整合行为相关信息（例如，颜色）的证据，同时忽略不相关的特征（颜色任务中的运动方向）来执行任务。行为动物记录的前额叶皮层神经元表现出复杂的活动模式，尽管这些不相关的特征对行为选择的影响很小，但它们仍然表现得很强。然而，RNN 捕捉到了这些反直觉的活动模式。检查 RNN 动态揭示了一种新的机制，通过该机制，不相关的特征被表示，但在证据积累过程中有选择地过滤掉，并且不随时间而整合。</p><p>为了更好地比较 RNN 和生物系统之间的神经动力学，神经科学中使用的 RNN 通常与机器学习中的 RNN 对待时间的方式不同。机器学习中的 RNN 几乎总是离散的时间系统，其中通过从时间步骤 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">t−1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6984em;vertical-align:-0.0833em;"></span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span> 的状态映射获得时间步骤 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>t</mi></mrow><annotation encoding="application/x-tex">t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6151em;"></span><span class="mord mathnormal">t</span></span></span></span> 的状态（公式 14 和 15）。使用离散时间系统意味着现实生活中相隔几秒的刺激可以在连续时间点提供给网络。为了实现更现实的神经动力学，神经科学中使用的 RNN 通常基于连续时间动力学系统，例如</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>τ</mi><mfrac><mrow><mi>d</mi><mi>r</mi></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mo>−</mo><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>W</mi><mi>r</mi></msub><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><msub><mi>W</mi><mi>x</mi></msub><mi>x</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mi>r</mi><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(26)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">τ\frac{dr}{dt}=−r(t)+f(W_rr(t)+W_xx(t)+br) \tag{26}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">x</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">x</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">b</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">26</span></span><span class="mord">)</span></span></span></span></span></span></p><p>这里，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>τ</mi></mrow><annotation encoding="application/x-tex">τ</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span></span></span></span> 是单一单位的时间尺度。然后，这个连续时间系统可以用欧拉方法进行离散化，时间步长为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">Δ</mi><mi>t</mi><mo stretchy="false">(</mo><mo>&lt;</mo><mi>τ</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">Δt(&lt;τ)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">Δ</span><span class="mord mathnormal">t</span><span class="mopen">(</span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span><span class="mclose">)</span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo>+</mo><mi mathvariant="normal">Δ</mi><mi>t</mi><mo stretchy="false">)</mo><mo>≈</mo><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mfrac><mrow><mi mathvariant="normal">Δ</mi><mi>t</mi></mrow><mi>τ</mi></mfrac><mo stretchy="false">[</mo><mo>−</mo><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>W</mi><mi>r</mi></msub><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><msub><mi>W</mi><mi>x</mi></msub><mi>x</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mi>r</mi><mo stretchy="false">)</mo><mo stretchy="false">]</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(27)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">r(t+Δt)≈r(t)+\frac{Δt}{τ}[−r(t)+f(W_rr(t)+W_xx(t)+br)] \tag{27}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">Δ</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">≈</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:2.0463em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3603em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord">Δ</span><span class="mord mathnormal">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mopen">[</span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">x</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">x</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">b</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)]</span></span><span class="tag"><span class="strut" style="height:2.0463em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">27</span></span><span class="mord">)</span></span></span></span></span></span></p><p>除了通过反向传播的梯度下降，神经科学中还使用了一系列不同的算法来训练 RNN 模型。这些算法基于利用具有弱扰动的混沌系统的思想。特别地，FORCE 算法允许通过使用递归最小二乘算法修改 RNN 的输出连接以匹配目标来进行快速学习<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>11</sup></a>。网络输出 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">y(t)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span></span></span></span>（此处假设为一维）通过 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>w</mi><mrow><mi>f</mi><mi>b</mi></mrow></msub></mrow><annotation encoding="application/x-tex">w_{fb}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.0269em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span><span class="mord mathnormal mtight">b</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 反馈给 RNN</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>τ</mi><mfrac><mrow><mi>d</mi><mi>r</mi></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mo>−</mo><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>W</mi><mi>r</mi></msub><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><msub><mi>W</mi><mi>x</mi></msub><mi>x</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><msub><mi>w</mi><mrow><mi>f</mi><mi>b</mi></mrow></msub><mi>y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mi>r</mi><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(28)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">τ\frac{dr}{dt}=−r(t)+f(W_rr(t)+W_xx(t)+w_{fb}y(t)+br) \tag{28}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">x</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">x</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.0269em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span><span class="mord mathnormal mtight">b</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">b</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">28</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>y</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><msubsup><mi>w</mi><mi>y</mi><mo>⊺</mo></msubsup><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(29)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">y(t)=w_y^⊺r(t) \tag{29}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.1331em;vertical-align:-0.3831em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-2.453em;margin-left:-0.0269em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3831em;"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:1.1331em;vertical-align:-0.3831em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">29</span></span><span class="mord">)</span></span></span></span></span></span></p><p>因此，修改输出连接相当于递归连接矩阵的低秩修改（<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>w</mi><mrow><mi>f</mi><mi>b</mi></mrow></msub><msubsup><mi>w</mi><mi>y</mi><mo>⊺</mo></msubsup></mrow><annotation encoding="application/x-tex">w_{fb}w_y^⊺</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0475em;vertical-align:-0.3831em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.0269em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span><span class="mord mathnormal mtight">b</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-2.453em;margin-left:-0.0269em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3831em;"><span></span></span></span></span></span></span></span></span></span>），</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>τ</mi><mfrac><mrow><mi>d</mi><mi>r</mi></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mo>−</mo><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>f</mi><mo stretchy="false">(</mo><mo stretchy="false">[</mo><mi>W</mi><mi>r</mi><mo>+</mo><msub><mi>w</mi><mrow><mi>f</mi><mi>b</mi></mrow></msub><msubsup><mi>w</mi><mi>y</mi><mo>⊺</mo></msubsup><mo stretchy="false">]</mo><mi>r</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><msub><mi>W</mi><mi>x</mi></msub><mi>x</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>b</mi><mi>r</mi><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(30)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">τ\frac{dr}{dt}=−r(t)+f([Wr+w_{fb}w_y^⊺]r(t)+W_xx(t)+br) \tag{30}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord mathnormal" style="margin-right:0.1132em;">τ</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">([</span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1331em;vertical-align:-0.3831em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.0269em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span><span class="mord mathnormal mtight">b</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-2.453em;margin-left:-0.0269em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">y</span></span></span><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.3831em;"><span></span></span></span></span></span></span><span class="mclose">]</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">x</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">x</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">b</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">30</span></span><span class="mord">)</span></span></span></span></span></span></p><h2 id="4-分析和理解-ANNs">4. 分析和理解 ANNs</h2><p>机器学习或神经科学中使用的普通 ANN 并不容易解释。对于许多神经科学问题，它们可能更适合作为模型系统，等待进一步分析。在一项任务上成功训练一个 ANN 不意味着知道系统是如何工作的。因此，与大多数机器学习应用不同，训练好的 ANN 不是最终目标，而只是分析该网络以获得理解的前提条件。</p><p>大多数研究生物神经回路的系统神经科学技术可以直接应用于理解人工网络。为了促进人工和生物神经网络之间的并排比较，可以用用于生物记录的相同降维工具（如 PCA）对 ANN 的活动进行可视化和分析。为了了解从神经元到行为的因果关系，可以对任意一组神经元进行病变或在短时间内失活，类似于生理实验中的光遗传学操作。同样，可以对两组选定的神经元之间的连接进行病变，以了解跨种群互动的因果贡献。</p><p>在本节中，我们重点讨论对分析 ANN 特别有用的方法。这些方法包括基于优化的调谐分析，基于定点的动态系统分析，模型和实验数据之间的定量比较，以及从生物进化角度的洞察力。</p><h3 id="相似性比较">相似性比较</h3><p>诸如可视化、病变、调谐和定点分析等分析方法可以为个体网络的神经机制提供详细的直觉。然而，由于训练 ANN 相对容易，有可能为同一任务或数据集训练大量的神经网络。面对这样的数据量，有必要利用高通量的定量方法，对不同的模型进行规模化的比较。相似性比较方法计算执行相同任务的两个网络的神经活动之间的标度相似性分数。这些方法对网络的形式和规模是不可知的，可以同样适用于人工和生物网络。</p><p>考虑两个网络（或两个神经元群），大小分别为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">N_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mn>2</mn></msub></mrow><annotation encoding="application/x-tex">N_2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>。它们对相同的 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>D</mi></mrow><annotation encoding="application/x-tex">D</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">D</span></span></span></span> 任务条件的神经活动可以用一个 D-by-<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">N_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的 矩阵 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">R_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 和一个 D-by-<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>N</mi><mn>2</mn></msub></mrow><annotation encoding="application/x-tex">N_2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 矩阵 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mn>2</mn></msub></mrow><annotation encoding="application/x-tex">R_2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> (图 4A)。表征相似性分析（RSA）首先计算每个网络内不同任务条件之间的神经反应的不相似性或距离，得出每个网络的 D-by-D 不相似性矩阵（图 4B）。接下来，计算两个网络的异同度矩阵之间的相关性。更高的相关性对应于更相似的表征。</p><img src="/images/000032/05.jpg" width="600" alt="图4 卷积神经网络的反应和调谐" align=center /><blockquote><p>(A) 在一个卷积神经网络中，对图像的神经反应被训练成手写数字的分类。该网络由两层卷积处理组成，然后是两个完全连接层。<br>(B) 异同矩阵（每个 D 乘 D）评估对不同输入图像的相似或不相似的神经反应。差异矩阵是针对网络的第1层和第4层的神经元计算的。 D=50。图像按类别（0、1 等）组织，每个类别有五个图像。在第4层（右）比在第1层（左）对同一类别的图像的神经反应更相似，也就是说，神经表征更基于类别。<br>© 通过基于梯度的优化，为每层的样本神经元找到首选的图像刺激。第 1 层和第 2 层是卷积的，因此它们的神经元有本地化的首选刺激。相比之下，第 3 层和第 4 层的神经元有非局部的偏好刺激。</p></blockquote><p>另一种相关的方法是使用线性回归，通过 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">R_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的线性变换来预测 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mn>2</mn></msub></mrow><annotation encoding="application/x-tex">R_2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 通过 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">R_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的线性转换，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mn>2</mn></msub><mo>≈</mo><mi>W</mi><msub><mi>R</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">R_2≈WR_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">≈</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>。相似性对应的是 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>R</mi><mn>2</mn></msub></mrow><annotation encoding="application/x-tex">R_2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 和它的预测值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><msub><mi>R</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">WR_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0077em;">R</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0077em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>。</p><h3 id="复杂调谐分析">复杂调谐分析</h3><p>研究单个神经元的调谐特性一直是神经科学中最重要的分析技术之一。传统上，调谐特性是通过显示低维空间的参数化刺激（如视觉中的定向条或光栅）来研究感官区域的调谐特性。当所研究的神经元具有相对简单的反应特性时，这种方法是最有效的。一类新的方法将调谐的映射视为一个高维的优化问题，并直接搜索最强烈激活神经元的刺激。遗传算法等无梯度方法已被用于研究生物神经元的复杂调谐。在深度神经网络中，可以使用基于梯度的方法。对于一个有活动的神经元 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">r(x)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mopen">(</span><span class="mord mathnormal">x</span><span class="mclose">)</span></span></span></span> 给定输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span>，梯度上升优化从一个随机的 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mn>0</mn></msub></mrow><annotation encoding="application/x-tex">x_0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 开始，并通过更新输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi></mrow><annotation encoding="application/x-tex">x</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span></span></span></span> 来进行</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>x</mi><mo>→</mo><mi>x</mi><mo>+</mo><mi mathvariant="normal">Δ</mi><mi>x</mi><mo separator="true">;</mo><mspace width="1em"/><mi mathvariant="normal">Δ</mi><mi>x</mi><mo>=</mo><mi>η</mi><mfrac><mrow><mi mathvariant="normal">∂</mi><mi>r</mi></mrow><mrow><mi mathvariant="normal">∂</mi><mi>x</mi></mrow></mfrac></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(31)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">x→x+Δx; \quad Δx=η\frac{∂r}{∂x} \tag{31}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">→</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8778em;vertical-align:-0.1944em;"></span><span class="mord">Δ</span><span class="mord mathnormal">x</span><span class="mpunct">;</span><span class="mspace" style="margin-right:1em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">Δ</span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal">x</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">31</span></span><span class="mord">)</span></span></span></span></span></span></p><p>这种方法可用于搜索深度网络中任何神经元或任何神经元群的首选输入。它对于研究具有更复杂调谐特性的高层神经元特别有用。</p><p>x 的空间可能维度太高（如像素空间），无法进行有效的搜索，特别是对于无梯度方法。在这种情况下，我们可以利用一个低维空间，但仍有很强的表现力。生成模型学习一个函数，将低维潜伏空间映射到高维空间，如像素空间。然后，可以在低维潜伏空间中进行搜索。</p><p>ANNs 可以用来建立复杂行为的模型，否则不容易做到，开辟了新的可能性，如研究更抽象的信息形式的编码。例如，Yang 等人，2019 年研究了任务结构的神经调谐，而不是刺激，在规则引导下的问题解决<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>12</sup></a>。一个 ANN 被训练来执行动物实验中常用的许多不同的认知任务，包括感知决策、工作记忆、抑制性控制和分类。复杂的网络组织是通过训练形成的，其中递归神经元对一个子集的任务显示出选择性（图 5）。</p><img src="/images/000032/06.jpg" width="350" alt="图5 分析为执行 20 项认知任务而训练的神经网络的调谐特性" align=center /><blockquote><p>在一个对多个认知任务进行训练的网络中，模型单元对单个任务的调谐特性可以被量化。X 轴，循环单元；Y 轴，不同的任务。颜色衡量每个单元参与一项任务的程度（在 0 和 1 之间）。使用分层聚类方法确定了 12 个聚类（底部，彩色条）。例如，集群 3 对涉及抑制性控制的亲反应与反反应任务（Anti）具有高度选择性；集群 10 和 11 分别参与延迟匹配样本（DMS）和延迟非匹配样本（DNMS）；集群 12 被调整为 DMC。<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>12</sup></a></p></blockquote><h3 id="动态系统分析">动态系统分析</h3><p>调谐特性提供了一个关于神经表征和计算的大部分静态观点。为了理解神经网络如何及时计算和处理信息，研究 RNN 的动态是很有用的。</p><p>了解动态的一个有用方法是研究固定点和围绕固定点的网络动态。在一个通用的动态系统中：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi>d</mi><mi>r</mi></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mi>F</mi><mo stretchy="false">(</mo><mi>r</mi><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(32)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{dr}{dt}=F(r) \tag{32}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">32</span></span><span class="mord">)</span></span></span></span></span></span></p><p>定点 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub></mrow><annotation encoding="application/x-tex">r_{ss}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 是一种稳定状态，状态不会随时间变化，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub><mo stretchy="false">)</mo><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">F(r_{ss})=0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">0</span></span></span></span>。状态 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo>=</mo><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub><mo>+</mo><mi mathvariant="normal">Δ</mi><mi>r</mi></mrow><annotation encoding="application/x-tex">r=r_{ss}+Δr</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord">Δ</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span> 时的网络动力学围绕固定点 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub></mrow><annotation encoding="application/x-tex">r_{ss}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 近似线性，</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi>d</mi><mi>r</mi></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mi>F</mi><mo stretchy="false">(</mo><mi>r</mi><mo stretchy="false">)</mo><mo>=</mo><mi>F</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub><mo>+</mo><mi mathvariant="normal">Δ</mi><mi>r</mi><mo stretchy="false">)</mo><mo>≈</mo><mi>F</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub><mo stretchy="false">)</mo><mo>+</mo><mi>J</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub><mo stretchy="false">)</mo><mi mathvariant="normal">Δ</mi><mi>r</mi><mo separator="true">,</mo><mspace width="1em"/><mfrac><mrow><mi>d</mi><mi mathvariant="normal">Δ</mi><mi>r</mi></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mi>J</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub><mo stretchy="false">)</mo><mi mathvariant="normal">Δ</mi><mi>r</mi></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(33)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{dr}{dt}=F(r)=F(r_{ss}+Δr)≈F(r_{ss})+J(r_{ss})Δr,\quad \frac{dΔr}{dt}=J(r_{ss})Δr \tag{33}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">Δ</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">≈</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord mathnormal" style="margin-right:0.0962em;">J</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mord">Δ</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mpunct">,</span><span class="mspace" style="margin-right:1em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord">Δ</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0962em;">J</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mord">Δ</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span><span class="tag"><span class="strut" style="height:2.0574em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">33</span></span><span class="mord">)</span></span></span></span></span></span></p><p>其中 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>J</mi></mrow><annotation encoding="application/x-tex">J</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.0962em;">J</span></span></span></span> 是 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>F</mi></mrow><annotation encoding="application/x-tex">F</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span></span></span></span> 的雅可比矩阵，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>J</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>=</mo><mi mathvariant="normal">∂</mi><msub><mi>F</mi><mi>i</mi></msub><mi mathvariant="normal">/</mi><mi mathvariant="normal">∂</mi><msub><mi>r</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">J_{ij}=∂F_{i}/∂r_j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0962em;">J</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0962em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord">/</span><span class="mord" style="margin-right:0.0556em;">∂</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span>，在 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub></mrow><annotation encoding="application/x-tex">r_{ss}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 上评估这是一个线性系统，可以更容易地理解，例如，通过研究 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>J</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mrow><mi>s</mi><mi>s</mi></mrow></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">J(r_{ss})</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0962em;">J</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">ss</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span> 的特征向量和特征值在 ANN 中，可以通过基于梯度的优化来找到这些固定点,</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>a</mi><mi>r</mi><mi>g</mi><mi>m</mi><mi>i</mi><msub><mi>n</mi><mi>r</mi></msub><mi mathvariant="normal">∥</mi><mi>F</mi><mo stretchy="false">(</mo><mi>r</mi><mo stretchy="false">)</mo><msup><mi mathvariant="normal">∥</mi><mn>2</mn></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(34)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">argmin_r\Vert F(r)\Vert ^2 \tag{34}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1141em;vertical-align:-0.25em;"></span><span class="mord mathnormal">a</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="mord mathnormal">mi</span><span class="mord"><span class="mord mathnormal">n</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord">∥</span><span class="mord mathnormal" style="margin-right:0.1389em;">F</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mclose">)</span><span class="mord"><span class="mord">∥</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:1.1141em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">34</span></span><span class="mord">)</span></span></span></span></span></span></p><p>固定点对于理解网络如何存储记忆、积累信息以及在离散状态之间过渡特别有用。这一点可以在一个被训练来执行参数化工作记忆任务的网络中说明<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>13</sup></a>。在这项任务中，一个频率为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">f_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的振动触觉刺激样本呈现，随后是几秒钟的延迟期；然后是频率为 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mn>2</mn></msub></mrow><annotation encoding="application/x-tex">f_2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的测试刺激呈现，受试者必须决定 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mn>2</mn></msub></mrow><annotation encoding="application/x-tex">f_2</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 是比 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">f_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 高还是低 (图 6A)。在延迟期间，行为猴的前额叶皮层的神经元表现出持续的活动，其速率随 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>f</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">f_1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的变化而单调。这种参数化的工作记忆编码出现在 RNN 的训练中（图 6B）：在这个网络的状态空间中，延迟期间的神经轨迹根据存储值的不同而收敛到不同的固定点。这些固定点在延迟期形成一个近似的线吸引子（图 6C）。</p><img src="/images/000032/07.jpg" width="600" alt="图 6 通过状态空间和动态系统分析理解网络计算" align=center /><blockquote><p>（A–C）在一个简单的参数化工作记忆任务中，网络需要通过一个延迟周期（A）记忆刺激的（频率）值。网络可以通过开发线吸引子（B和C）来实现这种参数化工作记忆。<br>（B） PCA空间中不同刺激值的延迟期内的神经活动试验平均值。三角形表示延迟周期的开始。<br>（C） 通过优化找到的固定点（橙色十字）。线吸引子的方向可以通过找到对应特征值接近0的特征向量来估计。橙色线显示了围绕其中一个固定点估计的线吸引子。<br>（D–G）训练循环神经网络和猴子进行延迟匹配分类任务。任务是确定测试和样本刺激（视觉运动模式）是否属于同一类别（D）。这两个类别是根据刺激的运动方向定义的（红色，类别1；蓝色，类别2）（E）。在为执行该分类任务而训练的ANN中，模型的重复单元显示出类别选择性的广泛的开始时间异质性，类似于任务期间从猴后顶叶皮层（外侧顶内区域，LIP）记录的单个神经元（F）。基于 DMC 任务（G）性能的递归神经网络的神经动力学。最终决定，匹配（AA 或 BB）或不匹配（AB 或 BA）对应于位于状态空间中不同位置的不同吸引子状态。在实验数据中也发现了类似的族群活动轨迹。</p></blockquote><blockquote><p>本图来自<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE">14</a></p></blockquote><p>在计算神经科学中缺乏这样的例子，它不仅说明了神经表征或动力学的单一方面，而且说明了实现复杂任务的计算序列。ANNs 提供了一个新的工具来面对这个困难。Chaisangmongkon 等人，2017年使用这种方法来建立一个延迟匹配到类别（DMC）任务的模型<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>14</sup></a>。一个 DMC 任务（图 6D和 6E）从一个刺激样本开始，比如一个视觉移动图案，其中的一个特征（运动方向为 0°到 360°的模拟量化）被分为两个类别（A 为红色，B 为蓝色）。在一个记忆延迟期之后，一个测试刺激被显示出来，任务是决定测试是否具有与样本相同的类别成员。在训练完成这项任务后，一个递归神经网络显示出与猴子做相同任务的顶叶神经元相似的多样化神经活动模式（图 6F）。递归神经群体在状态空间中的轨迹揭示了计算是如何通过任务的历时进行的（图 6G）。</p><h3 id="从目标、结构和训练中理解神经回路">从目标、结构和训练中理解神经回路</h3><p>上述所有方法都是在训练后寻求对 ANNs 的机械性理解。一种更综合的观点将深度学习的三个基本成分：学习问题（任务/目标）、网络架构和训练算法与训练后的解决方案联系起来。这种方法类似于生物学中的进化或发展观点，它将环境与生物体的功能联系起来。它可以帮助解释观察到的结构或功能的计算效益或必要性。例如，与纯粹的前馈网络相比，递归连接的深度网络能更好地预测高级视觉区神经元对杂乱场景的行为挑战图像的反应。这表明递归连接对大脑中困难图像的分类有贡献。</p><p>虽然重新运行发展和进化的生物过程可能是困难的，但由于机器学习的最新进展，重新训练具有不同目标、架构和算法的网络是相当简单的。每当训练 ANN 导致一个结论时，改变描述基本成分的超参数（在合理的程度上）以探索结论的必要和充分条件是很好的做法。</p><p>从这三种成分到网络解决方案的联系通常是不严格的。然而，在某些简化的情况下，通过分析解决训练过程，可以牢固地建立这种联系。</p><h2 id="5-生物拟真的网络架构和学习">5.生物拟真的网络架构和学习</h2><p>尽管神经科学家和认知科学家在机器学习中使用的标准神经网络架构（vanilla RNNs）和训练算法（例如 SGD）方面取得了很大成功，但对于许多神经科学问题而言，构建网络架构并利用生物学上合理的学习算法至关重要。在本节中，我们概述了使用更具生物学现实意义的结构、规范计算和可塑性规则构建网络的方法。</p><h3 id="5-1-结构化连接">5.1.结构化连接</h3><p>现代神经生理学实验通常在同一动物行为期间从多个脑区和/或多个细胞类型进行记录。通过将基本生物结构（如当前已知的细胞类型特异性连接和跨模型区域/层的长距离连接）纳入神经网络，可以大大促进对这些发现进行建模的计算工作。</p><p>在普通的递归网络中，默认的连通性是所有到所有。相比之下，生物神经系统中的局部和长程连接通常都是稀疏的。拥有稀疏连接矩阵 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi></mrow><annotation encoding="application/x-tex">W</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span></span></span> 的一种方法是将可训练矩阵 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mover accent="true"><mi>W</mi><mo>~</mo></mover></mrow><annotation encoding="application/x-tex">\tilde{W}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9202em;"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.9202em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span><span style="top:-3.6023em;"><span class="pstrut" style="height:3em;"></span><span class="accent-body" style="left:-0.25em;"><span class="mord">~</span></span></span></span></span></span></span></span></span></span> 与不可训练的稀疏掩码 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>M</mi></mrow><annotation encoding="application/x-tex">M</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.109em;">M</span></span></span></span> 进行元素相乘，即 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><mo>=</mo><mover accent="true"><mi>W</mi><mo>~</mo></mover><mo>⊙</mo><mi>M</mi></mrow><annotation encoding="application/x-tex">W=\tilde{W}\odot M</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0035em;vertical-align:-0.0833em;"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.9202em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span><span style="top:-3.6023em;"><span class="pstrut" style="height:3em;"></span><span class="accent-body" style="left:-0.25em;"><span class="mord">~</span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">⊙</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.109em;">M</span></span></span></span>。为了鼓励稀疏性而不严格要求，可以在损失函数中加入 L1 正则化项 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>β</mi><msub><mo>∑</mo><mrow><mi>i</mi><mi>j</mi></mrow></msub><mi mathvariant="normal">∣</mi><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mi mathvariant="normal">∣</mi></mrow><annotation encoding="application/x-tex">β\sum_{ij}\vert W_{ij} \vert</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1858em;vertical-align:-0.4358em;"></span><span class="mord mathnormal" style="margin-right:0.0528em;">β</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.162em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4358em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">∣</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord">∣</span></span></span></span>。 标量系数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>β</mi></mrow><annotation encoding="application/x-tex">β</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0528em;">β</span></span></span></span> 控制稀疏性约束的强度。为了对细胞类型的特定发现进行建模，必须建立具有多种细胞类型的神经网络。一个虚无缥缈的递归网络（公式 14、15 和 16）（或任何其他网络）可以通过分离兴奋性和抑制性神经元而被轻易地修改为服从戴尔定律。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi>d</mi><msup><mi>r</mi><mi>E</mi></msup></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mo>−</mo><msup><mi>r</mi><mi>E</mi></msup><mo>+</mo><msub><mi>f</mi><mi>E</mi></msub><mo stretchy="false">(</mo><msub><mi>W</mi><mrow><mi>E</mi><mi>E</mi></mrow></msub><msup><mi>r</mi><mi>E</mi></msup><mo>−</mo><msub><mi>W</mi><mrow><mi>E</mi><mi>I</mi></mrow></msub><msup><mi>r</mi><mi>I</mi></msup><mo>+</mo><msub><mi>W</mi><mrow><mi>E</mi><mi>x</mi></mrow></msub><mi>x</mi><mo>+</mo><msup><mi>b</mi><mi>E</mi></msup><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(35)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{dr^E}{dt}=−r^E+f_E(W_{EE}r^E−W_{EI}r^I+W_{Ex}x+b^E) \tag{35}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.2043em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.5183em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8413em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.9747em;vertical-align:-0.0833em;"></span><span class="mord">−</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1413em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.0413em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span><span class="mord mathnormal mtight">x</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1413em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.2043em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">35</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mfrac><mrow><mi>d</mi><msup><mi>r</mi><mi>I</mi></msup></mrow><mrow><mi>d</mi><mi>t</mi></mrow></mfrac><mo>=</mo><mo>−</mo><msup><mi>r</mi><mi>I</mi></msup><mo>+</mo><msub><mi>f</mi><mi>I</mi></msub><mo stretchy="false">(</mo><msub><mi>W</mi><mrow><mi>I</mi><mi>E</mi></mrow></msub><msup><mi>r</mi><mi>I</mi></msup><mo>−</mo><msub><mi>W</mi><mrow><mi>I</mi><mi>I</mi></mrow></msub><msup><mi>r</mi><mi>I</mi></msup><mo>+</mo><msub><mi>W</mi><mrow><mi>I</mi><mi>x</mi></mrow></msub><mi>x</mi><mo>+</mo><msup><mi>b</mi><mi>I</mi></msup><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(36)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\frac{dr^I}{dt}=−r^I+f_I(W_{IE}r^I−W_{II}r^I+W_{Ix}x+b^I) \tag{36}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.2043em;vertical-align:-0.686em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.5183em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord mathnormal">d</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8413em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.9747em;vertical-align:-0.0833em;"></span><span class="mord">−</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1413em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1076em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.0413em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span><span class="mord mathnormal mtight">x</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1413em;vertical-align:-0.25em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8913em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0785em;">I</span></span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.2043em;vertical-align:-0.686em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">36</span></span><span class="mord">)</span></span></span></span></span></span></p><p>其中，一个绝对函数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">∣</mi><mi mathvariant="normal">.</mi><mi mathvariant="normal">∣</mi></mrow><annotation encoding="application/x-tex">\vert . \vert</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">∣.∣</span></span></span></span> 限制了连接权重的符号，例如，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>E</mi><mi>E</mi></mrow></msub><mo>=</mo><mi mathvariant="normal">∥</mi><msub><mover accent="true"><mi>W</mi><mo>~</mo></mover><mrow><mi>E</mi><mi>E</mi></mrow></msub><mi mathvariant="normal">∥</mi></mrow><annotation encoding="application/x-tex">W_{EE}=\Vert \tilde{W}_{EE}\Vert</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.1702em;vertical-align:-0.25em;"></span><span class="mord">∥</span><span class="mord"><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.9202em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span><span style="top:-3.6023em;"><span class="pstrut" style="height:3em;"></span><span class="accent-body" style="left:-0.25em;"><span class="mord">~</span></span></span></span></span></span></span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span><span class="mord mathnormal mtight" style="margin-right:0.0576em;">E</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord">∥</span></span></span></span>。 . 在训练 ANN 执行经典的 &quot;随机点 &quot;运动方向辨别任务后，人们可以 “打开黑匣子”，检查由此产生的递归网络连接模式的 “接线图”（图 7）。随着戴尔定律的加入，训练中出现的连通性是基于生物的决策结构网络模型的异质性版本），表明机器学习更接近大脑的硬件，确实可以用来揭示生物神经网络的洞察力。</p><img src="/images/000032/08.jpg" width="350" alt="图 7 用戴尔定律训练网络" align=center /><blockquote><p>在感知决策任务上训练的循环网络的连通矩阵。该网络遵循戴尔定律，分别由兴奋性（蓝色）和抑制性（红色）神经元组成。仅显示了具有高刺激选择性的神经元之间的连接。神经元根据其对选择1和2的刺激选择性进行分类。选择相同选择的神经元之间的反复兴奋性连接由两个黑色方块表示。</p></blockquote><p>跨越大脑区域的广泛的长程连接可以包含在 ANN 中。在经典的卷积神经网络中，每一层只接受来自紧邻的前一层的前馈输入。然而，在最近的一些网络中，每一层也接收来自更早层的前馈输入。在卷积递归网络中，每一层的神经元进一步接收来自后面各层和局部递归连接的反馈输入。</p><h3 id="5-2-经典计算">5.2 经典计算</h3><p>神经科学家已经确定了几个在广泛的脑区进行的典型计算，包括注意、归一化和门控。在此，我们讨论如何将这种典型计算引入神经网络。它们作为模块化的架构组件，可以被插入许多网络中。有趣的是，上面提到的典型计算在基于机器学习的神经网络中都有其相似之处。我们将强调纯粹的机器学习实现和更多的生物实现之间的差异和相似之处。</p><h4 id="归一化-2">归一化</h4><p>分化归一化在生物神经系统中被广泛观察到。在分化归一化中，一个神经元 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">r_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的激活不再由其直接输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>I</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">I_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0785em;">I</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0785em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 决定，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mi>i</mi></msub><mo>=</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>I</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">r_i=f(I_i)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0785em;">I</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0785em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span>。相反，它是由输入 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mo>∑</mo><mi>j</mi></msub><msub><mi>I</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">\sum_j I_j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1858em;vertical-align:-0.4358em;"></span><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.162em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4358em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0785em;">I</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0785em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 的总和归一化到一个更广泛的神经元池，称为归一化池。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>r</mi><mi>i</mi></msub><mo>=</mo><mi>f</mi><mo stretchy="false">(</mo><mi>γ</mi><mfrac><msub><mi>I</mi><mi>i</mi></msub><mrow><munder><mo>∑</mo><mi>j</mi></munder><msub><mi>I</mi><mi>j</mi></msub><mo>+</mo><mi>σ</mi></mrow></mfrac><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(37)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">r_i=f(γ \frac{I_i}{\sum_j I_j+σ}) \tag{37}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.4821em;vertical-align:-1.1218em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0556em;">γ</span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3603em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.162em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4358em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0785em;">I</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0785em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0785em;">I</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0785em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.1218em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:2.4821em;vertical-align:-1.1218em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">37</span></span><span class="mord">)</span></span></span></span></span></span></p><p>归一化池的具体选择取决于所研究的系统。在生物学上，虽然突触输入对神经元的驱动是相加的，但反馈抑制可以有效地产生归一化。这种形式的分化归一化是可分的。所以，它可以直接被纳入到 ANN 中。</p><p>归一化也是机器学习中许多神经网络的一个关键部分。与除法归一化类似，基于机器学习的归一化方法旨在将神经元反应放入适合下游区域处理的范围。与除法归一化不同的是，通常从一个神经元池的平均输入中减去，而不是除以直接输入（公式21）。这些方法还计算了归一化池的输入的标准差，这个步骤在生物学上可能是不可行的。不同的基于机器学习的归一化方法是根据他们对归一化池的选择来区分的。</p><h4 id="注意">注意</h4><p>注意在神经科学中得到了广泛的研究。计算模型能够捕捉到自下而上和自上而下注意力的各个方面。在计算模型中，自上而下的注意通常采取对特定神经元组活动的乘法增益场的形式。在空间注意的情况下，考虑一组神经元，每个神经元有一个偏好的空间位置 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">x_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 和预注意活动 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mover accent="true"><mi>r</mi><mo>~</mo></mover><mo stretchy="false">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\tilde{r}(x_i)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6679em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span><span style="top:-3.35em;"><span class="pstrut" style="height:3em;"></span><span class="accent-body" style="left:-0.1944em;"><span class="mord">~</span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span> 对某一刺激的预注意活动。被关注的空间位置 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>q</mi></msub></mrow><annotation encoding="application/x-tex">x_q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">q</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 导致注意权重 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mi>i</mi></msub><mo stretchy="false">(</mo><msub><mi>x</mi><mi>q</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">α_i(x_q)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">q</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span>，如果 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>q</mi></msub></mrow><annotation encoding="application/x-tex">x_q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">q</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 类似于 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">x_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>。然后注意权重可以用来调节神经元 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span> 的神经反应，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>r</mi><mi>i</mi></msub><mo stretchy="false">(</mo><msub><mi>x</mi><mi>q</mi></msub><mo stretchy="false">)</mo><mo>=</mo><msub><mi>α</mi><mi>i</mi></msub><mo stretchy="false">(</mo><msub><mi>x</mi><mi>q</mi></msub><mo stretchy="false">)</mo><mover accent="true"><mi>r</mi><mo>~</mo></mover><mo stretchy="false">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">r_i(x_q)=α_i(x_q)\tilde{r}(x_i)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">q</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">q</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6679em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span><span style="top:-3.35em;"><span class="pstrut" style="height:3em;"></span><span class="accent-body" style="left:-0.1944em;"><span class="mord">~</span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span>。同样，特征注意加强了对所关注的特征（如特定颜色）具有选择性的神经元的活动。这种自上而下的空间和特征注意可以包含在卷积神经网络中。</p><p>同时，注意力已经广泛用于机器学习，构成了最近自然语言处理模型的标准组件。虽然机器学习的注意力机制看起来与神经科学中的注意力模型相当不同，但正如我们将在下面展示的那样，这两种机制是非常密切相关的。</p><p>在深度学习中，注意力可以被看作是一个可区分的字典检索过程。一个普通的字典存储了一些键值对（例如，单词-解释对） <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">{</mo><mo stretchy="false">(</mo><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo separator="true">,</mo><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo stretchy="false">)</mo><mo stretchy="false">}</mo></mrow><annotation encoding="application/x-tex">\{ (k^{(i)},v^{(i)})\}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.138em;vertical-align:-0.25em;"></span><span class="mopen">{(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose">)}</span></span></span></span>，类似于查找一个词 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo fence="true">(</mo><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo fence="true">)</mo></mrow><annotation encoding="application/x-tex">\left(k^{(i)}\right)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.238em;vertical-align:-0.35em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;"><span class="delimsizing size1">(</span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;"><span class="delimsizing size1">)</span></span></span></span></span></span> 的解释 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo fence="true">(</mo><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo fence="true">)</mo></mrow><annotation encoding="application/x-tex">\left(v^{(i)}\right)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.238em;vertical-align:-0.35em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;"><span class="delimsizing size1">(</span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;"><span class="delimsizing size1">)</span></span></span></span></span></span>。对于一个给定的查询 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi></mrow><annotation encoding="application/x-tex">q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span></span></span></span>，使用字典包括搜索与 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi></mrow><annotation encoding="application/x-tex">q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span></span></span></span> 匹配的键 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">k^{(j)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 他可以匹配 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mo separator="true">,</mo><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mi>q</mi></mrow><annotation encoding="application/x-tex">q,k^{(j)}=q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0824em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span></span></span></span>，并检索相应的值，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>y</mi><mo>=</mo><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>j</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">y=v^{(j)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span>。这个过程可以被认为是根据注意力权重 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>α</mi></mrow><annotation encoding="application/x-tex">α</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.0037em;">α</span></span></span></span> 调节每个值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">v^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span> 基于注意力权重 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">α_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 衡量键值 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>k</mi><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">k(i)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="mopen">(</span><span class="mord mathnormal">i</span><span class="mclose">)</span></span></span></span> 和查询 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi></mrow><annotation encoding="application/x-tex">q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span></span></span></span> 之间的相似性。在简单的二进制情况下：</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>α</mi><mi>i</mi></msub><mo>=</mo><mrow><mo fence="true">{</mo><mtable rowspacing="0.36em" columnalign="left left" columnspacing="1em"><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><mn>1</mn><mo separator="true">,</mo><mtext>  </mtext><mi>i</mi><mi>f</mi><mspace width="1em"/><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo>=</mo><mi>q</mi></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><mn>0</mn><mo separator="true">,</mo><mtext>  </mtext><mi>o</mi><mi>t</mi><mi>h</mi><mi>e</mi><mi>r</mi><mi>w</mi><mi>i</mi><mi>s</mi><mi>e</mi></mrow></mstyle></mtd></mtr></mtable></mrow></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(38)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">\alpha _i = \begin{cases} 1,\; if \quad k^{(i)}=q\\ 0,\; otherwise \end{cases} \tag{38}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:3em;vertical-align:-1.25em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;"><span class="delimsizing size4">{</span></span><span class="mord"><span class="mtable"><span class="col-align-l"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.69em;"><span style="top:-3.69em;"><span class="pstrut" style="height:3.008em;"></span><span class="mord"><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mspace" style="margin-right:1em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span></span></span><span style="top:-2.25em;"><span class="pstrut" style="height:3.008em;"></span><span class="mord"><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">o</span><span class="mord mathnormal">t</span><span class="mord mathnormal">h</span><span class="mord mathnormal" style="margin-right:0.0278em;">er</span><span class="mord mathnormal" style="margin-right:0.0269em;">w</span><span class="mord mathnormal">i</span><span class="mord mathnormal">se</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.19em;"><span></span></span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span><span class="tag"><span class="strut" style="height:3em;vertical-align:-1.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">38</span></span><span class="mord">)</span></span></span></span></span></span></p><p>其将输出调制为:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>y</mi><mo>=</mo><munder><mo>∑</mo><mi>i</mi></munder><msub><mi>α</mi><mi>i</mi></msub><msup><mi>v</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(39)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">y=\sum_iα_iv^{(i)} \tag{39}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.3277em;vertical-align:-1.2777em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">v</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.3277em;vertical-align:-1.2777em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">39</span></span><span class="mord">)</span></span></span></span></span></span></p><p>在上述空间注意的情况下，第 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em;"></span><span class="mord mathnormal">i</span></span></span></span> 个键值对是 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo fence="true">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo separator="true">,</mo><mover accent="true"><mi>r</mi><mo>~</mo></mover><mo stretchy="false">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mo fence="true">)</mo></mrow><annotation encoding="application/x-tex">\left(x_i, \tilde{r}(x_i)\right)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6679em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span></span><span style="top:-3.35em;"><span class="pstrut" style="height:3em;"></span><span class="accent-body" style="left:-0.1944em;"><span class="mord">~</span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mclose delimcenter" style="top:0em;">)</span></span></span></span></span>，而查询是有人参与的空间位置 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>q</mi></msub></mrow><annotation encoding="application/x-tex">x_q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">q</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span>_。每个神经元的响应都根据其首选空间位置（其值）<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">x_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 与参与位置（查询）<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mi>q</mi></msub></mrow><annotation encoding="application/x-tex">x_q</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">q</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 的相似程度进行调节。</p><p>机器学习注意力的使用使得查询关键字比较和值检索过程可区分。将查询与每个关键向量 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">k^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.888em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span>进行比较以获得注意力权重（归一化相似性分数）<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>α</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">α_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>c</mi><mi>i</mi></msub><mo>=</mo><mi>s</mi><mi>c</mi><mi>o</mi><mi>r</mi><mi>e</mi><mrow><mo fence="true">(</mo><mi>q</mi><mo separator="true">,</mo><mtext> </mtext><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo fence="true">)</mo></mrow></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(40)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">c_i=score \left(q,\ k^{(i)}\right) \tag{40}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.8em;vertical-align:-0.65em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">scor</span><span class="mord mathnormal">e</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;"><span class="delimsizing size2">(</span></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.938em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;"><span class="delimsizing size2">)</span></span></span></span><span class="tag"><span class="strut" style="height:1.8em;vertical-align:-0.65em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">40</span></span><span class="mord">)</span></span></span></span></span></span></p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>α</mi><mn>1</mn></msub><mo separator="true">,</mo><mo>⋯</mo><mo separator="true">,</mo><msub><mi>α</mi><mi>N</mi></msub><mo>=</mo><mi>n</mi><mi>o</mi><mi>r</mi><mi>m</mi><mi>a</mi><mi>l</mi><mi>i</mi><mi>z</mi><mi>e</mi><mtext> </mtext><mo stretchy="false">(</mo><msub><mi>c</mi><mn>1</mn></msub><mo separator="true">,</mo><mo>⋯</mo><mo separator="true">,</mo><msub><mi>c</mi><mi>N</mi></msub><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(41)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">α_1,⋯,α_N=normalize\ (c_1,⋯,c_N) \tag{41}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner">⋯</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.0278em;">or</span><span class="mord mathnormal">ma</span><span class="mord mathnormal" style="margin-right:0.0197em;">l</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.044em;">z</span><span class="mord mathnormal">e</span><span class="mspace"> </span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner">⋯</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.109em;">N</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">41</span></span><span class="mord">)</span></span></span></span></span></span></p><p>这里，相似性评分函数可以是简单的内积，得分<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mo fence="true">(</mo><mi>q</mi><mo separator="true">,</mo><mtext> </mtext><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup><mo fence="true">)</mo></mrow><mo>=</mo><msup><mi>q</mi><mo>⊺</mo></msup><msup><mi>k</mi><mrow><mo stretchy="false">(</mo><mi>i</mi><mo stretchy="false">)</mo></mrow></msup></mrow><annotation encoding="application/x-tex">\left(q,\ k^{(i)}\right)=q^⊺k^{(i)}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.238em;vertical-align:-0.35em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;"><span class="delimsizing size1">(</span></span><span class="mord mathnormal" style="margin-right:0.0359em;">q</span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;"><span class="delimsizing size1">)</span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0824em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">q</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0315em;">k</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.888em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mopen mtight">(</span><span class="mord mathnormal mtight">i</span><span class="mclose mtight">)</span></span></span></span></span></span></span></span></span></span></span></span>，归一化函数可以是 softmax 函数，</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>α</mi><mi>i</mi></msub><mo>=</mo><mfrac><msup><mi>e</mi><msub><mi>c</mi><mi>i</mi></msub></msup><mrow><munder><mo>∑</mo><mi>j</mi></munder><msup><mi>e</mi><msub><mi>c</mi><mi>j</mi></msub></msup></mrow></mfrac><mo separator="true">,</mo><mtext> 于是 </mtext><munder><mo>∑</mo><mi>i</mi></munder><msub><mi>α</mi><mi>i</mi></msub><mo>=</mo><mn>1</mn></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(42)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">α_i=\frac{e^{c_i}}{\sum_je^{c_j}},\ \text{于是}\  \sum_iα_i=1 \tag{42}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.6191em;vertical-align:-1.2777em;"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3414em;"><span style="top:-2.314em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mop"><span class="mop op-symbol small-op" style="position:relative;top:0em;">∑</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.162em;"><span style="top:-2.4003em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.4358em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal">e</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6065em;"><span style="top:-3.0051em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3281em;"><span style="top:-2.357em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2819em;"><span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span><span style="top:-3.23em;"><span class="pstrut" style="height:3em;"></span><span class="frac-line" style="border-bottom-width:0.04em;"></span></span><span style="top:-3.677em;"><span class="pstrut" style="height:3em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal">e</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathnormal mtight">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3281em;"><span style="top:-2.357em;margin-left:0em;margin-right:0.0714em;"><span class="pstrut" style="height:2.5em;"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.143em;"><span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.1218em;"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord text"><span class="mord cjk_fallback">于是</span></span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0037em;">α</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0037em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span><span class="tag"><span class="strut" style="height:2.6191em;vertical-align:-1.2777em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">42</span></span><span class="mord">)</span></span></span></span></span></span></p><p>归一化函数的使用至关重要，因为它有效地迫使网络聚焦于几个关键向量（在空间注意力的情况下，一些关注的位置）。</p><h4 id="门控">门控</h4><p>生物神经系统的一个重要计算是门控。门控指的是控制信息流而不必扭曲其内容的想法。生物系统中的门控可以通过各种机制实现。注意力调节将神经元的输入乘以增益因子，在感觉系统水平上提供分级门控机制。另一种形式的门控可能涉及几种类型的抑制性神经元。在行为层面，门控通常表现为<strong>全部</strong>或<strong>全不</strong>，如注意力不集中、失明等效应。</p><p>在深度学习中，乘法门控对于流行的递归网络架构至关重要，例如 LSTM（长-短期存储器）网络（方程43）和 GRU（门控递归单元）网络。门控网络通常比普通 RNN 更容易训练和更强大。门控变量通过乘法交互作用动态控制这些网络中的信息流。在 LSTM 网络中，有三种类型的门控变量。输入和输出门 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>g</mi><mi>t</mi><mi>i</mi></msubsup></mrow><annotation encoding="application/x-tex">g_t^i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0717em;vertical-align:-0.247em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8247em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span></span></span></span> 和 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>g</mi><mi>t</mi><mi>e</mi></msubsup></mrow><annotation encoding="application/x-tex">g_t^e</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9114em;vertical-align:-0.247em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">e</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span></span></span></span> 控制单元状态 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>c</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">c_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 的输入和输出，而遗忘门 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>g</mi><mi>t</mi><mi>f</mi></msubsup></mrow><annotation encoding="application/x-tex">g_t^f</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2128em;vertical-align:-0.2458em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.967em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.1809em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2458em;"><span></span></span></span></span></span></span></span></span></span> 控制单元状态 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>c</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">c_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 是否保持其存储器 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>c</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub></mrow><annotation encoding="application/x-tex">c_{t−1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6389em;vertical-align:-0.2083em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span></span></span></span>。</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mo fence="true">{</mo><mtable rowspacing="0.36em" columnalign="left left" columnspacing="1em"><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><msubsup><mi>g</mi><mi>t</mi><mi>f</mi></msubsup><mo>=</mo><msub><mi>σ</mi><mi>g</mi></msub><mrow><mo fence="true">(</mo><msub><mi>W</mi><mi>f</mi></msub><msub><mi>x</mi><mi>t</mi></msub><mo>+</mo><msub><mi>U</mi><mi>f</mi></msub><msub><mi>r</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>b</mi><mi>f</mi></msub><mo fence="true">)</mo></mrow><mo separator="true">,</mo></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><msubsup><mi>g</mi><mi>t</mi><mi>i</mi></msubsup><mo>=</mo><msub><mi>σ</mi><mi>g</mi></msub><mrow><mo fence="true">(</mo><msub><mi>W</mi><mi>i</mi></msub><msub><mi>x</mi><mi>t</mi></msub><mo>+</mo><msub><mi>U</mi><mi>i</mi></msub><msub><mi>r</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>b</mi><mi>i</mi></msub><mo fence="true">)</mo></mrow><mo separator="true">,</mo></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><msubsup><mi>g</mi><mi>t</mi><mi>o</mi></msubsup><mo>=</mo><msub><mi>σ</mi><mi>g</mi></msub><mrow><mo fence="true">(</mo><msub><mi>W</mi><mi>o</mi></msub><msub><mi>x</mi><mi>t</mi></msub><mo>+</mo><msub><mi>U</mi><mi>o</mi></msub><msub><mi>r</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>b</mi><mi>o</mi></msub><mo fence="true">)</mo></mrow><mo separator="true">,</mo></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><msub><mi>c</mi><mi>t</mi></msub><mo>=</mo><msubsup><mi>g</mi><mi>t</mi><mi>f</mi></msubsup><mo>⊙</mo><msub><mi>c</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msubsup><mi>g</mi><mi>t</mi><mi>i</mi></msubsup><mo>⊙</mo><msub><mi>σ</mi><mi>g</mi></msub><mrow><mo fence="true">(</mo><msub><mi>W</mi><mi>c</mi></msub><msub><mi>x</mi><mi>t</mi></msub><mo>+</mo><msub><mi>U</mi><mi>c</mi></msub><msub><mi>r</mi><mrow><mi>t</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>+</mo><msub><mi>b</mi><mi>c</mi></msub><mo fence="true">)</mo></mrow><mo separator="true">,</mo></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="false"><mrow><msub><mi>r</mi><mi>t</mi></msub><mo>=</mo><msubsup><mi>g</mi><mi>t</mi><mi>o</mi></msubsup><mo>⊙</mo><msub><mi>σ</mi><mi>r</mi></msub><mrow><mo fence="true">(</mo><msub><mi>c</mi><mi>t</mi></msub><mo fence="true">)</mo></mrow><mi mathvariant="normal">.</mi></mrow></mstyle></mtd></mtr></mtable></mrow><annotation encoding="application/x-tex">\begin{cases} g_t^f=\sigma _g \left ( W_fx_t+U_fr_{t-1}+b_f \right ),  \\ g_t^i=\sigma _g \left ( W_ix_t+U_ir_{t-1}+b_i \right ), \\ g_t^o=\sigma _g \left ( W_ox_t+U_or_{t-1}+b_o \right ), \\ c_t=g_t^f\odot c_{t-1}+g_t^i \odot \sigma _g \left ( W_cx_t+U_cr_{t-1}+b_c \right ), \\ r_t=g_t^o\odot \sigma _r\left (c_t \right ). \end{cases}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:7.2em;vertical-align:-3.35em;"></span><span class="minner"><span class="mopen"><span class="delimsizing mult"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:3.85em;"><span style="top:-1.366em;"><span class="pstrut" style="height:3.816em;"></span><span class="delimsizinginner delim-size4"><span>⎩</span></span></span><span style="top:-1.358em;"><span class="pstrut" style="height:3.816em;"></span><span style="height:1.816em;width:0.8889em;"><svg xmlns="http://www.w3.org/2000/svg" width="0.8889em" height="1.816em" style="width:0.8889em" viewBox="0 0 888.89 1816" preserveAspectRatio="xMinYMin"><path d="M384 0 H504 V1816 H384z M384 0 H504 V1816 H384z"/></svg></span></span><span style="top:-3.816em;"><span class="pstrut" style="height:3.816em;"></span><span class="delimsizinginner delim-size4"><span>⎨</span></span></span><span style="top:-4.958em;"><span class="pstrut" style="height:3.816em;"></span><span style="height:1.816em;width:0.8889em;"><svg xmlns="http://www.w3.org/2000/svg" width="0.8889em" height="1.816em" style="width:0.8889em" viewBox="0 0 888.89 1816" preserveAspectRatio="xMinYMin"><path d="M384 0 H504 V1816 H384z M384 0 H504 V1816 H384z"/></svg></span></span><span style="top:-6.766em;"><span class="pstrut" style="height:3.816em;"></span><span class="delimsizinginner delim-size4"><span>⎧</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:3.35em;"><span></span></span></span></span></span></span><span class="mord"><span class="mtable"><span class="col-align-l"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:3.85em;"><span style="top:-5.85em;"><span class="pstrut" style="height:3.008em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.967em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.1809em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2458em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">U</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;">)</span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span></span></span><span style="top:-4.41em;"><span class="pstrut" style="height:3.008em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8247em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">U</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;">)</span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span></span></span><span style="top:-2.97em;"><span class="pstrut" style="height:3.008em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">o</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">o</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">U</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">o</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">o</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;">)</span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span></span></span><span style="top:-1.53em;"><span class="pstrut" style="height:3.008em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.967em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.1809em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2458em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">⊙</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8247em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">⊙</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">c</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">U</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">c</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mbin mtight">−</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2083em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal">b</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">c</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;">)</span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mpunct">,</span></span></span><span style="top:-0.09em;"><span class="pstrut" style="height:3.008em;"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">o</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">⊙</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="minner"><span class="mopen delimcenter" style="top:0em;">(</span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mclose delimcenter" style="top:0em;">)</span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">.</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:3.35em;"><span></span></span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span></span></p><p>这里，符号⊙ 表示两个相同长度的向量的元素层面上的乘积（Hadamard 乘积，即 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>z</mi><mo>=</mo><mi>x</mi><mo>⊙</mo><mi>y</mi></mrow><annotation encoding="application/x-tex">z=x⊙y</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal" style="margin-right:0.044em;">z</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">⊙</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">y</span></span></span></span> 表示 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>z</mi><mi>i</mi></msub><mo>=</mo><msub><mi>x</mi><mi>i</mi></msub><msub><mi>y</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">z_i=x_iy_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.044em;">z</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.044em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em;"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> )。门控变量由 sigmoid 函数 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>σ</mi><mi>g</mi></msub></mrow><annotation encoding="application/x-tex">σ_g</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">σ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 限定在 0 和 1 之间，它可以被看作是一个平滑的可微分的二进制阶梯函数的近似值。当一个门的对应值接近 1 或 0 时，该门被打开或关闭。所有的权重（W 和 U 矩阵）都经过训练。通过引入这些门，原则上，LSTM 可以将一个存储器保持在其单元状态 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>c</mi><mi>t</mi></msub></mrow><annotation encoding="application/x-tex">c_t</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.5806em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal">c</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.2806em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span> 通过遗忘门 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>g</mi><mi>t</mi><mi>f</mi></msubsup><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">g_t^f=1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.2128em;vertical-align:-0.2458em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.967em;"><span style="top:-2.4542em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.1809em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2458em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">1</span></span></span></span> 和输入门 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>g</mi><mi>t</mi><mi>i</mi></msubsup><mo>=</mo><mn>0</mn></mrow><annotation encoding="application/x-tex">g_t^i=0</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0717em;vertical-align:-0.247em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.8247em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6444em;"></span><span class="mord">0</span></span></span></span> (图 8)。此外，网络可以通过设置输出门 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msubsup><mi>g</mi><mi>t</mi><mi>o</mi></msubsup><mo>=</mo><mn>0</mn><mtext> 或 </mtext><mn>1</mn></mrow><annotation encoding="application/x-tex">g_t^o=0\ \text{或}\ 1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9114em;vertical-align:-0.247em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em;">g</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.6644em;"><span style="top:-2.453em;margin-left:-0.0359em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">t</span></span></span><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">o</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.247em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord">0</span><span class="mspace"> </span><span class="mord text"><span class="mord cjk_fallback">或</span></span><span class="mspace"> </span><span class="mord">1</span></span></span></span> 来选择何时从存储器中读出。尽管 LSTM 对机器学习有很大的作用，但 LSTM（和 GRU）不容易与生物神经回路相联系。有人建议对 LSTM 进行修改，以便用神经生物学更好地解释门控过程。</p><img src="/images/000032/09.jpg" width="400" alt="图 8 在简单内存任务中可视化 LSTM 活动" align=center /><blockquote><p>（A–C）简单的记忆任务。<br>（A） 网络接收输入刺激流，其值在每个时间点随机且独立地采样。<br>（B） 当“记忆输入”（红色）激活时，网络需要记住刺激的当前值（A），并在“报告输入”（蓝色）下次激活时输出该值。<br>（C） 训练后，单个单元 LSTM 可以在适度的记忆时间内几乎完美地完成任务。<br>（D） 当记忆输入激活时，该网络打开输入门（允许输入）并关闭遗忘门（忘记以前的记忆）。当报告输入激活时，它打开输出门。</p></blockquote><p>尽管注意和门控都是利用乘法的相互作用，但一个关键的区别是，在注意中，神经调制是归一化的（公式 42），而在门控中则不是。因此，神经注意往往有一个焦点，而神经门控则可以均匀地打开或关闭所有神经元的闸门。机器学习的一个重要启示是，门控应该是可塑的，这应该激励神经科学家研究大脑中的门控学习。</p><h4 id="预测性编码">预测性编码</h4><p>为大脑提出的另一个典型计算是计算预测。在预测性编码中，一个神经系统不断试图对外部世界进行推理。大脑区域将选择性地传播未预测或令人惊讶的信息，同时抑制对预期刺激的反应。为了在 ANN 中实现预测性编码，可以用一个单独的损失来训练来自高层的反馈连接，将反馈连接的输出与低层的神经活动进行比较。通过这种方式，反馈连接将学会预测低层区域的活动。然后，反馈输入将被用来抑制低层的神经活动。</p><h3 id="5-3-学习和可塑性">5.3. 学习和可塑性</h3><p>生物神经系统是进化、发展和学习的产物。相比之下，传统的ANNs 是用基于 SGD 的规则来训练的，大部分是从头开始。众所周知，计算梯度下降的反向传播算法在生物学上是不可信的。纳入更真实的学习过程可以帮助我们建立更好的大脑模型。</p><h4 id="选择性训练和持续学习">选择性训练和持续学习</h4><p>在典型的神经网络中，所有的连接都被训练。然而，在生物神经系统中，突触并不是同样可以修改的。许多突触可以稳定多年。为了实现连接的选择性训练，有效连接矩阵 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi></mrow><annotation encoding="application/x-tex">W</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span></span></span> 可以表示为稀疏的可训练突触权重矩阵和不可训练突触权重矩阵之和，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><mo>=</mo><msub><mi>W</mi><mrow><mi>t</mi><mi>r</mi><mi>a</mi><mi>i</mi><mi>n</mi></mrow></msub><mo>+</mo><msub><mi>W</mi><mrow><mi>f</mi><mi>i</mi><mi>x</mi></mrow></msub></mrow><annotation encoding="application/x-tex">W=W_{train}+W_{fix}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">t</span><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight">ain</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight">x</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 。或者更一般地说，可以通过在损失中加入正则化项 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>L</mi><mrow><mi>r</mi><mi>e</mi><mi>g</mi></mrow></msub></mrow><annotation encoding="application/x-tex">L_{reg}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">L</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 来软性施加选择性训练。使得改变某些连接的权重更加困难:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>L</mi><mrow><mi>r</mi><mi>e</mi><mi>g</mi></mrow></msub><mo>=</mo><mi>β</mi><munder><mo>∑</mo><mrow><mi>i</mi><mi>j</mi></mrow></munder><msub><mi>M</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo stretchy="false">(</mo><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>−</mo><msub><mi>W</mi><mrow><mi>f</mi><mi>i</mi><mi>x</mi></mrow></msub><mo separator="true">,</mo><mtext> </mtext><mi>i</mi><mi>j</mi><msup><mo stretchy="false">)</mo><mn>2</mn></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(44)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">L_{reg}=β\sum_{ij}M_{ij}(W_{ij}−W_{fix},\ ij)^2 \tag{44}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">L</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0278em;">r</span><span class="mord mathnormal mtight">e</span><span class="mord mathnormal mtight" style="margin-right:0.0359em;">g</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:2.4638em;vertical-align:-1.4138em;"></span><span class="mord mathnormal" style="margin-right:0.0528em;">β</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em;"><span style="top:-1.8723em;margin-left:0em;"><span class="pstrut" style="height:3.05em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span><span style="top:-3.05em;"><span class="pstrut" style="height:3.05em;"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.4138em;"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">M</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.1502em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight">x</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace"> </span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.0572em;">ij</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.8641em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:2.4638em;vertical-align:-1.4138em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">44</span></span><span class="mord">)</span></span></span></span></span></span></p><p>这里，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>M</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">M_{ij}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.109em;">M</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 决定了连接 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">W_{ij}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 的强度应坚持接近 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>W</mi><mrow><mi>f</mi><mi>i</mi><mi>x</mi><mo separator="true">,</mo><mi>i</mi><mi>j</mi></mrow></msub></mrow><annotation encoding="application/x-tex">W_{fix,ij}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9694em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em;"><span style="top:-2.55em;margin-left:-0.1389em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.1076em;">f</span><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight">x</span><span class="mpunct mtight">,</span><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> 的值。</p><p>通过这种形式的软约束对连接进行选择性训练，已经被持续学习技术用来对抗灾难性的遗忘。灾难性遗忘的现象通常在 ANN 学习新任务时被观察到；它们倾向于迅速忘记以前学过的、没有被重访的任务。一类主要的持续学习方法通过选择性地训练那些被认为对以前学习的任务或知识不重要的突触连接，同时保护重要的突触连接来处理这个问题。</p><p>Hebbian 可塑性</p><p>生物学习的主导思想是 Hebbian 可塑性<a href="#%E5%8F%82%E8%80%83%E6%96%87%E7%8C%AE"><sup>15</sup></a>及其变体。Hebbian 可塑性是一种无监督的学习方法，在没有目标输出或奖励的情况下驱动连接权重的学习。它对经典的联想记忆模型，如 Hopfield 网络至关重要，并且与具有明确的长期记忆模块的现代神经网络架构有着深刻的联系。</p><p>监督学习技术，特别是那些基于 SGD 的技术，可以与 Hebbian 可塑性相结合，以开发出既对某些任务更强大，又更符合生物学规律的 ANN。有两种方法可以将 Hebbian 可塑性与 SGD 相结合。在第一种方法中，有效连接矩阵 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>W</mi><mo>=</mo><mover accent="true"><mi>W</mi><mo stretchy="true">~</mo></mover><mo>+</mo><mi>A</mi></mrow><annotation encoding="application/x-tex">W=\widetilde{W}+A</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0267em;vertical-align:-0.0833em;"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.9433em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span><span class="svg-align" style="top:-3.6833em;"><span class="pstrut" style="height:3em;"></span><span style="height:0.26em;"><svg xmlns="http://www.w3.org/2000/svg" width="100%" height="0.26em" viewBox="0 0 600 260" preserveAspectRatio="none"><path d="M200 55.538c-77 0-168 73.953-177 73.953-3 0-7-2.175-9-5.437L2 97c-1-2-2-4-2-6 0-4 2-7 5-9l20-12C116 12 171 0 207 0c86 0 114 68 191 68 78 0 168-68 177-68 4 0 7 2 9 5l12 19c1 2.175 2 4.35 2 6.525 0 4.35-2 7.613-5 9.788l-19 13.05c-92 63.077-116.937 75.308-183 76.128-68.267.847-113-73.952-191-73.952z"/></svg></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">A</span></span></span></span> 是两个连接矩阵之和，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mover accent="true"><mi>W</mi><mo stretchy="true">~</mo></mover></mrow><annotation encoding="application/x-tex">\widetilde{W}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9433em;"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.9433em;"><span style="top:-3em;"><span class="pstrut" style="height:3em;"></span><span class="mord mathnormal" style="margin-right:0.1389em;">W</span></span><span class="svg-align" style="top:-3.6833em;"><span class="pstrut" style="height:3em;"></span><span style="height:0.26em;"><svg xmlns="http://www.w3.org/2000/svg" width="100%" height="0.26em" viewBox="0 0 600 260" preserveAspectRatio="none"><path d="M200 55.538c-77 0-168 73.953-177 73.953-3 0-7-2.175-9-5.437L2 97c-1-2-2-4-2-6 0-4 2-7 5-9l20-12C116 12 171 0 207 0c86 0 114 68 191 68 78 0 168-68 177-68 4 0 7 2 9 5l12 19c1 2.175 2 4.35 2 6.525 0 4.35-2 7.613-5 9.788l-19 13.05c-92 63.077-116.937 75.308-183 76.128-68.267.847-113-73.952-191-73.952z"/></svg></span></span></span></span></span></span></span></span></span> 由 SGD 练，A 由 Hebbian 可塑性驱动:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><mi>A</mi><mo stretchy="false">(</mo><mi>t</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo><mo>=</mo><mi>λ</mi><mi>A</mi><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>η</mi><mi>r</mi><msup><mi>r</mi><mo>⊺</mo></msup></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(45)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">A(t+1)=λA(t)+ηrr^⊺ \tag{45}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">A</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord mathnormal">λ</span><span class="mord mathnormal">A</span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.9088em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7144em;"><span style="top:-3.113em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mbin amsrm mtight">⊺</span></span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">45</span></span><span class="mord">)</span></span></span></span></span></span></p><p>或者以组件形式:</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>A</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo stretchy="false">(</mo><mi>t</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo><mo>=</mo><mi>λ</mi><msub><mi>A</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>η</mi><msub><mi>r</mi><mi>i</mi></msub><msub><mi>r</mi><mi>j</mi></msub></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(46)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">A_{ij}(t+1)=λA_{ij}(t)+ηr_ir_j \tag{46}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord mathnormal">λ</span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span><span class="tag"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">46</span></span><span class="mord">)</span></span></span></span></span></span></p><p>除了训练单独的矩阵外，SGD 还可以用于学习可塑性规则本身。可塑性规则是突触前和突触后活动的可训练功能，</p><p><span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable width="100%"><mtr><mtd width="50%"></mtd><mtd><mrow><msub><mi>A</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo stretchy="false">(</mo><mi>t</mi><mo>+</mo><mn>1</mn><mo stretchy="false">)</mo><mo>=</mo><mi>λ</mi><msub><mi>A</mi><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>+</mo><mi>f</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mi>i</mi></msub><mo separator="true">,</mo><msub><mi>r</mi><mi>j</mi></msub><mo separator="true">,</mo><mi>θ</mi><mo stretchy="false">)</mo></mrow></mtd><mtd width="50%"></mtd><mtd><mtext>(47)</mtext></mtd></mtr></mtable><annotation encoding="application/x-tex">A_{ij}(t+1)=λA_{ij}(t)+f(r_i,r_j,θ) \tag{47}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">1</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord mathnormal">λ</span><span class="mord"><span class="mord mathnormal">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">ij</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="mclose">)</span></span><span class="tag"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord text"><span class="mord">(</span><span class="mord"><span class="mord">47</span></span><span class="mord">)</span></span></span></span></span></span></p><p>因为系统是可微的，参数 θ 共同描述了塑性规则，可以用基于 SGD 的方法进行更新。在其最简单的形式中，<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mo stretchy="false">(</mo><msub><mi>r</mi><mi>i</mi></msub><mo separator="true">,</mo><msub><mi>r</mi><mi>j</mi></msub><mo separator="true">,</mo><mi>θ</mi><mo stretchy="false">)</mo><mo>=</mo><mi>η</mi><msub><mi>r</mi><mi>i</mi></msub><msub><mi>r</mi><mi>j</mi></msub></mrow><annotation encoding="application/x-tex">f(r_i,r_j,θ)=ηr_ir_j</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.0361em;vertical-align:-0.2861em;"></span><span class="mord mathnormal" style="margin-right:0.1076em;">f</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7167em;vertical-align:-0.2861em;"></span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0278em;">r</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em;"><span style="top:-2.55em;margin-left:-0.0278em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0572em;">j</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.2861em;"><span></span></span></span></span></span></span></span></span></span> ，其中 <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>θ</mi><mo>=</mo><mo stretchy="false">{</mo><mi>η</mi><mo stretchy="false">}</mo></mrow><annotation encoding="application/x-tex">θ=\{η\}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.0278em;">θ</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">{</span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mclose">}</span></span></span></span>。 这里，系统可以学习成为 Hebbian <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>η</mi><mo>&gt;</mo><mn>0</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(η&gt;0)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&gt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">0</span><span class="mclose">)</span></span></span></span> 或反 Hebbian <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mi>η</mi><mo>&lt;</mo><mn>0</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(η&lt;0)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0359em;">η</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em;"></span><span class="mord">0</span><span class="mclose">)</span></span></span></span>。塑性规则的学习是元学习的一种形式，使用一种算法（这里是 SGD）来优化内部学习规则（这里是 Hebbian 塑性）。</p><p>这样的 Hebbian 可塑性网络可以扩展到包括更复杂的突触，在突触可塑性的&quot;级联模型&quot;中有多个隐藏变量。从理论上讲，适当设计的复杂突触可以大幅提升神经网络的记忆能力。这种复杂突触的模型是可微分的，因此可以被纳入 ANNs。</p><h2 id="参考文献">参考文献</h2><ol><li><p><a href="https://pubmed.ncbi.nlm.nih.gov/31227823/">Nath T, Mathis A, Chen AC, Patel A, Bethge M, Mathis MW. Using DeepLabCut for 3D markerless pose estimation across species and behaviors[J]. Nat Protoc. 2019 Jul;14(7):2152-2176.</a></p></li><li><p><a href="https://pubmed.ncbi.nlm.nih.gov/31036945/">Kar K, Kubilius J, Schmidt K, Issa EB, DiCarlo JJ. Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior[J]. Nat Neurosci. 2019 Jun;22(6):974-983.</a></p></li><li><p><a href="https://mitpress.mit.edu/9780262035613/deep-learning/">Goodfellow I, Bengio Y, Courville A, Deep Learning[M]. MIT Press, 2016.</a></p></li><li><p><a href="http://www.nature.com/articles/nature12742">Mante V , Sussillo D , Shenoy K V , et al. Context-dependent computation by recurrent dynamics in prefrontal cortex.[J]. Nature.</a></p></li><li><p><a href="https://ieeexplore.ieee.org/document/726791/">Lecun Y , Bottou L . Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.</a></p></li><li><p><a href="https://onlinelibrary.wiley.com/doi/abs/10.1207/s15516709cog1402_1">Elman J L . Finding Structure in Time[J]. Cognitive Science, 1990, 14(2):179-211.</a></p></li><li><p><a href="http://www.researchgate.net/profile/Antoine_Bordes/publication/215616967_Deep_Sparse_Rectifier_Neural_Networks/links/0a85e537a7f4b21bb1000000">Glorot X , Bordes A , Bengio Y . Deep Sparse Rectifier Neural Networks[J]. Journal of Machine Learning Research, 2011, 15:315-323.</a></p></li><li><p><a href="http://www.oalib.com/paper/4068193">Kingma D , Ba J . Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014.</a></p></li><li><p><a href="https://doi.org/10.1038/nn.4244">Yamins DL, DiCarlo JJ. Using goal-driven deep learning models to understand sensory cortex[J]. Nat Neurosci. 2016 Mar;19(3):356-65.</a></p></li><li><p><a href="http://doi.org/10.1073/pnas.1403112111">Yamins D L K ,  Hong H ,  Cadieu C F , et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex[J]. Proceedings of the National Academy of Sciences, 2014, 111(23):8619-24.</a></p></li><li><p><a href="http://www.sciencedirect.com/science/article/pii/S0896627309005479">Sussillo D ,  Abbott L F . Generating Coherent Patterns of Activity from Chaotic Neural Networks[J].  2009.</a></p></li><li><p><a href="https://doi.org/10.1038/s41593-018-0310-2">Yang GR, Joglekar MR, Song HF. Task representations in neural networks trained to perform many cognitive tasks[J]. Nat Neurosci. 2019;22(2):297-306.</a></p></li><li><p><a href="https://doi.org/10.1038/20939">Romo R, Brody CD, Hernández A, Lemus L. Neuronal correlates of parametric working memory in the prefrontal cortex[J]. Nature. 1999;399(6735):470-473.</a></p></li><li><p><a href="http://www.sciencedirect.com/science/article/pii/S089662731730185X">Chaisangmongkon W ,  Swaminathan S K ,  Freedman D J , et al. Computing by Robust Transience: How the Fronto-Parietal Network Performs Sequential, Category-Based Decisions[J]. Neuron, 2017, 93(6):1504-1517.e4.</a></p></li><li><p><a href="https://www.cell.com/servlet/linkout?suffix=e_1_5_1_2_64_2&amp;dbid=16&amp;doi=10.1016/j.neuron.2020.09.005&amp;key=10.4324%2F9781410612403&amp;cf=">Hebb D O . The Organization Of Behavior A Neuropsychological Theory[M]. John Wiley, Chapman &amp; Hall, 2013.</a></p></li></ol>]]>
    </content>
    <id>https://www.insidentally.com/articles/000032/</id>
    <link href="https://www.insidentally.com/articles/000032/"/>
    <published>2022-12-27T12:11:33.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>本文参考：<br>
<a href="https://doi.org/10.1016/j.neuron.2020.09.005"> <em>Artificial Neural Networks for Neuroscientists: A Prim]]>
    </summary>
    <title>面向神经科学家的人工神经网络</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="文件系统" scheme="https://www.insidentally.com/tags/%E6%96%87%E4%BB%B6%E7%B3%BB%E7%BB%9F/"/>
    <category term="ntfs3" scheme="https://www.insidentally.com/tags/ntfs3/"/>
    <category term="ntfs-3g" scheme="https://www.insidentally.com/tags/ntfs-3g/"/>
    <category term="分区" scheme="https://www.insidentally.com/tags/%E5%88%86%E5%8C%BA/"/>
    <content>
      <![CDATA[<blockquote><p>本文参考：<br><a href="https://docs.kernel.org/filesystems/ntfs3.html"> <em>NTFS3 — The Linux Kernel documentation</em> </a><br><a href="https://wiki.archlinux.org/title/NTFS_(%E7%AE%80%E4%BD%93%E4%B8%AD%E6%96%87)"> <em>NTFS (简体中文) - ArchWiki</em> </a></p></blockquote><p><ruby>NTFS<rt>New Technology File System</rt></ruby> 是 Windows NT 内核的系列操作系统支持的、一个特别为网络和磁盘配额、文件加密等管理安全特性设计的磁盘文件系统格式。而 NTFS3 是功能齐全的 NTFS 读写驱动程序。该驱动程序适用于最高 3.1 的 NTFS 版本。</p><span id="more"></span><h2 id="简介">简介</h2><p>最初 Linux 内核没有对 NTFS 做原生支持，来自 Tuxera 的 NTFS-3G 是目前主流的解决方案，但在实际使用中也有不少小问题。NTFS-3G 是借助 Linux 的用户空间文件系统 FUSE 模块在用户层实现的一个模仿对 NTFS 支持的文件系统，对 NTFS 的访问逻辑代码都是在用户层代码实现的。</p><p>在 NTFS3 出现之前 Linux 上使用 NTFS 主要问题还是缺乏稳定且功能齐全的读/写支持。</p><p>2020年，Paragon Software 做出了一个惊人的决定：尝试将之前只用于商业的 NTFS3 驱动程序 Mainline 化。最终 Linux Kernel 5.15 合并了 Paragon 提供的 NTFS3 内核驱动，它拥有更高的性能和更多的特性。</p><ul><li><p>该驱动程序实现了对 NTFS 文件系统中的正常、稀疏和压缩文件的读/写支持。</p></li><li><p>支持本地日志回放。</p></li><li><p>支持安装的 NTFS 卷的 NFS 导出。</p></li><li><p>支持扩展属性。预定义的扩展属性：</p><ul><li><p>system.ntfs_security gets/sets security</p><p>关键字: SECURITY_DESCRIPTOR_RELATIVE</p></li><li><p>system.ntfs_attrib gets/sets ntfs file/dir attributes.</p></li></ul><blockquote><p>注意：这一项应用于空文件，允许在稀疏（0x200）、压缩（0x800）和正常之间切换类型。</p></blockquote></li></ul><h2 id="挂载">挂载</h2><p>挂载时使用的文件系统类型是 ntfs3。</p><h3 id="手动挂载">手动挂载</h3><p>手动挂载使用命令：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># mount -t ntfs3 /dev/sdxY /mnt</span></span><br></pre></td></tr></table></figure><p><code>-t</code> 指出文件系统类型，<code>/dev/sdxY</code> 是你分区的路径，可以使用 <code>lsblk</code> 命令查看。<code>/mnt</code> 是挂载到哪个文件夹。</p><h3 id="开机自动挂载">开机自动挂载</h3><p>编辑 <code>/etc/fstab</code> 文件,添加行：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">UUID=**** /data ntfs3 iocharset=utf8,umask=0,prealloc 0 0</span><br></pre></td></tr></table></figure><p>其中 <code>UUID=****</code> 是指定分区的 UUID。使用 <code>UUID</code> 的好处在于它们与磁盘顺序无关。如果你在 BIOS 中改变了你的存储设备顺序，或是重新拔插了存储设备，或是因为一些 BIOS 可能会随机地改变存储设备的顺序，那么用 <code>UUID</code> 来表示将更有效。可以使用 <code>blkid</code> 命令查看 <code>UUID</code> 。</p><p><code>/data</code>  是挂载位置。本示例的位置是 <code>/data</code> 你需要提前创建这个文件夹。</p><p>后面的选项都是挂载参数，详细可看<a href="#%E6%8C%82%E8%BD%BD%E5%8F%82%E6%95%B0">后文</a>介绍。</p><p>最后两个 <code>0 0</code> ，表示是否备份和是否检查。<code>0 0</code> 表示不备份，不检查。</p><h2 id="挂载参数">挂载参数</h2><table>    <tr>        <th>参数</th>        <th>解释</th>    </tr>    <tr>        <td>iocharset=name</td>        <td>此选项告知驱动程序如何解释路径字符串，并将其转换为 Unicode 或返回。如果未设置此选项，将使用默认代码页（CONFIG\u NLS\u default）。示例：iocharset=utf8</td>    </tr>    <tr>        <td>uid=</td>        <td>挂载用户 id</td>    </tr>    <tr>        <td>gid=</td>        <td>挂载组 id</td>    </tr>    <tr>        <td>umask=</td>        <td>控制装载 NTFS 卷后创建的文件/目录的默认权限。</td>    </tr>    <tr>        <td>dmask=</td>        <td rowspan='2'> fmask 只适用于文件，dmask 只适用于目录，而不是指定同时适用于文件和目录的 umask。</td>    </tr>    <tr>        <td>fmask=</td>    </tr>    <tr>        <td rowspan='3'>noacsrules</td>        <td>“无访问规则”装载选项将文件/文件夹的访问权限设置为 777，所有者/组设置为 root。此装载选项吸收所有其他权限。</td>    </tr>    <tr>        <td>文件/文件夹的权限更改将报告为成功，但仍将保持 777。</td>    </tr>    <tr>        <td>所有者/组更改将报告为成功，但他们将保留为 root 用户。</td>    </tr>    <tr>        <td>nohidden</td>        <td>Linux 下不会显示具有 Windows 特定隐藏（FILE_ATTRIBUTE_HIDDEN）属性的文件。</td>    </tr>    <tr>        <td>sys_immutable</td>        <td>具有 Windows 特定系统（FILE_ATTRIBUTE_SYSTEM）属性的文件将标记为系统不可变文件。</td>    </tr>    <tr>        <td>discard</td>        <td>支持 TRIM 命令以提高删除操作的性能，建议将其用于固态驱动器（SSD）。</td>    </tr>    <tr>        <td>force</td>        <td>即使卷被标记为脏，也强制驱动程序装载分区。不建议使用。</td>    </tr>    <tr>        <td>sparse</td>        <td>创建稀疏的新文件。</td>    </tr>    <tr>        <td>showmeta</td>        <td>使用此参数可显示已装入 NTFS 分区上的所有元文件（系统文件）。默认情况下，所有元文件都是隐藏的。</td>    </tr>    <tr>        <td>prealloc</td>        <td>当写入时文件大小增加时，为文件过度预分配空间。减少对不同文件执行并行写入操作时的碎片。</td>    </tr>    <tr>        <td>acl</td>        <td>支持 POSIX ACL（访问控制列表）。如果内核支持，则有效。不要与 NTFS ACL 混淆。指定为 acl 的选项支持 POSIX acl。</td>    </tr></table><h2 id="NTFS3-的优点">NTFS3 的优点</h2><p>NTFS3 是内核态的驱动，ntfs3 比 nfts-3g 无论是速度还是负载都要好上不少。</p><p>已经有诸多网友做过测试：</p><ul><li><p><a href="https://biluohc.github.io/posts/ntfs3gvsntfs3/">ntfs-3g 与 Linux 5.15+ ntfs3 驱动的简单性能测试</a></p></li><li><p><a href="https://bbs.deepin.org/post/236260">Linux 5.15内核NTFS3性能评测</a></p></li></ul><p>除了性能更好以外，NTFS3 还支持挂载用户和文件权限管理等功能。具体使用方法可以自行学习 gid、uid 以及 umask 的用法。</p><p>另外 NTFS3 还支持 NTFS 的 prealloc，可以大幅减少文件碎片的产生。</p><h2 id="关于-NTFS3-驱动无人维护的问题">关于 NTFS3 驱动无人维护的问题</h2><p>自从该驱动 2021 年在 Linux 5.15 中最终被主线化以来，至今为止，在接近一年的时间里，还没有任何重大的错误修复被送入驱动。</p><p>有人推测是该驱动的维护者 Konstantin Komarov 身处俄罗斯，受到俄乌战争影响的原因。</p><p>随后包括 Linus Torvalds 在内的诸多程序员都对此事表达了关切，并且愿意参与到贡献中来。</p><p>现在，我们看到 Paragon 软件公司的 Konstantin Komarov 在因休息和其他事务而离开后，又重新活跃在内核邮件列表中。Komarov 在 2022 年 6 月 3 日为Linux 5.19 的合并窗口提交了一批 NTFS3 的修正。</p><p>我相信 ntfs3 未来会越来越好。并且目前，nfts3 已经是 Linux 中最好用 NTFS 驱动了，我觉得您不妨尝试一下。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000029/</id>
    <link href="https://www.insidentally.com/articles/000029/"/>
    <published>2022-06-27T08:12:33.000Z</published>
    <summary>
      <![CDATA[<blockquote>
<p>本文参考：<br>
<a href="https://docs.kernel.org/filesystems/ntfs3.html"> <em>NTFS3 — The Linux Kernel documentation</em> </a><br>
<a href="https://wiki.archlinux.org/title/NTFS_(%E7%AE%80%E4%BD%93%E4%B8%AD%E6%96%87)"> <em>NTFS (简体中文) - ArchWiki</em> </a></p>
</blockquote>
<p><ruby>NTFS<rt>New Technology File System</rt></ruby> 是 Windows NT 内核的系列操作系统支持的、一个特别为网络和磁盘配额、文件加密等管理安全特性设计的磁盘文件系统格式。而 NTFS3 是功能齐全的 NTFS 读写驱动程序。该驱动程序适用于最高 3.1 的 NTFS 版本。</p>]]>
    </summary>
    <title>使用 ntfs3 驱动替换 ntfs-3g 挂载 windows NTFS 分区</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="gnome" scheme="https://www.insidentally.com/tags/gnome/"/>
    <category term="镜像源" scheme="https://www.insidentally.com/tags/%E9%95%9C%E5%83%8F%E6%BA%90/"/>
    <category term="删除旧内核" scheme="https://www.insidentally.com/tags/%E5%88%A0%E9%99%A4%E6%97%A7%E5%86%85%E6%A0%B8/"/>
    <category term="Fedora" scheme="https://www.insidentally.com/tags/Fedora/"/>
    <category term="dnf" scheme="https://www.insidentally.com/tags/dnf/"/>
    <content>
      <![CDATA[<p>Fedora 是红帽系发行版中最激进的发行版。不少朋友将使用 Fedora 的人看做是红帽的小白鼠。但是 Fedora 超快的更新速度其实也为开发者提供了不少便利。本文介绍了安装 Fedora 36 后一些简单的设置，可以使你的 Fedora 更加易用一些。</p><span id="more"></span><h2 id="1-设置软件源">1. 设置软件源</h2><p>Fedora 默认使用 Metalink 给出推荐的镜像列表，保证用户使用的镜像仓库足够新，并且能够尽快收到安全更新，从而提供更好的安全性。所以通常情况下使用默认配置即可，无需更改配置文件。</p><p>不过，由于 Metalink 需要从国外的 Fedora 项目服务器上获取元信息，所以对于校园内网、无国外访问等特殊情况，Metalink 并不适用，此时可以参照清华大学 tuna 小组介绍的 <a href="https://mirrors.tuna.tsinghua.edu.cn/help/fedora/">方法</a> 来修改软件源。</p><h2 id="2-更新系统">2. 更新系统</h2><p>激进的发行版就要有激进的用法，因此配置好软件源后第一件事就是执行系统更新、刷新存储库列表是理所当然要做的。</p><p>你可以从 GNOME 软件中心执行此操作，或者使用终端操作。</p><p>对于终端，只需使用以下命令：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf update</span><br></pre></td></tr></table></figure><p>可能需要重新启动才能完成系统更新。</p><h2 id="3-删除旧的内核以及其他不需要的旧软件包">3. 删除旧的内核以及其他不需要的旧软件包</h2><p>更新系统之后多半会安装新的内核，以及会出现一些无用的依赖。重新启动系统到新的内核，确保内核运转没有问题了，就可以删除旧内核以及无用的依赖了。</p><p>使用以下命令就可以自动删除无用的依赖：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf autoremove</span><br></pre></td></tr></table></figure><p>Fedora 内核更新快，但是每次更新内核，旧的内核不会自动删除，占用硬盘空间。以前的教程删除旧内核都是先搜索，再移除要删除的版本，输入版本号也非常麻烦。使用以下命令即可一条命令删除旧内核：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf remove --oldinstallonly</span><br></pre></td></tr></table></figure><img src="/images/000028/02.jpg" width="700" alt="Fedora 一条命令删除旧内核" align=center /><h2 id="4-启用-RPM-Fusion-软件源">4. 启用 RPM Fusion 软件源</h2><p>安装 Fedora 时会提示你是否启用其他第三方软件源。</p><p>但是自动启用的软件源，只有英伟达驱动程序、谷歌 Chrome 和 Steam 等软件源，全套的 RPM Fusion 软件源并没有自动启用，因此还有诸如 VLC 和 MPV 等软件也不可用。</p><p>建议你还是开启全套的 RPM Fusion，国内玩家还是建议使用清华的镜像开启 RPM Fusion：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo yum install --nogpgcheck https://mirrors.tuna.tsinghua.edu.cn/rpmfusion/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm https://mirrors.tuna.tsinghua.edu.cn/rpmfusion/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm</span><br></pre></td></tr></table></figure><p>安装成功后，修改 <code>/etc/yum.repos.d/</code> 目录下以 <code>rpmfusion</code> 开头，以 <code>.repo</code> 结尾的文件。具体而言，需要将文件中的 <code>baseurl=</code> 开头的行等号后面链接中的 <code>http://download1.rpmfusion.org/</code> 替换为 <code>https://mirrors.tuna.tsinghua.edu.cn/rpmfusion/</code>， 替换后的文件类似如下：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line">[rpmfusion-free]</span><br><span class="line">name=RPM Fusion for Fedora $releasever - Free</span><br><span class="line">baseurl=https://mirrors.tuna.tsinghua.edu.cn/rpmfusion/free/fedora/releases/$releasever/Everything/$basearch/os/</span><br><span class="line">mirrorlist=http://mirrors.rpmfusion.org/mirrorlist?repo=free-fedora-$releasever&amp;arch=$basearch</span><br><span class="line">enabled=1</span><br><span class="line">metadata_expire=7d</span><br><span class="line">gpgcheck=1</span><br><span class="line">gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-rpmfusion-free-fedora-$releasever</span><br><span class="line"></span><br><span class="line">[rpmfusion-free-debuginfo]</span><br><span class="line">name=RPM Fusion for Fedora $releasever - Free - Debug</span><br><span class="line">mirrorlist=http://mirrors.rpmfusion.org/mirrorlist?repo=free-fedora-debug-$releasever&amp;arch=$basearch</span><br><span class="line">enabled=0</span><br><span class="line">metadata_expire=7d</span><br><span class="line">gpgcheck=1</span><br><span class="line">gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-rpmfusion-free-fedora-$releasever</span><br><span class="line"></span><br><span class="line">[rpmfusion-free-source]</span><br><span class="line">name=RPM Fusion for Fedora $releasever - Free - Source</span><br><span class="line">baseurl=https://mirrors.tuna.tsinghua.edu.cn/rpmfusion/free/fedora/releases/$releasever/Everything/source/SRPMS/</span><br><span class="line">mirrorlist=http://mirrors.rpmfusion.org/mirrorlist?repo=free-fedora-source-$releasever&amp;arch=$basearch</span><br><span class="line">enabled=0</span><br><span class="line">metadata_expire=7d</span><br><span class="line">gpgcheck=1</span><br><span class="line">gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-rpmfusion-free-fedora-$releasever</span><br></pre></td></tr></table></figure><h2 id="5-添加-Flathub-存储库">5. 添加 Flathub 存储库</h2><p>Fedora 默认情况下启用了 Flatpak。 但是，它是被 Fedora 过滤后的 Flatpak 。</p><p>因此，要访问更完备的 Flatpak 应用程序库，你可以在终端中使用以下命令添加 Flathub 存储库：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">flatpak remote-add --if-not-exists flathub https://flathub.org/repo/flathub.flatpakrepo</span><br></pre></td></tr></table></figure><h2 id="6-配置-DNF-以更快地下载包">6. 配置 DNF 以更快地下载包</h2><p>Fedora 可以通过多种方法增强下载包的速度。比如选择最快的镜像，可以提高包下载速度。此外，如果你的互联网连接速度足够快，则可以更改并行下载的数量以获得更快的下载。</p><p>要做这两件事，只需编辑位于 <code>/etc/dnf/dnf.conf</code> 的 DNF 配置文件。</p><p>将以下行附加到 <code>/etc/dnf/dnf.conf</code> 文件中，保存并退出：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">fastestmirror=true</span><br><span class="line">deltarpm=true</span><br><span class="line">max_parellel_downloads=10</span><br></pre></td></tr></table></figure><ul><li><code>fastestmirror</code> 为选择最快软件源，如果你手动修改了仓库里面的信息则不需要启动这个。</li><li><code>deltarpm</code> 相当于增量下载，把软件增加的部分下载下来，和原软件包合成新软件包，类似于现在的 Android 软件更新。</li><li><code>max_parellel_downloads</code> 设置最大并行下载数量。</li></ul><h2 id="7-安装后更改主机名">7. 安装后更改主机名</h2><p>安装后，默认主机名设置为 <code>fedora</code>。</p><p>因此，如果你想在安装后个性化你的系统主机名，可以使用以下命令设置新的主机名：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo hostnamectl set-hostname &lt;你的主机名&gt;</span><br></pre></td></tr></table></figure><p>请将 <code>&lt;你的主机名&gt;</code> 替换为你的主机名（不包含 <code>&lt;</code> 和 <code>&gt;</code>），建议采用 FQDN 主机名，即包括域名的完全限定主机名。</p><p>然后可以修改 <code>/etc/hosts</code> 在 <code>127.0.0.1</code> 以及 <code>::1</code> 条目后面都加上你的主机名。类似下面这样：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"># Loopback entries; do not change.</span><br><span class="line"># For historical reasons, localhost precedes localhost.localdomain:</span><br><span class="line">127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4 &lt;你的主机名&gt;</span><br><span class="line">::1         localhost localhost.localdomain localhost6 localhost6.localdomain6 &lt;你的主机名&gt;</span><br><span class="line"># See hosts(5) for proper format and other examples:</span><br><span class="line"># 192.168.1.10 foo.mydomain.org foo</span><br><span class="line"># 192.168.1.13 bar.mydomain.org bar</span><br></pre></td></tr></table></figure><h2 id="8-安装-Gnome-优化和扩展应用程序">8. 安装 Gnome 优化和扩展应用程序</h2><p>要调整 GNOME 的外观和感觉，你需要安装 GNOME <ruby>优化<rt>Tweaks</rt></ruby> 和扩展管理器应用程序。 可以通过软件中心或终端使用以下命令来完成：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install gnome-tweaks gnome-extensions-app</span><br></pre></td></tr></table></figure><p>然后你就可以在 <a href="https://extensions.gnome.org/">GNOME Shell 扩展页面</a> 挑选扩展了。</p><p>使用一些好用的 GNOME 扩展来增强你的桌面工作的使用体验。限于篇幅，本文就不展开 GNOME 扩展的玩法了。</p><h2 id="9-用于电池健康管理的-TLP">9. 用于电池健康管理的 TLP</h2><p>TLP 是一个很好的实用程序，可帮助优化笔记本电脑的电池。该实用程序带有各种命令行选项来调整和查看有关功耗的报告。</p><p>TLP 非常好用，你只需安装它并忘记它。这不需要任何设置或配置即可使其工作。使用默认设置安装后，它就可以开箱即用。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">dnf install tlp tlp-rdw</span><br></pre></td></tr></table></figure><p>然后卸载有冲突的 power-profiles-daemon 软件包：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">dnf remove power-profiles-daemon</span><br></pre></td></tr></table></figure><p>设置开机启动 TLP 的服务：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">systemctl enable tlp.service</span><br></pre></td></tr></table></figure><p>您还应该屏蔽以下服务以避免冲突，确保 TLP 的无线设备（蓝牙、wifi等）切换选项的能够正确操作：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">systemctl mask systemd-rfkill.service systemd-rfkill.socket</span><br></pre></td></tr></table></figure><p>安装 TLP 能够极大的提高笔记本电脑电池的使用时长。</p><h2 id="10-安装和配置主题">10. 安装和配置主题</h2><p>GNOME 桌面的美化是个见仁见智的事情。</p><p>我的美化方案是用软件源里面有的东西。</p><p>安装主题：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install flat-remix-theme</span><br></pre></td></tr></table></figure><p>安装图标：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install numix-icon-theme-circle</span><br></pre></td></tr></table></figure><p>安装光标：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo dnf install breeze-cursor-theme</span><br></pre></td></tr></table></figure><p>然后启用“<ruby>用户主题<rt>User Themes</rt></ruby>” 扩展，在<ruby>扩展<rt>Extension</rt></ruby>应用程序中里面启用它。</p><img src="/images/000028/03.jpg" width="500" alt="GNOME 扩展管理" align=center /><p>再去 GNOME <ruby>优化<rt>Tweaks</rt></ruby>的“外观”设置里面修改刚刚安装的主题、图标和光标，还可以修改字体。</p><img src="/images/000028/04.jpg" width="500" alt="GNOME 优化外观" align=center /><h2 id="11-配置-NTP-以获得准确的时间">11. 配置 NTP 以获得准确的时间</h2><p>Network Time Protocol（NTP）是用来使计算机时间同步化的一种协议，它可以使计算机对其服务器或时钟源做同步化，它可以提供高精准度的时间校正。</p><p>Fedora 默认使用 chrony 来进行时间同步。</p><p>可以修改 <code>/etc/chrony.conf</code></p><p>将 pool 的值选择为下列中的其中一个即可：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"># 中国 NTP 域名授时快速服务</span><br><span class="line">pool cn.ntp.org.cn </span><br><span class="line"></span><br><span class="line"># 阿里云 NTP</span><br><span class="line">pool ntp.aliyun.com </span><br><span class="line"></span><br><span class="line"># 腾讯云 NTP</span><br><span class="line">pool ntp.tencent.com </span><br></pre></td></tr></table></figure><p>随后重启 chrony 即可。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">sudo systemctl restart chronyd.service</span><br></pre></td></tr></table></figure><p>最后就是愉快的使用 Fedora 吧。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000028/</id>
    <link href="https://www.insidentally.com/articles/000028/"/>
    <published>2022-06-04T02:10:33.000Z</published>
    <summary>
      <![CDATA[<p>Fedora 是红帽系发行版中最激进的发行版。不少朋友将使用 Fedora 的人看做是红帽的小白鼠。但是 Fedora 超快的更新速度其实也为开发者提供了不少便利。本文介绍了安装 Fedora 36 后一些简单的设置，可以使你的 Fedora 更加易用一些。</p>]]>
    </summary>
    <title>安装 Fedora 36 后一些适合中国用户的简单设置</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
  <entry>
    <author>
      <name>insidentally</name>
    </author>
    <category term="技术分享" scheme="https://www.insidentally.com/categories/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/"/>
    <category term="Linux" scheme="https://www.insidentally.com/tags/Linux/"/>
    <category term="Debian sid" scheme="https://www.insidentally.com/tags/Debian-sid/"/>
    <category term="镜像源" scheme="https://www.insidentally.com/tags/%E9%95%9C%E5%83%8F%E6%BA%90/"/>
    <content>
      <![CDATA[<p>Debian sid 其实严格意义上不是一个正式的发行版，它更像一个 Debian 发行版的滚动开发版本，包含引入 Debian 系统中的最新的软件包。一般都是一些硬派开发者或测试者才会使用这个版本。他的软件包及其的新，相应的这些软件包有可能不稳定。</p><h2 id="安装-Debian">安装 Debian</h2><p>你可以到 Debian 的<a href="https://www.debian.org/CD/">官网</a>去下载最新的 Debian 安装镜像。对于中国大陆的用户也可以去各个大高校或者大公司提供的镜像源去下载安装镜像。较好的镜像源下载地址有：<a href="https://mirrors.tuna.tsinghua.edu.cn/debian-cd/current/amd64/iso-cd/">清华</a>、<a href="https://mirrors.ustc.edu.cn/debian-cd/current/amd64/iso-cd/">中科大</a>、<a href="https://mirrors.cloud.tencent.com/debian-cd/current/amd64/iso-cd/">腾讯云</a>以及<a href="https://mirrors.aliyun.com/debian-cd/current/amd64/iso-cd/">阿里云</a>等。</p><p>按照<a href="https://www.debian.org/releases/stable/amd64/index.zh-cn.html">官网的安装方法</a>安装即可。</p><h2 id="切换软件源">切换软件源</h2><p>安装好后你就会获得一个正常的 Debian 系统，此时只需要修改软件源就可以将系统转为 Debian sid 系统。</p><p>使用管理员权限修改 <code>/etc/apt/sources.list</code> 文件。将文件中的内容全部删去，修改为：</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">deb https://mirrors.tuna.tsinghua.edu.cn/debian/ sid main contrib non-free</span><br><span class="line"># deb-src https://mirrors.tuna.tsinghua.edu.cn/debian/ sid main contrib non-free</span><br></pre></td></tr></table></figure><p>由于 sid 中的软件包足够新，所以并不需要 <code> updates</code>、<code>backports</code>和<code>security</code>这三个软件源。</p><p>然后使用命令：</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_"># </span><span class="language-bash"><span class="built_in">sudo</span> apt update</span></span><br><span class="line"><span class="meta prompt_"># </span><span class="language-bash"><span class="built_in">sudo</span> apt upgrade</span></span><br></pre></td></tr></table></figure><p>即可更新系统，并将系统转变为 Debian sid。</p><h2 id="需要注意的点">需要注意的点</h2><p>需要知道的是，使用 Debian sid，其实是在参与 Debian 的开发。这意味着你应该了解 Linux、Debian 和 Debian 打包系统的方方面面，而且你应当对跟踪、修复漏洞持有兴趣。</p><p>每次更新的时候要小心谨慎，要看清楚更新了那些内容，有哪些变化需要手动维护。</p><p>最好在手边保留一个能用的 live CD/USB，以在系统出问题时进行修复。</p><p>不时的备份一下系统，以便系统出问题时进行恢复。</p><h2 id="使用体验">使用体验</h2><p>目前我已经在我的联想小新 pro 13 上面使用了一段时间的 Debian sid。Debian sid 名义上是不稳定版本，但是实际上还算稳定，没有出什么大问题。</p><p>Debian sid 的软件包非常新，但是并没有 Arch Linux 新，也没有 Fedora 新。</p><p>新的内核对新硬件支持较好，intel 最新的网卡也可以直接驱动，可以享受 wifi6 的快速。</p><p>由于 Sid 是个永不发布的版本，所以将来也不会有大版本升级的困扰，他其实是个滚动发行版。各路极客完全可以尝试以下这个独特的滚动发行版。</p>]]>
    </content>
    <id>https://www.insidentally.com/articles/000027/</id>
    <link href="https://www.insidentally.com/articles/000027/"/>
    <published>2022-05-27T02:10:33.000Z</published>
    <summary>
      <![CDATA[<p>Debian sid 其实严格意义上不是一个正式的发行版，它更像一个 Debian 发行版的滚动开发版本，包含引入 Debian 系统中的最新的软件包。一般都是一些硬派开发者或测试者才会使用这个版本。他的软件包及其的新，相应的这些软件包有可能不稳定。</p>
<h2 id=]]>
    </summary>
    <title>如何切换到独特的滚动发行版——Debian sid</title>
    <updated>2026-06-02T09:44:03.955Z</updated>
  </entry>
</feed>
