pytorch中如何只让指定变量向后传播梯度?
(或者说如何让指定变量不参与后向传播?)
有以下公式,假如要让L对xvar求导:
(1)中,L对xvar的求导将同时计算out1部分和out2部分;
(2)中,L对xvar的求导只计算out2部分,因为out1的requires_grad=False;
(3)中,L对xvar的求导只计算out1部分,因为out2的requires_grad=False;
验证如下:
#!/usr/bin/env python2 # -*- coding: utf-8 -*- """ Created on Wed May 23 10:02:04 2018 @author: hy """ import torch from torch.autograd import Variable print("Pytorch version: {}".format(torch.__version__)) x=torch.Tensor([1]) xvar=Variable(x,requires_grad=True) y1=torch.Tensor([2]) y2=torch.Tensor([7]) y1var=Variable(y1) y2var=Variable(y2) #(1) print("For (1)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_() #(2) print("For (2)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) out1 = out1.detach() print("after out1.detach(), out1 requres_grad: {}".format(out1.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_() #(3) print("For (3)") print("xvar requres_grad: {}".format(xvar.requires_grad)) print("y1var requres_grad: {}".format(y1var.requires_grad)) print("y2var requres_grad: {}".format(y2var.requires_grad)) out1 = xvar*y1var print("out1 requres_grad: {}".format(out1.requires_grad)) out2 = xvar*y2var print("out2 requres_grad: {}".format(out2.requires_grad)) #out1 = out1.detach() out2 = out2.detach() print("after out2.detach(), out2 requres_grad: {}".format(out1.requires_grad)) L=torch.pow(out1-out2,2) L.backward() print("xvar.grad: {}".format(xvar.grad)) xvar.grad.data.zero_()
pytorch中,将变量的requires_grad设为False,即可让变量不参与梯度的后向传播;
但是不能直接将out1.requires_grad=False;
其实,Variable类型提供了detach()方法,所返回变量的requires_grad为False。
注意:如果out1和out2的requires_grad都为False的话,那么xvar.grad就出错了,因为梯度没有传到xvar
补充:
volatile=True表示这个变量不计算梯度, 参考:Volatile is recommended for purely inference mode, when you're sure you won't be even calling .backward(). It's more efficient than any other autograd setting - it will use the absolute minimal amount of memory to evaluate the model. volatile also determines that requires_grad is False.
以上这篇在pytorch中实现只让指定变量向后传播梯度就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持。
免责声明:本站文章均来自网站采集或用户投稿,网站不提供任何软件下载或自行开发的软件! 如有用户或公司发现本站内容信息存在侵权行为,请邮件告知! 858582#qq.com
RTX 5090要首发 性能要翻倍!三星展示GDDR7显存
三星在GTC上展示了专为下一代游戏GPU设计的GDDR7内存。
首次推出的GDDR7内存模块密度为16GB,每个模块容量为2GB。其速度预设为32 Gbps(PAM3),但也可以降至28 Gbps,以提高产量和初始阶段的整体性能和成本效益。
据三星表示,GDDR7内存的能效将提高20%,同时工作电压仅为1.1V,低于标准的1.2V。通过采用更新的封装材料和优化的电路设计,使得在高速运行时的发热量降低,GDDR7的热阻比GDDR6降低了70%。
更新日志
- lol全球总决赛lck一号种子是谁 S14全球总决赛lck一号种子队伍
- BradMehldau-ApresFaure(2024)[24-96]FLAC
- IlCannone-FrancescaDegoPlaysPaganinisViolin(2021)[24-96]FLAC
- Tchaikovsky,Babajanian-PianoTrios-Gluzman,Moser,Sudbin[FLAC+CUE]
- 费玉清.1987-费玉清十周年旧曲情怀4CD【东尼】【WAV+CUE】
- 群星.2024-春花焰电视剧影视原声带【TME】【FLAC分轨】
- 方力申.2008-我的最爱新曲+精丫金牌大风】【WAV+CUE】
- 群星 《2024好听新歌35》十倍音质 U盘音乐 [WAV分轨][1.1G]
- 群星《烧透你的耳朵1》DXD金佰利 [低速原抓WAV+CUE][1.2G]
- 莫文蔚《超级金曲精选2CD》SONY [WAV+CUE][1.6G]
- 【RR】加尼克奥尔森GarrickOhlsso《贝多芬钢琴协奏曲全集》原声母带WAV
- 彭芳《纯色角1》[WAV+CUE]
- 李蔓《山顶的月亮—李蔓动态情歌》
- 梁咏琪.1999-新鲜【EEI】【WAV+CUE】
- 张琍敏.1979-悲之秋【海山】【FLAC分轨】