<metaname="twitter:image:src"content="https://avatars2.githubusercontent.com/u/21003710?s=400&v=4"><metaname="twitter:site"content="@github"><metaname="twitter:card"content="summary"><metaname="twitter:title"content="pytorch/pytorch"><metaname="twitter:description"content="Tensors and Dynamic neural networks in Python with strong GPU acceleration - pytorch/pytorch">
<metaproperty="og:image"content="https://avatars2.githubusercontent.com/u/21003710?s=400&v=4"><metaproperty="og:site_name"content="GitHub"><metaproperty="og:type"content="object"><metaproperty="og:title"content="pytorch/pytorch"><metaproperty="og:url"content="https://github.com/pytorch/pytorch"><metaproperty="og:description"content="Tensors and Dynamic neural networks in Python with strong GPU acceleration - pytorch/pytorch">
<inputtype="text"class="form-control input-sm header-search-input jump-to-field js-jump-to-field js-site-search-focus js-site-search-field is-clearable"data-hotkey="s,/"name="q"placeholder="Search or jump to…"data-unscoped-placeholder="Search or jump to…"data-scoped-placeholder="Search or jump to…"autocapitalize="off"aria-autocomplete="list"aria-controls="jump-to-results"aria-label="Search or jump to…"data-jump-to-suggestions-path="/_graphql/GetSuggestedNavigationDestinations"spellcheck="false"autocomplete="off">
<aaria-label="You have unread notifications"class="Header-link notification-indicator position-relative tooltipped tooltipped-sw js-socket-channel js-notification-indicator"data-hotkey="g n"data-ga-click="Header, go to notifications, icon:unread"data-channel="eyJjIjoibm90aWZpY2F0aW9uLWNoYW5nZWQ6MjE4OTkzMCIsInQiOjE1OTI5MjY0MjR9--88a2dfa2cb93da487c2c5e04b90d6dda85ad67f9cb472ba9c47ff45695f4e8e6"href="https://github.com/notifications">
<arole="menuitem"class="dropdown-item"href="https://github.com/pytorch/pytorch/issues/new/choose"data-ga-click="Header, create new issue"data-skip-pjax="">
<divclass="header-nav-current-user css-truncate"><arole="menuitem"class="no-underline user-profile-link px-3 pt-2 pb-2 mb-n2 mt-n1 d-block"href="https://github.com/mslacken"data-ga-click="Header, go to profile, text:Signed in as">Signed in as <strongclass="css-truncate-target">mslacken</strong></a></div>
<inputtype="text"autocomplete="off"data-no-org-url="/autocomplete/user-suggestions"data-org-url="/suggestions?mention_suggester=1"data-maxlength="80"class="d-table-cell width-full form-control js-user-status-message-field js-characters-remaining-field"placeholder="What's happening?"name="message"aria-label="What is your current status?">
</text-expander>
<divclass="error">Could not update your status, please try again.</div>
<inputtype="checkbox"name="limited_availability"value="1"class="js-user-status-limited-availability-checkbox"data-default-message="I may be slow to respond."aria-describedby="limited-availability-help-text-truncate-true-compact-true"id="limited-availability-truncate-true-compact-true">
<arole="menuitem"class="dropdown-item"href="https://github.com/mslacken"data-ga-click="Header, go to profile, text:your profile"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"YOUR_PROFILE","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="596244f29312bddf019af45a1261bcee05dc3916f0b865106553bd7c9ca8f61c">Your profile</a>
<arole="menuitem"class="dropdown-item"href="https://github.com/mslacken?tab=repositories"data-ga-click="Header, go to repositories, text:your repositories"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"YOUR_REPOSITORIES","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="a728aebe357ded31d008ea3b7f6e41e0e87015c6191c920d80f24f42010e6929">Your repositories</a>
<arole="menuitem"class="dropdown-item"href="https://github.com/settings/organizations"data-ga-click="Header, go to organizations, text:your organizations"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"YOUR_ORGANIZATIONS","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="a49959cd987b0cbfbe9666df66301950b0042287568cc680772fe75f93dbf427">Your organizations</a>
<arole="menuitem"class="dropdown-item"href="https://github.com/mslacken?tab=projects"data-ga-click="Header, go to projects, text:your projects"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"YOUR_PROJECTS","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="9a2fe35ddcd414987d943d72a2db76a742079aaf4cb082de00eb3a33f921d43d">Your projects</a>
<arole="menuitem"class="dropdown-item"href="https://github.com/mslacken?tab=stars"data-ga-click="Header, go to starred repos, text:your stars"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"YOUR_STARS","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="3a9029fa49dff517cd46daecf6e2eead4991b49385eaaea42ff309516c27cdad">Your stars</a>
<arole="menuitem"class="dropdown-item"href="https://gist.github.com/mine"data-ga-click="Header, your gists, text:your gists"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"YOUR_GISTS","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="ebedd3947c448f521eb69f0d92f11b566e5b23e535686265efcf9d35f1918c04">Your gists</a>
<arole="menuitem"class="dropdown-item"href="https://help.github.com/"data-ga-click="Header, go to help, text:help"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"HELP","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="7b3da1dc5c5ef58d2ddc3b566d893696264f52e4d9f876813397cf2034765f51">Help</a>
<arole="menuitem"class="dropdown-item"href="https://github.com/settings/profile"data-ga-click="Header, go to settings, icon:settings"data-hydro-click="{"event_type":"global_header.user_menu_dropdown.click","payload":{"request_url":"https://github.com/pytorch/pytorch/releases","target":"SETTINGS","originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="7e1e26943ae2892f8fcb8d15c0e24b5697ded34e809f38906b178bc562905eab">Settings</a>
<aclass="social-count"href="https://github.com/pytorch/pytorch/network/dependents?package_id=UGFja2FnZS01MjY1MjIxNQ%3D%3D"aria-label="33218 repositories depend on this package">
<summaryclass="btn btn-sm btn-with-count"title="Fork your own copy of pytorch/pytorch to your account"data-hydro-click="{"event_type":"repository.click","payload":{"target":"FORK_BUTTON","repository_id":65600975,"originating_url":"https://github.com/pytorch/pytorch/releases","user_id":2189930}}"data-hydro-click-hmac="9646ca24c40d508fa509f6addc0c8bf9efbdab2df43d06b479c35617ddf1eddd"data-ga-click="Repository, show fork modal, action:releases#index; text:Fork"role="button">
<h3>Autograd: Operations that return integer-type tensors now always returns tensors that don’t require grad (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="612176471"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37789"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37789/hovercard"href="https://github.com/pytorch/pytorch/pull/37789">#37789</a>).</h3>
<p>This most notably affects <code>torch.argmin</code>, <code>torch.argmax</code>, and <code>torch.argsort</code>.
This change is BC-Breaking because previously one could obtain an
integer-type tensor that requires grad in 1.5.0. However, said tensors
were not usable by autograd; calling <code>.backward()</code> on them resulted in an error, so most users are likely to not have been relying on this behavior.</p>
<h3>When using multiprocessing, PyTorch 1.5.1 and 1.5.0 may error out
with complaints about incompatibility between MKL and libgomp (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="607895757"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37377"data-hovercard-type="issue"data-hovercard-url="/pytorch/pytorch/issues/37377/hovercard"href="https://github.com/pytorch/pytorch/issues/37377">#37377</a>)</h3>
<p>You may see error messages like the following when using the <code>torch.multiprocessing</code> package. This bug has primarily affected users with AMD CPUs.</p>
<pre><code>`Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library.
Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.`
</code></pre>
<p>You can get rid of the error and the error message by setting the environment <code>MKL_THREADING_LAYER=GNU</code>. This can be done either by including the following in your python code:</p>
<pre><code>import os
os.environ['MKL_THREADING_LAYER'] = 'GNU'
</code></pre>
<p>or by specifying the environment variable when running your script:</p>
<p>To learn more about what triggers this bug and other workarounds if the above isn’t working, please <ahref="https://github.com/pytorch/pytorch/issues/37377#issuecomment-629610327"data-hovercard-type="issue"data-hovercard-url="/pytorch/pytorch/issues/37377/hovercard">read this comment on the issue</a>.</p>
<h1>Critical Fixes</h1>
<h3><ahref="https://pytorch.org/docs/stable/torch.html#torch.multinomial"rel="nofollow"><code>torch.multinomial</code>:</a> Fixed a bug where CUDA <code>multinomial</code> generated the same sequence over and over again with a shift of 4. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="614324506"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38046"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38046/hovercard"href="https://github.com/pytorch/pytorch/pull/38046">#38046</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/nn.html#conv2d"rel="nofollow"><code>nn.Conv2d</code></a>: Fixed a bug where circular padding applied padding across the wrong dimension (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="612903601"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37881"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37881/hovercard"href="https://github.com/pytorch/pytorch/pull/37881">#37881</a>)</h3>
<h3>Fixed bug where asserts in CUDA kernels were mistakingly disabled, leading to many silent kernel errors. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="623515934"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38943"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38943/hovercard"href="https://github.com/pytorch/pytorch/pull/38943">#38943</a>, <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="625349578"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39047"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39047/hovercard"href="https://github.com/pytorch/pytorch/pull/39047">#39047</a>, <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="626980560"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39218"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39218/hovercard"href="https://github.com/pytorch/pytorch/pull/39218">#39218</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/torch.html#torch.gather"rel="nofollow"><code>torch.gather</code></a>, <ahref="https://pytorch.org/docs/stable/torch.html#torch.scatter"rel="nofollow"><code>torch.scatter</code></a>: added checks for illegal input dtypes that caused silently incorrect behaviors (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="614209545"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38025"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38025/hovercard"href="https://github.com/pytorch/pytorch/pull/38025">#38025</a>, <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="620213360"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38646"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38646/hovercard"href="https://github.com/pytorch/pytorch/pull/38646">#38646</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/torch.html#torch.argmin"rel="nofollow"><code>torch.argmin</code></a>, <ahref="https://pytorch.org/docs/stable/torch.html?highlight=argmax#torch.argmax"rel="nofollow"><code>torch.argmax</code></a>: Fixed silently incorrect result for inputs with more than 2^32 elements (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="626925816"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39212"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39212/hovercard"href="https://github.com/pytorch/pytorch/pull/39212">#39212</a>)</h3>
<h3>C++ Custom Operators: fixed a bug where custom operators stopped working with autograd and ignored the <code>requires_grad=True</code> flag. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="607783672"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37355"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37355/hovercard"href="https://github.com/pytorch/pytorch/pull/37355">#37355</a>)</h3>
<h2>Crashes and Error Fixes</h2>
<h3>Fixed CUDA reduction operations on inputs with more than 2^32 elements (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="612109174"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37788"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37788/hovercard"href="https://github.com/pytorch/pytorch/pull/37788">#37788</a>)</h3>
<spanclass="pl-s">`RuntimeError: sub_iter.strides(0)[0] == 0 INTERNAL ASSERT FAILED at /pytorch/aten/src/ATen/native/cuda/Reduce.cuh:706, please report a bug to PyTorch.`</span></pre></div></sub></td>
<h3>Fixed pickling of PyTorch operators (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="614262182"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38033"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38033/hovercard"href="https://github.com/pytorch/pytorch/pull/38033">#38033</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/nn.html?highlight=leaky#torch.nn.LeakyReLU"rel="nofollow"><code>nn.LeakyReLU</code></a>: Fixed a bug where using autograd with in-place <code>nn.LeakyReLu</code> with a slope of 0 incorrectly errored out. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="608573624"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37453"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37453/hovercard"href="https://github.com/pytorch/pytorch/pull/37453">#37453</a>, <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="609523550"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37559"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37559/hovercard"href="https://github.com/pytorch/pytorch/pull/37559">#37559</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/torch.html#torch.as_strided"rel="nofollow"><code>torch.as_strided</code></a> : Fixed crash when passed <code>sizes</code> and <code>strides</code> of different lengths. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="627701316"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39301"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39301/hovercard"href="https://github.com/pytorch/pytorch/pull/39301">#39301</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/nn.html#torch.nn.SyncBatchNorm.convert_sync_batchnorm"rel="nofollow"><code>nn.SyncBatchNorm.convert_sync_batchnorm</code></a>: Fixed bug where it did not respect the devices of the original BatchNorm module, resulting in device mismatch errors (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="628645081"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39344"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39344/hovercard"href="https://github.com/pytorch/pytorch/pull/39344">#39344</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/nn.html#torch.nn.utils.clip_grad_norm_"rel="nofollow"><code>nn.utils.clip_grad_norm_</code></a>: Fixed ability to operate on tensors on different devices (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="619561674"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38615"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38615/hovercard"href="https://github.com/pytorch/pytorch/pull/38615">#38615</a>)</h3>
<h3><ahref="https://pytorch.org/docs/stable/torch.html#torch.min"rel="nofollow"><code>torch.min</code></a>, <ahref="https://pytorch.org/docs/stable/torch.html#torch.max"rel="nofollow"><code>torch.max</code></a>: added check for illegal output dtypes (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="622196802"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38850"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38850/hovercard"href="https://github.com/pytorch/pytorch/pull/38850">#38850</a>)</h3>
<h3>MacOS: Fixed <code>import torch</code> error (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="603452265"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36941"data-hovercard-type="issue"data-hovercard-url="/pytorch/pytorch/issues/36941/hovercard"href="https://github.com/pytorch/pytorch/issues/36941">#36941</a>).</h3>
<h3>C++ Extensions: fixed compilation error when building with older versions of nvcc (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="606200118"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37221"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37221/hovercard"href="https://github.com/pytorch/pytorch/pull/37221">#37221</a>)</h3>
<p>This bug mainly affected users of ubuntu 16.04. We’re certain it affected the following configurations:</p>
<ul>
<li>ubuntu 16.04 + cuda 9.2 + gcc 5</li>
<li>ubuntu 16.04 + cuda 9.2 + gcc 7</li>
<li>ubuntu 16.04 + cuda 10.0 + gcc 5</li>
</ul>
<h3>C++ Extensions: fixed ability to compile with paths that include spaces (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="622281251"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38860"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38860/hovercard"href="https://github.com/pytorch/pytorch/pull/38860">#38860</a>, <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="620411897"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38670"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38670/hovercard"href="https://github.com/pytorch/pytorch/pull/38670">#38670</a>)</h3>
<h3>C++ Extensions: fixed ability to compile with relative <code>include_dirs</code> for ahead-of-time compilation (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="616155670"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38264"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38264/hovercard"href="https://github.com/pytorch/pytorch/pull/38264">#38264</a>)</h3>
<h1>Other Fixes</h1>
<h3><ahref="https://pytorch.org/docs/stable/nn.html#conv1d"rel="nofollow"><code>nn.Conv1d</code></a>, <ahref="https://pytorch.org/docs/stable/nn.html#conv2d"rel="nofollow"><code>nn.Conv2d</code></a>, <ahref="https://pytorch.org/docs/stable/nn.html#conv3d"rel="nofollow"><code>nn.Conv3d</code></a>: Fixed a bug where convolutions were using more memory than previous versions of PyTorch. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="620464528"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38674"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38674/hovercard"href="https://github.com/pytorch/pytorch/pull/38674">#38674</a>)</h3>
<h3>Fixed in-place floor division magic method (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="620599440"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38695"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38695/hovercard"href="https://github.com/pytorch/pytorch/pull/38695">#38695</a>)</h3>
<p>In 1.5.0, the in-place floor division magic method mistakingly
performed the floor division out-of-place. We’ve fixed this in 1.5.1.</p>
<h3>Documentation: fixed link to java docs. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="625244917"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39039"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39039/hovercard"href="https://github.com/pytorch/pytorch/pull/39039">#39039</a>)</h3>
<h3>Quantization: Fixed weight quantization inaccuracies for LSTM (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="593555936"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/35961"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35961/hovercard"href="https://github.com/pytorch/pytorch/pull/35961">#35961</a>)</h3>
<p>Weight quantization was done incorrectly for LSTMs, the statistics
for all weights (across layers) were combined in the observer. This
meant that weights for later layers in a LSTM would use sub-optimal
scales impacting accuracy. The problem gets worse as the number of
layers increases.</p>
<h3>DistributedDataParallel: Fixed single-process multi-GPU use case (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="599103577"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36503"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36503/hovercard"href="https://github.com/pytorch/pytorch/pull/36503">#36503</a>)</h3>
<h3>RPC: Fixed future callbacks not capturing and restoring autograd context id (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="618567408"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38512"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38512/hovercard"href="https://github.com/pytorch/pytorch/pull/38512">#38512</a>)</h3>
<h3>TorchScript: Fixed support with <ahref="https://pytorch.org/docs/stable/torch.html#torch.unique"rel="nofollow"><code>torch.unique</code></a> (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="615006637"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38156"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38156/hovercard"href="https://github.com/pytorch/pytorch/pull/38156">#38156</a>)</h3>
<h3>ONNX: Fix <code>pow</code> operator export (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="636433693"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39791"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39791/hovercard"href="https://github.com/pytorch/pytorch/pull/39791">#39791</a>)</h3>
<preclass="text-small text-gray">[ONNX] Fix pow op export [1.5.1] (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="636433693"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39791"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39791/hovercard"href="https://github.com/pytorch/pytorch/pull/39791">#39791</a>)
* [ONNX] Fix pow op export (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="614386399"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38065"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38065/hovercard"href="https://github.com/pytorch/pytorch/pull/38065">#38065</a>)
Summary:
Fix pow type cast for opset 9 and update opset 12
Pull Request <spanclass="issue-keyword tooltipped tooltipped-se"aria-label="This commit closes pull request #38065.">resolved</span>: <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="614386399"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/38065"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/38065/hovercard"href="https://github.com/pytorch/pytorch/pull/38065">#38065</a>
* Update ort-nighly version as suggested in <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="634995729"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/39685"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/39685/hovercard?comment_id=641452470&comment_type=issue_comment"href="https://github.com/pytorch/pytorch/pull/39685#issuecomment-641452470">#39685 (comment)</a>
* Apply changes from <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="612806570"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/37846"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/37846/hovercard"href="https://github.com/pytorch/pytorch/pull/37846">#37846</a> to `test_topk_smallest_unsorted`
<p>This release includes several major new API additions and
improvements. These include new APIs for autograd allowing for easy
computation of hessians and jacobians, a significant update to the C++
frontend, ‘channels last’ memory format for more performant computer
vision models, a stable release of the distributed RPC framework used
for model parallel training, and a new API that allows for the creation
of Custom C++ Classes that was inspired by PyBind. Additionally <code>torch_xla</code> 1.5 is now available and tested with the PyTorch 1.5 release providing a mature Cloud TPU experience.</p>
<h3>C++ Frontend API [Now Stable]</h3>
<p>The C++ frontend API is now at parity with Python and the features
overall has been moved to ‘stable’. (previously tagged as experimental).
Some of the major highlights include:</p>
<ul>
<li>C++ torch::nn module/functional are now at ~100% parity with Python
API, with appropriate documentation. Now users can easily translate
their model from Python API to C++ API, making the model authoring
experience much smoother.</li>
<li>C++ optimizers now behave identically to the Python API. In the
past, optimizers in C++ had deviated from the Python equivalent: C++
optimizers couldn’t take parameter groups as input while the Python ones
could. Also step function implementations were not exactly the same.
With the 1.5 release, C++ optimizers will always behave the same as the
Python equivalent.</li>
<li>New C++ tensor multi-dim indexing API which looks and behaves the
similar to the Python API. The previous workaround was to use a
combination of <code>narrow</code> / <code>select</code> / <code>index_select</code> / <code>masked_select</code>, which is clunky and error-prone compared to the Python API’s elegant <code>tensor[:, 0, ..., mask]</code> syntax. With the 1.5 release users can use <code>tensor.index({Slice(), 0, "...", mask})</code> to achieve the same result.</li>
</ul>
<h3>Channels last memory format for Computer Vision models [Experimental]</h3>
<p>Channels Last memory format is an alternative way of ordering NCHW
tensors in memory while preserving the NCHW semantic dimensions
ordering. Channels Last tensors are ordered in memory in such a way that
channels become the densest dimension (aka storing images
pixel-per-pixel).</p>
<p>Channels Last memory format unlocks the ability to use performance
efficient convolution algorithms and hardware (NVidia’s Tensor Cores,
FBGEMM, QNNPACK). Additionally it was designed to automatically
propagate through the operators, which allows easy switching between
memory layouts.</p>
<p>Learn more <ahref="https://github.com/pytorch/pytorch/wiki/Writing-memory-format-aware-operators">here</a> on how to write memory format aware operators.</p>
<h3>Custom C++ Classes [Experimental]</h3>
<p>This release adds a new API for binding custom C++ classes into
TorchScript and Python simultaneously. This API is almost identical in
syntax to <ahref="https://pybind11.readthedocs.io/en/stable/"rel="nofollow">pybind11</a>.
It allows users to expose their C++ class and its methods to the
TorchScript type system and runtime system such that they can
instantiate and manipulate arbitrary C++ objects from TorchScript and
<p>The <code>torch.distributed.rpc</code> package aims at supporting a wide range of distributed training paradigms that do not fit into <code>DistributedDataParallel</code>.
Examples include parameter server training, distributed model
parallelism, and distributed pipeline parallelism. Features in the <code>torch.distributed.rpc</code> package can be categorized into four main sets of APIs.</p>
<ul>
<li>The <strong>RPC</strong> API allows running a function on a
specified destination worker with given arguments and fetches the return
value or creates a distributed reference to the return value.</li>
<li>The <strong>RRef</strong> (Remote REFerence) serves as a reference
to an object on another worker. A worker holding an RRef can explicitly
request copies of the object, and it can also share the light-weight
RRef with other workers without worrying about reference counting. This
is especially useful when multiple workers need to repeatedly access
different versions of the same remote object.</li>
<li>With <strong>Distributed Autograd</strong>, applications can
automatically compute gradients even if a model is split on multiple
workers using RPC. This is achieved by stitching together local autograd
graphs at RPC boundaries in the forward pass and reaching out to
participants to transparently launch local autograd in the backward
pass.</li>
<li>The <strong>Distributed Optimizer</strong> uses gradients computed by Distributed Autograd to update model parameters. Its constructor takes a local optimizer (e.g., <code>SGD</code>, <code>Adagrad</code>, etc.) and a list of parameter RRefs, and its <code>step()</code> function automatically uses the local optimizer to update parameters on all distinct RRef owner workers.</li>
</ul>
<p>Learn more <ahref="https://pytorch.org/docs/stable/rpc.html"rel="nofollow">here</a>.</p>
<h3><strong>torch_xla 1.5 now available</strong></h3>
<p><ahref="http://pytorch.org/xla/"rel="nofollow">torch_xla</a> is a Python package that uses the <ahref="https://www.tensorflow.org/xla"rel="nofollow">XLA linear algebra compiler</a> to accelerate the <ahref="https://pytorch.org/"rel="nofollow">PyTorch deep learning framework</a> on <ahref="https://cloud.google.com/tpu/"rel="nofollow">Cloud TPUs</a> and <ahref="https://cloud.google.com/tpu/docs/tutorials/pytorch-pod"rel="nofollow">Cloud TPU Pods</a>.
torch_xla aims to give PyTorch users the ability to do everything they
can do on GPUs on Cloud TPUs as well while minimizing changes to the
user experience. This release of <ahref="http://pytorch.org/xla/"rel="nofollow">torch_xla</a>
is aligned and tested with PyTorch 1.5 to reduce friction for
developers and to provide a stable and mature PyTorch/XLA stack for
training models using Cloud TPU hardware. You can <ahref="https://medium.com/pytorch/get-started-with-pytorch-cloud-tpus-and-colab-a24757b8f7fc"rel="nofollow">try it for free</a> in your browser on an 8-core Cloud TPU device with <ahref="https://colab.research.google.com/"rel="nofollow">Google Colab</a>, and you can use it at a much larger scale <ahref="https://cloud.google.com/gcp"rel="nofollow">on Google Cloud</a>.</p>
<p>See the full torch_xla release notes <ahref="https://github.com/pytorch/xla/releases">here</a> and the full docs <ahref="https://pytorch.org/xla/"rel="nofollow">here</a>.</p>
<h3><strong>New High level autograd API [Experimental]</strong></h3>
<p>PyTorch 1.5 brings new functions including jacobian, hessian, jvp, vjp, hvp and vhp to the <code>torch.autograd.functional.*</code> submodule. This feature builds on the current API and allow the user to easily perform these functions.</p>
<p>See the full docs <ahref="https://pytorch.org/docs/stable/autograd.html#functional-higher-level-api"rel="nofollow">here</a>.</p>
<h3>Python 2 no longer supported</h3>
<p>For PyTorch 1.5.0 we will no longer support Python 2, specifically
version 2.7. Going forward support for Python will be limited to Python
3, specifically Python 3.5, 3.6, 3.7 and 3.8 (first enabled in PyTorch
1.4.0).</p>
<h1>Known Issues</h1>
<h3><code>torch.nn.parallel.DistributedDataParallel</code> does not work in Single-Process Multi-GPU mode.</h3>
<p><code>DistributedDataParallel</code> (DDP) used to support two modes</p>
<ol>
<li>Single-Process Multi-GPU (SPMG): In this mode, each DDP process replicates the input <code>module</code> to all specified devices and trains on all <code>module</code> replicas. This mode is enabled when application passes in a <code>device_ids</code> argument that contains multiple devices. Or if <code>device_ids</code> is not presented, DDP will try to use all available devices.</li>
<li>Multi-Process Single-GPU (MPSG): This is the <strong>recommended</strong> mode, as it is faster than SPMG. In this mode, each DDP process directly works on the provided <code>module</code> without creating additional replicas. This mode is enabled when <code>device_ids</code> only contains a single device or if there is only one visible device (e.g., by setting <code>CUDA_VISIBLE_DEVICES</code>).</li>
</ol>
<p>A recent change (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="572399471"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/33907"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33907/hovercard"href="https://github.com/pytorch/pytorch/pull/33907">#33907</a>) in <code>torch.nn.parallel.replicate</code>
breaks DDP’s assumption on replicated modules and leads to failures in
the SPMG mode. However, since SPMG is known to be slower due to GIL
contention and additional overhead caused by scattering input and
gathering output, we are planning to retire this mode in future releases
and make MPSG the only supported mode in DDP. The code below shows an
example of the recommended way to construct DDP.</p>
<pre><code>import torch
from torch.nn.parallel import DistributedDataParallel as DDP
<p>See <ahref="https://github.com/pytorch/pytorch/issues/36268"data-hovercard-type="issue"data-hovercard-url="/pytorch/pytorch/issues/36268/hovercard">#36268</a> for more discussion.</p>
<h3><code>Tensor.exponential_(0)</code> used to return <code>Inf</code>, now it incorrectly returns <code>0</code></h3>
<p>Previously in 1.4, <code>x.exponential_(0)</code> gives a tensor full of <code>inf</code>. On 1.5.0, it wrongly gives a tensor full of zeros.</p>
<p>See <ahref="https://github.com/pytorch/pytorch/issues/36798"data-hovercard-type="issue"data-hovercard-url="/pytorch/pytorch/issues/36798/hovercard">#36798</a> for more details</p>
<h1>Backwards Incompatible Changes</h1>
<h2>Python</h2>
<h3><code>Tensor.clone</code>, <code>Tensor.to</code>, <code>Tensor.empty_like</code>, and similar functions preserve stride information instead of returning contiguous tensors</h3>
<p><code>clone</code>, <code>to</code>, <code>type</code>, <code>cuda</code>, <code>cpu</code>, <code>byte</code>, <code>char</code>, <code>double</code>, <code>bool</code>, <code>half</code>, <code>int</code>, <code>long</code>, <code>short</code>, <code>float</code>, <code>bfloat16</code>, <code>empty_like</code>, <code>full_like</code>, <code>ones_like</code>, <code>zeros_like</code>, <code>rand_like</code>, <code>randn_like</code>, <code>randint_like</code> operators now propagate memory format (roughly, the strides) of the input tensor to the output tensor.</p>
<p>Since PyTorch operators generally support non-contiguous tensors,
this should have no functional effect on most PyTorch programs.</p>
<p>The most common incompatibility with Python programs is with the <code>view</code>
operator, which has specific stride requirements. If these requirements
are no longer met as a result of this change, you will get an error
message indicating that you should use reshape instead, i.e.
"RuntimeError: view size is not compatible with input tensor's size and
stride (at least one dimension spans across two contiguous subspaces).
Use .reshape(...) instead."</p>
<p>Another possible exception incompatibility is if you have a (usually)
C++ operator implementation that works directly on memory (i.e. calls
data_ptr and relies on the strides being contiguous).</p>
<p>In the following example, we go through the implementation of a simple <code>clone</code> operation and see how it needs to change between versions.</p>
<h3>The inferred dtype of np.float_, np.float64 scalars in tensor
constructors (e.g. torch.tensor(...), torch.as_tensor(...) is now
torch.float64 instead of the default dtype (usually torch.float32). (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="528999599"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/30486"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30486/hovercard"href="https://github.com/pytorch/pytorch/pull/30486">#30486</a> (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="528999599"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/30486"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30486/hovercard"href="https://github.com/pytorch/pytorch/pull/30486">#30486</a>))</h3>
<p>Please explicitly pass in the desired dtype when constructing tensors with NumPy float64 scalars to get the old behavior.</p>
<p>This can cause your program to execute in torch.float64, potentially
slowing down your program or can lead to errors for operators that don't
support torch.float64 or mixed-dtypes.</p>
<p>numpy integer scalars are now treated as integers for the purposes of type promotion (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="528999599"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/30486"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30486/hovercard"href="https://github.com/pytorch/pytorch/pull/30486">#30486</a> (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="528999599"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/30486"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30486/hovercard"href="https://github.com/pytorch/pytorch/pull/30486">#30486</a>))</p>
<p>Previously, in 1.4.0, they were mistakenly treated as floats (so for
example, torch.ones(3) * np.int64(3) would return a float32 tensor. In
1.5.0, we’ve fixed that behavior; torch.ones(3) * np.int64(3) returns an
int32 tensor.</p>
<p>This can cause your code to fail if you performed operations between
PyTorch tensors and numpy scalars and then passed the result into an
operation that does not support integral types or mixed types. To fix
your code, please cast the resulting tensor to the desired dtype.</p>
<h3>numpy integer scalars are now treated as integers for the purposes of type promotion (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="528999599"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/30486"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30486/hovercard"href="https://github.com/pytorch/pytorch/pull/30486">#30486</a>)</h3>
<p>Previously, in 1.4.0, they were mistakenly treated as floats (so for example, <code>torch.ones(3) * np.int64(3)</code> would return a float32 tensor. In 1.5.0, we’ve fixed that behavior; <code>torch.ones(3) * np.int64(3)</code> returns an int32 tensor.</p>
<p>This can cause your code to fail if you performed operations between
PyTorch tensors and numpy scalars and then passed the result into an
operation that does not support integral types or mixed types. To fix
your code, please cast the resulting tensor to the desired dtype.</p>
<h3><code>torch.autograd.Function</code>: dropped support for old-style Functions (<ahref="https://github.com/pytorch/pytorch/pull/33956"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33956/hovercard">#33956</a>).</h3>
<p>In previous versions of PyTorch, there were two ways to write
autograd Functions. We deprecated one of them in 1.3.0 and dropped
support for it entirely in 1.5.0. Old-style autograd Functions will no
longer work in user code.</p>
<p>These Functions be identified by not having <code>staticmethod</code><code>forward</code> and <code>backward</code> functions (see the example below) Please see <ahref="https://pytorch.org/docs/stable/autograd.html#torch.autograd.Function"rel="nofollow">the current documentation</a> for how to write new-style Functions.</p>
<pre><code># Version 1.4.0
class Exp(torch.autograd.Function):
def forward(self, i):
result = i.exp()
self.save_for_backward(result)
return result
def backward(self, grad_output):
result, = self.saved_tensors
return grad_output * result
Exp()(torch.tensor(1.))
</code></pre>
<pre><code># Version 1.5.0
class Exp(torch.autograd.Function):
@staticmethod
def forward(ctx, i):
result = i.exp()
ctx.save_for_backward(result)
return result
@staticmethod
def backward(ctx, grad_output):
result, = ctx.saved_tensors
return grad_output * result
Exp.apply(torch.tensor(1.))
</code></pre>
<h3><code>torch.optim</code> optimizers changed to fix in-place checks for the changes made by the optimizer (<ahref="https://github.com/pytorch/pytorch/pull/33640"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33640/hovercard">#33640</a>, <ahref="https://github.com/pytorch/pytorch/pull/34211"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34211/hovercard">#34211</a>)</h3>
<p>If this causes your code to fail, there are two possible reasons:</p>
<p>Reason 1: The value of that parameter was actually saved and used and
we were computing incorrect gradients in previous versions of PyTorch.
This would result in an error message mentioning incorrect version
numbers. You should replace code that uses <code>self.my_param</code> by <code>self.my_param.clone()</code> to make sure the saved version is different from the one that is modified by the optimizer. For example:</p>
<p>Before 1.5.0, the following may have worked.</p>
<pre><code>def model(input, target, param):
return `(input * param ** 2 - target).norm()`
param = torch.randn(2, requires_grad=True)
input = torch.randn(2)
target = torch.randn(2)
sgd = optim.SGD([param], lr=0.001)
loss = model(input, target, param)
loss.backward(retain_graph=True)
sgd.step()
loss.backward()
param.grad
</code></pre>
<p>If after upgrading to 1.5.0, the above fails due to a version counter
error, then that means the gradient computed was incorrect. To remedy
this, clone <code>param</code> before using it in the model:</p>
<pre><code>def model(input, target, param):
return (input * param ** 2 - target).norm()
param = torch.randn(2, requires_grad=True)
input = torch.randn(2)
target = torch.randn(2)
sgd = optim.SGD([param], lr=0.001)
loss = model(input, target, param.clone())
loss.backward(retain_graph=True)
sgd.step()
loss.backward()
param.grad
</code></pre>
<p>Reason 2: You know what you're doing and change the values back to
the right thing before the next backward. However, you're running into
an error because the version counter cannot be decremented. Open an
issue with your particular use case and we will help you to work around
the version counter issue.</p>
<h3><code>utils.cpp_extensions</code> now use <code>ninja</code> as the default compilation backend (<ahref="https://github.com/pytorch/pytorch/pull/32495"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32495/hovercard">#32495</a>)</h3>
<p><code>ninja</code> enables parallel compilation of your C++
extension, greatly speeding up compilation. This change will not break
most user code; if you do not have <code>ninja</code> installed, we fallback to the old <code>distutils</code> backend.</p>
<p>However, if you do have <code>ninja</code> installed, it is possible
that this change will cause your C++ extension build to fail by
oversubscribing your system with too many worker processes. There are
two potential workarounds to this.</p>
<p>Method 1: If a previously succeeding <code>python setup.py install</code> now fails, try setting the <code>MAX_JOBS</code> environment variable.</p>
<h3><code>torch.optim.Adam</code>, <code>torch.optim.SGD</code> changed to not modify gradients in-place (<ahref="https://github.com/pytorch/pytorch/pull/30257"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30257/hovercard">#30257</a>)</h3>
<p>In previous versions of PyTorch, the Adam and SGD optimizers modified gradients (e.g. <code>param.grad</code>) in-place via in-place addition of <code>params.grad += weight_decay * param</code>.
To make this consistent with the behavior of other optimizers and to
prevent surprises about the behavior, we’ve changed them to stop
modifying gradients in-place.</p>
<p>This should not have an effect on most PyTorch programs unless they
relied on this behavior. The easiest way to replicate the old behavior
is to create a custom optimizer that implements it.</p>
<h3><code>torch.masked_select</code> now always returns a 1D tensor (<ahref="https://github.com/pytorch/pytorch/pull/29923"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29923/hovercard">#29923</a>)</h3>
<p>The behavior of <code>torch.masked_select</code> when both "self" and
"mask" are 0-dimensional was changed. In previous versions of PyTorch,
this would return a 0-dimensional tensor. Now, we return a 1-dimensional
tensor to be consistent with other input sizes and our documentation.</p>
<h3><code>torch.index_select</code> on a 0-d tensor now returns a 0-d tensor. (<ahref="https://github.com/pytorch/pytorch/pull/30790"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30790/hovercard">#30790</a>)</h3>
<p>In previous versions of PyTorch, the output of <code>torch.index_select</code>
on a 0D input tensor produced a 1D tensor. This was inconsistent with
our documentation on it, which stated "The returned tensor has the same
number of dimensions as the original tensor (input)." Now, we return a
<h3><code>nn.MultiLabelMarginLoss:</code> 'none' reduction on 1D tensor now returns a 0D tensor (<ahref="https://github.com/pytorch/pytorch/pull/30768"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30768/hovercard">#30768</a>)</h3>
<p>In previous versions of PyTorch, the output of <code>nn.MultiLabelMarginLoss</code>
on 1D and 0D tensors incorrectly produced 1-D tensors. Now, those cases
return a 0D tensor to be consistent with the 2-D tensor case.</p>
<h3><code>nn.MultiMarginLoss:</code>‘none' reduction on 1D target now returns a 1D tensor (<ahref="https://github.com/pytorch/pytorch/pull/30826"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30826/hovercard">#30826</a>)</h3>
<p>In previous versions of PyTorch, the output of <code>nn.MultiMarginLoss</code> on a 1D <code>target</code> tensor produced a 0D output. We changed this to return a 1D <code>target</code> tensor to make it consistent with other input sizes which return an output that matches the target shape.</p>
<h3><code>Tensor.exponential_(lambda)</code> no longer supports <code>lambda < 0</code> (<ahref="https://github.com/pytorch/pytorch/pull/32501"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32501/hovercard">#32501</a>)</h3>
<p><code>lambda</code>, the rate parameter of the exponential distribution, mathematically should be greater than 0. We’ve disabled support <code>lambda < 0</code> to be mathematically correct; most users will not have used a lambda less than zero.</p>
<td><sub><divclass="highlight highlight-source-python"><pre><spanclass="pl-c"># Negative lambda not supported!</span>
</pre></div></sub></td>
</tr>
</tbody></table>
<p></p>
<h3><code>nn.BCELoss</code>, <code>nn.functional.binary_cross_entropy</code> no longer accept inputs with the same number of elements that are not broadcastable (<ahref="https://github.com/pytorch/pytorch/pull/31365"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31365/hovercard">#31365</a>)</h3>
<p>Previously, we supported accepting inputs with the same number of
elements. However, this behavior was deprecated and we removed it in
1.5.0. In order to replicate the old behavior, please explicitly <code>reshape</code> your input and target tensors to have the same shape.</p>
<h3><code>torch.normal</code> out argument is now required to have the same size as the computed output (<ahref="https://github.com/pytorch/pytorch/pull/32031"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32031/hovercard">#32031</a>)</h3>
<p>Previously, on CPU devices, <code>torch.normal(mean, std, out=out)</code> would resize <code>out</code> to the correct size. To be consistent with the CUDA implementation, we’ve changed it so that <code>out</code> must either already have the correct size, or be an empty tensor with size <code>[0]</code>. To work around this, please ensure that your <code>out</code> tensor has the correct size.</p>
<h3><code>Tensor.geometric_</code> no longer supports integral Tensors (<ahref="https://github.com/pytorch/pytorch/pull/31878"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31878/hovercard">#31878</a>)</h3>
<p>Previously, on CPU devices, <code>Tensor.geometric_</code> supported
Tensors with integral dtype. Now, it only supports floating point. We
removed support for this because it doesn’t make sense for <code>geometric_</code> to operate on integral dtypes.</p>
<h3>Changed <code>torch.floor_divide</code><code>input</code> positional argument name to <code>self</code> (<ahref="https://github.com/pytorch/pytorch/pull/34552"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34552/hovercard">#34552</a>)</h3>
<p>Before PyTorch 1.5, <code>torch.floor_divide</code> took two positional arguments: <code>torch.floor_divide(input, other)</code>. We’ve changed the name of the <code>input</code> argument to <code>self</code>; this will break code that called <code>torch.floor_divide</code> via keyword argument. For example:</p>
<li>Instead of returning <code>RNNOutput</code>, RNN / GRU <code>forward</code> method now returns <code>std::tuple<Tensor, Tensor></code>, and LSTM <code>forward</code> method now returns <code>std::tuple<Tensor, std::tuple<Tensor, Tensor>></code>, matching Python API.</li>
<li>LSTM forward method’s hidden state parameter now has type <code>torch::optional<std::tuple<Tensor, Tensor>></code>, matching Python API.</li>
<li>RNN / LSTM / GRU layers now have <code>forward_with_packed_input</code> method which accepts <code>PackedSequence</code> as input and optionally hidden state, matching the <code>forward(PackedSequence, ...)</code> variant in Python API.</li>
<li>RNN / LSTM / GRU layers no longer have these fields: <code>w_ih</code> / <code>w_hh</code> / <code>b_ih</code> / <code>b_hh</code>. Instead, to access the weights and biases of the gates, users should do e.g. <code>rnn->named_parameters()["weight_ih_l0"]</code>, which mirrors the Python API <code>rnn.weight_ih_l0</code>.</li>
<li>In <code>RNNOptions</code>
<ul>
<li><code>tanh()</code> / <code>relu()</code> / <code>activation</code> are removed. Instead, <code>nonlinearity</code> is added which takes either <code>torch::kTanh</code> or <code>torch::kReLU</code></li>
<li><code>layers</code> is renamed to <code>num_layers</code></li>
<li><code>with_bias</code> is renamed to <code>bias</code></li>
</ul>
</li>
<li>In <code>LSTMOptions</code>
<ul>
<li><code>layers</code> is renamed to <code>num_layers</code></li>
<li><code>with_bias</code> is renamed to <code>bias</code></li>
</ul>
</li>
<li>In <code>GRUOptions</code>
<ul>
<li><code>layers</code> is renamed to <code>num_layers</code></li>
<li><code>with_bias</code> is renamed to <code>bias</code></li>
</ul>
</li>
</ul>
<h3>Upsample layer / F::interpolate function (<ahref="https://github.com/pytorch/pytorch/pull/35025"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35025/hovercard">#35025</a>)</h3>
<ul>
<li>There are changes to <code>UpsampleOptions</code> and <code>InterpolateFuncOptions</code>:
<ul>
<li><code>size</code> is changed from <code>std::vector<int64_t></code> to <code>c10::optional<std::vector<int64_t>></code>. If you want to pass a list of <code>int64_t</code> to this argument, you must pass it as <code>std::vector<int64_t></code>.</li>
<li><code>scale_factor</code> is changed from <code>std::vector<double></code> to <code>c10::optional<std::vector<double>></code>. If you want to pass a list of <code>double</code> to this argument, you must pass it as <code>std::vector<double></code>.</li>
<li><code>torch::nn::functional::MultiLabelMarginLossFuncOptions</code> is renamed to <code>torch::nn::functional::MultilabelMarginLossFuncOptions</code></li>
<li><code>torch::nn::functional::MultiLabelSoftMarginLossFuncOptions</code> is renamed to <code>torch::nn::functional::MultilabelSoftMarginLossFuncOptions</code></li>
<li>The deprecated <code>torch::nn::BatchNorm</code> is removed in favor of <code>torch::nn::BatchNorm{1,2,3}d</code></li>
<li>The deprecated <code>torch::nn::FeatureDropout</code> is removed in favor of <code>torch::nn::Dropout{2,3}d</code></li>
<li>The deprecated <code>torch::nn::modules_ordered_dict</code> is removed. User should do <code>Sequential sequential({{"m1", MyModule(1)}, {"m2", MyModule(2)}})</code> instead.</li>
<li>The deprecated <code>torch::nn::init::Nonlinearity</code> is removed, in favor of these enums: <code>torch::kLinear </code>/ <code>torch::kConv1D</code> / <code>torch::kConv2D</code> / <code>torch::kConv3D</code> / <code>torch::kConvTranspose1D</code> / <code>torch::kConvTranspose2D</code> / <code>torch::kConvTranspose3D</code> / <code>torch::kSigmoid</code> / <code>torch::kTanh</code> / <code>torch::kReLU</code> / <code>torch::kLeakyReLU</code></li>
<li>The deprecated <code>torch::nn::init::FanMode</code> is removed, in favor of these enums: <code>torch::kFanIn</code> / <code>torch::kFanOut</code></li>
</ul>
<h3>Optimizers</h3>
<ul>
<li><code>Optimizer::step</code> now accepts closure function as optional input and returns a tensor, and <code>LossClosureOptimizer</code> is removed (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="581847100"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/34790"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34790/hovercard"href="https://github.com/pytorch/pytorch/pull/34790">#34790</a>) (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="583849769"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/34957"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34957/hovercard"href="https://github.com/pytorch/pytorch/pull/34957">#34957</a>). If you had a custom optimizer class defined as:</li>
</ul>
<pre><code>struct MyOptimizer : Optimizer {
using Optimizer::Optimizer;
void step() override {...}
};
</code></pre>
<pre><code>* you would need to update your optimizer class definition as follows:
<h3>Removed <code>AutoGIL/AutoNoGIL</code> in favor of <code>pybind11::gil_scoped_*</code> functions (#<ahref="https://github.com/pytorch/pytorch/pull/34301"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34301/hovercard">34301</a>)</h3>
<p>If your code released or acquired the GIL via AutoNoGIL or AutoGIL, please change the invocations to <code>pybind11::gil_scoped_release</code> or <code>pybind11::gil_scoped_release</code>, respectively.</p>
<h3>Others</h3>
<ul>
<li><code>torch::tensor(floating-point values)</code> will always produce tensor of default dtype, and <code>torch::tensor(integer values)</code> will always produce tensor of <code>torch::kLong</code> dtype, matching Python API behavior (<ahref="https://github.com/pytorch/pytorch/pull/32367"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32367/hovercard">#32367</a>).</li>
<li><code>torch::Tensor::base()</code> is renamed to <code>torch::Tensor::_base()</code> , matching Python API. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="564992741"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/33316"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33316/hovercard"href="https://github.com/pytorch/pytorch/pull/33316">#33316</a>)</li>
<li>Renamed TensorTypeId to DispatchKey (<ahref="https://github.com/pytorch/pytorch/pull/32154"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32154/hovercard">#32154</a>)</li>
<li>Throw an error if nbytes is called on a sparse tensor. (<ahref="https://github.com/pytorch/pytorch/pull/33897"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33897/hovercard">#33897</a>)</li>
</ul>
<h2>JIT</h2>
<h3>Simple Executor Is Now On By Default</h3>
<p>The simple executor skips the number of fusion-related passes and
analyses that are very time-consuming. Disabling these optimizations
fixes pathologically long compilation times. The users that rely on GPU
fusion to have their desired performance profile, should turn on the
profiling executor. We provide C++ and python API to enable the
profiling executor:</p>
<ul>
<li>in python, call <code>torch._C._jit_set_profiling_mode(True)</code> before you call your model for the first time.</li>
<li>in C++, include <code>#include <torch/csrc/jit/runtime/graph_executor.h></code> and set <code>getProfilingMode() = true</code> before you invoke your model for the first time.</li>
</ul>
<h2>Quantization</h2>
<h3><strong>Remove qconfig_dict in top level eager mode quantization API</strong> (<ahref="https://github.com/pytorch/pytorch/pull/31972"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31972/hovercard">#31972</a>).</h3>
<p>In eager mode quantization, one needs to manually insert quant and
dequant stubs in a model to specify where activations are quantized.
Having a qconfig_dict that specifies the quantization configuration for
each module is not useful as one needs to manually modify the model with
quant/dequant stubs. The new API makes it explicit that the model needs
to be manually modified for quantization.</p>
<pre><code># previously qconfig_dict was an optional argument to prepare
<h3>Functional API for Distributed Autograd and Distributed Optimizer</h3>
<p>More specifically, callers must pass <code>context_id</code> to <code>torch.distributed.autograd.backward()</code> and <code>torch.distributed.optim.step()</code>. (<ahref="https://github.com/pytorch/pytorch/pull/33711"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33711/hovercard">#33711</a>)</p>
<pre><code># Before
import torch.distributed.autograd as dist_autograd
import torch.distributed.rpc as rpc
from torch import optim
from torch.distributed.optim import DistributedOptimizer
<p>The motivation is to prevent potential invalid device errors when the
number of devices on the sender and the receiver does not match.
However applications, can always move CUDA tensors to CPU before sending
(<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="568703487"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/33604"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33604/hovercard"href="https://github.com/pytorch/pytorch/pull/33604">#33604</a>).</p>
<h3>Added new functional autograd API (<ahref="https://github.com/pytorch/pytorch/pull/34066"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34066/hovercard">#34066</a>)</h3>
<ul>
<li>See Highlights for more details</li>
</ul>
<h3>New <code>__torch_function__</code> API Override Mechanism (<ahref="https://github.com/pytorch/pytorch/pull/30730"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30730/hovercard">#30730</a>, <ahref="https://github.com/pytorch/pytorch/pull/32194"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32194/hovercard">#32194</a>, <ahref="https://github.com/pytorch/pytorch/pull/32799"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32799/hovercard">#32799</a>, <ahref="https://github.com/pytorch/pytorch/pull/34240"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34240/hovercard">#34240</a>, <ahref="https://github.com/pytorch/pytorch/pull/34303"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34303/hovercard">#34303</a>).</h3>
<p>We introduced <code>__torch_function__</code>, an API override mechanism for subclassing <code>torch.Tensor</code> in Python. This is useful for creating custom objects that implement the <code>torch.*</code> APIs. These currently support overriding most <code>torch.*</code>, and <code>torch.nn.functional</code> APIs; we’ve also planned future support for subclassing <code>torch.Tensor</code> (see tracking issue <ahref="https://github.com/pytorch/pytorch/issues/22402"data-hovercard-type="issue"data-hovercard-url="/pytorch/pytorch/issues/22402/hovercard">#22402</a>).</p>
<h2>New Operators</h2>
<ul>
<li><code>torch.logical_and</code> and <code>torch.logical_or</code> operations added (<ahref="https://github.com/pytorch/pytorch/pull/30521"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30521/hovercard">#30521</a>).</li>
<li>Added PCA and SVD for low-rank matrices (<code>torch.pca_lowrank</code>, <code>torch.svd_lowrank</code>), <code>torch.lobpcg</code> for positive-defined generalized eigenvalue problem (<ahref="https://github.com/pytorch/pytorch/pull/34721"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34721/hovercard">#34721</a>).</li>
<li><code>distributions.mixture_same_family</code> : Added support for mixture distributions (<ahref="https://github.com/pytorch/pytorch/pull/22742"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/22742/hovercard">#22742</a>, <ahref="https://github.com/pytorch/pytorch/pull/33408"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33408/hovercard">#33408</a>).</li>
<li>Please see docs: <ahref="https://pytorch.org/cppdocs/notes/tensor_indexing.html"rel="nofollow">https://pytorch.org/cppdocs/notes/tensor_indexing.html</a></li>
</ul>
</li>
<li>Operators
<ul>
<li>C++ API parity: <code>isinf</code> (<ahref="https://github.com/pytorch/pytorch/pull/31099"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31099/hovercard">#31099</a>).</li>
</ul>
</li>
<li>Autograd
<ul>
<li>Add <code>at::Tensor::retain_grad</code> API (<ahref="https://github.com/pytorch/pytorch/pull/33349"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33349/hovercard">#33349</a>).</li>
</ul>
</li>
<li>C++ extensions
<ul>
<li>Add option to use ninja to compile ahead-of-time <code>cpp_extensions</code> (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="553683326"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/32495"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32495/hovercard"href="https://github.com/pytorch/pytorch/pull/32495">#32495</a>, <ahref="https://github.com/pytorch/pytorch/pull/33084"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33084/hovercard">#33084</a>)</li>
<li>Added support for Pytorch C++ extensions to use HIP (<ahref="https://github.com/pytorch/pytorch/pull/32669"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32669/hovercard">#32669</a>).</li>
</ul>
</li>
</ul>
<h2>Distributed</h2>
<ul>
<li>Allows Python application to create subclass of C++ <code>c10d.Store</code> using pybind11 trampoline class <ahref="https://github.com/pytorch/pytorch/pull/30415"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30415/hovercard">#30415</a>.</li>
</ul>
<h2>Mobile</h2>
<ul>
<li>Loading module from android asset (<ahref="https://github.com/pytorch/pytorch/pull/30378"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30378/hovercard">#30378</a>).</li>
<li>Torchscript print to logcat (<ahref="https://github.com/pytorch/pytorch/pull/31456"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31456/hovercard">#31456</a>).</li>
<li>Quantized H Tangent function (<ahref="https://github.com/pytorch/pytorch/pull/31031"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31031/hovercard">#31031</a>).</li>
<li>QNNPACK: Add support for dynamic quantization. (<ahref="https://github.com/pytorch/pytorch/pull/31896"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31896/hovercard">#31896</a>).</li>
<li>Add operator support for dynamic quant on mobile (<ahref="https://github.com/pytorch/pytorch/pull/32479"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32479/hovercard">#32479</a>).</li>
<li>FP16 dynamic quantized Linear (<ahref="https://github.com/pytorch/pytorch/pull/32331"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32331/hovercard">#32331</a>).</li>
<li>Add support for Dynamic LSTM quantization on Mobile (<ahref="https://github.com/pytorch/pytorch/pull/32757"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32757/hovercard">#32757</a>).</li>
<li>Quantized sigmoid function (<ahref="https://github.com/pytorch/pytorch/pull/31851"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31851/hovercard">#31851</a>).</li>
<li>Add the 3d avg pool for video related model (<ahref="https://github.com/pytorch/pytorch/pull/33339"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33339/hovercard">#33339</a>).</li>
<li>Add quantized ELU activation (<ahref="https://github.com/pytorch/pytorch/pull/34267"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34267/hovercard">#34267</a>).</li>
<li>Add the 3d upsample quantized op for video model (<ahref="https://github.com/pytorch/pytorch/pull/34594"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34594/hovercard">#34594</a>).</li>
<li>Add the quantized batch_norm3d and also batch_norm3d fused with relu operators (<ahref="https://github.com/pytorch/pytorch/pull/34702"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34702/hovercard">#34702</a>).</li>
<li>Add quantized implementation of hard sigmoid (<ahref="https://github.com/pytorch/pytorch/pull/34607"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34607/hovercard">#34607</a>).</li>
</ul>
<h2>RPC</h2>
<ul>
<li>[Experimental] Enable autograd profiler to work with RPC (<ahref="https://github.com/pytorch/pytorch/pull/31381"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31381/hovercard">#31381</a>, <ahref="https://github.com/pytorch/pytorch/pull/34398"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34398/hovercard">#34398</a>, <ahref="https://github.com/pytorch/pytorch/pull/30677"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30677/hovercard">#30677</a>, <ahref="https://github.com/pytorch/pytorch/pull/31346"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31346/hovercard">#31346</a>, <ahref="https://github.com/pytorch/pytorch/pull/31380"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31380/hovercard">#31380</a>).</li>
<li><code>nn.RNN</code>: Ensure MIOpen is called on same stream as operator (<ahref="https://github.com/pytorch/pytorch/pull/30672"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30672/hovercard">#30672</a>)</li>
<li>Fixed asserts in CUDA kernels (<ahref="https://github.com/pytorch/pytorch/pull/31276"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31276/hovercard">#31276</a>, <ahref="https://github.com/pytorch/pytorch/pull/31297"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31297/hovercard">#31297</a>).</li>
<li>Enable BFloat16 support for convolutions (<ahref="https://github.com/pytorch/pytorch/pull/30948"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30948/hovercard">#30948</a>).</li>
<li>Install complete set of headers for ROCm build (<ahref="https://github.com/pytorch/pytorch/pull/32076"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32076/hovercard">#32076</a>).</li>
<li>Adjust <code>elementwise_kernel</code> settings on ROCm (<ahref="https://github.com/pytorch/pytorch/pull/32609"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32609/hovercard">#32609</a>).</li>
<li><code>nn.BatchNorm{1,2,3}d</code>: Use <code>C10_WARP_SIZE</code> to fix functionality on HIP vs CUDA for gradient computation (<ahref="https://github.com/pytorch/pytorch/pull/33098"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33098/hovercard">#33098</a>).</li>
<li>Enabled Bfloat16 type for activation functions and <code>batch_norm</code> (<ahref="https://github.com/pytorch/pytorch/pull/32065"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32065/hovercard">#32065</a>).</li>
<li>Added ability to enable/disable MIOpen at runtime (<ahref="https://github.com/pytorch/pytorch/pull/33118"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33118/hovercard">#33118</a>).</li>
<li>Enable BFloat16 type for pooling ops (<ahref="https://github.com/pytorch/pytorch/pull/34166"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34166/hovercard">#34166</a>).</li>
<li><code>torch.pdist</code>: improved precision by enabling double <code>__shfl_down</code> (<ahref="https://github.com/pytorch/pytorch/pull/34103"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34103/hovercard">#34103</a>).</li>
<li>Enabled BFloat16 type for loss functions and few misc ops required for resnet50 (<ahref="https://github.com/pytorch/pytorch/pull/34469"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34469/hovercard">#34469</a>).</li>
<li>Enabled BFloat16 type for EmbeddingBag, Index, and Sigmoid ops (<ahref="https://github.com/pytorch/pytorch/pull/34630"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34630/hovercard">#34630</a>).</li>
<li>Enabled 3D batch norms through MIOpen (<ahref="https://github.com/pytorch/pytorch/pull/33262"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33262/hovercard">#33262</a>).</li>
<li>Enabled 3D convolutions through ROCm (<ahref="https://github.com/pytorch/pytorch/pull/33067"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33067/hovercard">#33067</a>).</li>
<li><code>nn.RNN</code>: Check if weights need to be flattened (<ahref="https://github.com/pytorch/pytorch/pull/34265"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34265/hovercard">#34265</a>).</li>
</ul>
<h2>C++ API</h2>
<ul>
<li>NN modules / functionals
<ul>
<li>Allow skipping default arguments in module's forward method when module is used in <code>torch::nn::Sequential</code> (<ahref="https://github.com/pytorch/pytorch/pull/33027"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33027/hovercard">#33027</a>) (<ahref="https://github.com/pytorch/pytorch/pull/33718"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33718/hovercard">#33718</a>)</li>
<li>Make <code>torch::nn::Sequential::push_back(AnyModule)</code> methods public (<ahref="https://github.com/pytorch/pytorch/pull/34208"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34208/hovercard">#34208</a>).</li>
<li>Refactor RNN / GRU / LSTM layers to match Python API (<ahref="https://github.com/pytorch/pytorch/pull/34322"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34322/hovercard">#34322</a>).</li>
<li>For <code>Conv{1,2,3}d</code>, <code>padding_mode</code> now accepts <code>torch::kZeros</code> / <code>torch::kReflect</code> / <code>torch::kReplicate</code> / <code>torch::kCircular</code>, matching Python API behavior. (<ahref="https://github.com/pytorch/pytorch/pull/35023"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35023/hovercard">#35023</a>)</li>
<li>Fix <code>F::interpolate</code> and <code>torch::nn::Upsample</code> implementation to match Python API behavior (<ahref="https://github.com/pytorch/pytorch/pull/35025"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35025/hovercard">#35025</a>) (<ahref="https://github.com/pytorch/pytorch/pull/36274"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36274/hovercard">#36274</a>)</li>
<li>All existing optimizers in the C++ API (Adagrad / SGD / Adam /
RMSprop / LBFGS) have the following changes to achieve parity with the
Python API: (<ahref="https://github.com/pytorch/pytorch/pull/29335"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29335/hovercard">#29335</a>) (<ahref="https://github.com/pytorch/pytorch/pull/30739"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30739/hovercard">#30739</a>) (<ahref="https://github.com/pytorch/pytorch/pull/32592"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32592/hovercard">#32592</a>) (<ahref="https://github.com/pytorch/pytorch/pull/33730"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33730/hovercard">#33730</a>) (<ahref="https://github.com/pytorch/pytorch/pull/33450"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33450/hovercard">#33450</a>) (<ahref="https://github.com/pytorch/pytorch/pull/34790"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34790/hovercard">#34790</a>) (<ahref="https://github.com/pytorch/pytorch/pull/34564"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34564/hovercard">#34564</a>) (<ahref="https://github.com/pytorch/pytorch/pull/34957"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34957/hovercard">#34957</a>) (<ahref="https://github.com/pytorch/pytorch/pull/35001"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35001/hovercard">#35001</a>) (<ahref="https://github.com/pytorch/pytorch/pull/36033"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36033/hovercard">#36033</a>) (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="596762209"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36245"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36245/hovercard"href="https://github.com/pytorch/pytorch/pull/36245">#36245</a>)
<ul>
<li>step function implementation is changed to behave the same as Python equivalent</li>
<li>Constructor now accepts <code>std::vector<OptimizerParamGroup></code> as input</li>
<li><code>optimizer.add_param_group(...)</code> can be used to add parameter group to an existing optimizer</li>
<li><code>optimizer.state()</code> should be used to access parameter state</li>
</ul>
</li>
</ul>
</li>
<li>autograd
<ul>
<li>Renamed <code>at::Tensor::base()</code> to <code>_base()</code>, matching Python API (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="564992741"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/33316"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33316/hovercard"href="https://github.com/pytorch/pytorch/pull/33316">#33316</a>)</li>
</ul>
</li>
</ul>
<h2>Distributed</h2>
<ul>
<li>Allow TCPStore to pick a port to bind to (<ahref="https://github.com/pytorch/pytorch/pull/31674"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31674/hovercard">#31674</a>).</li>
<li>Enhance NCCL watchdog to actively abort communicators for timed out ops (<ahref="https://github.com/pytorch/pytorch/pull/32338"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32338/hovercard">#32338</a>).</li>
<li>Expose setNumThreads to android api (<ahref="https://github.com/pytorch/pytorch/pull/31033"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31033/hovercard">#31033</a>).</li>
<li>remove unused SparseCPUType from mobile build (<ahref="https://github.com/pytorch/pytorch/pull/33517"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33517/hovercard">#33517</a>).</li>
<li>make sure mobile build work with dynamic dispatch (<ahref="https://github.com/pytorch/pytorch/pull/34038"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34038/hovercard">#34038</a>).</li>
<li>support for custom mobile build with dynamic dispatch (<ahref="https://github.com/pytorch/pytorch/pull/34055"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34055/hovercard">#34055</a>).</li>
<li>Add watchOS support (<ahref="https://github.com/pytorch/pytorch/pull/33318"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33318/hovercard">#33318</a>).</li>
<li>speed_benchmark_torch switch to log latency from dataset level to row level (<ahref="https://github.com/pytorch/pytorch/pull/34598"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34598/hovercard">#34598</a>).</li>
</ul>
<h2>ONNX</h2>
<h3><strong><strong>Exporting More Torch Operators to ONNX</strong></strong></h3>
<p>In PyTorch 1.5, we have added support for 10 additional operators and
also enhanced support for another set of 10+ existing operators. We
have also added support for exporting large models (> 2GB) to ONNX.
Additionally, we have made enhancements and optimizations to the export
of ScriptModules and will continue to do that in the next release. We
have also made improvements to the custom op export experience.</p>
<ul>
<li>Export dynamic unbind, split and getitem (<ahref="https://github.com/pytorch/pytorch/pull/29136"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29136/hovercard">#29136</a>).</li>
<li>Export bitwise_not for bool (<ahref="https://github.com/pytorch/pytorch/pull/28439"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28439/hovercard">#28439</a>).</li>
<li>Export logsoftmax with dim != -1 (<ahref="https://github.com/pytorch/pytorch/pull/30433"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30433/hovercard">#30433</a>).</li>
<li>Export aten::copy_ and aten::index_put to ONNX opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/26941"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26941/hovercard">#26941</a>).</li>
<li>Export bool type index mask (<ahref="https://github.com/pytorch/pytorch/pull/32445"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32445/hovercard">#32445</a>).</li>
<li>Export split with list of sizes (<ahref="https://github.com/pytorch/pytorch/pull/33161"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33161/hovercard">#33161</a>).</li>
<li>Export scalar tensor for split (<ahref="https://github.com/pytorch/pytorch/pull/32493"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32493/hovercard">#32493</a>).</li>
<li>Export flatten to accept negative indices in opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/30751"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30751/hovercard">#30751</a>).</li>
<li>Export sort with negative axes (<ahref="https://github.com/pytorch/pytorch/pull/31971"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31971/hovercard">#31971</a>).</li>
<li>Export Interpolate to support scale (<ahref="https://github.com/pytorch/pytorch/pull/28324"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28324/hovercard">#28324</a>, <ahref="https://github.com/pytorch/pytorch/pull/31526"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31526/hovercard">#31526</a>, <ahref="https://github.com/pytorch/pytorch/pull/32554"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32554/hovercard">#32554</a>).</li>
<h3><strong><strong>Enhancing the Support for ScriptModule</strong></strong></h3>
<ul>
<li>Fixed access to element in size tensor for scripting (<ahref="https://github.com/pytorch/pytorch/pull/32652"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32652/hovercard">#32652</a>).</li>
<li>Export Conv in TorchScript module (<ahref="https://github.com/pytorch/pytorch/pull/30618"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30618/hovercard">#30618</a>).</li>
<li>Export Dim operation in TorchScript module (<ahref="https://github.com/pytorch/pytorch/pull/31928"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31928/hovercard">#31928</a>).</li>
<li>Export randnlike in TorchScript module (<ahref="https://github.com/pytorch/pytorch/pull/32830"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32830/hovercard">#32830</a>).</li>
<li>Partially support tensor lists in loop/concat/stack (<ahref="https://github.com/pytorch/pytorch/pull/30126"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30126/hovercard">#30126</a>)</li>
<li>Adding ONNX large model export support in exporter (<ahref="https://github.com/pytorch/pytorch/pull/33062"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33062/hovercard">#33062</a>).</li>
<li>Extend op registration (<ahref="https://github.com/pytorch/pytorch/pull/32943"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32943/hovercard">#32943</a>).</li>
<li>Support op registration if name starts with underscore (<ahref="https://github.com/pytorch/pytorch/pull/32017"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32017/hovercard">#32017</a>).</li>
<li>Added cons folding for ONNX mul, div, sqrt ops (<ahref="https://github.com/pytorch/pytorch/pull/32077"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32077/hovercard">#32077</a>).</li>
<li>Enable constant folding for Reshape (<ahref="https://github.com/pytorch/pytorch/pull/31054"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31054/hovercard">#31054</a>).</li>
</ul>
<h3><strong><strong>Adding Utility Functions and Refactoring</strong></strong></h3>
<ul>
<li>Added ONNX model checker to ONNX export (<ahref="https://github.com/pytorch/pytorch/pull/32298"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32298/hovercard">#32298</a>).</li>
<li>Upgrade exported ONNX IR version to 6 (<ahref="https://github.com/pytorch/pytorch/pull/31025"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31025/hovercard">#31025</a>).</li>
<li>Provide names for operator nodes in ONNX exported graph (<ahref="https://github.com/pytorch/pytorch/pull/27342"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27342/hovercard">#27342</a>).</li>
<li>Update ONNX landing page since 1.3 (<ahref="https://github.com/pytorch/pytorch/pull/32805"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32805/hovercard">#32805</a>).</li>
<li>Turn ONNX_ML into a proper build option (<ahref="https://github.com/pytorch/pytorch/pull/33424"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33424/hovercard">#33424</a>).</li>
</ul>
<h2>Operator Benchmark</h2>
<ul>
<li>Added small input shapes to test operator overhead (<ahref="https://github.com/pytorch/pytorch/pull/30617"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30617/hovercard">#30617</a>).</li>
<li>Added <code>binary_test</code> to benchmark binary ops (<ahref="https://github.com/pytorch/pytorch/pull/31326"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31326/hovercard">#31326</a>).</li>
<li>Removed option to wipe cache because it did not help with variance (<ahref="https://github.com/pytorch/pytorch/pull/31334"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31334/hovercard">#31334</a>).</li>
<li>Guard against copying from quantized Tensor to non-quantized Tensor (<ahref="https://github.com/pytorch/pytorch/pull/29660"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29660/hovercard">#29660</a>).</li>
<li>Add assert for min, max, qmin, qmax for ChooseQuantizationParams (<ahref="https://github.com/pytorch/pytorch/pull/32739"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32739/hovercard">#32739</a>).</li>
<li>Support broadcast for quantized mul kernel (<ahref="https://github.com/pytorch/pytorch/pull/30442"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30442/hovercard">#30442</a>).</li>
<li>Make FakeQuant use <code>REGISTER_DISPATCH</code> (<ahref="https://github.com/pytorch/pytorch/pull/33682"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33682/hovercard">#33682</a>).</li>
<li>Set alias analysis kind to <code>FROM_SCHEMA</code> for qadd, qmul, qclamp, qconcat (<ahref="https://github.com/pytorch/pytorch/pull/33359"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33359/hovercard">#33359</a>).</li>
<li>Migrate <code>fake_quant_slice</code> to TensorIterator (<ahref="https://github.com/pytorch/pytorch/pull/33744"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33744/hovercard">#33744</a>).</li>
<li>Parallelize quantize and dequantize (<ahref="https://github.com/pytorch/pytorch/pull/33765"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33765/hovercard">#33765</a>).</li>
<li>Make FP16 RNN use new prepack op (<ahref="https://github.com/pytorch/pytorch/pull/34339"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34339/hovercard">#34339</a>).</li>
<li>Refactor QAT Conv module for better extensibility (<ahref="https://github.com/pytorch/pytorch/pull/30362"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30362/hovercard">#30362</a>).</li>
<li>Use non-inplace for insert observer pass (<ahref="https://github.com/pytorch/pytorch/pull/34190"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34190/hovercard">#34190</a>).</li>
</ul>
<h2>RPC</h2>
<ul>
<li>Add default arguments for <code>init_method</code> (<ahref="https://github.com/pytorch/pytorch/pull/30208"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30208/hovercard">#30208</a>).</li>
<li>By default ignore RRef leaks during shutdown (<ahref="https://github.com/pytorch/pytorch/pull/30217"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30217/hovercard">#30217</a>).</li>
<li>Robustify <code>rpc_agent</code> handlers with generic Future (<ahref="https://github.com/pytorch/pytorch/pull/31224"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31224/hovercard">#31224</a>).</li>
<li>Fix error message in incorrect <code>rref.localValue()</code> call (<ahref="https://github.com/pytorch/pytorch/pull/31199"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31199/hovercard">#31199</a>).</li>
<li>Add <code>RpcAgent::getWorkerInfos()</code> API to return all <code>WorkInfo</code>s in the group (<ahref="https://github.com/pytorch/pytorch/pull/30241"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30241/hovercard">#30241</a>).</li>
<li>Add local shutdown to process group agent (<ahref="https://github.com/pytorch/pytorch/pull/30330"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30330/hovercard">#30330</a>).</li>
<li>Add <code>RRef.str()</code> API to return a string representation of the RRef (<ahref="https://github.com/pytorch/pytorch/pull/30609"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30609/hovercard">#30609</a>).</li>
<li>Adding Debug Info for RRef Context (<ahref="https://github.com/pytorch/pytorch/pull/30610"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30610/hovercard">#30610</a>).</li>
<li>Add <code>get_metrics</code> and <code>get_debug_info</code> to RPC agent (<ahref="https://github.com/pytorch/pytorch/pull/30833"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30833/hovercard">#30833</a>).</li>
<li>Adding debugging metrics to process group agent (<ahref="https://github.com/pytorch/pytorch/pull/30884"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30884/hovercard">#30884</a>).</li>
<li>Add glue code to collect debug info from all components (<ahref="https://github.com/pytorch/pytorch/pull/30888"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30888/hovercard">#30888</a>).</li>
<li>Make RRef leak detection always print a warning log (<ahref="https://github.com/pytorch/pytorch/pull/31922"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31922/hovercard">#31922</a>).</li>
<li>Allow multiple backward passes to accumulate gradients. (<ahref="https://github.com/pytorch/pytorch/pull/32506"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32506/hovercard">#32506</a>).</li>
<li>Allow RRef local creation with IValue objects (<ahref="https://github.com/pytorch/pytorch/pull/33263"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33263/hovercard">#33263</a>).</li>
<li>Improve ProcessGroup <code>RpcBackendOptions</code> Constructor API (<ahref="https://github.com/pytorch/pytorch/pull/34081"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34081/hovercard">#34081</a>).</li>
<li>Enhanced Error Reporting in Dist Autograd/RPC (<ahref="https://github.com/pytorch/pytorch/pull/34179"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34179/hovercard">#34179</a>).</li>
<li>Delete all user forks tracked in <code>RRefContext</code> before graceful shutdown (<ahref="https://github.com/pytorch/pytorch/pull/31893"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31893/hovercard">#31893</a>).</li>
<li>Best-effort Error Detection for Using Deleted UserRRefs (<ahref="https://github.com/pytorch/pytorch/pull/34673"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34673/hovercard">#34673</a>).</li>
<li>Don't run user function until all UserRRefs in the args are confirmed (<ahref="https://github.com/pytorch/pytorch/pull/34497"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34497/hovercard">#34497</a>).</li>
<li>Support using self as the destination in <code>rpc.remote</code> for builtin operators (<ahref="https://github.com/pytorch/pytorch/pull/34931"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34931/hovercard">#34931</a>).</li>
<li>Add debug info API for distributed autograd. (<ahref="https://github.com/pytorch/pytorch/pull/30642"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30642/hovercard">#30642</a>).</li>
<li>Propagate errors in <code>clearAndWaitForOutstandingRpcsAsync</code>. (<ahref="https://github.com/pytorch/pytorch/pull/32952"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32952/hovercard">#32952</a>).</li>
</ul>
<h2>Type Hints</h2>
<ul>
<li>DataLoader <code>default_collate</code> type hint added (<ahref="https://github.com/pytorch/pytorch/pull/28935"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28935/hovercard">#28935</a>).</li>
<li><code>Tensor.rsub, Tensor.rpow, Tensor.rtruediv, Tensor.map_</code> type hints were added (<ahref="https://github.com/pytorch/pytorch/pull/30576"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30576/hovercard">#30576</a>).</li>
<li><code>torch.optim</code>: added more missing type hints (<ahref="https://github.com/pytorch/pytorch/pull/31130"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31130/hovercard">#31130</a>).</li>
<li><code>torch.nn.Parameter</code> constructor type hint was fixed (<ahref="https://github.com/pytorch/pytorch/pull/32617"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32617/hovercard">#32617</a>).</li>
<li><code>nn.MultiheadAttention</code>, <code>nn.Transformer</code>: added type hints (<ahref="https://github.com/pytorch/pytorch/pull/28396"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28396/hovercard">#28396</a>).</li>
<li><code>torch.optim.LambdaLR</code> constructor type hint was fixed (<ahref="https://github.com/pytorch/pytorch/pull/33271"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33271/hovercard">#33271</a>).</li>
<li><code>torch.optim</code>: added missing default value for <code>LRScheduler.step()</code> (<ahref="https://github.com/pytorch/pytorch/pull/32411"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32411/hovercard">#32411</a>).</li>
<li>Make type of <code>Tensor.type()</code> more specific (<ahref="https://github.com/pytorch/pytorch/pull/32353"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32353/hovercard">#32353</a>).</li>
<li><code>torch.optim.optimizer.Optimizer</code> type hints were fixed (<ahref="https://github.com/pytorch/pytorch/pull/32900"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32900/hovercard">#32900</a>).</li>
<li><code>optim.AdamW</code> type hints were fixed (<ahref="https://github.com/pytorch/pytorch/pull/34299"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34299/hovercard">#34299</a>).</li>
<li><code>torch.utils.data.Sampler</code> subclasses type hints were added (<ahref="https://github.com/pytorch/pytorch/pull/33679"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33679/hovercard">#33679</a>).</li>
<li><code>nn.Sequential</code>, <code>nn.ModuleList</code>, <code>nn.ParameterList</code>, <code>nn.ParameterDict</code> type hints were fixed (<ahref="https://github.com/pytorch/pytorch/pull/33686"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33686/hovercard">#33686</a>).</li>
<li><code>Tensor.bfloat16()</code> type hint was added (<ahref="https://github.com/pytorch/pytorch/pull/33747"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33747/hovercard">#33747</a>).</li>
<li>Binary operator type hints were fixed (<ahref="https://github.com/pytorch/pytorch/pull/33748"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33748/hovercard">#33748</a>).</li>
<li><code>torch.bfloat16</code>, <code>nn.Module.training</code>, <code>Tensor.cuda</code>, and 10s of other type hints added (<ahref="https://github.com/pytorch/pytorch/pull/33762"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33762/hovercard">#33762</a>).</li>
<li><code>torch.add</code> type hint was fixed(<ahref="https://github.com/pytorch/pytorch/pull/33935"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33935/hovercard">#33935</a>).</li>
<li><code>Tensor.shape</code> type hint was fixed (<ahref="https://github.com/pytorch/pytorch/pull/34595"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34595/hovercard">#34595</a>).</li>
<li><code>Tensor.__radd__</code> type hint was fixed (<ahref="https://github.com/pytorch/pytorch/pull/35231"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35231/hovercard">#35231</a>)</li>
</ul>
<h2>Other</h2>
<ul>
<li><code>autograd.detect_anomaly</code>: added support for Sparse Tensors (<ahref="https://github.com/pytorch/pytorch/pull/29803"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29803/hovercard">#29803</a>).</li>
<li><code>autograd.detect_anomaly</code>: Error messages now print the current Node name (<ahref="https://github.com/pytorch/pytorch/pull/33875"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33875/hovercard">#33875</a>).</li>
<li><code>autograd.profiler</code>: added better error message when crashing while profiling multi-worker DataLoader (<ahref="https://github.com/pytorch/pytorch/pull/31473"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31473/hovercard">#31473</a>).</li>
<li><code>autograd.profiler</code> Enable using <code>torch.autograd.profiler.record_function</code> as decorator (<ahref="https://github.com/pytorch/pytorch/pull/30861"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30861/hovercard">#30861</a>).</li>
<li><code>autograd.profiler</code> Speed up <code>export_chrome_trace</code> by up to 4x (<ahref="https://github.com/pytorch/pytorch/pull/30724"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30724/hovercard">#30724</a>).</li>
<li><code>torch.autograd</code>: added better error message when attempting to fork (<ahref="https://github.com/pytorch/pytorch/pull/33885"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33885/hovercard">#33885</a>).</li>
<li><code>torch.cuda.memory.caching_allocator_alloc</code>, <code>torch.cuda.memory.caching_allocator_delete</code> exposed in Python API (<ahref="https://github.com/pytorch/pytorch/pull/33860"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33860/hovercard">#33860</a>).</li>
<li><code>torch.roll</code>: added bool tensor support (<ahref="https://github.com/pytorch/pytorch/pull/31194"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31194/hovercard">#31194</a>).</li>
<li><code>torch.flip</code>: added support for bool tensors (<ahref="https://github.com/pytorch/pytorch/pull/31267"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31267/hovercard">#31267</a>).</li>
<li><code>torch.equal</code>: added support for bfloat16 CPU scalar types (<ahref="https://github.com/pytorch/pytorch/pull/30817"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30817/hovercard">#30817</a>).</li>
<li><code>torch.save</code>, <code>torch.load</code>: added error message for minimum dill version support (<ahref="https://github.com/pytorch/pytorch/pull/30985"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30985/hovercard">#30985</a>).</li>
<li><code>torch.diagonal</code>: added named tensor support(<ahref="https://github.com/pytorch/pytorch/pull/30193"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30193/hovercard">#30193</a>).</li>
<li><code>torch.linspace</code>: added support for integral types on CPU (<ahref="https://github.com/pytorch/pytorch/pull/32218"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32218/hovercard">#32218</a>).</li>
<li><code>torch.eig</code>: Added autograd support in the case where eigenvalues are real (<ahref="https://github.com/pytorch/pytorch/pull/33090"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33090/hovercard">#33090</a>).</li>
<li><code>torch.no_grad</code>, <code>torch.enable_grad</code>: added support for decorating generator functions (<ahref="https://github.com/pytorch/pytorch/pull/31792"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31792/hovercard">#31792</a>).</li>
<li><code>torch.narrow</code>: added Tensor overload for <code>start</code> (<ahref="https://github.com/pytorch/pytorch/pull/34317"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34317/hovercard">#34317</a>).</li>
<li><code>Tensor.random_</code>: enabled support for half on CPU (<ahref="https://github.com/pytorch/pytorch/pull/34030"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34030/hovercard">#34030</a>).</li>
<li><code>Tensor.grad</code>: added warnings when accessing it if it won't be populated for known reasons (<ahref="https://github.com/pytorch/pytorch/pull/30531"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30531/hovercard">#30531</a>).</li>
<li><code>nn.functional.max_pool{1,2,3}d</code>: added named tensor support (<ahref="https://github.com/pytorch/pytorch/pull/31669"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31669/hovercard">#31669</a>).</li>
<li><code>nn.Module.load_state_dict</code>: Include the contents of the exception in error messages (<ahref="https://github.com/pytorch/pytorch/pull/32693"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32693/hovercard">#32693</a>).</li>
<li><code>nn.MultiheadAttention</code>: add support for 3D attention mask (<ahref="https://github.com/pytorch/pytorch/pull/31996"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31996/hovercard">#31996</a>).</li>
<li><code>nn.MSELoss</code> : Added performance warning for using CPU Half (<ahref="https://github.com/pytorch/pytorch/pull/33021"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33021/hovercard">#33021</a>).</li>
<li><code>nn.ModuleList</code>, <code>nn.ParameterDict</code>, <code>nn.ParameterDict</code>: added more descriptive error messages when attempting to call these like Modules (<ahref="https://github.com/pytorch/pytorch/pull/29991"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29991/hovercard">#29991</a>).</li>
<li><code>nn.init.dirac_</code>: Added <code>groups</code> option for compatibility with initializing group convolutions (<ahref="https://github.com/pytorch/pytorch/pull/32825"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32825/hovercard">#32825</a>).</li>
<li>Added error message to indicate that reduction operations are not supported for dim >= 64 (<ahref="https://github.com/pytorch/pytorch/pull/31476"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31476/hovercard">#31476</a>).</li>
<li>Type Promotion: added supports for sparse tensors and arithmetic operations (<ahref="https://github.com/pytorch/pytorch/pull/30429"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30429/hovercard">#30429</a>).</li>
<li>Enabled indexing for bfloat16 tensors (<ahref="https://github.com/pytorch/pytorch/pull/31692"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31692/hovercard">#31692</a>).</li>
<li>Add 64-bit indexing support for CUDA Tensors (<ahref="https://github.com/pytorch/pytorch/pull/33405"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33405/hovercard">#33405</a>).</li>
<li>Added warning when converting a read-only NumPy array to <code>torch.Tensor</code> (<ahref="https://github.com/pytorch/pytorch/pull/33615"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33615/hovercard">#33615</a>).</li>
<li>Set rpath for JNI library on Mac (<ahref="https://github.com/pytorch/pytorch/pull/32247"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32247/hovercard">#32247</a>).</li>
<li>Updated MAGMA to 2.5.2 for Windows (<ahref="https://github.com/pytorch/pytorch/pull/30513"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30513/hovercard">#30513</a>, <ahref="https://github.com/pytorch/pytorch/pull/34205"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34205/hovercard">#34205</a>).</li>
<li>Marked PyTorch incompatible with Python-3.6.0 (<ahref="https://github.com/pytorch/pytorch/pull/34724"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34724/hovercard">#34724</a>).</li>
<li>Improved dll loading logic on Windows (<ahref="https://github.com/pytorch/pytorch/pull/33856"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33856/hovercard">#33856</a>).</li>
<li>Error out if legacy <code>Tensor.new </code> is called on alternate layouts or dtypes (<ahref="https://github.com/pytorch/pytorch/pull/31485"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31485/hovercard">#31485</a>).</li>
<li><code>output_ratio</code> for <code>FractionalMaxPool{2,3}d </code>module and <code>fractional_max_pool{2,3}d</code> functional should accept double as data type (<ahref="https://github.com/pytorch/pytorch/pull/33304"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33304/hovercard">#33304</a>)</li>
<li>For <code>AdaptiveAvgPool{2,3}d </code>and <code>AdaptiveMaxPool{2,3}d</code>, <code>output_size</code> is changed to accept <code>c10::nullopt</code> in its elements, matching Python API behavior. (<ahref="https://github.com/pytorch/pytorch/pull/35022"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35022/hovercard">#35022</a>)</li>
<li>Fix bug in <code>fractional_max_pool3d_with_indices</code> implementation (<ahref="https://github.com/pytorch/pytorch/pull/35024"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35024/hovercard">#35024</a>)</li>
<li>Remove <code>namespace F = torch::nn::functional</code> from torch/nn/modules/batchhnorm.h, so that people don't have to use <code>F</code> to alias <code>torch::nn::functional</code> if they don't want to (<ahref="https://github.com/pytorch/pytorch/pull/30684"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30684/hovercard">#30684</a>)</li>
</ul>
</li>
<li>autograd
<ul>
<li>For <code>AutogradContext</code>, <code>get_dirty()</code> is removed and <code>get_and_bump_dirty()</code> is added, and the latter always bumps the version counter of the returned tensors (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="561342654"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/33068"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33068/hovercard"href="https://github.com/pytorch/pytorch/pull/33068">#33068</a>)</li>
<li>Fix allow_unused checking for C++ API (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="573471032"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/34035"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34035/hovercard"href="https://github.com/pytorch/pytorch/pull/34035">#34035</a>)</li>
<li>Remove <code>using namespace torch::autograd</code> from <code>torch/csrc/api/include/torch/nn/modules/_functions.h</code> (<ahref="https://github.com/pytorch/pytorch/pull/34423"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34423/hovercard">#34423</a>)</li>
</ul>
</li>
<li>Operators
<ul>
<li><code>torch::tensor(floating-point values)</code> will always produce tensor of default dtype, and <code>torch::tensor(integer values)</code> will always produce tensor of <code>torch::kLong</code> dtype, matching Python API behavior (<ahref="https://github.com/pytorch/pytorch/pull/32367"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32367/hovercard">#32367</a>)</li>
<li>Fix <code>torch::allclose</code> to handle <code>std::numeric_limits::lowest()</code> for integral types (<ahref="https://github.com/pytorch/pytorch/pull/32978"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32978/hovercard">#32978</a>)</li>
<li>Switch <code>torch::empty_like</code> to use <code>merge_in</code> to process TensorOptions (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="567761104"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/33505"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33505/hovercard"href="https://github.com/pytorch/pytorch/pull/33505">#33505</a>)</li>
</ul>
</li>
</ul>
<h2>Distributed</h2>
<ul>
<li>Allow DDP to detect globally unused parameters (<ahref="https://github.com/pytorch/pytorch/pull/28883"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28883/hovercard">#28883</a>).</li>
<li>Accept url query when <code>rank</code> or <code>world_size</code> is specified in Process Group <code>init_method</code> URL (<ahref="https://github.com/pytorch/pytorch/pull/32016"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32016/hovercard">#32016</a>).</li>
<li>Add ability to abort NCCL communicators from the store. (<ahref="https://github.com/pytorch/pytorch/pull/32895"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32895/hovercard">#32895</a>).</li>
<li>Fix timeout support when initializing process group with TCP store (<ahref="https://github.com/pytorch/pytorch/pull/33434"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33434/hovercard">#33434</a>).</li>
<li>Abort NCCL communicators before throwing operation timed out (<ahref="https://github.com/pytorch/pytorch/pull/31128"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31128/hovercard">#31128</a>).</li>
<li>Fix logging for aborted communicators in ProcessGroupNCCL (<ahref="https://github.com/pytorch/pytorch/pull/33147"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33147/hovercard">#33147</a>).</li>
<li>Fix handling of replica parameters in DataParallel (<ahref="https://github.com/pytorch/pytorch/pull/33907"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33907/hovercard">#33907</a>).</li>
<li>Specify <code>requires_grad</code> for Parameter replica so it's not always set to True by default (<ahref="https://github.com/pytorch/pytorch/pull/32356"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32356/hovercard">#32356</a>)</li>
<li>Put sparse <code>allreduce</code> results to input tensors (<ahref="https://github.com/pytorch/pytorch/pull/32226"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32226/hovercard">#32226</a>)</li>
<li>Issue a warning when <code>zero_grad</code> is used in <code>DataParallel</code> (<ahref="https://github.com/pytorch/pytorch/pull/33064"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33064/hovercard">#33064</a>)</li>
</ul>
<h2>JIT</h2>
<ul>
<li>TorchScript compilation fixed for (<ahref="https://github.com/pytorch/pytorch/pull/33783"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33783/hovercard">#33783</a>):
<ul>
<li><code>torch.stft</code></li>
<li><code>torch.lu</code>,</li>
<li><code>torch.lu_unpack</code></li>
<li><code>torch.cdist</code></li>
<li><code>torch.norm</code></li>
</ul>
</li>
<li><code>tensor.tolist()</code> compilation now supported, requires output type annotation (<ahref="https://github.com/pytorch/pytorch/pull/34554"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34554/hovercard">#33472</a>)</li>
<li><code>torch.rand_like</code> and other <code>_like</code> constructors no longer require additional arguments in TorchScript</li>
<li>Compilation for <code>nn.Module</code> APIs added <ahref="https://github.com/pytorch/pytorch/pull/29495"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29495/hovercard">(#29495)</a>:
<ul>
<li><code>children</code></li>
<li><code>named_children</code></li>
<li><code>modules</code></li>
<li><code>named_modules</code></li>
</ul>
</li>
<li>Support for ModuleList Indexing with Integer Literal (<ahref="https://github.com/pytorch/pytorch/pull/29236/"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29236/hovercard">#29236)</a></li>
<li>Fixed flipped outputs for <code>PackedSequence</code> (<ahref="https://github.com/pytorch/pytorch/pull/32955"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32955/hovercard">#32955)</a></li>
<li>Support <code>index</code> and <code>type</code> properties on <code>Device</code> (<ahref="https://github.com/pytorch/pytorch/pull/32953"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32953/hovercard">#32953</a>)
<li>Fix augmented assignment to non-tensor attributes <ahref="https://github.com/pytorch/pytorch/pull/32993"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32993/hovercard">#32993</a></li>
<li>Fixed type resolution for function arguments <ahref="https://github.com/pytorch/pytorch/pull/29623"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29623/hovercard">#29623</a>
<ul>
<li>Previously we resolved types by parsing their names directly, but
now TorchScript uses the value of the type directly from Python</li>
<li>This allows types types like <code>torch.device</code> to be used</li>
</ul>
</li>
<li><code>len</code> on tuples containing different types <ahref="https://github.com/pytorch/pytorch/pull/35768"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35768/hovercard">#35768</a></li>
</ul>
<h2>Mobile</h2>
<ul>
<li>Fix exception message in Java Tensor (<ahref="https://github.com/pytorch/pytorch/pull/30205"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30205/hovercard">#30205</a>).</li>
<li>Fix the crashes for c++ not able to find java class through Jni (<ahref="https://github.com/pytorch/pytorch/pull/30390"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30390/hovercard">#30390</a>).</li>
<li>Add @DoNotStrip to nativeNewTensor method. (<ahref="https://github.com/pytorch/pytorch/pull/30472"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30472/hovercard">#30472</a>).</li>
<li>GenericDict/List type use unshapedType() (<ahref="https://github.com/pytorch/pytorch/pull/30428"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30428/hovercard">#30428</a>).</li>
<li>Support tensors with a storage offset in Java (<ahref="https://github.com/pytorch/pytorch/pull/31584"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31584/hovercard">#31584</a>).</li>
<li>Fix SIGABORT caused by double exception in PyTorchStreamReader when file not found. (<ahref="https://github.com/pytorch/pytorch/pull/33243"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33243/hovercard">#33243</a>).</li>
<li>Fix for handling batch size 0. (<ahref="https://github.com/pytorch/pytorch/pull/34599"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34599/hovercard">#34599</a>).</li>
<li>fixed AutoGradMode/AutoNonVariableTypeMode uses for mobile callsites</li>
<li>Use <code>gettimeofday</code> on iOS (<ahref="https://github.com/pytorch/pytorch/pull/30361"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30361/hovercard">#30361</a>).</li>
</ul>
<h2>ONNX</h2>
<ul>
<li>Fix <code>weight_norm</code> export for dim=0 (<ahref="https://github.com/pytorch/pytorch/pull/31015"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31015/hovercard">#31015</a>).</li>
<li>Fix for constant folding flaky tests (<ahref="https://github.com/pytorch/pytorch/pull/32546"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32546/hovercard">#32546</a>).</li>
<li>Fix export for avg_pool with default stride (<ahref="https://github.com/pytorch/pytorch/pull/33017"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33017/hovercard">#33017</a>).</li>
<li>Fix ONNX CI by moving test data to aws (<ahref="https://github.com/pytorch/pytorch/pull/33200"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33200/hovercard">#33200</a>).</li>
<li>Fix for random generators export (<ahref="https://github.com/pytorch/pytorch/pull/33789"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33789/hovercard">#33789</a>).</li>
<li>Fix export of index_put (<ahref="https://github.com/pytorch/pytorch/pull/31552"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31552/hovercard">#31552</a>).</li>
<li>Fix for expand -1 dim value (<ahref="https://github.com/pytorch/pytorch/pull/34069"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34069/hovercard">#34069</a>).</li>
<li>Reduce ONNX test time on CI (<ahref="https://github.com/pytorch/pytorch/pull/33242"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33242/hovercard">#33242</a>).</li>
<li>ONNX Error Message on Missing Op (<ahref="https://github.com/pytorch/pytorch/pull/33593"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33593/hovercard">#33593</a>).</li>
<li>Fix exporting <code>copy_</code> with index as tensor input (<ahref="https://github.com/pytorch/pytorch/pull/32801"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32801/hovercard">#32801</a>).</li>
<li>Fix for <code>rand_like</code> as well (<ahref="https://github.com/pytorch/pytorch/pull/33095"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33095/hovercard">#33095</a>).</li>
<li>Added torchvision tests as part of ORT tests (<ahref="https://github.com/pytorch/pytorch/pull/31835"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31835/hovercard">#31835</a>).</li>
<li>Remove non-ascii character from <code>torch/onnx/symbolic_opset11.py</code> (<ahref="https://github.com/pytorch/pytorch/pull/31814"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31814/hovercard">#31814</a>).</li>
<li>Add flag to enable script tests (<ahref="https://github.com/pytorch/pytorch/pull/32654"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32654/hovercard">#32654</a>).</li>
<li>Skip same tests in ONNX Python3 CI as in Python2 (<ahref="https://github.com/pytorch/pytorch/pull/31827"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31827/hovercard">#31827</a>).</li>
<li>Fixed <code>aten::size</code> for opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/35984"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35984/hovercard">#35984</a>)</li>
</ul>
<h2>Quantization</h2>
<ul>
<li>Bug fix: Handle missing keys in observer state dict during load (<ahref="https://github.com/pytorch/pytorch/pull/30357"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30357/hovercard">#30357</a>).</li>
<li>Fix BC for quantized linear (<ahref="https://github.com/pytorch/pytorch/pull/30481"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30481/hovercard">#30481</a>).</li>
<li>Fix mapping white list to avoid attaching qconfig for DeQuantStub (<ahref="https://github.com/pytorch/pytorch/pull/30636"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30636/hovercard">#30636</a>).</li>
<li>Fix default instantation of dynamic quantized LSTM (<ahref="https://github.com/pytorch/pytorch/pull/31433"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31433/hovercard">#31433</a>).</li>
<li>Use default scale/zero_point in fake_quantize module instead of None (<ahref="https://github.com/pytorch/pytorch/pull/32318"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32318/hovercard">#32318</a>).</li>
<li>Fix ASAN / potential segfault in quantized Tensor memory allocations. (<ahref="https://github.com/pytorch/pytorch/pull/29882"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29882/hovercard">#29882</a>).</li>
<li>Don't serialize None values in observer (<ahref="https://github.com/pytorch/pytorch/pull/32733"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32733/hovercard">#32733</a>).</li>
<li>Enable inplace relu fusion for training (<ahref="https://github.com/pytorch/pytorch/pull/33105"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33105/hovercard">#33105</a>).</li>
<li>Bug fix in dynamic quantization kernels + better test coverage. (<ahref="https://github.com/pytorch/pytorch/pull/33320"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33320/hovercard">#33320</a>).</li>
<li>Run weight_post_process for QAT (<ahref="https://github.com/pytorch/pytorch/pull/33852"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33852/hovercard">#33852</a>).</li>
<li>Fix histogram observer to work with QAT on GPU (<ahref="https://github.com/pytorch/pytorch/pull/34232"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34232/hovercard">#34232</a>).</li>
<li>Fix the quantized batchnorm2d (<ahref="https://github.com/pytorch/pytorch/pull/34579"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34579/hovercard">#34579</a>).</li>
<li>Move QScheme ops to c10 (<ahref="https://github.com/pytorch/pytorch/pull/30134"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30134/hovercard">#30134</a>).</li>
<li>Remove incorrect fp16 dynamic linear/relu op (<ahref="https://github.com/pytorch/pytorch/pull/32774"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32774/hovercard">#32774</a>).</li>
<li>Don't crash callee when function does not exist on it, instead return an Exception (<ahref="https://github.com/pytorch/pytorch/pull/32726"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32726/hovercard">#32726</a>).</li>
<li>Throw the correct Exception on local client based on the <code>RemoteException</code> (<ahref="https://github.com/pytorch/pytorch/pull/32936"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32936/hovercard">#32936</a>).</li>
<li>Attach autograd edges only for tensors requiring grad. (<ahref="https://github.com/pytorch/pytorch/pull/30904"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30904/hovercard">#30904</a>).</li>
<li><code>WireSerializer</code> should check <code>has_storage()</code> (<ahref="https://github.com/pytorch/pytorch/pull/34626"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34626/hovercard">#34626</a>).</li>
<li>Fixed potential deadlock in python exception handling (<ahref="https://github.com/pytorch/pytorch/pull/35283/"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35283/hovercard">#35283</a>)</li>
</ul>
<h2>Other</h2>
<ul>
<li>
<p><code>torch.split</code>: Fixed incorrect gradient computation that assumed the output was not a view (<ahref="https://github.com/pytorch/pytorch/pull/32044"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32044/hovercard">#32044</a>).</p>
</li>
<li>
<p>Allowed numpy integer types to be used where we accept Python integers (<ahref="https://github.com/pytorch/pytorch/pull/30486"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30486/hovercard">#30486</a>).</p>
</li>
<li>
<p><code>torch.unique</code>, <code>torch.unique_consecutive</code>: fixed bug with zero-element input support (<ahref="https://github.com/pytorch/pytorch/pull/31211"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31211/hovercard">#31211</a>).</p>
</li>
<li>
<p><code>Tensor.to_sparse</code>: fixed backward in the non-contiguous tensor case (<ahref="https://github.com/pytorch/pytorch/pull/31223"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31223/hovercard">#31223</a>).</p>
</li>
<li>
<p><code>torch.index_put</code>: Added error checks for input tensors’ devices (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="537855141"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/31280"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31280/hovercard"href="https://github.com/pytorch/pytorch/pull/31280">#31280</a>) (<ahref="https://github.com/pytorch/pytorch/pull/31280"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31280/hovercard">#31280</a>).</p>
</li>
<li>
<p>Ensure we switch the CUDA stream correctly in CUDA operations (<ahref="https://github.com/pytorch/pytorch/pull/31537"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31537/hovercard">#31537</a>, <ahref="https://github.com/pytorch/pytorch/pull/31538"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31538/hovercard">#31538</a>, <ahref="https://github.com/pytorch/pytorch/pull/31541"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31541/hovercard">#31541</a>).</p>
</li>
<li>
<p><code>torch.SparseTensor</code>: ensure the legacy sparse constructor doesn't interpret Python data as tensor data. (<ahref="https://github.com/pytorch/pytorch/pull/31490"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31490/hovercard">#31490</a>).</p>
</li>
<li>
<p><code>torch.argmax</code>, <code>torch.argmin</code>: Fixed incorrect behavior on large tensors (<ahref="https://github.com/pytorch/pytorch/pull/33310"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33310/hovercard">#33310</a>).</p>
</li>
<li>
<p><code>torch.div</code>: Fixed to throw an error when dividing by integer zero on CPU (<ahref="https://github.com/pytorch/pytorch/pull/32629"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32629/hovercard">#32629</a>).</p>
</li>
<li>
<p><code>torch.cos</code>: Fixed incorrect gradient computation caused by not properly initializing temporary vectors in avx2 code (<ahref="https://github.com/pytorch/pytorch/pull/32722"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32722/hovercard">#32722</a>, <ahref="https://github.com/pytorch/pytorch/pull/34281"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34281/hovercard">#34281</a>).</p>
<p><code>torch.prod</code>: Fixed behavior when passed a <code>torch.half</code> input tensor and <code>torch.float</code> output tensor (<ahref="https://github.com/pytorch/pytorch/pull/32831"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32831/hovercard">#32831</a>).</p>
</li>
<li>
<p><code>torch.max</code>, <code>torch.min</code>: Fixed NaN handling (<ahref="https://github.com/pytorch/pytorch/pull/32541"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32541/hovercard">#32541</a>).</p>
</li>
<li>
<p><code>torch.max</code>, <code>torch.min</code>: Added error check that operand and outputs are on the same device type (<ahref="https://github.com/pytorch/pytorch/pull/32862"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32862/hovercard">#32862</a>).</p>
<p><code>torch.add</code>: Fixed memory leak on certain platforms (<ahref="https://github.com/pytorch/pytorch/pull/32478"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32478/hovercard">#32478</a>).</p>
<p><code>torch.cumsum</code>: fixed to handle inputs with zero-sized dimensions correctly (<ahref="https://github.com/pytorch/pytorch/pull/31694"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31694/hovercard">#31694</a>).</p>
<p><code>torch.cat</code>: Disallow passing <code>out</code> as one of the input tensors (<ahref="https://github.com/pytorch/pytorch/pull/30577"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30577/hovercard">#30577</a>).</p>
</li>
<li>
<p><code>torch.pdist</code>: Added support for large batch sizes (<ahref="https://github.com/pytorch/pytorch/pull/31593"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31593/hovercard">#31593</a>).</p>
</li>
<li>
<p><code>torch.stft</code>: Fixed crash when used with <code>nn.DataParallel</code> (<ahref="https://github.com/pytorch/pytorch/pull/31861"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31861/hovercard">#31861</a>).</p>
</li>
<li>
<p><code>torch.autograd</code>: Ensure the original grad mode is restored during backward (<ahref="https://github.com/pytorch/pytorch/pull/31884"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31884/hovercard">#31884</a>).</p>
</li>
<li>
<p><code>torch.autograd</code>: Fixed a race condition by locking graph_task before writing leaf_streams. (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="547631220"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/31995"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31995/hovercard"href="https://github.com/pytorch/pytorch/pull/31995">#31995</a>) (<ahref="https://github.com/pytorch/pytorch/pull/31995"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31995/hovercard">#31995</a>).</p>
</li>
<li>
<p><code>torch.tensordot</code>: Fixed support for negative dimensions (<ahref="https://github.com/pytorch/pytorch/pull/31954"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31954/hovercard">#31954</a>).</p>
</li>
<li>
<p><code>torch.cumprod</code>: Fixed to handle inputs with zero-sized dimensions correctly (<ahref="https://github.com/pytorch/pytorch/pull/32070"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32070/hovercard">#32070</a>).</p>
</li>
<li>
<p><code>torch.pow</code>: Fixed the gradient computation when the base is a Tensor or Scalar of zeros (<ahref="https://github.com/pytorch/pytorch/pull/32062"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32062/hovercard">#32062</a>, <ahref="https://github.com/pytorch/pytorch/pull/32063"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32063/hovercard">#32063</a>).</p>
</li>
<li>
<p><code>torch.baddbmm</code>: Fixed bug in corner case (<ahref="https://github.com/pytorch/pytorch/pull/33538"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33538/hovercard">#33538</a>).</p>
</li>
<li>
<p><code>torch.where</code>: Added check for consistent devices (<ahref="https://github.com/pytorch/pytorch/pull/33432"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33432/hovercard">#33432</a>).</p>
</li>
<li>
<p><code>torch.cdist</code>: Fixed gradient computation for <code>p=2</code> and large inputs (<ahref="https://github.com/pytorch/pytorch/pull/31167"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31167/hovercard">#31167</a>).</p>
</li>
<li>
<p><code>torch.mv</code>: Fixed NaN handling (<ahref="https://github.com/pytorch/pytorch/pull/31666"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31666/hovercard">#31666</a>).</p>
</li>
<li>
<p><code>torch.index_put</code>: Added handling for large input tensors (<ahref="https://github.com/pytorch/pytorch/pull/33753"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33753/hovercard">#33753</a>).</p>
</li>
<li>
<p><code>torch.addmm</code>: Fixed incorrect output when using BLAS backend (<ahref="https://github.com/pytorch/pytorch/pull/33819"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33819/hovercard">#33819</a>).</p>
</li>
<li>
<p><code>torch.topk</code> fixed double backward when input has non-finite values (<ahref="https://github.com/pytorch/pytorch/pull/35253"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35253/hovercard">#35253</a>)</p>
</li>
<li>
<p><code>torch.load</code>: Avoid problematic pickle usages on Python 3.8.0 and 3.8.1 (<ahref="https://github.com/pytorch/pytorch/pull/33824"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33824/hovercard">#33824</a>).</p>
</li>
<li>
<p><code>Tensor.to</code>: Fixed race condition for gradient computation that spans CUDA devices (<ahref="https://github.com/pytorch/pytorch/pull/31930"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31930/hovercard">#31930</a>).</p>
</li>
<li>
<p><code>Tensor.random_</code> added check that <code>from</code> and <code>to</code> are within the Tensor’s dtype bounds (<ahref="https://github.com/pytorch/pytorch/pull/34033"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34033/hovercard">#34033</a>).</p>
</li>
<li>
<p><code>Tensor.copy_</code>: Fixed memory overlap check and allowed outputs to be zero-strided tensors if the size is <= 1 along that dimension (<ahref="https://github.com/pytorch/pytorch/pull/34100"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34100/hovercard">#34100</a>).</p>
</li>
<li>
<p><code>nn.BatchNorm{1,2,3}d</code>: fixed gradient computation for empty inputs (<ahref="https://github.com/pytorch/pytorch/pull/32820"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32820/hovercard">#32820</a>).</p>
</li>
<li>
<p><code>nn.BatchNorm</code>: Fixed behavior for inputs with large batch sizes (<ahref="https://github.com/pytorch/pytorch/pull/32763"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32763/hovercard">#32763</a>).</p>
</li>
<li>
<p><code>nn.Conv2d</code>: Fixed 5d weight handling with MKLDNN backend (<ahref="https://github.com/pytorch/pytorch/pull/34115"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34115/hovercard">#34115</a>).</p>
<p><code>nn.Conv{1,2,3}d</code>: added support for empty batch size(<ahref="https://github.com/pytorch/pytorch/pull/32709"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32709/hovercard">#32709</a>).</p>
</li>
<li>
<p><code>nn.Conv{1,2,3}d</code>: fixed <code>CUDNN_STATUS_NOT_SUPPORTED</code> errors by trying multiple algorithms (<ahref="https://github.com/pytorch/pytorch/pull/33073"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33073/hovercard">#33073</a>).</p>
</li>
<li>
<p><code>nn.Conv{1,2,3}d</code>: fixed padding mode support and added additional padding modes (reflection and replication) (<ahref="https://github.com/pytorch/pytorch/pull/31784"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31784/hovercard">#31784</a>).</p>
</li>
<li>
<p><code>nn.Conv2d</code>, <code>nn.Conv3d</code>, <code>nn.Conv1d</code>, <code>nn.ConvTranspose2d</code>: Fixed support for batch sizes greater than 2^32 (<ahref="https://github.com/pytorch/pytorch/pull/31383"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31383/hovercard">#31383</a>, <ahref="https://github.com/pytorch/pytorch/pull/31379"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31379/hovercard">#31379</a>, <ahref="https://github.com/pytorch/pytorch/pull/31889"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31889/hovercard">#31889</a>, <ahref="https://github.com/pytorch/pytorch/pull/34407"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34407/hovercard">#34407,</a><ahref="https://github.com/pytorch/pytorch/pull/31510"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31510/hovercard">#31510</a>).</p>
</li>
<li>
<p><code>nn.InstanceNorm</code>, <code>nn.GroupNorm</code>: Added error check for input with exactly one element (<ahref="https://github.com/pytorch/pytorch/pull/29082"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29082/hovercard">#29082</a>).</p>
</li>
<li>
<p><code>nn.RNN</code>: Fixed moving RNNs to a device after applying weight norm (<ahref="https://github.com/pytorch/pytorch/pull/32563"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32563/hovercard">#32563</a>, <ahref="https://github.com/pytorch/pytorch/pull/32989"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32989/hovercard">#32989</a>).</p>
</li>
<li>
<p><code>nn.MultiLabelMarginLoss</code>: added support for 0-d tensors (<ahref="https://github.com/pytorch/pytorch/pull/30765"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30765/hovercard">#30765</a>).</p>
</li>
<li>
<p><code>nn.GroupNorm</code>: added support for empty batch (<ahref="https://github.com/pytorch/pytorch/pull/32401"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32401/hovercard">#32401</a>).</p>
</li>
<li>
<p><code>nn.NLLLoss</code>: fixed to support empty tensors on CUDA (<ahref="https://github.com/pytorch/pytorch/pull/31491"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31491/hovercard">#31491</a>).</p>
<p><code>nn.MultiLabelMarginLoss</code>: fixed memory leak on CUDA (<ahref="https://github.com/pytorch/pytorch/pull/30767"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30767/hovercard">#30767</a>).</p>
</li>
<li>
<p><code>nn.MultiMarginLoss</code>: fixed error checking on CUDA for the 1D case. (<ahref="https://github.com/pytorch/pytorch/pull/30825"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30825/hovercard">#30825</a>).</p>
</li>
<li>
<p><code>nn.Softmax</code>: Fixed half->float case of softmax backward (<ahref="https://github.com/pytorch/pytorch/pull/30838"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30838/hovercard">#30838</a>).</p>
</li>
<li>
<p><code>nn.Softshrink</code>: Added check that lambda is no less than zero (<ahref="https://github.com/pytorch/pytorch/pull/33201"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33201/hovercard">#33201</a>).</p>
</li>
<li>
<p><code>nn.functional.interpolate</code>: added support for empty batch size input for interpolate. (<ahref="https://github.com/pytorch/pytorch/pull/32400"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32400/hovercard">#32400</a>).</p>
</li>
<li>
<p><code>nn.functional.pad</code>: Also return a new tensor instead of sometimes returning a view (<ahref="https://github.com/pytorch/pytorch/pull/32350"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32350/hovercard">#32350</a>).</p>
</li>
<li>
<p><code>nn.functional.grid_sample</code>: Fixed gradient computation at image borders (<ahref="https://github.com/pytorch/pytorch/pull/32829"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32829/hovercard">#32829</a>).</p>
<p><code>optim.MultiStepLR</code>: Fix “unbound local variable” error by removing return value for <code>__exit__</code> (<ahref="https://github.com/pytorch/pytorch/pull/32997"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32997/hovercard">#32997</a>).</p>
<p><code>torch.autograd</code>: added new error message if incorrect usage would cause a deadlock (<ahref="https://github.com/pytorch/pytorch/pull/32295"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32295/hovercard">#32295</a>).</p>
<p><code>torch.autograd</code>: Fixed incorrect handling of functions that return multiple views (<ahref="https://github.com/pytorch/pytorch/pull/32790"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32790/hovercard">#32790</a>).</p>
</li>
<li>
<p><code>autograd.Function</code>: Fixed error if <code>Function</code> returned a view in a <code>torch.no_grad</code> block (<ahref="https://github.com/pytorch/pytorch/pull/33896"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33896/hovercard">#33896</a>).</p>
</li>
<li>
<p><code>autograd.Function</code>: Added more error checks for incorrect behavior (<ahref="https://github.com/pytorch/pytorch/pull/33069"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33069/hovercard">#33069</a>).</p>
</li>
<li>
<p><code>autograd.Function</code>: Added nice error message if missing overrides (<ahref="https://github.com/pytorch/pytorch/pull/33142"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33142/hovercard">#33142</a>).</p>
</li>
<li>
<p><code>autograd.Function</code>: Fixed version check for <code>grad_fn</code> for views (<ahref="https://github.com/pytorch/pytorch/pull/34145"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34145/hovercard">#34145</a>).</p>
</li>
<li>
<p><code>autograd.profiler</code>: Fix incorrect chrome trace formatting output for CUDA traces (<ahref="https://github.com/pytorch/pytorch/pull/33987"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33987/hovercard">#33987</a>).</p>
</li>
<li>
<p><code>multiprocessing.util.register_after_fork</code>: fixed crash on Windows (<ahref="https://github.com/pytorch/pytorch/pull/30809"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30809/hovercard">#30809</a>).</p>
</li>
<li>
<p><code>utils.data.DataLoader</code>: Fixed potential hang when exiting main process (<ahref="https://github.com/pytorch/pytorch/pull/33721"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33721/hovercard">#33721</a>).</p>
</li>
<li>
<p><code>utils.tensorboard.SummaryWriter</code> fixed <code>scale_factor</code> calculation for uint8 tensor (<ahref="https://github.com/pytorch/pytorch/pull/31778"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31778/hovercard">#31778</a>).</p>
</li>
<li>
<p><code>utils.tensorboard</code> Fix for when PyTorch model trace has RecursiveScriptModules (<ahref="https://github.com/pytorch/pytorch/pull/30430"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30430/hovercard">#30430</a>).</p>
</li>
<li>
<p>Fixed <code>CPU_INTEL</code> flag error on Windows (<ahref="https://github.com/pytorch/pytorch/pull/30564"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30564/hovercard">#30564</a>).</p>
</li>
<li>
<p>Don't use <code>RTLD_GLOBAL</code> to load <code>_C</code>, resolving a multitude of weird segfaults and crashes<br>
when PyTorch is imported along with other packages (<ahref="https://github.com/pytorch/pytorch/pull/31162"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31162/hovercard">#31162</a>).</p>
</li>
<li>
<p>Fixed dll load logic for Python 3.8 on Windows (<ahref="https://github.com/pytorch/pytorch/pull/32215"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32215/hovercard">#32215</a>).</p>
</li>
<li>
<p><code>quasirandom.SobolEngine</code>: Fixed crash when default tensor type is CUDA (<ahref="https://github.com/pytorch/pytorch/pull/32496"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32496/hovercard">#32496</a>).</p>
</li>
<li>
<p>Fixed error message when converting NumPy array with negative strides to a <code>torch.Tensor</code> (<ahref="https://github.com/pytorch/pytorch/pull/33254"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33254/hovercard">#33254</a>).</p>
</li>
<li>
<p>Fixed crash when indexing a <code>torch.Tensor</code> with a single-element array (<ahref="https://github.com/pytorch/pytorch/pull/33456"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33456/hovercard">#33456</a>).</p>
</li>
<li>
<p>Fixed crash when converting CUDA tensors and non-strided tensors to NumPy arrays (<ahref="https://github.com/pytorch/pytorch/pull/33612"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33612/hovercard">#33612</a>).</p>
</li>
<li>
<p>Prevented crash on exit from static destructor race on Windows (<ahref="https://github.com/pytorch/pytorch/pull/33955"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33955/hovercard">#33955</a>).</p>
</li>
<li>
<p>Fixed uncaught <code>std::domain_error</code> on macOS (<ahref="https://github.com/pytorch/pytorch/pull/34301"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34301/hovercard">#34301</a>).</p>
</li>
<li>
<p>Don’t reset worker affinity when using operators that call into OpenMP (<ahref="https://github.com/pytorch/pytorch/pull/29006"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29006/hovercard">#29006</a>).</p>
</li>
<li>
<p><code>torch.backends.mkldnn</code>: changed to be usable without import (<ahref="https://github.com/pytorch/pytorch/pull/32055"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32055/hovercard">#32055</a>).</p>
</li>
</ul>
<h1>Performance</h1>
<h2>Mobile</h2>
<ul>
<li>Java Tensor hybrid, owns at::Tensor, no memcopy for java outputs. (<ahref="https://github.com/pytorch/pytorch/pull/30501"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30501/hovercard">#30501</a>).</li>
<li>Tensor prep from image in native (<ahref="https://github.com/pytorch/pytorch/pull/31426"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31426/hovercard">#31426</a>).</li>
<li>Pass to remove prepacking ops. (<ahref="https://github.com/pytorch/pytorch/pull/34319"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34319/hovercard">#34319</a>).</li>
<li>Speed up per-channel min-max observer (<ahref="https://github.com/pytorch/pytorch/pull/34118"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34118/hovercard">#34118</a>).</li>
<li>Vectorized qmul and more methods on qint data types (<ahref="https://github.com/pytorch/pytorch/pull/34376"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34376/hovercard">#34376</a>).</li>
<li>Avoid sending large unneeded data over wire in <code>ProcessGroupAgent</code>. (<ahref="https://github.com/pytorch/pytorch/pull/31357"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31357/hovercard">#31357</a>).</li>
<li>Integrate async mode for autograd engine with distributed autograd. (<ahref="https://github.com/pytorch/pytorch/pull/31508"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31508/hovercard">#31508</a>).</li>
<li>Make handling of <code>FORWARD_AUTOGRAD_REQ</code> in <code>request_callback_impl</code> nonblocking (<ahref="https://github.com/pytorch/pytorch/pull/32476"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32476/hovercard">#32476</a>).</li>
</ul>
<h2>Other</h2>
<ul>
<li>Major multithreaded performance regression when doing operator calls resolved (<ahref="https://github.com/pytorch/pytorch/pull/30333"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30333/hovercard">#30333</a>)</li>
<li>Improved performance of comparison ops on CUDA (<ahref="https://github.com/pytorch/pytorch/pull/29743"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29743/hovercard">#29743</a>).</li>
<li><code>nn.SmoothL1Loss</code>: vectorized gradient computation on CPU. (<ahref="https://github.com/pytorch/pytorch/pull/30046"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30046/hovercard">#30046</a>).</li>
<li><code>nn.EmbeddingBag</code>: improved performance on CPU (<ahref="https://github.com/pytorch/pytorch/pull/30701"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30701/hovercard">#30701</a>, <ahref="https://github.com/pytorch/pytorch/pull/27477"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27477/hovercard">#27477</a>).</li>
<li><code>nn.LayerNorm</code>: optimized with explicit vectorization using Vec256 (<ahref="https://github.com/pytorch/pytorch/pull/31127"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31127/hovercard">#31127</a>).</li>
<li><code>Tensor.copy_</code>: fixed kernel speed regression introduced in <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="521326231"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/29631"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29631/hovercard"href="https://github.com/pytorch/pytorch/pull/29631">#29631</a> (<ahref="https://github.com/pytorch/pytorch/pull/31279"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31279/hovercard">#31279</a>).</li>
<li>Moved a number of debug asserts to not compile in release builds (<ahref="https://github.com/pytorch/pytorch/pull/31240"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31240/hovercard">#31240</a>).</li>
<li><code>Tensor::has_names</code> sped up for unnamed tensors (<ahref="https://github.com/pytorch/pytorch/pull/31436"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31436/hovercard">#31436</a>).</li>
<li><code>torch.index_select</code>: optimized performance on CPU (<ahref="https://github.com/pytorch/pytorch/pull/30598"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30598/hovercard">#30598</a>).</li>
<li><code>nn.Conv{1,2,3}d</code>: Improved performance by refactoring <code>bias</code> handling for cuDNN backend (<ahref="https://github.com/pytorch/pytorch/pull/31524"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31524/hovercard">#31524</a>).</li>
<li><code>torch.norm</code>: Optimized case where <code>p = 2</code> (<ahref="https://github.com/pytorch/pytorch/pull/31903"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31903/hovercard">#31903</a>).</li>
<li><code>nn.utils.clip_grad_norm_</code>: Refactored the computation for more performance (<ahref="https://github.com/pytorch/pytorch/pull/32020"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32020/hovercard">#32020</a>).</li>
<li>Made an assert on a hotpath trigger only in DEBUG mode (<ahref="https://github.com/pytorch/pytorch/pull/32117"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32117/hovercard">#32117</a>).</li>
<li>First steps toward TensorIterator unrolling and vectorized load (<ahref="https://github.com/pytorch/pytorch/pull/31974"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31974/hovercard">#31974</a>).</li>
<li><code>nn.functional.normalize</code>: changed to use <code>clamp_min_</code> (<ahref="https://github.com/pytorch/pytorch/pull/32360"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32360/hovercard">#32360</a>).</li>
<li>Stopped refreshing numel on a stride update (<ahref="https://github.com/pytorch/pytorch/pull/32116"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32116/hovercard">#32116</a>).</li>
<li><code>nn.functional.softplus</code>: vectorized operator and gradient computation on CPU (<ahref="https://github.com/pytorch/pytorch/pull/32944"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32944/hovercard">#32944</a>).</li>
<li><code>torch.gather</code> regression fixed by not materializing loop vars in error message (<ahref="https://github.com/pytorch/pytorch/pull/33108"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33108/hovercard">#33108</a>).</li>
<li><code>nn.ELU</code> forward and backward vectorized on CPU (<ahref="https://github.com/pytorch/pytorch/pull/32985"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32985/hovercard">#32985</a>, <ahref="https://github.com/pytorch/pytorch/pull/32986"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32986/hovercard">#32986</a>)</li>
<li><code>torch.cat</code>: optimized performance on CPU (<ahref="https://github.com/pytorch/pytorch/pull/30806"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30806/hovercard">#30806</a>, <ahref="https://github.com/pytorch/pytorch/pull/33534"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33534/hovercard">#33534</a>).</li>
<li><code>torch.conv3d</code>: optimized Unfold3d to improve performance (<ahref="https://github.com/pytorch/pytorch/pull/33191"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33191/hovercard">#33191</a>).</li>
<li>Workaround performance bug and memory leak in GOMP for AMD CPUs (<ahref="https://github.com/pytorch/pytorch/pull/32875"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32875/hovercard">#32875</a>).</li>
<li>Bounds checking for functor execution in vectorized/unrolled kernels (<ahref="https://github.com/pytorch/pytorch/pull/33642"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33642/hovercard">#33642</a>).</li>
<li><code>nn.EmbeddingBag</code>: improved performance on CUDA (<ahref="https://github.com/pytorch/pytorch/pull/33589"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33589/hovercard">#33589</a>).</li>
<li>Remove unnecessary tensor copies while calling operators (<ahref="https://github.com/pytorch/pytorch/pull/33732"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33732/hovercard">#33732</a>).</li>
<li>clang intrinsics targeting on Windows (<ahref="https://github.com/pytorch/pytorch/pull/33958"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33958/hovercard">#33958</a>).</li>
<li><code>nn.Dropout</code>: added vectorized CUDA implementation (<ahref="https://github.com/pytorch/pytorch/pull/33879"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33879/hovercard">#33879</a>).</li>
<li><code>nn.UpSampleNearest{1, 2, 3}d</code> performance on CPU optimized (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="540003476"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/31452"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31452/hovercard"href="https://github.com/pytorch/pytorch/pull/31452">#31452</a>) (<ahref="https://github.com/pytorch/pytorch/pull/31452"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31452/hovercard">#31452</a>).</li>
<li>Remove <code>cudaMemcpy</code> on full memory overlap (<ahref="https://github.com/pytorch/pytorch/pull/34548"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34548/hovercard">#34548</a>).</li>
<li>CUDA Loops: move address computation into policy, make <code>policy.load</code> load all arguments (<ahref="https://github.com/pytorch/pytorch/pull/33720"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33720/hovercard">#33720</a>).</li>
<li>Add the build for runtime dispatch for AVX, AVX2 instruction set (<ahref="https://github.com/pytorch/pytorch/pull/26125"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26125/hovercard">#26125</a>).</li>
<li><code>nn.RReLU</code> performance improved up to 5x for inference on CPU (<ahref="https://github.com/pytorch/pytorch/pull/31094"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31094/hovercard">#31094</a>).</li>
<li><code>nn.LogSigmoid</code> performance improved up to 10x on CPU (<ahref="https://github.com/pytorch/pytorch/pull/30958"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30958/hovercard">#30958</a>).</li>
<li><code>torch.dist</code> performance improved up to 2x (<ahref="https://github.com/pytorch/pytorch/pull/29714"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29714/hovercard">#29714</a>).</li>
<li><code>torch.max</code>, <code>torch.min</code> performance improved up to 1.5x on CPU (<ahref="https://github.com/pytorch/pytorch/pull/33936"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33936/hovercard">#33936</a>).</li>
<li><code>nn.GLU</code> performance improved up to 1.5X on CPU (<ahref="https://github.com/pytorch/pytorch/pull/33179"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33179/hovercard">#33179</a>).</li>
<li><code>nn.LeakyReLU</code> performance improved up to 4x (<ahref="https://github.com/pytorch/pytorch/pull/29899"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29899/hovercard">#29899</a>).</li>
<li><code>nn.HardTanh</code> performance improved up to 5x (<ahref="https://github.com/pytorch/pytorch/pull/30152"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30152/hovercard">#30152</a>).</li>
</ul>
<h1>Documentation</h1>
<h2>Python</h2>
<ul>
<li>Added documentation for <code>nn.functional.softplus</code> (<ahref="https://github.com/pytorch/pytorch/pull/30055"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30055/hovercard">#30055</a>, <ahref="https://github.com/pytorch/pytorch/pull/32945"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32945/hovercard">#32945</a>).</li>
<li><code>torch.max</code>: Added warning about different, nondeterministic behavior on CPU and CUDA (<ahref="https://github.com/pytorch/pytorch/pull/31115"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31115/hovercard">#31115</a>).</li>
<li>Clarified the documentation for <code>nn.NLLLoss</code> (<ahref="https://github.com/pytorch/pytorch/pull/31488"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31488/hovercard">#31488</a>).</li>
<li>Exclude generated source docs from Google search indexing (<ahref="https://github.com/pytorch/pytorch/pull/31484"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31484/hovercard">#31484</a>).</li>
<li><code>torch.poisson</code> docstring added to documentation (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="543009293"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/31667"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31667/hovercard"href="https://github.com/pytorch/pytorch/pull/31667">#31667</a>) (<ahref="https://github.com/pytorch/pytorch/pull/31667"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31667/hovercard">#31667</a>).</li>
<li><code>torch.eq</code> fixed incorrect examples in documentation (<ahref="https://github.com/pytorch/pytorch/pull/32399"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32399/hovercard">#32399</a>).</li>
<li><code>optim.CosineAnnealingLR</code>: fixed the usage in examples (<ahref="https://github.com/pytorch/pytorch/pull/31358"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31358/hovercard">#31358</a>).</li>
<li>Removed legacy <code>.data</code> usages from the <code>torch.nn</code> documentation (<ahref="https://github.com/pytorch/pytorch/pull/31481"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31481/hovercard">#31481</a>).</li>
<li>Fixed description of convolution modules (<ahref="https://github.com/pytorch/pytorch/pull/30079"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30079/hovercard">#30079</a>).</li>
<li><code>Tensor.t()</code>, <code>Tensor.permute()</code>, <code>Tensor.unfold()</code>, and <code>Tensor.select()</code> clarified to note that they return views (<ahref="https://github.com/pytorch/pytorch/pull/32512"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32512/hovercard">#32512</a>).</li>
<li><code>torch.multiprocessing</code> Updated documentation indicating that start_method is ignored for <code>mp.spawn()</code> (<ahref="https://github.com/pytorch/pytorch/pull/33070"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33070/hovercard">#33070</a>).</li>
<li>Improved CPU threading documentation (<ahref="https://github.com/pytorch/pytorch/pull/33083"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33083/hovercard">#33083</a>).</li>
<li><code>nn.BCELoss</code>: documented how it avoids infinite results (<ahref="https://github.com/pytorch/pytorch/pull/33160"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33160/hovercard">#33160</a>).</li>
<li><code>nn.utils.rnn.pack_padded_sequence</code>: Improved the description of <code>enforce_sorted</code> (<ahref="https://github.com/pytorch/pytorch/pull/33617"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33617/hovercard">#33617</a>).</li>
<li>Created a Tensor View documentation page that documents all PyTorch operations that return views (<ahref="https://github.com/pytorch/pytorch/pull/32560"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32560/hovercard">#32560</a>).</li>
<li>Added grad context manager doc to top level torch module. (<ahref="https://github.com/pytorch/pytorch/pull/33877"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33877/hovercard">#33877</a>).</li>
<li>Fix <code>at::Tensor</code> docs generation and make it accessible again at <ahref="https://pytorch.org/cppdocs/api/classat_1_1_tensor.html"rel="nofollow">https://pytorch.org/cppdocs/api/classat_1_1_tensor.html</a> (<ahref="https://github.com/pytorch/pytorch/pull/34467"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34467/hovercard">#34467</a>)</li>
<li>Add docs for all <code>torch::nn modules</code> and functionals (<ahref="https://github.com/pytorch/pytorch/pull/34522"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34522/hovercard">#34522</a>) (<ahref="https://github.com/pytorch/pytorch/pull/34688"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34688/hovercard">#34688</a>) (<ahref="https://github.com/pytorch/pytorch/pull/34752"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34752/hovercard">#34752</a>)</li>
<li>Improve C++ autograd and tensor indexing docs (<ahref="https://github.com/pytorch/pytorch/pull/35919"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35919/hovercard">#35919</a>)</li>
<li>Fix example in <code>torch::nn::ModuleList</code> docs (<ahref="https://github.com/pytorch/pytorch/pull/34463"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34463/hovercard">#34463</a>)</li>
</ul>
<h2>RPC</h2>
<ul>
<li>Reorganize RPC API doc and add introduction (<ahref="https://github.com/pytorch/pytorch/pull/30491"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30491/hovercard">#30491</a>, <ahref="https://github.com/pytorch/pytorch/pull/35109"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35109/hovercard">#35109</a>).</li>
<li>Make doc source format consistent in <code>rpc/init.cpp</code> (<ahref="https://github.com/pytorch/pytorch/pull/30515"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30515/hovercard">#30515</a>).</li>
<li>Add examples to RRef doc (<ahref="https://github.com/pytorch/pytorch/pull/30516"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30516/hovercard">#30516</a>).</li>
<li>Add more details to explain <code>rpc_backend_options</code> arg in <code>init_rpc</code> (<ahref="https://github.com/pytorch/pytorch/pull/30855"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30855/hovercard">#30855</a>).</li>
<li>Fix examples in API doc (<ahref="https://github.com/pytorch/pytorch/pull/30856"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30856/hovercard">#30856</a>).</li>
<li>Fix examples in RRef API doc (<ahref="https://github.com/pytorch/pytorch/pull/30857"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30857/hovercard">#30857</a>).</li>
<li>Document WorkerInfo and <code>RpcBackendOptions</code> structures in RPC docs. (<ahref="https://github.com/pytorch/pytorch/pull/31077"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31077/hovercard">#31077</a>).</li>
<li>Explain RPC behavior when using Tensor as arg or return value (<ahref="https://github.com/pytorch/pytorch/pull/31968"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31968/hovercard">#31968</a>).</li>
<li>Update RPC docs to reflect correct use of dist_autograd backwards and dist_optim <code>step() </code>(<ahref="https://github.com/pytorch/pytorch/pull/34670"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34670/hovercard">#34670</a>).</li>
<li>Minor doc tweak to use mp.spawn in example (<ahref="https://github.com/pytorch/pytorch/pull/30381"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30381/hovercard">#30381</a>).</li>
<li>Add info about transitive dependencies in case of using local aars (<ahref="https://github.com/pytorch/pytorch/pull/30128"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30128/hovercard">#30128</a>).</li>
<li>Update Docs for building PyTorch for Android. (<ahref="https://github.com/pytorch/pytorch/pull/32578"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32578/hovercard">#32578</a>).</li>
<li>Updates to quantization documentation (<ahref="https://github.com/pytorch/pytorch/pull/30288"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30288/hovercard">#30288</a>).</li>
<li>Fix docs so that the example works (<ahref="https://github.com/pytorch/pytorch/pull/30120"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30120/hovercard">#30120</a>).</li>
<li>Add the explicit per-tensor/per-channel quant info when we print the module (<ahref="https://github.com/pytorch/pytorch/pull/30591"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30591/hovercard">#30591</a>).</li>
<li>Fixed typos in quantization docs / docstrings (<ahref="https://github.com/pytorch/pytorch/pull/34182"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34182/hovercard">#34182</a>).</li>
<li>Docs entry for the <code>is_quantized</code> (<ahref="https://github.com/pytorch/pytorch/pull/32075"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32075/hovercard">#32075</a>).</li>
</ul>
<h1>Deprecations</h1>
<h2>Python</h2>
<h3>How to figure out which line in your code is raising a warning</h3>
<p>Attempting to use deprecated behavior will raise warnings.
Unfortunately, sometimes it is not entirely obvious what line of code
the warning corresponds to, especially if the the warning comes from our
C++ backend. For example, with a file named <code>foo.py</code> with the following contents,</p>
<pre><code>import torch
# This is newly deprecated behavior, see the next section
torch.tensor(1) / torch.tensor(2)
</code></pre>
<p>running it doesn’t give us the location of the warning:</p>
<pre><code>> python foo.py
../aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true
division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
</code></pre>
<p>We can use the <code>warnings</code> module to tell us where the warning is by asking it to treat warnings as errors:</p>
# This is newly deprecated behavior, see the next section
torch.tensor(1) / torch.tensor(2)
</code></pre>
<p>Running the file now tells us exactly where the warning is:</p>
<pre><code>> python foo.py
Traceback (most recent call last):
File "foo.py", line 5, in <module>
torch.tensor(1) / torch.tensor(2)
UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide
or floor_divide (// in Python) instead.
</code></pre>
<h3>Deprecated <code>torch.div</code> and <code>torch.addcdiv</code> integer floor division behavior (<ahref="https://github.com/pytorch/pytorch/pull/34570"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34570/hovercard">#34570</a>)</h3>
<p>In 1.5.0 and older PyTorch releases <code>torch.div</code> and the <code>/</code> operator perform integer floor division. In a future PyTorch release, torch.div (including the <code>/</code> operator) will perform "true" division as in Python3 and NumPy.</p>
<p>To floor divide integer tensors, please use <code>torch.floor_divide</code> instead.</p>
<h3>Deprecated <code>torch.full</code> returning float tensors if no dtype is specified (<ahref="https://github.com/pytorch/pytorch/pull/34709"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/34709/hovercard">#34709</a>).</h3>
<p>In a future PyTorch release, <code>torch.full</code> will infer its
dtype from its fill value when the optional dtype and out parameters are
unspecified, matching NumPy's inference for <code>numpy.full</code>. For example, <code>torch.full(size, 1)</code> will return a tensor of <code>torch.long</code> dtype, unlike today where it returns a tensor of <code>torch.float</code> dtype.</p>
<p>This is an internal-facing class that is not a part of our public
API. We’ve refactored some PyTorch internals to work without it and will
remove it in a future release.</p>
<h3>Deprecated positional args in multiple <code>torch</code> function signatures (<ahref="https://github.com/pytorch/pytorch/pull/32009"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32009/hovercard">#32009</a>, <ahref="https://github.com/pytorch/pytorch/pull/33428"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33428/hovercard">#33428</a>)</h3>
<p>Below please find a list of deprecated signatures and what to change them to.</p>
<ul>
<li><code>torch.add(self: Tensor, alpha: Scalar, other: Tensor)</code>, <code>torch.sub(self: Tensor, alpha: Scalar, other: Tensor)</code> please use <code>alpha</code> as a keyword-only arg instead of positional args</li>
<li><code>torch.addbmm(beta: Scalar, self: Tensor, alpha: Scalar, batch1: Tensor, batch2: Tensor)</code>: please use <code>alpha</code> and <code>beta</code> as keyword only args instead of positional args.</li>
<li><code>torch.addcdiv(self: Tensor, value: Scalar, tensor1: Tensor, tensor2: Tensor)</code>, <code>torch.addmdiv(self: Tensor, value: Scalar, tensor1: Tensor, tensor2: Tensor)</code>: please use <code>value</code> as a keyword-only arg</li>
<li><code>torch.addmm(beta: Scalar, self: Tensor, alpha: Scalar, mat1: Tensor, mat2: Tensor)</code>, <code>torch.sspaddmm(beta: Scalar, self: Tensor, alpha: Scalar, mat1: Tensor, mat2: Tensor)</code> please use <code>alpha</code> and <code>beta</code> as keyword only args instead of positional args.</li>
<li><code>torch.addmv(beta: Scalar, self: Tensor, alpha: Scalar, mat: Tensor, vec: Tensor)</code>: please use <code>alpha</code> and <code>beta</code> as keyword only args instead of positional args.</li>
<li><code>torch.addr(beta: Scalar, self: Tensor, alpha: Scalar, vec1: Tensor, vec2: Scalar)</code>: please use <code>alpha</code> and <code>beta</code> as keyword only args instead of positional args.</li>
<li><code>torch.baddbmm(beta: Scalar, self: Tensor, alpha: Scalar, batch1: Tensor, batch2: Tensor)</code>: please use <code>alpha</code> and <code>beta</code> as keyword only args instead of positional args.</li>
<h3>Deprecate modifying in-place a view that returned by a custom autograd Function (<ahref="https://github.com/pytorch/pytorch/pull/32839"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32839/hovercard">#32839</a>).</h3>
<p>Modifying in-place a view that was created by a custom Function leads
to the custom backward not being called or being called with a partial
gradient. This behavior will be removed in 1.6.</p>
<p>Please clone() the output of the Function to avoid incorrect gradient computation.</p>
<h3>Deprecate modifying in-place a view created inside a no_grad block (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="557743694"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/32839"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/32839/hovercard"href="https://github.com/pytorch/pytorch/pull/32839">#32839</a>)</h3>
<p>Modifying in-place a view created inside a no_grad block is ambiguous and error-prone so we have deprecated it.</p>
<p>Here is an example of some code that we’ve deprecated. In previous
versions of PyTorch, the following code throws a non-descriptive error
message, but we've added a deprecation in 1.5.0.</p>
<pre><code>>>> base = torch.rand(10, requires_grad=True)
>>> var = torch.rand([], requires_grad=True)
>>> with torch.no_grad():
>>> view = base[1]
>>> view.copy_(var)
>>> torch.autograd.grad(base.sum(), var)
RuntimeError: A view was created in no_grad mode and is being modified inplace with grad mode enabled. Given that this use case is ambiguous and error-prone,
it is deprecated and will be forbidden starting 1.6 (see https://github.com/pytorch/pytorch/pull/32839 for more details about this). You can clarify your code and remove this warning by moving both the view and the inplace either both inside the no_grad block (if you don't want the inplace to be tracked) or both outside (if you want the inplace to be tracked).
</code></pre>
<p>If you want to differentiate, you should change the above code to</p>
<pre><code>>>> base = torch.rand(10, requires_grad=True)
>>> var = torch.rand([], requires_grad=True)
>>> view = base[1]
>>> view.copy_(var)
>>> torch.autograd.grad(base.sum(), var)
(tensor(1.),)
</code></pre>
<p>If you don’t want to differentiate, you should change it to</p>
<pre><code>>>> base = torch.rand(10, requires_grad=True)
>>> var = torch.rand([], requires_grad=True)
<p>Please use <code>Tensor.options()</code> instead.</p>
<h1>Miscellaneous</h1>
<ul>
<li>Part of an automated mixed-precision solution (<ahref="https://github.com/pytorch/pytorch/pull/33366"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33366/hovercard">#33366</a>, <ahref="https://github.com/pytorch/pytorch/pull/33832"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/33832/hovercard">#33832</a>).</li>
<preclass="text-small text-gray">[v.1.5.0] Ensure linearIndex of advanced indexing backwards is contig… (
<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="603586052"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36962"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36962/hovercard"href="https://github.com/pytorch/pytorch/pull/36962">#36962</a>)
This is a more straightforward solution to the problem than <aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="603580389"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36957"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36957/hovercard"href="https://github.com/pytorch/pytorch/pull/36957">#36957</a>; I don't know about the relative performance.
<preclass="text-small text-gray">Use counter instead of vector of futures in `_parallel_run` (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="596095120"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36159"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36159/hovercard"href="https://github.com/pytorch/pytorch/pull/36159">#36159</a>) (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="597512650"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36334"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36334/hovercard"href="https://github.com/pytorch/pytorch/pull/36334">#…</a>
<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="597512650"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/36334"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/36334/hovercard"href="https://github.com/pytorch/pytorch/pull/36334">…36334</a>)
(cherry picked from commit <aclass="commit-link"data-hovercard-type="commit"data-hovercard-url="https://github.com/pytorch/pytorch/commit/986a8fdd6a18d9110f8bde59361967139450966b/hovercard"href="https://github.com/pytorch/pytorch/commit/986a8fdd6a18d9110f8bde59361967139450966b"><tt>986a8fd</tt></a>)
Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
<preclass="text-small text-gray">Revert "Fix handling of non-finite values in topk (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="586545606"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/35253"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35253/hovercard"href="https://github.com/pytorch/pytorch/pull/35253">#35253</a>)" (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="589449999"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/35582"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/35582/hovercard"href="https://github.com/pytorch/pytorch/pull/35582">#35582</a>)
This reverts commit <aclass="commit-link"data-hovercard-type="commit"data-hovercard-url="https://github.com/pytorch/pytorch/commit/b12579da5398ff23b421332e21e18dc619a0b960/hovercard"href="https://github.com/pytorch/pytorch/commit/b12579da5398ff23b421332e21e18dc619a0b960"><tt>b12579d</tt></a>.
<p>Following the experimental release of <ahref="https://pytorch.org/blog/pytorch-1-dot-3-adds-mobile-privacy-quantization-and-named-tensors/"rel="nofollow">PyTorch Mobile in the 1.3 release</a>,
PyTorch 1.4 adds additional mobile support including the ability to
customize build scripts at a fine-grain level. This allows mobile
developers to optimize library size by only including the operators used
by their models and, in the process, reduce their on device footprint
significantly. Initial results show that, for example, a customized
MobileNetV2 is 40% to 50% smaller than the prebuilt PyTorch mobile
library. <ahref="https://pytorch.org/mobile/home/"rel="nofollow">Learn more</a> about how to create your own custom builds, and please engage with the community on the <ahref="https://discuss.pytorch.org/c/mobile"rel="nofollow">PyTorch forums</a> to provide any feedback you have.</p>
<p>For the full tutorials, see the links below:</p>
<ul>
<li><ahref="https://pytorch.org/tutorials/intermediate/rpc_tutorial.html"rel="nofollow">A full RPC tutorial</a></li>
<li><ahref="https://github.com/pytorch/examples/tree/master/distributed/rpc">Examples using model parallel training for reinforcement learning and with an LSTM</a></li>
</ul>
<p>As always, you can connect with community members and discuss more on the <ahref="https://discuss.pytorch.org/c/distributed/distributed-rpc"rel="nofollow">forums</a>.</p>
<p>Learn more about how to use PyTorch from Java <ahref="https://github.com/pytorch/java-demo">here</a>, and see the full Javadocs API documentation <ahref="https://pytorch.org/docs/stable/packages.html"rel="nofollow">here</a>.</p>
<p>To prune a tensor, first select a pruning technique among those available in <code>nn.utils.prune</code> (or implement your own by subclassing <code>BasePruningMethod</code>).</p>
<p>To prune a module, select one of the pruning functions available in <code>nn.utils.prune</code> (or implement your own) and specify which module and which parameter within that module pruning should act on.</p>
<p>Pruning reparametrizes the module by turning <code>weight</code> (in the example above) from a parameter to an attribute, and replacing it with a new parameter called <code>weight_orig</code> (i.e. appending <code>"_orig"</code> to the initial parameter <code>name</code>) that stores the unpruned version of the tensor. The pruning mask is stored as a buffer named <code>weight_mask</code> (i.e. appending <code>"_mask"</code> to the initial parameter <code>name</code>). Pruning is applied prior to each forward pass by recomputing <code>weight</code> through a multiplication with the updated mask using PyTorch's <code>forward_pre_hooks</code>.</p>
<p><code>nn.utils.prune</code> is easily extensible to support new pruning functions by subclassing the <code>BasePruningMethod</code> base class and implementing the <code>compute_mask</code> method with the instructions to compute the mask according to the logic of the new pruning technique.</p>
<h1>Backwards Incompatible Changes</h1>
<h2>Python</h2>
<h3><code>torch.optim</code>: It is no longer supported to use <code>Scheduler.get_lr()</code> to obtain the last computed learning rate. to get the last computed learning rate, call <code>Scheduler.get_last_lr()</code> instead. (<ahref="https://github.com/pytorch/pytorch/pull/26423"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26423/hovercard">26423</a>)</h3>
<p>Learning rate schedulers are now “chainable,” as mentioned in the <em>New Features</em> section below. <code>Scheduler.get_lr</code> was sometimes used for monitoring purposes to obtain the current learning rate. But since <code>Scheduler.get_lr</code>
is also used internally for computing new learning rates, this actually
returns a value that is “one step ahead.” To get the last computed
learning rate, use <code>Scheduler.get_last_lr</code> instead.</p>
<p>Note that <code>optimizer.param_groups[0]['lr']</code> was in version 1.3.1 and remains in 1.4.0 a way of getting the current learning rate used in the optimizer.</p>
<h3><code>Tensor.unfold</code> on a 0-dimensional Tensor now properly returns a 1-dimensional Tensor.</h3>
<li>Make <code>torch.jit.get_trace_graph</code> private (it is now <code>torch.jit._get_trace_graph</code>) (<ahref="https://github.com/pytorch/pytorch/pull/29149"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29149/hovercard">29149</a>)
<ul>
<li>This function was intended only for ONNX integration; use <code>traced_module.graph</code> instead, like:</li>
<li><code>@property</code> on <code>ScriptModule</code>s has been disabled (<ahref="https://github.com/pytorch/pytorch/pull/28395"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28395/hovercard">28395</a>)
<ul>
<li>Scripted <code>@property</code> accesses were silently broken before, where we would evaluate the the <code>get</code> function once and store that as the attribute permanently. They properly error now; a workaround is to make your <code>@property</code> a regular method.</li>
</ul>
</li>
<li>Custom ops: <code>torch::jit::RegisterOperators</code> has been removed, use <code>torch::RegisterOperators</code> instead (<ahref="https://github.com/pytorch/pytorch/pull/28229"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28229/hovercard">28229</a>). The usage and behavior should remain the same.</li>
<li>Remove<code> torch.jit._register_*</code> bindings from Python (e.g. <code>torch.jit._register_attribute</code>). These were private functions that were not intended to be used. (<ahref="https://github.com/pytorch/pytorch/pull/29499"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29499/hovercard">29499</a>)</li>
</ul>
<h2>C++</h2>
<h3>[C++] The distinction between Tensor and Variable has been eliminated at the C++ level. (<ahref="https://github.com/pytorch/pytorch/pull/28287"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28287/hovercard">28287</a>)</h3>
<p>This change is unlikely to affect user code; the most likely exceptions are:</p>
<ol>
<li>
<p><ahref="https://en.cppreference.com/w/cpp/language/adl"rel="nofollow">Argument-dependent lookup</a> for <code>torch::autograd</code> may no longer work. This can break because Variable is now defined as an alias for Tensor (<code>using Variable = Tensor;</code>). In this case, you must explicitly qualify the calls to <code>torch::autograd</code> functions.</p>
</li>
<li>
<p>Because <code>Variable</code> and <code>Tensor</code> are now the same type, code which assumes that they are different types (e.g., for the purposes of templating, or <code>std::enable_if</code> checks) will not work until you delete the (now) redundant overload/specialization.</p>
</li>
<li>
<p>Some operators may trace differently. If this happens, please <ahref="https://github.com/pytorch/pytorch/issues/new?template=bug-report.md">file a bug.</a> The most likely situations are:</p>
</li>
</ol>
<ol>
<li>There are now <em>more</em> operations in your trace than before (usually, calls to <code>aten::empty</code>)</li>
<li>There are now <em>less</em> operations in your trace than before (e.g., the trace complains that <code>"there is no observable dependence"</code> with the inputs)</li>
</ol>
<h3>[C++] arguments in <code>torch::nn::LinearOptions</code> are renamed to match the Python API. (<ahref="https://github.com/pytorch/pytorch/pull/27382"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27382/hovercard">27382</a>)</h3>
<h3>[C++] arguments in <code>torch::nn::Conv{1,2,3}dOptions</code> are renamed to match the Python API. (<ahref="https://github.com/pytorch/pytorch/pull/28917"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28917/hovercard">28917</a>) (<ahref="https://github.com/pytorch/pytorch/pull/29838"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29838/hovercard">29838</a>)</h3>
<h3>[C++] <code>torch::nn::Conv{1,2,3}dOptions</code> no longer has the <code>transposed</code> argument. (<ahref="https://github.com/pytorch/pytorch/pull/31005"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31005/hovercard">31005</a>)</h3>
<ul>
<li>If users have <code>transposed</code> originally set to <code>true</code> in <code>torch::nn::Conv{1,2,3}dOptions</code>, they should migrate their code to use <code>torch::nn::ConvTranspose{1,2,3}d</code> layers instead.</li>
</ul>
<h3>[C++] All Reduction enums for <code>torch::nn</code> layers and functionals are changed to have <code>torch::KEnumNAME</code> syntax. (<ahref="https://github.com/pytorch/pytorch/pull/27942"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27942/hovercard">27942</a>, <ahref="https://github.com/pytorch/pytorch/pull/26837"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26837/hovercard">26837</a>)</h3>
<ul>
<li>Example: previously, to specify “mean” as the reduction method in a torch::nn layer or functional, we would use <code>torch::Reduction::Mean</code>. Now, <code>torch::Reduction::Mean</code> has been renamed to the shorter <code>torch::kMean</code>.</li>
</ul>
<h3>[C++] <code>torch::tensor</code> constructor is improved to match Python API behavior. (<ahref="https://github.com/pytorch/pytorch/pull/28523"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28523/hovercard">28523</a>) (<ahref="https://github.com/pytorch/pytorch/pull/29632"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29632/hovercard">29632</a>) (<ahref="https://github.com/pytorch/pytorch/pull/29066"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29066/hovercard">29066</a>)</h3>
<ul>
<li>Shape checking fixes
<ul>
<li>Example 1: previously, <code>torch::tensor({{1}, {2}})</code> produced a tensor of sizes <code>{2}</code>. Now, it produces a tensor of sizes <code>{2, 1}</code>.</li>
<li>Example 2: previously, <code>torch::tensor(1.1)</code> produced a 1-dim tensor. Now it produces a 0-dim tensor.</li>
</ul>
</li>
<li>Type inference improvements
<ul>
<li>Example 1: previously, C++ <code>torch::tensor</code> with a double (e.g. <code>torch::tensor(1.1)</code>) or a (nested) braced-init-list of doubles (e.g. <code>torch::tensor({{1.1, 2.2}})</code> produces a tensor with dtype <code>torch::kDouble</code>. Now it produces a tensor with dtype <code>torch::get_default_dtype()</code>.</li>
<li>Example 2: previously, C++ <code>torch::tensor</code> with an integer type (e.g. <code>torch::tensor(1)</code>) or a (nested) braced-init-list of integer types (e.g. <code>torch::tensor({{1, 2}})</code>) produces a tensor with the same dtype. Now it always produces a tensor of dtype <code>torch::kLong</code> (aka. <code>int64_t</code>).</li>
<li>Example 3: previously, when passed a <code>TensorOptions</code> without a dtype set to the <code>torch::tensor</code> constructor, it always produces a tensor of dtype <code>torch::get_default_dtype()</code>. Now it produces a tensor of different dtypes based on the dtype of the braced-init-list and the default dtype.</li>
</ul>
</li>
<li>Passing a <code>std::initializer_list</code> (NOT braced-init-list) to <code>torch::tensor</code> will no longer compile, and the user should pass the equivalent braced-init-list to <code>torch::tensor</code> instead. For example, write <code>torch::tensor({1.1, 1.2})</code> instead of <code>torch::tensor(std::initializer_list<double>({1.1, 1.2}))</code>.</li>
</ul>
<h3>[C++] Some activation modules’<code>forward</code> function now take <code>Tensor</code> instead of <code>Tensor&</code> as input. (<ahref="https://github.com/pytorch/pytorch/pull/28501"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28501/hovercard">28501</a>)</h3>
This change ensures that the above layers can be used in a <code>torch::nn::Sequential</code> module. If your C++ model uses any of the above layers, you must recompile your C++ code with the new libtorch binary.</p>
<li>Add <code>allgather_coalesced</code> API to <code>ProcessGroup</code> (<ahref="https://github.com/pytorch/pytorch/pull/28634"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28634/hovercard">28634,</a><ahref="https://github.com/pytorch/pytorch/pull/29059"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29059/hovercard">29059</a>)</li>
<li>Add <code>abort</code> API in <code>ProcessGroupGloo</code> Send/Recv Work (<ahref="https://github.com/pytorch/pytorch/pull/29928"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29928/hovercard">29928</a>).</li>
<li>Add <code>--no_python</code> flag to allow using a bash script wrapper in the launch command (<ahref="https://github.com/pytorch/pytorch/pull/29144"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29144/hovercard">29144</a>).</li>
<p><code>torch.distributed.rpc</code> is a newly introduced package. It
contains basic building blocks to run functions remotely in model
training and inference, which will be useful for scenarios like
distributed model parallel or implementing parameter server frameworks.
More specifically, it contains four pillars: RPC, Remote Reference,
Distributed Autograd, and Distributed Optimizer. Please refer to the <ahref="https://pytorch.org/docs/master/rpc.html"rel="nofollow">documentation</a> and the <ahref="https://pytorch.org/tutorials/intermediate/rpc_tutorial.html"rel="nofollow">tutorial</a> for more details.</p>
<li>Add <code>rpc_sync</code> and <code>rpc_async</code> for builtin operators and Python user functions (<ahref="https://github.com/pytorch/pytorch/pull/23228"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/23228/hovercard">23228</a>, <ahref="https://github.com/pytorch/pytorch/pull/23569"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/23569/hovercard">23569</a>, <ahref="https://github.com/pytorch/pytorch/pull/28392"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28392/hovercard">28392</a>).</li>
<li>Add <code>remote</code> and <code>RRef</code> for builtin operators and Python user functions (<ahref="https://github.com/pytorch/pytorch/pull/25169"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25169/hovercard">25169</a>, <ahref="https://github.com/pytorch/pytorch/pull/25499"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25499/hovercard">25499</a>).</li>
<li>Distributed Autograd - FAST mode backward pass implementation. (<ahref="https://github.com/pytorch/pytorch/pull/27022"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27022/hovercard">27022</a>, <ahref="https://github.com/pytorch/pytorch/pull/27576"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27576/hovercard">27576</a>).</li>
<li>Integrate <code>remote</code> and <code>RRef</code> with distributed autograd (<ahref="https://github.com/pytorch/pytorch/pull/28630"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28630/hovercard">28630</a>, <ahref="https://github.com/pytorch/pytorch/pull/28656"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28656/hovercard">28656</a>).</li>
<li>Add a distributed optimizer (<ahref="https://github.com/pytorch/pytorch/pull/29304"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29304/hovercard">29304</a>, <ahref="https://github.com/pytorch/pytorch/pull/30062"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30062/hovercard">30062</a>).</li>
<li>Add python API for <code>get_gradients()</code> method to retrieve gradients from distributed autograd context. (<ahref="https://github.com/pytorch/pytorch/pull/28926"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28926/hovercard">28926</a>).</li>
<li>Support creating local <code>RRef</code>s on local values and to-self <code>remote</code> calls (<ahref="https://github.com/pytorch/pytorch/pull/28948"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28948/hovercard">28948</a>, <ahref="https://github.com/pytorch/pytorch/pull/29634"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29634/hovercard">29634</a>).</li>
<li>Support custom pickler for RPC (<ahref="https://github.com/pytorch/pytorch/pull/30185"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30185/hovercard">30185</a>).</li>
<li>Add default RPC agent options based on the backend type (<ahref="https://github.com/pytorch/pytorch/pull/30201"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30201/hovercard">30201</a>).</li>
<li>Add local <code>shutdown</code> to <code>ProcessGroup</code> agent (<ahref="https://github.com/pytorch/pytorch/pull/30330"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30330/hovercard">30330</a>).</li>
</ul>
<h2>JIT</h2>
<ul>
<li><code>script::Module</code>: implement more of of the nn.Module API (<ahref="https://github.com/pytorch/pytorch/pull/28828"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28828/hovercard">28828</a>)
<ul>
<li>In particular, adds the (optionally recursive) methods that iterate over submodules, parameters, etc.</li>
<li>Adds a pybind-like <code>attr()</code> method to simplify attribute access.</li>
</ul>
</li>
<li>Add support for <code>@staticmethod</code> on <code>ScriptModule</code>s (<ahref="https://github.com/pytorch/pytorch/pull/27163"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27163/hovercard">27163</a>)</li>
<li>Support Module Containers as Iterables (<ahref="https://github.com/pytorch/pytorch/pull/26465"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26465/hovercard">26465</a>)</li>
<li>Support Iterables In List Comprehensions (<ahref="https://github.com/pytorch/pytorch/pull/26768"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26768/hovercard">26768)</a></li>
<li>Dictionaries now preserve insertion order, and <code>OrderedDict</code> is supported (<ahref="https://github.com/pytorch/pytorch/pull/26465"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26465/hovercard">26465</a>)</li>
<li>Add support for <code>hasattr()</code> (<ahref="https://github.com/pytorch/pytorch/pull/29332"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29332/hovercard">29332</a>)</li>
<li>TorchScript classes can now be callable (<ahref="https://github.com/pytorch/pytorch/pull/26743"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26743/hovercard">26743</a>)</li>
<li>Add <code>clone_instance</code> for <code>ScriptModule</code>s (<ahref="https://github.com/pytorch/pytorch/pull/30168"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30168/hovercard">30168</a>)</li>
<li>Add <code>torch.memory_format</code> support to the TorchScript (<ahref="https://github.com/pytorch/pytorch/pull/28544"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28544/hovercard">28544</a>)</li>
<li>Custom <code>forward()</code> is now allowed on container modules (<ahref="https://github.com/pytorch/pytorch/pull/28988"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28988/hovercard">28988</a>)</li>
<li>Calls to submodules are now preserved in the traced graph (<ahref="https://github.com/pytorch/pytorch/pull/29261"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29261/hovercard">29261</a>)</li>
<li>Add support for module containers to be used as iterables (<ahref="https://github.com/pytorch/pytorch/pull/28255"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28255/hovercard">28255</a>)</li>
<li>Make JIT Serialization support arbitrary std::function<> IO (<ahref="https://github.com/pytorch/pytorch/pull/28039"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28039/hovercard">28039</a>)</li>
<li>Methods and functions are no longer inlined in the serialized file format (<ahref="https://github.com/pytorch/pytorch/pull/26706"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26706/hovercard">26706</a>)</li>
</ul>
<h2>Mobile</h2>
<ul>
<li>Build level customization
<ul>
<li>Add custom build script to only include selected operators (<ahref="https://github.com/pytorch/pytorch/pull/30144"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30144/hovercard">30144</a>).</li>
<li>Dump operator names used by a script module (<ahref="https://github.com/pytorch/pytorch/pull/29374"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29374/hovercard">29374</a>, <ahref="https://github.com/pytorch/pytorch/pull/30467"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30467/hovercard">30467</a>).</li>
<li>Disable JIT optimizer in Android wrapper for mobile custom build (<ahref="https://github.com/pytorch/pytorch/pull/30285"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30285/hovercard">30285</a>).</li>
<li>Add timeout support in <code>ProcessGroupNCCL</code> (<ahref="https://github.com/pytorch/pytorch/pull/27224"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27224/hovercard">27224</a>).</li>
<li>Ensure that DDP wrapped module has parameters that require gradients (<ahref="https://github.com/pytorch/pytorch/pull/25858"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25858/hovercard">25858</a>).</li>
<li>Making <code>torch/csrc/cuda</code> NCCL usage safe for NCCL 2.5 (<ahref="https://github.com/pytorch/pytorch/pull/29014"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29014/hovercard">29014</a>).</li>
<li>Enable <code>test_distributed</code> for ROCm but only with NCCL backend (<ahref="https://github.com/pytorch/pytorch/pull/28814"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28814/hovercard">28814</a>).</li>
</ul>
<h3>RPC Improvements</h3>
<ul>
<li>Separate out RPC to <code>rpc_sync</code> and <code>rpc_async</code> APIs (<ahref="https://github.com/pytorch/pytorch/pull/26570"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26570/hovercard">26570</a>).</li>
<li>Make python user function serialization format to be consistent with builtin operators (<ahref="https://github.com/pytorch/pytorch/pull/27136"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27136/hovercard">27136</a>).</li>
<li>Clean up distributed autograd context on all participants on exit (<ahref="https://github.com/pytorch/pytorch/pull/27951"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27951/hovercard">27951</a>).</li>
<li>Improve error handling for distributed autograd engine. (<ahref="https://github.com/pytorch/pytorch/pull/27940"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27940/hovercard">27940</a>).</li>
<li>Scope pybind11 functions to <code>torch.distributed.{autograd,rpc}</code> (<ahref="https://github.com/pytorch/pytorch/pull/27529"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27529/hovercard">27529</a>).</li>
<li>Lift <code>rpc_timeout</code> to <code>RpcAgent</code> to make it reusable for other <code>RpcAgent</code> implementations. (<ahref="https://github.com/pytorch/pytorch/pull/29341"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29341/hovercard">29341</a>).</li>
<li>Support sending message to self in <code>process_group_agent</code> (<ahref="https://github.com/pytorch/pytorch/pull/29253"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29253/hovercard">29253</a>).</li>
<li>Properly shutdown RPC even in the case of <code>clean_shutdown=False</code>. (<ahref="https://github.com/pytorch/pytorch/pull/29148"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29148/hovercard">29148</a>).</li>
<li>Ensure <code>initializedContextIds_</code> map is cleaned up appropriately in distributed autograd engine. (<ahref="https://github.com/pytorch/pytorch/pull/29787"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29787/hovercard">29787</a>).</li>
<li>Add hash and equality operators for <code>WorkerInfo</code> (<ahref="https://github.com/pytorch/pytorch/pull/29958"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29958/hovercard">29958</a>).</li>
<li>Add <code>RpcAgentOptions</code> struct type to bundle arguments for different <code>RpcAgent</code>s (<ahref="https://github.com/pytorch/pytorch/pull/29972"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29972/hovercard">29972</a>).</li>
<li>Mark timeout <code>FutureMessage</code>s and throw exceptions in <code>ProcessGroupAgent</code> (<ahref="https://github.com/pytorch/pytorch/pull/29601"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29601/hovercard">29601</a>).</li>
<li>Re-throw python remote exception when using remote reference to itself (<ahref="https://github.com/pytorch/pytorch/pull/29930"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29930/hovercard">29930</a>).</li>
<li>By default ignore <code>RRef</code> leaks during shutdown (<ahref="https://github.com/pytorch/pytorch/pull/30217"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30217/hovercard">30217</a>).</li>
</ul>
<h3>Documentation</h3>
<ul>
<li>Add Design doc for Distributed Autograd Engine (<ahref="https://github.com/pytorch/pytorch/pull/29175"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29175/hovercard">29175</a>, <ahref="https://github.com/pytorch/pytorch/pull/30068"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30068/hovercard">30068</a>, <ahref="https://github.com/pytorch/pytorch/pull/29927"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29927/hovercard">29927</a>)</li>
<li>Add Design doc for Remote Reference (<ahref="https://github.com/pytorch/pytorch/pull/30066"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30066/hovercard">30066</a>).</li>
<li>Add known worker IDs to distributed autograd context (<ahref="https://github.com/pytorch/pytorch/pull/26324"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26324/hovercard">26324</a>).</li>
<li>Minor tweaks to RPC message API (<ahref="https://github.com/pytorch/pytorch/pull/28326"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28326/hovercard">28326</a>).</li>
<li>Use <code>std::shared_ptr</code> for <code>DistAutogradContext</code> (<ahref="https://github.com/pytorch/pytorch/pull/29770"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29770/hovercard">29770</a>).</li>
<li>Mark <code>c10d::~NCCLUtils</code> as noexcept (<ahref="https://github.com/pytorch/pytorch/pull/29118"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29118/hovercard">29118</a>).</li>
</ul>
<h2>JIT</h2>
<ul>
<li>Move custom passes to last optimization step (<ahref="https://github.com/pytorch/pytorch/pull/29256"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29256/hovercard">29256</a>)</li>
<li>Represent the original Python name of a module type the same way in traced and scripted modules. (<ahref="https://github.com/pytorch/pytorch/pull/29912"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29912/hovercard">29912</a>)</li>
<li>Only print original SourceRange on highlight (<ahref="https://github.com/pytorch/pytorch/pull/29708"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29708/hovercard">29708</a>)</li>
<li>Error message and ergonomic improvements:
<ul>
<li>Show full call stack in TorchScript exception even when calls were inlined. (<ahref="https://github.com/pytorch/pytorch/pull/29911"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29911/hovercard">29911</a>)</li>
<li>Reduce error context from 10 -> 3 (<ahref="https://github.com/pytorch/pytorch/pull/26765"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26765/hovercard">26765</a>)</li>
<li>Fix error report highlight for unmatched type annotation (<ahref="https://github.com/pytorch/pytorch/pull/27195"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27195/hovercard">27195</a>)</li>
<li>Make default string arguments in schemas human readable (<ahref="https://github.com/pytorch/pytorch/pull/27088"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27088/hovercard">27088</a>)</li>
<li>Print which output didn't have dependence during trace checking. (<ahref="https://github.com/pytorch/pytorch/pull/29047"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29047/hovercard">29047</a>)</li>
</ul>
</li>
<li>Improvements to save/load and serialization performance:
<ul>
<li>Modules can now share JIT types if their implementation is the same, improving save/load performance (<ahref="https://github.com/pytorch/pytorch/pull/26666"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26666/hovercard">26666</a>)</li>
<li>Pickler: convert <code>std::stringstream</code> cases for improved performance. (<ahref="https://github.com/pytorch/pytorch/pull/29351"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29351/hovercard">29351</a>)</li>
<li>Buffer to speed Unpickler (<ahref="https://github.com/pytorch/pytorch/pull/27727"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27727/hovercard">27727</a>)</li>
<li>Buffer in Pickler to improve performance. (<ahref="https://github.com/pytorch/pytorch/pull/27720"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27720/hovercard">27720</a>)</li>
<li>In <code>torch::save()</code> avoid zip compressing small header records. (<ahref="https://github.com/pytorch/pytorch/pull/28180"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28180/hovercard">28180</a>)</li>
<li>String optimizations related to serialization. (<ahref="https://github.com/pytorch/pytorch/pull/28230"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28230/hovercard">28230</a>)</li>
</ul>
</li>
<li>Clean up serialized source format (<ahref="https://github.com/pytorch/pytorch/pull/28129"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28129/hovercard">28129</a>)</li>
<li>API for finding a common ancestor block for a pair of nodes (<ahref="https://github.com/pytorch/pytorch/pull/28864"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28864/hovercard">28864</a>)</li>
<li>Better hashing for constant pool (<ahref="https://github.com/pytorch/pytorch/pull/27733"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27733/hovercard">27733</a>)</li>
<li>Improve error messages when a method or attribute is missing (<ahref="https://github.com/pytorch/pytorch/pull/27110"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27110/hovercard">27110</a>)</li>
<li>Display original source range in <code>Node::print</code> (<ahref="https://github.com/pytorch/pytorch/pull/27524"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27524/hovercard">27524</a>)</li>
<li>Always use the closure to resolve variable names (<ahref="https://github.com/pytorch/pytorch/pull/27515"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27515/hovercard">27515</a>)</li>
</ul>
<h2>Mobile</h2>
<ul>
<li>Improve Java API / JNI
<ul>
<li>Add module method to allow explicitly destructing native part (<ahref="https://github.com/pytorch/pytorch/pull/27090"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27090/hovercard">27090</a>).</li>
<li>Add methods to write image tensor content to buffer (<ahref="https://github.com/pytorch/pytorch/pull/27359"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27359/hovercard">27359</a>).</li>
<li>Various improvements to Android API (<ahref="https://github.com/pytorch/pytorch/pull/27454"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27454/hovercard">27454</a>, <ahref="https://github.com/pytorch/pytorch/pull/27455"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27455/hovercard">27455</a>).</li>
<li>Add support for PyTorch JNI build (<ahref="https://github.com/pytorch/pytorch/pull/29412"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29412/hovercard">29412</a>, <ahref="https://github.com/pytorch/pytorch/commit/42faf961c8">42faf961c8</a>, <ahref="https://github.com/pytorch/pytorch/commit/d22f61432d">d22f61432d</a>).</li>
<li>Various fixes to PyTorch JNI (<ahref="https://github.com/pytorch/pytorch/pull/29350"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29350/hovercard">29350</a>, <ahref="https://github.com/pytorch/pytorch/pull/29861"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29861/hovercard">29861</a>, <ahref="https://github.com/pytorch/pytorch/pull/30206"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30206/hovercard">30206</a>, <ahref="https://github.com/pytorch/pytorch/pull/30207"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30207/hovercard">30207</a>).</li>
</ul>
</li>
<li>Improve support for older Android NDK
<ul>
<li>Introduce math_compat.h for older Android versions (<ahref="https://github.com/pytorch/pytorch/pull/28567"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28567/hovercard">28567</a>).</li>
<li>Define std::strtoll for older Android (<ahref="https://github.com/pytorch/pytorch/pull/28603"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28603/hovercard">28603</a>).</li>
<li>Enable full error message for mobile builds (<ahref="https://github.com/pytorch/pytorch/pull/29926"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29926/hovercard">29926</a>).</li>
<li>Rename function parameters to avoid [-Werror,-Wshadow] (<ahref="https://github.com/pytorch/pytorch/pull/30276"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30276/hovercard">30276</a>).</li>
<li>Fix exception message in Java Tensor (<ahref="https://github.com/pytorch/pytorch/pull/30776"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30776/hovercard">30776</a>).</li>
</ul>
</li>
<li>Improve support for benchmark and profiling
<ul>
<li>Add Android and iOS test app for benchmark and profiling (<ahref="https://github.com/pytorch/pytorch/pull/28405"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28405/hovercard">28405</a>, <ahref="https://github.com/pytorch/pytorch/pull/28406"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28406/hovercard">28406</a>, <ahref="https://github.com/pytorch/pytorch/pull/28469"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28469/hovercard">28469</a>, <ahref="https://github.com/pytorch/pytorch/pull/28622"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28622/hovercard">28622</a>).</li>
<li>Integration with mobile benchmark in PEP (<ahref="https://github.com/pytorch/pytorch/pull/28437"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28437/hovercard">28437</a>).</li>
<li>Subscribe for record function and if android do atrace (<ahref="https://github.com/pytorch/pytorch/pull/28708"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28708/hovercard">28708</a>).</li>
</ul>
</li>
<li>Improve build / CI
<ul>
<li>Improve Android Gradle build and publishing (<ahref="https://github.com/pytorch/pytorch/pull/26833"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26833/hovercard">26833</a>, <ahref="https://github.com/pytorch/pytorch/pull/27389"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27389/hovercard">27389</a>, <ahref="https://github.com/pytorch/pytorch/pull/29262"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29262/hovercard">29262</a>, <ahref="https://github.com/pytorch/pytorch/pull/29738"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29738/hovercard">29738</a>).</li>
<li>Misc fixes to the Android test project (<ahref="https://github.com/pytorch/pytorch/pull/27453"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27453/hovercard">27453</a>).</li>
<li>Add testing code to iOS CI jobs (<ahref="https://github.com/pytorch/pytorch/pull/27593"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27593/hovercard">27593</a>, <ahref="https://github.com/pytorch/pytorch/pull/27594"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27594/hovercard">27594</a>, <ahref="https://github.com/pytorch/pytorch/pull/27784"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27784/hovercard">27784</a>, <ahref="https://github.com/pytorch/pytorch/pull/30133"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30133/hovercard">30133</a>).</li>
<li>Misc fixes to the iOS TestApp (<ahref="https://github.com/pytorch/pytorch/pull/27591"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27591/hovercard">27591</a>, <ahref="https://github.com/pytorch/pytorch/pull/28356"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28356/hovercard">28356</a>, <ahref="https://github.com/pytorch/pytorch/pull/28809"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28809/hovercard">28809</a>, <ahref="https://github.com/pytorch/pytorch/pull/29247"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29247/hovercard">29247</a>, <ahref="https://github.com/pytorch/pytorch/pull/29962"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29962/hovercard">29962</a>, <ahref="https://github.com/pytorch/pytorch/pull/29963"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29963/hovercard">29963</a>).</li>
<li>Add support for host build to pytorch_android (<ahref="https://github.com/pytorch/pytorch/pull/27662"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27662/hovercard">27662,</a><ahref="https://github.com/pytorch/pytorch/pull/27664"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27664/hovercard">27664</a>).</li>
<li>Add mobile build CI with host toolchain (<ahref="https://github.com/pytorch/pytorch/pull/30292"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30292/hovercard">30292</a>).</li>
</ul>
</li>
</ul>
<h2>Named Tensors</h2>
<ul>
<li><code>torch.addcdiv</code>, <code>torch.addcmul</code> Added named tensor support (<ahref="https://github.com/pytorch/pytorch/pull/28975"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28975/hovercard">28975</a>).</li>
<li><code>torch.{ones,zeros,full,rand,randn}_like</code> Added named tensor support (<ahref="https://github.com/pytorch/pytorch/pull/28981"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28981/hovercard">28981</a>).</li>
<li><code>torch.cdist</code> Added named tensor support (<ahref="https://github.com/pytorch/pytorch/pull/29129"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29129/hovercard">29129</a>).</li>
<li><code>torch.equal</code> Added named tensor support (<ahref="https://github.com/pytorch/pytorch/pull/29322"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29322/hovercard">29322</a>).</li>
<li>Added named tensor support for comparison ops (<ahref="https://github.com/pytorch/pytorch/pull/27162"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27162/hovercard">27162</a>).</li>
<li><code>Tensor.align_to</code> Make method-only. (<ahref="https://github.com/pytorch/pytorch/pull/27304"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27304/hovercard">27304</a>).</li>
<li><code>Tensor.align_to</code> Accept partially named tensors (<ahref="https://github.com/pytorch/pytorch/pull/27308"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27308/hovercard">27308</a>).</li>
<li><code>torch.mean(Tensor, Dimname)</code> Fixed autograd support (<ahref="https://github.com/pytorch/pytorch/pull/29199"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29199/hovercard">29199</a>).</li>
<li><code>Tensor.unflatten</code> Fix when dim is a negative integer (<aclass="issue-link js-issue-link"data-error-text="Failed to load title"data-id="537212278"data-permission-text="Title is private"data-url="https://github.com/pytorch/pytorch/issues/31208"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31208/hovercard"href="https://github.com/pytorch/pytorch/pull/31208">#31208</a>) (<ahref="https://github.com/pytorch/pytorch/pull/31432"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/31432/hovercard">31432</a>).</li>
<li>Fix type errors in examples about Named Tensor (<ahref="https://github.com/pytorch/pytorch/pull/27828"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27828/hovercard">27828</a>).</li>
<li>bfloat16 enablement (initial) on ROCm (<ahref="https://github.com/pytorch/pytorch/pull/27719"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27719/hovercard">27719</a>)</li>
</ul>
</li>
<li>Build/CI
<ul>
<li>Upgrade to ROCm 2.9 (<ahref="https://github.com/pytorch/pytorch/pull/27417"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27417/hovercard">27417</a>)</li>
<li>Upgrade ROCm CI to Python3.6 (<ahref="https://github.com/pytorch/pytorch/pull/30119"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30119/hovercard">30119</a>, <ahref="https://github.com/pytorch/pytorch/pull/27353"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27353/hovercard">27353</a>)</li>
<li>Distribute hipify scripts as part of torch package (<ahref="https://github.com/pytorch/pytorch/pull/27425"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27425/hovercard">27425</a>)</li>
<li>Build and test gfx908 architecture (<ahref="https://github.com/pytorch/pytorch/pull/27388"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27388/hovercard">27388</a>)</li>
<li><code>torch.sort/torch.topk</code> are supported in Opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/25739"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25739/hovercard">25739</a>)</li>
<li><code>torch.size/torch.squeeze/torch.unsqueeze/torch.mm/torch.index_fill/torch.index_copy</code> are supported in Opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/27578"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27578/hovercard">27578</a>)</li>
<li><code>torch.masked_select/torch.masked_scatter</code> are supported in Opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/25949"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25949/hovercard">25949</a>)</li>
<li><code>torch.arange</code> is supported in Opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/26875"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26875/hovercard">26875</a>)</li>
<li><code>avg_pool, constant_pad_nd, reflection_pad, replication_pad</code> Support enhanced in Opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/28225"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28225/hovercard">28225</a>)</li>
<li><code>torch.hardtanh</code> is supported in Opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/30169"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30169/hovercard">30169</a>)</li>
<li>Enable ONNX constant folding for opset 11 (<ahref="https://github.com/pytorch/pytorch/pull/29011"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29011/hovercard">29011</a>)</li>
</ul>
<h3>Exporting More Torch Operators/Models to ONNX</h3>
<ul>
<li><code>torch.remainder</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/24410"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/24410/hovercard">24410</a>)</li>
<li><code>torch.unfold</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/24970"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/24970/hovercard">24970</a>)</li>
<li><code>torch.slice/torch.select</code> with negative index are enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/25273"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25273/hovercard">25273</a>, <ahref="https://github.com/pytorch/pytorch/pull/26549"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26549/hovercard">26549</a>)</li>
<li><code>torch.ones/torch.ones_like/torch.zeros/torch.zeros_like/torch.full/torch.full_like</code> with default dtype are enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/27577"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27577/hovercard">27577</a>)</li>
<li><code>torch.unbind</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/27247"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27247/hovercard">27247</a>)</li>
<li><code>torch.nn.functional.interpolate</code> export is enhanced (<ahref="https://github.com/pytorch/pytorch/pull/27179"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27179/hovercard">27179</a>, <ahref="https://github.com/pytorch/pytorch/pull/27566"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27566/hovercard">27566</a>, <ahref="https://github.com/pytorch/pytorch/pull/28560"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28560/hovercard">28560</a>, <ahref="https://github.com/pytorch/pytorch/pull/29489"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29489/hovercard">29489</a>)</li>
<li><code>torch.det</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/26958"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26958/hovercard">26958</a>)</li>
<li><code>torch.group_norm</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/27071"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27071/hovercard">27071</a>)</li>
<li><code>torch.meshgrid</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/26037"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26037/hovercard">26037</a>)</li>
<li><code>torch.randn/torch.randn_like</code> are enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/28470"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28470/hovercard">28470</a>, <ahref="https://github.com/pytorch/pytorch/pull/29354"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29354/hovercard">29354</a>)</li>
<li><code>torch.weight_norm</code> enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/28618"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28618/hovercard">28618</a>)</li>
<li><code>torch.scalar_tensor</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/28713"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28713/hovercard">28713</a>)</li>
<li><code>torch.logdet</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/29767"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29767/hovercard">29767</a>)</li>
<li><code>torch.batch_norm</code> 2D with affine=False is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/29458"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29458/hovercard">29458</a>)</li>
<li><code>torch.bitshift</code> is enabled in exporter (<ahref="https://github.com/pytorch/pytorch/pull/28210"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28210/hovercard">28210</a>)</li>
</ul>
<h3>Enhancing Export/Test Infra</h3>
<ul>
<li>Use deepcopy inputs in ONNX ORT test cases (<ahref="https://github.com/pytorch/pytorch/pull/27186"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27186/hovercard">27186</a>)</li>
<li>Return NotImplemented from all binary math ops (<ahref="https://github.com/pytorch/pytorch/pull/27423"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27423/hovercard">27423</a>).</li>
<li>Disabling ONNX IR v4 sematics for opset 8 or lower (<ahref="https://github.com/pytorch/pytorch/pull/28990"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28990/hovercard">28990</a>)</li>
<li>Add ONNX tests for torchvision models (<ahref="https://github.com/pytorch/pytorch/pull/30121"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30121/hovercard">30121</a>)</li>
<li>Keep output type information while exporting ONNX graph (<ahref="https://github.com/pytorch/pytorch/pull/25906"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25906/hovercard">25906</a>)</li>
<li>Quantized Tensor support copy (<ahref="https://github.com/pytorch/pytorch/pull/28612"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28612/hovercard">28612</a>).</li>
<li>Add quantized torch mean implementation (<ahref="https://github.com/pytorch/pytorch/pull/27675"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27675/hovercard">27675</a>).</li>
<li>Add quantized avg_pool2d for pytorch mobile (<ahref="https://github.com/pytorch/pytorch/pull/27631"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27631/hovercard">27631</a>).</li>
<li>PackedSequence support for quantized LSTM (<ahref="https://github.com/pytorch/pytorch/pull/29585"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29585/hovercard">29585</a>).</li>
<li>Improve legacy QuantizedLinear functions to reduce overhead (<ahref="https://github.com/pytorch/pytorch/pull/29773"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29773/hovercard">29773</a>).</li>
<li>Add support for quantized operator conversion from PT to C2 via ONNX (<ahref="https://github.com/pytorch/pytorch/pull/29694"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29694/hovercard">29694</a>).</li>
<li>enable per channel dynamic quantization (<ahref="https://github.com/pytorch/pytorch/pull/30122"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30122/hovercard">30122</a>).</li>
</ul>
</li>
<li>Scripting support:
<ul>
<li>Make PerChannelMinMaxObserver scriptable using <code>torch.jit.ignore</code> (<ahref="https://github.com/pytorch/pytorch/pull/29416"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29416/hovercard">29416</a>).</li>
<li>Make HistogramObserver scriptable with <code>@torch.jit.ignore</code> (<ahref="https://github.com/pytorch/pytorch/pull/27950"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27950/hovercard">27950</a>).</li>
<li>Fix tracing for dynamic quantized LSTM (<ahref="https://github.com/pytorch/pytorch/pull/29331"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29331/hovercard">29331</a>).</li>
<li>Support logging embedding for TensorBoard visualizations to generic filesystem (<ahref="https://github.com/pytorch/pytorch/pull/27716"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27716/hovercard">27716</a>)</li>
</ul>
<h2>Other Improvements</h2>
<ul>
<li><code>torch.argmax/argmin</code> Allow half type (<ahref="https://github.com/pytorch/pytorch/pull/28787"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28787/hovercard">28787</a>).</li>
<li><code>torch.cuda.memory_stats / memory_summary</code> instrumentation for CUDA memory allocator (<ahref="https://github.com/pytorch/pytorch/pull/27361"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27361/hovercard">27361</a>).</li>
<li><code>torch.set_num_threads</code> Allow calling multiple times with TBB (<ahref="https://github.com/pytorch/pytorch/pull/27190"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27190/hovercard">27190</a>).</li>
<li><code>torch.set_num_threads</code> Allow calling multiple times in parallel native (<ahref="https://github.com/pytorch/pytorch/pull/27947"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27947/hovercard">27947</a>).</li>
<li><code>torch.batch_norm_elemt</code> Add an out-variant (<ahref="https://github.com/pytorch/pytorch/pull/27621"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27621/hovercard">27621</a>).</li>
<li><code>torch.lerp</code> Implement derivative with respect to weight (<ahref="https://github.com/pytorch/pytorch/pull/28219"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28219/hovercard">28219</a>).</li>
<li><code>torch.complex32</code> Add type promotion support (<ahref="https://github.com/pytorch/pytorch/pull/27929"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27929/hovercard">27929</a>).</li>
<li><code>torch.unique</code> Support bool tensors (<ahref="https://github.com/pytorch/pytorch/pull/28374"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28374/hovercard">28374</a>).</li>
<li><code>torch.reshape</code> Improve backward for viewable geometries (<ahref="https://github.com/pytorch/pytorch/pull/28901"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28901/hovercard">28901</a>).</li>
<li><code>torch.bfloat16</code> Enabled for cuda (<ahref="https://github.com/pytorch/pytorch/pull/27259"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27259/hovercard">27259</a>).</li>
<li><code>torch.multinomial</code> Enable for torch.half (<ahref="https://github.com/pytorch/pytorch/pull/29266"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29266/hovercard">29266</a>).</li>
<li><code>nn.RNN</code> Respect the current stream in cudnn (<ahref="https://github.com/pytorch/pytorch/pull/27026"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27026/hovercard">27026</a>).</li>
<li><code>nn.Linear</code> Support 0-batch size. (<ahref="https://github.com/pytorch/pytorch/pull/27211"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27211/hovercard">27211</a>).</li>
<li><code>nn.AdaptiveAvgPool2d</code> Add support for NHWC memory format (<ahref="https://github.com/pytorch/pytorch/pull/24396"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/24396/hovercard">24396</a>).</li>
<li><code>nn.LayerNorm</code> Handle batch size of zero (<ahref="https://github.com/pytorch/pytorch/pull/28614"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28614/hovercard">28614</a>).</li>
<li><code>nn.BatchNorm</code> Add NHWC support on cudnn (<ahref="https://github.com/pytorch/pytorch/pull/23861"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/23861/hovercard">23861</a>).</li>
<li><code>nn.BatchNorm2d</code> support torch.channels_last (<ahref="https://github.com/pytorch/pytorch/pull/28982"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28982/hovercard">28982</a>).</li>
<li><code>nn.Sequential</code> Make iterable (<ahref="https://github.com/pytorch/pytorch/pull/28987"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28987/hovercard">28987</a>).</li>
<li><code>dtype.is_signed</code> Ability to differentiate signed dtypes (<ahref="https://github.com/pytorch/pytorch/pull/29511"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29511/hovercard">29511</a>).</li>
<li><code>optim.lr_scheduler.MultiplicativeLR </code>Add new multiplicative learning rate scheduler. (<ahref="https://github.com/pytorch/pytorch/pull/27254"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27254/hovercard">27254</a>).</li>
<li><code>cuda.comm.scatter, gather</code> Add channel-last support (<ahref="https://github.com/pytorch/pytorch/pull/28077"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28077/hovercard">28077</a>).</li>
<li><code>at::parallel_for</code> Choose number of OMP threads based on GRAIN_SIZE (<ahref="https://github.com/pytorch/pytorch/pull/26963"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26963/hovercard">26963</a>).</li>
<li>Return NotImplemented from unsupported tensor arithmetic operators (<ahref="https://github.com/pytorch/pytorch/pull/26507"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26507/hovercard">26507</a>).</li>
<li>Pickle support for sparse tensors (<ahref="https://github.com/pytorch/pytorch/pull/27062"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27062/hovercard">27062</a>).</li>
<li>Vectorized complex unary and binary op support. (<ahref="https://github.com/pytorch/pytorch/pull/26500"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26500/hovercard">26500</a>).</li>
<li>Complex support for reduce and linpack ops on CPU (<ahref="https://github.com/pytorch/pytorch/pull/27653"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27653/hovercard">27653</a>).</li>
<li>Complex support for compare and pointwise ops on CPU (<ahref="https://github.com/pytorch/pytorch/pull/28735"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28735/hovercard">28735</a>).</li>
<li>Buffer python warning to avoid deadlocks (<ahref="https://github.com/pytorch/pytorch/pull/26613"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26613/hovercard">26613</a>).</li>
<li>Use NNPACK for strided convolutions. (<ahref="https://github.com/pytorch/pytorch/pull/29084"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29084/hovercard">29084</a>).</li>
</ul>
<h1>Bug Fixes</h1>
<h2>Distributed</h2>
<ul>
<li>Ensure NCCL error handling code is disabled for NCCL versions < 2.4 (<ahref="https://github.com/pytorch/pytorch/pull/27124"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27124/hovercard">27124</a>).</li>
<li>Fix segmentation fault in <code>FileStore</code> with concurrent accesses. (<ahref="https://github.com/pytorch/pytorch/pull/28812"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28812/hovercard">28812</a>).</li>
<li>Fix DDP incompatibility issue with <code>nn.MultiheadAttention</code> (<ahref="https://github.com/pytorch/pytorch/pull/26826"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26826/hovercard">26826</a>).</li>
<li>Fix pybind11 warnings in Python RPC handler implementation (<ahref="https://github.com/pytorch/pytorch/pull/27284"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27284/hovercard">27284</a>).</li>
<li>Defer creating <code>ProcessGroupAgent</code> listener thread until contexts are initialized (<ahref="https://github.com/pytorch/pytorch/pull/28013"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28013/hovercard">28013</a>).</li>
<li>Always include autograd context id in <code>rpc_*</code> / <code>remote</code> requests (<ahref="https://github.com/pytorch/pytorch/pull/29781"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29781/hovercard">29781</a>).</li>
<li>Make <code>RRefContext</code> singleton leaky, deal with module destruct order race. (<ahref="https://github.com/pytorch/pytorch/pull/30172"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30172/hovercard">30172</a>).</li>
</ul>
<h2>C++ API Bug Fixes</h2>
<ul>
<li>at::Tensor::requires_grad_ now supported (<ahref="https://github.com/pytorch/pytorch/pull/26332"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26332/hovercard">26332</a>).</li>
<li>torch::isfinite now supported (<ahref="https://github.com/pytorch/pytorch/pull/30083"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30083/hovercard">30083</a>).</li>
<li>torch::nn::modules_ordered_dict is deprecated (<ahref="https://github.com/pytorch/pytorch/pull/28774"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28774/hovercard">28774</a>).</li>
<li>Add reset_parameters to torch::nn modules (<ahref="https://github.com/pytorch/pytorch/pull/29832"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29832/hovercard">29832</a>).</li>
<li>Allow passing undefined Tensor to Module::register_parameter (<ahref="https://github.com/pytorch/pytorch/pull/27948"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27948/hovercard">27948</a>).</li>
<li>Exclude undefined tensors in the result of Module::parameters() / named_paramters() / buffers() / named_buffers() (<ahref="https://github.com/pytorch/pytorch/pull/30626"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30626/hovercard">30626</a>).</li>
<li>Include hierarchy information in C++ API loading error messages (<ahref="https://github.com/pytorch/pytorch/pull/28499"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28499/hovercard">28499</a>).</li>
<li>Fix a bug: the C++ L-BFGS optimizer does not work properly if there
are one or more registered tensors with no grad in the model (<ahref="https://github.com/pytorch/pytorch/pull/27606"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27606/hovercard">27606</a>).</li>
<li>Use c10::variant-based enums for Nonlinearity and FanMode (<ahref="https://github.com/pytorch/pytorch/pull/27933"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27933/hovercard">27933</a>). Support for <code>torch::nn::init::Nonlinearity</code> and <code>torch::nn::init::FanMode</code> will be removed in 1.5.</li>
</ul>
<h2>JIT</h2>
<ul>
<li>Make dropout properly condition on training. (<ahref="https://github.com/pytorch/pytorch/pull/29436"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29436/hovercard">29436</a>)</li>
<li>Fix aten::grad to return optional list (<ahref="https://github.com/pytorch/pytorch/pull/29577"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29577/hovercard">29577</a>)</li>
<li>Fix <code>torch.arange</code> dtype</li>
<li>Fix type sharing on loaded ScriptModules (<ahref="https://github.com/pytorch/pytorch/pull/29826"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29826/hovercard">29826</a>)</li>
<li>Fix type sharing between traced modules (<ahref="https://github.com/pytorch/pytorch/pull/29583"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29583/hovercard">29583</a>)</li>
<li>Check for mutable default parameters (<ahref="https://github.com/pytorch/pytorch/pull/29833"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29833/hovercard">29833</a>)</li>
<li>Fix tracing of autograd functions (<ahref="https://github.com/pytorch/pytorch/pull/29791"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29791/hovercard">29791</a>)</li>
<li>Check for unrolled loop in break & continue (<ahref="https://github.com/pytorch/pytorch/pull/29474"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29474/hovercard">29474</a>)</li>
<li>Fix jit outplace tracing and reapply changes to _like operators. (<ahref="https://github.com/pytorch/pytorch/pull/28839"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28839/hovercard">28839</a>)</li>
<li>Properly guard against inheritance on TorchScript classes (<ahref="https://github.com/pytorch/pytorch/pull/28407"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28407/hovercard">28407</a>)</li>
<li>Fix when giving jit format warning about unsupported options (<ahref="https://github.com/pytorch/pytorch/pull/28616"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28616/hovercard">28616</a>)</li>
<li>Fix handling of function attributes. (<ahref="https://github.com/pytorch/pytorch/pull/28569"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28569/hovercard">28569</a>)</li>
<li>Fix pushLong() issue in pickler. (<ahref="https://github.com/pytorch/pytorch/pull/28057"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28057/hovercard">28057</a>)</li>
<li>Fix broken name mangling (<ahref="https://github.com/pytorch/pytorch/pull/27511"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27511/hovercard">27511</a>)</li>
<li>Fix segfault while printing value type for an error msg in emitListComprehension (<ahref="https://github.com/pytorch/pytorch/pull/27261"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27261/hovercard">27261</a>)</li>
<li>Fix race condition in Function::optimized_graph(). (<ahref="https://github.com/pytorch/pytorch/pull/27012"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27012/hovercard">27012</a>)</li>
<li>Sanitize module names on legacy import (<ahref="https://github.com/pytorch/pytorch/pull/27764"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27764/hovercard">27764</a>)</li>
<li>Python None should have its type inferred as NoneType (<ahref="https://github.com/pytorch/pytorch/pull/26665"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26665/hovercard">26665</a>)</li>
<li>Properly set existing attributes under recursive script (<ahref="https://github.com/pytorch/pytorch/pull/27514"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27514/hovercard">27514</a>)</li>
</ul>
<h2>Quantization</h2>
<ul>
<li>Skip copy_same_type_transpose_ for quantized tensor (<ahref="https://github.com/pytorch/pytorch/pull/29609"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29609/hovercard">29609</a>).</li>
<li>Add note that cuda quantization is not supported (<ahref="https://github.com/pytorch/pytorch/pull/27829"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27829/hovercard">27829</a>).</li>
<li>Rename _intrinsic to intrinsic (<ahref="https://github.com/pytorch/pytorch/pull/27194"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27194/hovercard">27194</a>).</li>
<li>Better error message for quantized dispatch (<ahref="https://github.com/pytorch/pytorch/pull/28635"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28635/hovercard">28635</a>).</li>
<li>Update the misleading comments for zero_points and scale in dynamic quant linear module [1/2] (<ahref="https://github.com/pytorch/pytorch/pull/28767"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28767/hovercard">28767</a>).</li>
<li>Avoid the misleading zero_point and scale [2/2] (<ahref="https://github.com/pytorch/pytorch/pull/28827"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28827/hovercard">28827</a>).</li>
<li>Add the warning message for API with linear modules (<ahref="https://github.com/pytorch/pytorch/pull/28766"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28766/hovercard">28766</a>).</li>
<li>Do not insert observers for empty sequential modules (<ahref="https://github.com/pytorch/pytorch/pull/28384"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28384/hovercard">28384</a>).</li>
<li>Fix the padding issue of quantized average pool operator (<ahref="https://github.com/pytorch/pytorch/pull/28260"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28260/hovercard">28260</a>).</li>
</ul>
<h2>Mobile</h2>
<ul>
<li>Fix deadlock issues in ThreadPool (<ahref="https://github.com/pytorch/pytorch/pull/29885"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29885/hovercard">29885</a>).</li>
<li>Disable ProfilingGraphExecutorImpl for mobile (<ahref="https://github.com/pytorch/pytorch/pull/30067"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30067/hovercard">30067</a>).</li>
</ul>
<h2>Other Bug fixes</h2>
<ul>
<li>
<p><code>torch.kthvalue</code> Fix CUDA shared memory out of bound access in findPattern (<ahref="https://github.com/pytorch/pytorch/pull/28989"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28989/hovercard">28989</a>).</p>
</li>
<li>
<p><code>torch.save</code> Fix source files not being saved (<ahref="https://github.com/pytorch/pytorch/pull/28965"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28965/hovercard">28965</a>).</p>
</li>
<li>
<p><code>torch.load</code> Fix OSError loading files larger than 2GB. (<ahref="https://github.com/pytorch/pytorch/pull/27125"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27125/hovercard">27125</a>).</p>
</li>
<li>
<p><code>torch.linspace</code> clearer error message for negative step sizes. (<ahref="https://github.com/pytorch/pytorch/pull/28274"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28274/hovercard">28274</a>).</p>
</li>
<li>
<p><code>torch.histc</code> Add range checks to avoid segfaults (<ahref="https://github.com/pytorch/pytorch/pull/27712"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27712/hovercard">27712</a>).</p>
</li>
<li>
<p><code>torch.lu</code> Fix thread<code></code>local issue on cpu (<ahref="https://github.com/pytorch/pytorch/pull/28546"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28546/hovercard">28546</a>).</p>
</li>
<li>
<p><code>torch.max_pool2d</code> Limit tensor size to max CUDA grid size (<ahref="https://github.com/pytorch/pytorch/pull/28931"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28931/hovercard">28931</a>).</p>
</li>
<li>
<p><code>torch.renorm</code> Fix a memory leak in CUDA renorm. (<ahref="https://github.com/pytorch/pytorch/pull/29873"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29873/hovercard">29873</a>).</p>
</li>
<li>
<p><code>torch.index_add</code> Fix bug in atomicAdd on CUDA for some dtypes (<ahref="https://github.com/pytorch/pytorch/pull/29231"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29231/hovercard">29231</a>).</p>
</li>
<li>
<p><code>torch.addmm</code> Fix handling of empty tensors (<ahref="https://github.com/pytorch/pytorch/pull/28613"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28613/hovercard">28613</a>).</p>
</li>
<li>
<p><code>nn.CTCLoss</code> Fix incorrect gradient for large target sizes (<ahref="https://github.com/pytorch/pytorch/pull/27460"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27460/hovercard">27460</a>).</p>
</li>
<li>
<p><code>nn.functional.ctc_loss</code> Fix incorrect gradient on cudnn (<ahref="https://github.com/pytorch/pytorch/pull/27039"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27039/hovercard">27039</a>).</p>
</li>
<li>
<p><code>nn.Embedding</code> Incorrect gradient at padding_idx in cuda kernel. (<ahref="https://github.com/pytorch/pytorch/pull/27731"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27731/hovercard">27731</a>).</p>
</li>
<li>
<p><code>nn.LayerNorm</code> Fix an illegal memory access error (<ahref="https://github.com/pytorch/pytorch/pull/28196"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28196/hovercard">28196</a>).</p>
</li>
<li>
<p><code>nn.Conv2d</code> handle zero stride (<ahref="https://github.com/pytorch/pytorch/pull/28784"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28784/hovercard">28784</a>).</p>
</li>
<li>
<p><code>nn.PoissonNLLLoss</code> Fix incorrect result with <code>full=True</code> (<ahref="https://github.com/pytorch/pytorch/pull/28637"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28637/hovercard">28637</a>).</p>
</li>
<li>
<p><code>nn.AvgPool2d</code> fix an overflow for 2^31-1 sized inputs (<ahref="https://github.com/pytorch/pytorch/pull/30793"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30793/hovercard">30793</a>).</p>
</li>
<li>
<p><code>nn.RNNBase</code> Fix an issue with use of children of RNN third party device types (<ahref="https://github.com/pytorch/pytorch/pull/28562"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28562/hovercard">28562</a>).</p>
<p><code>nn.Upsample</code> Fix a CUDA launch config failure (<ahref="https://github.com/pytorch/pytorch/pull/29016"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29016/hovercard">29016</a>).</p>
<p><code>PackedSequence.to</code> Ensure all tensors are moved (<ahref="https://github.com/pytorch/pytorch/pull/27245"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27245/hovercard">27245</a>).</p>
</li>
<li>
<p><code>EventList.total_average</code> Fix a regression caused by missing <strong>iadd</strong> (<ahref="https://github.com/pytorch/pytorch/pull/27498"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27498/hovercard">27498</a>).</p>
</li>
<li>
<p><code>Tensor.record_stream</code> Ensure stream is recorded for shifted view tensors (<ahref="https://github.com/pytorch/pytorch/pull/27371"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27371/hovercard">27371</a>).</p>
</li>
<li>
<p><code>torch.hub</code> Handle branch names containing a slash. (<ahref="https://github.com/pytorch/pytorch/pull/27960"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27960/hovercard">27960</a>).</p>
</li>
<li>
<p>Fix error handling in Magma kernels (<ahref="https://github.com/pytorch/pytorch/pull/29003"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29003/hovercard">29003</a>).</p>
</li>
<li>
<p>Fix avx for c++14 (<ahref="https://github.com/pytorch/pytorch/pull/28207"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28207/hovercard">28207</a>).</p>
</li>
<li>
<p>Fix illegal memory access thread safety issue in sparse CUDA (<ahref="https://github.com/pytorch/pytorch/pull/29426"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29426/hovercard">29426</a>).</p>
<h3><strong>Python 2 support is deprecated and will not be supported in the 1.5 release.</strong></h3>
<h3><code>torch.optim</code>: <code>Scheduler.step(epoch)</code> is now deprecated; use <code>Scheduler.step()</code> instead. (<ahref="https://github.com/pytorch/pytorch/pull/26423"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/26423/hovercard">26432</a>)</h3>
<p>For example:</p>
<pre><code>>>> for epoch in range(10):
>>> optimizer.step()
>>> scheduler.step(epoch)
DeprecationWarning: The epoch parameter in `scheduler.step()` was not necessary and is being deprecated where possible. Please use `scheduler.step()` to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https://github.com/pytorch/pytorch/issues/new/choose.
<h3><strong>[C++]</strong><code>Tensor::is_variable()</code> has been deprecated. As noted in the <strong>Backwards Incompatible Changes</strong>
section, the distinction between variable and non-variable has been
eliminated, so this check is no longer meaningful. Generally, <code>is_variable()</code> will now return true except in some special circumstances (see <ahref="https://github.com/pytorch/pytorch/pull/29653"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29653/hovercard">29653</a> for more details). (<ahref="https://github.com/pytorch/pytorch/pull/29653"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29653/hovercard">29653</a>)</h3>
<h3><strong>[C++]</strong><code>torch::nn::modules_ordered_dict</code> has been deprecated. It is generally no longer necessary and can just be removed. (<ahref="https://github.com/pytorch/pytorch/pull/28774/"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28774/hovercard">28774</a>)</h3>
<h3><code>torch.jit.quantized</code> API has been deprecated in favor of <code>torch.quantization.quantize_dynamic</code> (<ahref="https://github.com/pytorch/pytorch/pull/28766"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28766/hovercard">28766</a>)</h3>
<li><code>torch.nn.functional.threshold, torch.nn.functional.layer_norm, torch.cdist</code> Performance of threshold (CPU), layer norm (CUDA) and cdist operations was improved (<ahref="https://github.com/pytorch/pytorch/pull/27155"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27155/hovercard">27155,</a><ahref="https://github.com/pytorch/pytorch/pull/27634"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/27634/hovercard">27634</a>, <ahref="https://github.com/pytorch/pytorch/pull/25799"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/25799/hovercard">25799</a>)</li>
<li><code>torch.Tensor.fill_</code> Performance for half and bfloat16 types on CPU was improved (<ahref="https://github.com/pytorch/pytorch/pull/28397"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28397/hovercard">28397</a>).</li>
<li><code>torch.nn.MaxPool2d</code> implementation for channels_last format was added (<ahref="https://github.com/pytorch/pytorch/pull/24872"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/24872/hovercard">24872</a>)</li>
<li>There is a fast pass reducing the overheads of pointwise operations
relying on TensorIterator under certain conditions (contiguous inputs,
no broadcast) (<ahref="https://github.com/pytorch/pytorch/pull/29180"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29180/hovercard">29180</a>).</li>
<li>Overheads of operations with scalars/number literals was improved (<ahref="https://github.com/pytorch/pytorch/pull/29915"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/29915/hovercard">29915</a>).</li>
<li>In case of type promotion on the GPU, the values are converted on the fly, without explicit casting of the full tensor (<ahref="https://github.com/pytorch/pytorch/pull/30018"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/30018/hovercard">30018</a>).</li>
<li>reorder_dimensions in TensorIterator favors output write locality,
improving overall performance when operating on discontiguous tensors (<ahref="https://github.com/pytorch/pytorch/pull/28615"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28615/hovercard">28615</a>).</li>
<li>Float pickling speed was improved (<ahref="https://github.com/pytorch/pytorch/pull/28553"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28553/hovercard">28553</a>).</li>
<li>GRAIN_SIZE for intra-op parallelization was unified between TH and ATen operations (<ahref="https://github.com/pytorch/pytorch/pull/28770"data-hovercard-type="pull_request"data-hovercard-url="/pytorch/pytorch/pull/28770/hovercard">28770</a>)</li>