Quantcast
Channel: AI Beta - Unity Discussions
Viewing all articles
Browse latest Browse all 109

ScalarMad Optimization Proposal

$
0
0

I noticed that in some (already optimized) models, ScalarMad operations appear right before Conv operations, even though BatchNormalization operations and Conv operations are fused.
Actually it should also be possible to merge these ScalarMad operations.
I tried it in pytorch and the error is negligibly small.

import torch

s = 42.42
b = 1.23

conv_original = torch.nn.Conv2d(3,16,kernel_size=3, bias=True)

#fuse s and b into weights and biases
conv_fused = torch.nn.Conv2d(3,16,kernel_size=3, bias=True)
conv_fused.weight.data = conv_original.weight.data * s
conv_fused.bias.data = conv_original.bias.data + b*conv_original.weight.data.sum(dim=-1).sum(dim=-1).sum(dim=-1)

input = torch.rand(1,3,112,112)
a_original = conv_original(input * s + b)

a_fused = conv_fused(input)

print((a_original - a_fused).abs().mean().item())
#Prints: 1.2799195019397303e-06

So maybe a new ScalarMad optimizer can save another few FLOPS.

However, if the number of output channels is larger than the input channels and the original conv operation does not contain any bias parameters, optimization would probably make no sense.

1 post - 1 participant

Read full topic


Viewing all articles
Browse latest Browse all 109

Trending Articles