-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of Resize
operator
#510
Comments
After learning more about Resize operator, I concluded this is not Tensor operator, such as in PyTorch's resize, but rather it's an imager resize operation as described in ONNX Resize op. So the solution is more specific to ONNX compatibility rather than adding a missing op in Burn. The main question is how do we support this. Since this op is officially in the ONNX spec and there seem to exist models exported to ONNX with this op, we need to support it somehow. There are two possible solutions. And I went back and forth, preferring one over another. I have settle on the solution 2. Solution 1: Allow user implementation by exposing node hooks. Here is what it might work:
use burn_import::onnx::ModelGen;
fn main() {
// Generate the model code from the ONNX file.
ModelGen::new()
.input("src/model/mnist.onnx")
.out_dir("model/")
.with_custom("Resize", "mymodule.Resizer")
.with_custom("STFT", "mymodule.STFTer") // more registration
.run_from_script();
} This will generate the following type of code: // Generated from ONNX "src/model/mnist.onnx" by burn-import
use burn::nn::conv::Conv2d;
use burn::nn::conv::Conv2dConfig;
use burn::record::Recorder;
use burn::{
module::Module,
tensor::{backend::Backend, Tensor},
};
use mymodule.Resizer;
#[derive(Module, Debug)]
pub struct Model<B: Backend> {
conv2d1: Conv2d<B>,
resizer: Resizer<B>,
}
impl<B: Backend> Model<B> {
pub fn new_with(record: ModelRecord<B>) -> Self {
let conv2d1 = Conv2dConfig::new([1, 8], [3, 3])
.with_stride([1, 1])
.with_dilation([1, 1])
.with_groups(1)
.with_bias(true)
.init_with(record.conv2d1);
let resizer = Resizer::from_conf(FULL_ONNX_NODE_INFO); // full ONNX attributes passed
Self {
conv2d1,
resizer,
}
}
pub fn new() -> Self {
let conv2d1 = Conv2dConfig::new([1, 8], [3, 3])
.with_stride([1, 1])
.with_dilation([1, 1])
.with_groups(1)
.with_bias(true)
.init();
let resizer = Resizer::from_conf(FULL_ONNX_NODE_INFO); // full ONNX attributes passed
Self {
conv2d1,
resizer,
}
}
#[allow(clippy::let_and_return)]
pub fn forward(&self, input1: Tensor<B, 4>) -> Tensor<B, 2> {
let conv2d1_out1 = self.conv2d1.forward(input1);
let resizer_out1 = self.resizer.forward(conv2d1_out1);
resizer_out1
}
} Disadvantages of this solution are:
Solution 2: Introduce
|
I have found PyTorch's implementation: https://pytorch.org/docs/stable/generated/torch.nn.functional.interpolate.html But I recommend we use image library written in pure Rust instead of trying to write alg with tensors. The downside is that it will be using CPU, which is generally fine. Here is one such library: https://github.com/Cykooz/fast_image_resize |
Here is bicubic alg that's using PyTorch's convolution that ChatGPT has suggested (need to check for accuracy): import torch
import torch.nn.functional as F
def precompute_bicubic_kernel(dtype, device):
x = torch.linspace(-1.5, 2.5, 4, dtype=dtype, device=device)
a = -0.5
weights = (
((a * (x.abs() + 1)**3 - 5*a * x.abs()**2 + 8*a * (x.abs() + 1) - 4*a) * (x.abs() < 1)).unsqueeze(-1) +
((a * x.abs()**3 - 5*a * x.abs()**2 + 8*a * x.abs() - 4*a) * (1 <= x.abs()) * (x.abs() < 2)).unsqueeze(-1)
)
kernel = weights @ weights.t()
return kernel
def downsample_bicubic(image, scale_factor):
# Get the device and dtype of the image
dtype, device = image.dtype, image.device
# Precompute the bicubic kernel and make sure it has the right shape, type and device
kernel = precompute_bicubic_kernel(dtype, device).view(1, 1, 4, 4)
# Apply padding on both sides of the image. Padding size is derived from the kernel size.
padded_image = F.pad(image, (1, 2, 1, 2), mode='reflect')
# Convolve the image with the bicubic kernel
convolved = F.conv2d(padded_image, kernel, stride=1, padding=0, groups=image.size(1))
# Now, take every nth pixel, where n is the scale factor
downscaled = convolved[..., ::scale_factor, ::scale_factor]
return downscaled Please note, the weights should remain as non-trainable. If we adopt this alg, we should use functional |
@Luni-4 I realized we have a feature request for So, here is my updated plan. Lets add We could start with |
Perfect, I completely agree with your plan! |
I have not found any duplicate issue
Feature description
Addition of a
Resize
operator inburn
Feature motivation
I have tried some pre-trained networks and this operator is nearly present in all of them. So it would be helpful to provide a straightforward method for this, in order to simplify its usage by other developers.
(Optional) Suggest a Solution
Talking with @nathanielsimard, it might be built as a combination of
slice_assign
andreshape
operators, but still not clear whether it is the right way. Another option could be its implementation from scratch.Here an explanation with examples
The text was updated successfully, but these errors were encountered: