Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bilinear_interp_op #3925

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions paddle/operators/bilinear_interp_op.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/operators/bilinear_interp_op.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we implement such an operator?

If we implement abilinear_interp operator, we also need to implement 最近邻插值, 三次插值, even those gaussian blur operators. So many operators will be an infinite workload to us.
In my mind, we only need to port the OpenCV base data structure represent cv::Mat to eigen matrix. And reuse OpenCV CPU/GPU function as kernel implement.
And maybe put those operators into a image directory is better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bilinear_interp is an op what a user once suggested. This op is not used to do data argumentation. It's used to do image segmentation. I don't think these operators 最近邻插值, 三次插值, even those gaussian blur are needed now. And there is no need to port the heavier OpenCV for this one operator now. In addition, TensorFlow also implements this op: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/resize_bilinear_op.cc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see.
I just notice that this operator implement encounter some problems. This operator is more likely for user specified not for the general purpose.
I find that Caffe2 Catalogue do not implement it at all.
I just want to remind that we should implement most used operators since our deadline is looming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dzhwinter Thank you for reminding! Now some necessary operators could not be developed since the LoDTensor is not ready. We'll try our best to finish LoDTensor this week. And @luotao1 has several necessary sequence operators.


namespace paddle {
namespace operators {

using framework::Tensor;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use using in the header file violate the google style, see the detail in Namespace

Do not use Namespace aliases at namespace scope in header files except in explicitly marked internal-only namespaces, because anything imported into a namespace in a header file becomes part of the public API exported by that file.

By the way, I think

using framework::Tensor;

is enough here.


class BilinearInterpOp : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;

protected:
void InferShape(const framework::InferShapeContext &ctx) const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The infershape need to re-write with the new interface. the detail in Infershape

auto dim_X = ctx.Input<Tensor>("X")->dims(); // NCHW format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Add not-null check for InputVar("X") and OutputVar("Out")
  2. dim_X -> dim_x

int out_h = ctx.GetAttr<int>("out_h");
int out_w = ctx.GetAttr<int>("out_w");
PADDLE_ENFORCE_EQ(dim_X.size(), 4, "X's dimension must be 4");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe need to check the out_h, out_w, which does not range over the bounder.

ctx.Output<Tensor>("Out")->Resize({dim_X[0], dim_X[1], out_h, out_w});
}
};

class BilinearInterpOpMaker : public framework::OpProtoAndCheckerMaker {
public:
BilinearInterpOpMaker(framework::OpProto *proto,
framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X",
"The input tensor of bilinear interpolation, 4-D with NCHW shape");
AddOutput("Out", "The output tensor with the same shape as X");
AddComment(R"DOC(
Bilinear interpolation is an extension of linear interpolation for
interpolating functions of two variables (e.g. H-direction and W-direction
in this op) on a rectilinear 2D grid.

The key idea is to perform linear interpolation first in one direction,
and then again in the other direction.

For details, please refer to Wikipedia:
https://en.wikipedia.org/wiki/Bilinear_interpolation
)DOC");
AddAttr<int>("out_h", "output height of bilinear interpolation op.");
AddAttr<int>("out_w", "output weight of bilinear interpolation op.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put the AddComment(R"DOC )DOC") at last.

}
};

class BilinearInterpOpGrad : public framework::OperatorWithKernel {
public:
using framework::OperatorWithKernel::OperatorWithKernel;

protected:
void InferShape(const framework::InferShapeContext &ctx) const override {
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar("X"), "Input(X) should not be null");
PADDLE_ENFORCE_NOT_NULL(ctx.InputVar(framework::GradVarName("Out")),
"Input(Out@GRAD) should not be null");
ctx.Output<Tensor>(framework::GradVarName("X"))
->Resize(ctx.Input<Tensor>("X")->dims());
}
};

} // namespace operators
} // namespace paddle

namespace ops = paddle::operators;
REGISTER_OP(bilinear_interp, ops::BilinearInterpOp, ops::BilinearInterpOpMaker,
bilinear_interp_grad, ops::BilinearInterpOpGrad);
REGISTER_OP_CPU_KERNEL(bilinear_interp, ops::BilinearInterpKernel<float>);
REGISTER_OP_CPU_KERNEL(bilinear_interp_grad, ops::BilinearInterpKernel<float>);
38 changes: 38 additions & 0 deletions paddle/operators/bilinear_interp_op.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#include "paddle/operators/bilinear_interp_op.h"

namespace paddle {
namespace operators {

template <typename T>
class BilinearInterpCUDAKernel : public framework::OpKernel {
public:
void Compute(const framework::ExecutionContext& context) const override {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add cuda kernel here.

};

template <typename T>
class BilinearInterpGradCUDAKernel : public framework::OpKernel {
public:
void Compute(const framework::ExecutionContext& context) const override {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CUDA kernel will be done in the next PR, please add TODO(luotao) in these functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the TODO comment in here and merge this PR ASAP.

};

} // namespace operators
} // namespace paddle

namespace ops = paddle::operators;
REGISTER_OP_GPU_KERNEL(bilinear_interp, ops::BilinearInterpCUDAKernel<float>);
REGISTER_OP_GPU_KERNEL(bilinear_interp_grad,
ops::BilinearInterpGradCUDAKernel<float>);
144 changes: 144 additions & 0 deletions paddle/operators/bilinear_interp_op.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */

#pragma once
#include "paddle/framework/eigen.h"
#include "paddle/framework/op_registry.h"

namespace paddle {
namespace operators {

using Tensor = framework::Tensor;
template <typename T, int MajorType = Eigen::RowMajor,
typename IndexType = Eigen::DenseIndex>
using EigenVector = framework::EigenVector<T, MajorType, IndexType>;

template <typename T>
class BilinearInterpKernel : public framework::OpKernel {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto input_t = ctx.Input<Tensor>("X"); // float tensor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer the same name with parameter in OpMaker.
input_t => X. And if the return value is a pointer, the auto* is better.

auto output_t = ctx.Output<Tensor>("Out"); // float tensor
auto input = input_t->data<T>();
auto output = output_t->mutable_data<T>(ctx.GetPlace());

int out_h = ctx.GetAttr<int>("out_h");
int out_w = ctx.GetAttr<int>("out_w");
int number = input_t->dims()[0];
int channels = input_t->dims()[1];
int in_h = input_t->dims()[2];
int in_w = input_t->dims()[3];

int in_hw = in_h * in_w;
int out_hw = out_h * out_w;
int in_chw = channels * in_hw;
int out_chw = channels * out_hw;

T ratio_h = (out_h > 1) ? static_cast<T>(in_h - 1) / (out_h - 1) : 0.f;
T ratio_w = (out_w > 1) ? static_cast<T>(in_w - 1) / (out_w - 1) : 0.f;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

单测里需要考虑:

  1. out_h = 1, out_w = 1的情况
  2. out_h/out_w 大于、小于in_h/in_w的情况


if (in_h == out_h && in_w == out_w) {
memcpy(output, input, product(input_t->dims()) * sizeof(T));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

product(input_t->dims()) - > input_t->numel()

} else {
for (int k = 0; k < number; ++k) { // loop for batches
for (int i = 0; i < out_h; ++i) { // loop for images
int h = ratio_h * i;
int hid = (h < in_h - 1) ? 1 : 0;
T h1lambda = ratio_h * i - h;
T h2lambda = 1 - h1lambda;

for (int j = 0; j < out_w; ++j) {
int w = ratio_w * j;
int wid = (w < in_w - 1) ? 1 : 0;
T w1lambda = ratio_w * j - w;
T w2lambda = 1 - w1lambda;
// calculate four position for bilinear interpolation
const T* in_pos = &input[k * in_chw + h * in_w + w];
T* out_pos = &output[k * out_chw + i * out_w + j];

for (int c = 0; c < channels; ++c) { // loop for channels
// bilinear interpolation
out_pos[0] =
h2lambda * (w2lambda * in_pos[0] + w1lambda * in_pos[wid]) +
h1lambda * (w2lambda * in_pos[hid * in_w] +
w1lambda * in_pos[hid * in_w + wid]);
in_pos += in_hw;
out_pos += out_hw;
}
}
}
}
}
}
};

template <typename T>
class BilinearInterpGradKernel : public framework::OpKernel {
public:
void Compute(const framework::ExecutionContext& ctx) const override {
auto d_input_t = ctx.Output<Tensor>(framework::GradVarName("X"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use auto * please.

auto d_output_t = ctx.Input<Tensor>(framework::GradVarName("Out"));
auto d_input = d_input_t->mutable_data<T>(ctx.GetPlace());
auto d_output = d_output_t->data<T>();

int out_h = ctx.GetAttr<int>("out_h");
int out_w = ctx.GetAttr<int>("out_w");
int number = d_input_t->dims()[0];
int channels = d_input_t->dims()[1];
int in_h = d_input_t->dims()[2];
int in_w = d_input_t->dims()[3];

int in_hw = in_h * in_w;
int out_hw = out_h * out_w;
int in_chw = channels * in_hw;
int out_chw = channels * out_hw;

T ratio_h = (out_h > 1) ? static_cast<T>(in_h - 1) / (out_h - 1) : 0.f;
T ratio_w = (out_w > 1) ? static_cast<T>(in_w - 1) / (out_w - 1) : 0.f;

if (in_h == out_h && in_w == out_w) {
memcpy(d_input, d_output, product(d_input_t->dims()) * sizeof(T));
} else {
for (int k = 0; k < number; ++k) { // loop for batches
for (int i = 0; i < out_h; ++i) { // loop for images
int h = ratio_h * i;
int hid = (h < in_h - 1) ? 1 : 0;
T h1lambda = ratio_h * i - h;
T h2lambda = 1 - h1lambda;

for (int j = 0; j < out_w; ++j) {
int w = ratio_w * j;
int wid = (w < in_w - 1) ? 1 : 0;
T w1lambda = ratio_w * j - w;
T w2lambda = 1 - w1lambda;
T* in_pos = &d_input[k * in_chw + h * in_w + w];
const T* out_pos = &d_output[k * out_chw + i * out_w + j];

for (int c = 0; c < channels; ++c) { // loop for channels
in_pos[0] = h2lambda * w2lambda * out_pos[0];
in_pos[wid] = h2lambda * w1lambda * out_pos[0];
in_pos[hid * in_w] = h1lambda * w2lambda * out_pos[0];
in_pos[hid * in_w + wid] = h1lambda * w1lambda * out_pos[0];
in_pos += in_hw;
out_pos += out_hw;
}
}
}
}
}
}
};

} // namespace operators
} // namespace paddle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add python test