不,您不需要自己将输入图像的大小调整为固定形状。Tensorflow 对象检测 api 有一个预处理步骤,可以调整所有输入图像的大小。以下是预处理步骤中定义的函数,其中有一个image_resizer_fn
,它对应于一个名为image_resizer
在配置内file https://github.com/tensorflow/models/blob/master/research/object_detection/inputs.py.
def transform_input_data(tensor_dict,
model_preprocess_fn,
image_resizer_fn,
num_classes,
data_augmentation_fn=None,
merge_multiple_boxes=False,
retain_original_image=False,
use_multiclass_scores=False,
use_bfloat16=False):
"""A single function that is responsible for all input data transformations.
Data transformation functions are applied in the following order.
1. If key fields.InputDataFields.image_additional_channels is present in
tensor_dict, the additional channels will be merged into
fields.InputDataFields.image.
2. data_augmentation_fn (optional): applied on tensor_dict.
3. model_preprocess_fn: applied only on image tensor in tensor_dict.
4. image_resizer_fn: applied on original image and instance mask tensor in
tensor_dict.
5. one_hot_encoding: applied to classes tensor in tensor_dict.
6. merge_multiple_boxes (optional): when groundtruth boxes are exactly the
same they can be merged into a single box with an associated k-hot class
label.
根据proto https://github.com/tensorflow/models/blob/master/research/object_detection/protos/image_resizer.proto文件中,您可以在 4 种不同的图像缩放器中进行选择,即
- keep_aspect_ratio_resizer
- 固定形状调整器
- 身份调整器
- 条件形状调整器
Here https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/faster_rcnn_resnet101_pets.config是模型的示例配置文件faster_rcnn_resnet101_pets
并且图像全部用 min_dimension=600 和 max_dimension=1024 重新整形
model {
faster_rcnn {
num_classes: 37
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_resnet101'
first_stage_features_stride: 16
}
事实上,调整大小的图像的形状对检测速度和准确度性能有很大影响。虽然对输入图像的大小没有具体要求,但最好所有最小尺寸的图像都大于合理值,以便卷积运算正常工作。