Paper Note of maskRCNN

Title: Mask R-CNN

Contribution:

  • extend Faster R-CNN by adding a mask branch, which could be used for seg and also improves accuracy
    • the mask branch is a small FCN applied to each ROI
    • a mask encodes an input object’s spatial layout
    • extracting the spatial structure of masks can be addressed naturally by the pixel-to-pixel correspondence provided by convolutions
    • the fully conv needs fewer params and is more accurate
  • illustrates that decouple mask and class prediction is essential, so that the loss of the mask branch is the avg binary cross-entropy loss
  • proposed ROI Align for better predicting pixel-accurate masks
    • avoid quantization of the boundaries or bins
    • insensitive to max/avg pool
  • shows ablation experiments and analysis of improvements
Show comments from Gitment