
Figure 3. Overall encoder-decoder architecture of the proposed SANet. Ei represents the encoder at the i-th stage. Di indicates the decoder at the i-th stage. Si and Ri denote the output feature maps of the encoder and the decoder at the i-th stage, respectively. Pi stands for the predicted saliency map, and P1 is the final prediction result. G is the ground-truth saliency map. PPM: Pyramid pooling module; MFA: multi-scale feature aggregation.