Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention

Sitong Wu; Tianyi Wu; Haoru Tan; Guodong Guo

Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention

Sitong Wu, Tianyi Wu, Haoru Tan, Guodong Guo

[AAAI-22] Main Track

Keywords
Poster Session 2 @ Red 3, Poster Session 9 @ Red 3, Oral Session 2 @ Red 3, Poster Session 2, Poster Session 9, Oral Session 2

Download Paper

Enter the Virtual Venue

Abstract: Recently, Transformers have shown promising performance in various vision tasks. To reduce the quadratic computation complexity caused by the global self-attention, various methods constrain the range of attention within a local region to improve its efficiency. Consequently, their receptive fields in a single attention layer are not large enough, resulting in insufficient context modeling. To address this issue, we propose a Pale-Shaped self-Attention (PS-Attention), which performs self-attention within a pale-shaped region. Compared to the global self-attention, PS-Attention can reduce the computation and memory costs significantly. Meanwhile, it can capture richer contextual information under the similar computation complexity with previous local self-attention mechanisms. Based on the PS-Attention, we develop a general Vision Transformer backbone with a hierarchical architecture, named Pale Transformer, which achieves 83.4%, 84.3%, and 84.9% Top-1 accuracy with the model size of 22M, 48M, and 85M respectively for 224×224 ImageNet-1K classification, outperforming the previous Vision Transformer backbones. For downstream tasks, our Pale Transformer backbone performs better than the recent state-of-the-art CSWin Transformer by a large margin on ADE20K semantic segmentation and COCO object detection & instance segmentation. The code will be released on \url{https://github.com/BR-IDL/PaddleViT}.

Introduction Video

Sessions where this paper appears

Timezone

Poster Session 2

Fri, February 25 12:45 AM - 2:30 AM (+00:00)

Red 3

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Poster Session 2
Poster Session 9

Sun, February 27 8:45 AM - 10:30 AM (+00:00)

Red 3

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Poster Session 9
Oral Session 2

Fri, February 25 2:30 AM - 3:45 AM (+00:00)

Red 3

Add to Calendar
Apple
Google
iCal File
Microsoft 365
Outlook.com
Yahoo

Oral Session 2