Abstract: Recent advancements in convolutional neural network (CNN)-based techniques for remote sensing pansharpening have markedly enhanced image quality. However, conventional convolutional modules ...
Abstract: Currently, video question answering (VideoQA) algorithms relying on video-text pretraining models employ intricate unimodal encoders and multimodal fusion Transformers, which often lead to ...