Intelligence & Robotics

Search Log In

Intelligence & Robotics

Figure1

From: CMMF-Net: a generative network based on CLIP-guided multi-modal feature fusion for thermal infrared image colorization

CMMF-Net: a generative network based on CLIP-guided multi-modal feature fusion for thermal infrared image colorization

Figure 1. The overall framework. Including a ViT Image_Encoder module, a CLIP Text_Encoder module, a cross-modality alignment module and a U-net module. CMMF-Net takes image-sentence pairs as input, and outputs the colorized image. ViT: Vision transformer; CLIP: contrastive language-image pretraining; CMMF-Net: a generative network based on clip-guided multi-modal feature fusion for thermal infrared image colorization.

Intelligence & Robotics

ISSN 2770-3541 (Online)

[email protected]

Navigation

Follow Us

Navigation

Committee on Publication Ethics

https://members.publicationethics.org/members/intelligence-robotics

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

Committee on Publication Ethics

https://members.publicationethics.org/members/intelligence-robotics

Portico

All published articles are preserved here permanently:

https://www.portico.org/publishers/oae/

[email protected]

Discover Content

Language Editing

Layout & Production

Graphical Abstracts

Video Abstracts

Conference Organizer

Strategic Collaborators

Follow OAE

© 2016-2026 OAE Publishing Inc., except certain content provided by third parties

Privacy Cookies Terms of Service