CGOO

Abstract

Learned image compression aims to reduce redundancy by accurately modeling the complex signal distribution inher- ent in images with network parameters. However, existing practices that train models on entire dataset offline face a limitation, as the estimated distribution only approximates the general image signal distribution and fails to capture image-specific characteristics. To address this issue, we propose a cross-granularity online optimization strategy to mitigate information loss from two key aspects: statistical distribution gaps and local structural gaps. This strategy introduces additional fitted bitstream to push the estimated signal distribution closer to the real one at both coarse- grained and fine-grained levels. For coarse-grained op- timization, we relax the common bitrate constraints dur- ing gradient descent and reduce bitrate cost via adaptive QP (Quantization Parameter) selection, preventing infor- mation collapse and narrowing the statistical distribution gaps. For fine-grained optimization, a Mask-based Selec- tive Compensation Module is designed to sparsely encode structural characteristics at low bitrates, enhancing local distribution alignment. By jointly optimizing global and lo- cal distributions, our method achieves closer alignment to real image statistics and significantly enhances the perfor- mance. Extensive experiments validate the superiority of our method as well as the design of our module.

Method

The process starts with encoding an input image through an end-to-end encoder to obtain a quantized latent representation. For coarse-grained optimization, the method relaxes bitrate constraints during gradient descent and employs adaptive Quantization Parameter (QP) selection via a Data-Dependent Transform (DDT) module, which generates adaptive parameters for decoder-side sample dependence transformation . Fine-grained optimization is achieved through a Mask-based Selective Compensation (MSC) module, where sparse mask optimization selectively transmits beneficial compensation information by balancing the cost-benefit trade-off, ensuring that only the most effective structural details are preserved . Both strategies work jointly to align estimated signal distributions closer to real ones at both global and local levels, reducing distribution gaps significantly. This multi-stage optimization framework ultimately enhances rate-distortion performance during inference .

Figure 1. Workflow of our Cross-Granularity Optimization with both coarse and fine-grained optimization. The coarse-grained optimization aligns the real distribution with the estimated sample distribution of images on a global scale, while the fine-grained optimization further refines the alignment to minimize the gap by compensating for local details.

Figure 2. The overall structure of our proposed image compression framework with cross-granularity optimization.

Results

Figure 3. R-D performance evaluated on the CLIC Professional Validation dataset. The compared methods include state-of-the-art LIC models and conventional image codecs.

Table 1. BD-rate results and complexity based on CLIC Professional Validation dataset. We set BPG as the anchor in the calculation. The best results are shown in bold.

Figure 4. Visual comparisons with state-of-the-art methods on Kodak dataset. As can be seen, we achieve a improvement in subjective performance with lower bitrates.

Citation

@inproceedings{
kuang2025cross,
title={Cross-Granularity Online Optimization with Masked Compensated Information for Learned Image Compression},
author={Haowei Kuang and Wenhan Yang and Zongming Guo and Jiaying Liu},
booktitle={IEEE/CVF International Conference on Computer Vision 2025},
year={2025} }

STRUCT Project.