| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - ILSVRC/imagenet-1k |
| | model-index: |
| | - name: MaskBit-Tokenizer-14bits |
| | results: |
| | - task: |
| | type: image-generation |
| | dataset: |
| | name: ILSVRC/imagenet-1k |
| | type: ILSVRC/imagenet-1k |
| | metrics: |
| | - name: rFID |
| | type: rFID |
| | value: 1.37 |
| | - name: InceptionScore |
| | type: InceptionScore |
| | value: 190.3 |
| | - name: LPIPS |
| | type: LPIPS |
| | value: 0.286 |
| | - name: PSNR |
| | type: PSNR |
| | value: 21.5 |
| | - name: SSIM |
| | type: SSIM |
| | value: 0.56 |
| | - name: CodebookUsage |
| | type: CodebookUsage |
| | value: 1.0 |
| | --- |
| | |
| | This model is the MaskBit tokenizer with a vocabulary size of 14bits. It uses a downsampling factor of 16 and is trained on ImageNet for images of resolution 256. |
| |
|
| | You can find more details on the [project page](https://weber-mark.github.io/projects/maskbit.html) and in the [paper](https://arxiv.org/abs/2409.16211). |