Smallest fully transparent PNG image

During the development of my Libravatar implementation I became interested in the PNG file format which led me down the rabbit hole in a quest to find the smallest possible fully transparent image. Here is the story of why and how.

First, some context. Libravatar is a project initiated by François Marier and now maintained by Oliver Falk which mimics Gravatar but provides a decentralized approach based on Free Software. Its API is very similar to Gravatar’s.

What is Gravatar then? It is a centralised service providing various websites the ability to display globally recognised avatars. This works by letting users register an account on Gravatar, list the email addresses they use and upload an avatar. Websites will then generate specifically crafted URLs from the email addresses provided by their own users and display the same avatar everywhere, saving everyone from the burden of uploading their picture again and again on every website. But what happens if a user did not register on Gravatar beforehand? Without configuration a dull, default image will be returned (the Gravatar or Libravatar logo actually). But a special feature allows website administrators to tune this behaviour and instead have Gravatar returns an HTTP error code 404, a randomly generated image like RobotHash and Identicon, etc. One of these options is “blank”, which simply returns a fully transparent image.

This is the very file that piqued my interest.

So, let’s start and download it:

$ curl "https://secure.gravatar.com/avatar/invalid?d=blank" > blank.png

Here it is, in all its transparent glory (with an added border for emphasis):

What’s inside?

$ file blank.png
blank.png: PNG image data, 80 x 80, 8-bit/color RGB, non-interlaced

The command file(1) describes it as a “8-bit/colour RGB” file. What does that mean ?

Short introduction to the PNG file format

According to the specification a PNG image is made of a specific signature at the beginning of the file and then of a series of “chunks”, blocks of data describing various properties. Each chunk is made of three to four parts: the length of the chunk’s data encoded as a four-byte unsigned integer, the chunk type encoded as four ASCII uppercase and lowercase letters, the data if the length was not zero and then a verification value named CRC calculated from the chunk’s type and data and encoded as a four-byte unsigned integer.

Example: a four part chunk (each ASCII square represents a byte):

 <-   Length  -> <-    Type   -> <-   Data    -> <-    CRC    ->
+---+---+---+---+---+---+---+---+---+--   --+---+---+---+---+---+
| 0 | 0 | 0 | b | I | D | A | T | . | ..... | . | 4 | 1 | 8 | a |
+---+---+---+---+---+---+---+---+---+--   --+---+---+---+---+---+

Example: a three part chunk:

 <-   Length  -> <-    Type   -> <-    CRC    ->
+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | 0 | I | E | N | D | a | 4 | 6 | 8 |
+---+---+---+---+---+---+---+---+---+---+---+---+

There are some twenty six or so chunks defined in the specification and its extensions, but only three are absolutely mandatory:

IHDR: image header ;
IDAT: image data ;
IEND: image end.

The image header chunk describes the most basic properties of a PNG file:

width and height in pixels ;
bit depth: the number of bits used to store a pixel’s colour ;
colour type: various possibilities here, from greyscale to true colour with alpha channel ;
compression and filter methods ;
interlace method.

Its data part has a fixed size of 13 bytes, meaning the whole chunk is 25 bytes long.

Example: Artist rendering of the data part of an IHDR chunk.

                                       Bit depth
                                      /   Colour type
                                     /   /   Compression method
                                    /   /   /   Filter method
 <-   Width   -> <-   Height  ->   /   /   /   /  ,Interlace method
+---+---+---+---+---+---+---+---+---+---+---+---+---+
| 0 | 0 | 0 | f | 0 | 0 | 0 | f | 8 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+---+---+---+---+---+---+---+---+

The image data chunk contains the actual image data, stored as a series of rows named “scanlines”. How each pixel is stored in a scanline depends on various values provided in the IHDR chunk. It is possible to have many IDAT chunks in a file, each of a different size.

Lastly the IEND chunk marks the end of the file, everything after it is considered garbage and is not read by image viewers. Its has a zero-length data part, meaning the whole chunk is 12 bytes long only.

The presence of this special end-of-file marker is actually interesting as it can be exploited to append other files after a PNG image, most notably tar archive. I heavily (ab)used this mechanism to upload various documents on free image hosters back in the days.

Example of a file that is both a valid image and a valid tar archive:

$ tar -cf lena.tar lena.png
$ cat blank.png lena.tar > combined.png
$ file combined.png
combined.png: PNG image data, 80 x 80, 8-bit/color RGB, non-interlaced
$ tar -tf combined.png
tar: Cannot identify format. Searching...
lena.png

Here it is: Secret image

But focus back on this transparent image generated by Gravatar.

PNG file inspection

I am not going to draw the whole file in little ASCII-art boxes so let’s use xxd(1) for some low level inspection.

$ xxd -g 1 blank.png
00000000: 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52  .PNG........IHDR
00000010: 00 00 00 50 00 00 00 50 08 02 00 00 00 01 73 65  ...P...P......se
00000020: fa 00 00 00 06 74 52 4e 53 00 00 00 00 00 00 6e  .....tRNS......n
00000030: a6 07 91 00 00 00 09 70 48 59 73 00 00 0e c4 00  .......pHYs.....
00000040: 00 0e c4 01 95 2b 0e 1b 00 00 00 2a 49 44 41 54  .....+.....*IDAT
00000050: 78 9c ed c1 01 0d 00 00 00 c2 a0 f7 4f 6d 0e 37  x...........Om.7
00000060: a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000070: 00 00 00 e0 d7 00 4b 50 00 01 8f 26 91 39 00 00  ......KP...&.9..
00000080: 00 00 49 45 4e 44 ae 42 60 82                    ..IEND.B`.

Breaking this down a bit while following the specification it is easier to understand:

# This is the PNG signature
89 50 4e 47 0d 0a 1a 0a
# First chunk length: 13 bytes
00 00 00 0d
# First chunk type: IHDR
49 48 44 52
# First chunk data
00 00 00 50 00 00 00 50 08 02 00 00 00
# First chunk checksum
01 73 65 fa
# Second chunk length: 6 bytes
00 00 00 06
# Second chunk type: tRNS
74 52 4e 53
# Second chunk data
00 00 00 00 00 00
# Second chunk checksum
6e a6 07 91
# Third chunk length: 9 bytes
00 00 00 09
# Third chunk type: pHYs
70 48 59 73
# Third chunk data
00 00 0e c4 00 00 0e c4 01
# Third chunk checksum
95 2b 0e 1b
# Fourth chunk length: 42 bytes
00 00 00 2a
# Fourth chunk type: IDAT
49 44 41 54
# Fourth chunk data
78 9c ed c1 01 0d 00 00 00 c2 a0 f7 4f 6d 0e 37
a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 e0 d7 00 4b 50 00 01
# Fourth chunk checksum
8f 26 91 39
# Fifth chunk length: 0 bytes
00 00 00 00
# Fifth chunk type: IEND
49 45 4e 44
# Fifth chunk checksum
ae 42 60 82

To make analyses less tedious I built a custom tool named pnginfo(1). It takes a file on standard input or with the command line option -f and can list the chunks in a PNG file (-l) or display detailed informations about a particular chunk (-c).

Let’s inspect the blank.png file:

$ pnginfo -l -f blank.png
IHDR
tRNS
pHYs
IDAT
IEND

So this file contains the usual IHDR, IDAT and IEND chunks already described and two slightly less common ones: tRNS and pHYs.

Let’s start digging with the later. A pHYs chunk is optional and used to denote the desired aspect ratio of the image, expressed as pixels per unit (generally meter). In the context of Gravatar images are perfect squares and probably not intended for printing or displaying on strange, non-square pixels configuration.

Now about tRNS. Depending on the values in IHDR this chunk holds one or more colour definitions and the corresponding alpha channels (aka transparency levels). More specifically a greyscale or true colour image can have one colour dedicated to transparency, expressed as a value ranging from 0 (fully transparent) to 255 (fully opaque). Otherwise for an indexed-colour image, also known as palette-based, the tRNS chunk contains the alpha value for each palette entries.

So which one is used in the blank.png file ?

$ pnginfo -c IHDR -f blank.png
IHDR: width: 80
IHDR: height: 80
IHDR: bitdepth: 8
IHDR: colourtype: truecolour
IHDR: compression: deflate
IHDR: filter: adaptive
IHDR: interlace method: standard

Truecolour. This means each pixel in this file is stored as three whole bytes in the IDAT chunk, one for each level of red, green and blue, and then the transparency level is encoded on six bytes in the tRNS chunk. For a file with only one colour that is further fully transparent. Seems wasteful to me.

What about the total file size?

$ ls -gnh blank.png
-rw-r--r--  1 1000   138B Apr 15 12:19 blank.png

138 bytes. Is it big? Not really. Can it be made smaller? Definitively yes.

In order to set a goal we can start by calculating the minimal, incompressible size using the mandatory elements of a PNG file. First there are the eight bytes in the PNG signature, twenty-five bytes for the IHDR chunk, at least twelve bytes for IDAT and twelve bytes again at the end with the IEND chunk.

So the goal is now clear: reducing the file size to be as close as possible to 57 bytes. There 81 bytes to shave.

Remove optional chunks

According to the section § 11.3.5.3 of the PNG specification the pHYs chunk contains nine bytes of data

Let’s analyse this chunk with pnginfo(1):

$ pnginfo -c pHYs -f blank.png
pHYs: pixel per unit, X axis: 3780
pHYs: pixel per unit, Y axis: 3780
pHYs: unit specifier: metre

In this case we have a desired density of 3780 pixels per meter horizontally and vertically, which is equivalent to a DPI of 96. I have serious doubt about anyone printing this picture any day, so let’s strip this chunk entirely.

The pHYs chunk has a fixed size of twenty one bytes: four for the length, four for the type, four for the checksum and then nine for the data, meaning we can reduce the file size down to 117 bytes.

Here it is:

$ file blank-without-pHYs.png 
blank-without-pHYs.png: PNG image data, 80 x 80, 8-bit/color RGB, non-interlaced
$ ls -gnh blank-without-pHYs.png 
-rw-r--r--  1 1000   117B Apr 17 10:46 blank-without-pHYs.png

Switch to a palette

For now the pixels are stored as RGB triplets in the IDAT chunk but all three bytes are zeroes. The section § 4.3.3 of the PNG specification says the indexed-colour model is probably more efficient in our case.

The PLTE chunk allows us to create a palette of every colours used in the image, so its size is dynamic and depends directly on the number of entries. In the IDAT chunk RGB triplets will be replaced by a single byte containing a number: an index to the palette.

Example: Imaginary a representation of a 5x5 pixels image.

IDAT:                    PLTE:
+---+---+---+---+---+
| 1 | 1 | 1 | 1 | 1 |    [0]: White
+---+---+---+---+---+    [1]: Black
| 1 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+
| 1 | 1 | 1 | 0 | 0 |
+---+---+---+---+---+
| 1 | 0 | 0 | 0 | 0 |
+---+---+---+---+---+
| 1 | 1 | 1 | 1 | 1 |
+---+---+---+---+---+

Here the IDAT chunk only holds simple values like 0 or 1. In order to draw the correct colour a PNG viewer will only need to look in the PLTE table to check what is 0 or 1. In this case the picture would represent a capital letter “E” in black on a white background.

For the purpose of the blank.png file only one entry is needed in the palette, so the exact size of the PLTE chunk is 15 bytes: 12 as usual for the type, length and CRC plus one RGB triplet for the colour. Additionally I need to throw in the tRNS chunk to indicate the transparency level of this single colour.

Again I wrote a small utility to help me in this task. As It can use various strategies to write a simple blank PNG file I named it pngblank (please refer to its man page for more information).

$ pngblank -p 80 > blank-with-PLTE.png
$ file blank-with-PLTE.png
blank-with-PLTE.png: PNG image data, 80 x 80, 8-bit colormap, non-interlaced
$ pnginfo -l -f blank-with-PLTE.png
IHDR
PLTE
tRNS
IDAT
IEND
$ pnginfo -c PLTE -f blank-with-PLTE.png
PLTE: 1 entries
PLTE: entry   0: 0x000000
$ pnginfo -c tRNS -f blank-with-PLTE.png
tRNS: palette index 0: 0

Here it is:

The reward is mediocre, as I only gained 3 bytes from the previous state. The answer to why is because of compression: in theory the data size was divided by 3, but compressing a series of zeroes is very efficiently done. Plus I also had to add a new chunk, weighting 15 bytes. After rereading the specification I decided to explore another path: bit depth.

$ ls -gnh blank-with-PLTE.png
-rw-r--r--  1 1000   114B Mar 26  2021 blank-with-PLTE.png

Reduce the bit depth

When the number of colour in a file is low the PNG specification allows to pack the information more densely. Instead of using one byte per colour it can be sufficient to use less than this: 4 bits, 2 bits or even 1 bit. This correspond respectively to a number of 16, 4 and 2 colours per picture.

For a file using only one colour a bit depth of 1-bit seems ideal. So let’s drop the PLTE chunk, keep the tRNS chunk and reduce the size of IDAT:

$ pngblank -g -b1 80 > blank-g-1b.png
$ file blank-g-1b.png
blank-g-1b.png: PNG image data, 80 x 80, 1-bit grayscale, non-interlaced
$ pnginfo -l -f blank-g-1b.png
IHDR
tRNS
IDAT
IEND
$ pnginfo -c IHDR -f blank-g-1b.png
IHDR: width: 80
IHDR: height: 80
IHDR: bitdepth: 1
IHDR: colourtype: greyscale
IHDR: compression: deflate
IHDR: filter: adaptive
IHDR: interlace method: standard
$ pnginfo -c tRNS -f blank-g-1b.png
tRNS: gray: 0

Here it is:

This is already 26 bytes smaller:

$ ls -gnh blank-g-1b.png
-rw-r--r--  1 0    88B Mar  1 23:31 blank-g-1b.png

Fine tuning zlib

Can we go even smaller ? I think there is no way to pack the data tighter than this but there are still options: fine tuning the compression algorithm. While the PNG designers tried to be future-proof and flexible by allowing multiple compression methods none were actually added after the initial write-up. So there is only one option, deflate (RFC1951) wrapped in a zlib header (RFC1950).

The original implementation and probably the most widely deployed today is zlib. Let’s try that.

According to the documentation two functions can be used to modify the behaviour of zlib: deflateParams which can be used to specify the compression level as well as the compression strategy and deflateTune which can modify “internal compression parameters”. For the later it is further recommended to be intimately familiar with the deflate algorithm and to read zlib’s source code to understand the specificities of the parameters. Spoiler: deflateParams manipulates these internal parameters too using a pre-defined table.

Starting with deflateParams as the easiest path I implemented the -l and -s switch in pngblank. They respectively allow to specify the compression level (from 1 to 9) and the compression strategy. The following are available:

default ;
fixed ;
filtered ;
huffmanonly ;
rle.

The explanation for each is given in man 3 deflateInit2. It actually contains an interesting lead:

Z_RLE is designed to be almost as fast as Z_HUFFMAN_ONLY, but gives better compression for PNG image data.

Using a shell script to try every combination of strategy and compression level:

#!/bin/sh

for strat in default fixed filtered huffmanonly rle; do
    for level in 1 2 3 4 5 6 7 8 9; do
        pngblank -g -b1 -s "$strat" -l "$level" 80 > "blank-g-1b-l$level-$strat.png"
    done
done

$ sh ./blank.sh
$ ls -ngh blank-g-1b-l9-*.png
-rw-r--r--  1 0    88B Mar  2 00:03 blank-g-1b-l9-default.png
-rw-r--r--  1 0    88B Mar  2 00:03 blank-g-1b-l9-filtered.png
-rw-r--r--  1 0    88B Mar  2 00:03 blank-g-1b-l9-fixed.png
-rw-r--r--  1 0   199B Mar  2 00:03 blank-g-1b-l9-huffmanonly.png
-rw-r--r--  1 0    87B Mar  2 00:03 blank-g-1b-l9-rle.png

The rle strategy seems a tiny byte better than default, filtered and fixed, while huffmanonly is generating files more than twice as big. To verify further the difference between every strategies I wrote another script trying every file size from 80 to 512 and graph the result:

plot

The second graph only take default and rle data points, showing that they differ by one byte only on very specific file widths. This is interesting.

Anyway by fine tuning zlib another byte bit the dust.

Here it is:

$ ls -gnh blank-g-1b-l9-rle.png
-rw-r--r--  1 1000    87B Mar  2 00:06 blank-g-1b-l9-rle.png

Switching library - libdeflate

There are other implementation of the deflate algorithm than zlib, for example libdeflate. I decided to try it and implemented the -c switch in pngblank in order to allow for an alternative compression backend.

Compared to zlib there are three more compression levels available and no strategies. Also the library offers no run time modifications to its inner parameters. In my opinion the API is far easier to use, as shown by the respective code size for the two libraries in pngblank (there is twice as much code for the same purpose using zlib).

$ pngblank -g -b 1 -l 12 -c libdeflate 80 > blank-g-1b-l12-libdeflate.png

The size is the same as zlib with the rle strategy. We are still stuck at 87 bytes.

Here it is:

$ ls -gnh blank-g-1b-l12-libdeflate.png
-rw-r--r--  1 1000    87B Mar  3 17:58 blank-g-1b-l12-libdeflate.png

Switching library - zopfli

What about zopfli, the compression library implemented by Google? As there is no man page or other visible documentation in the project repository I went the easy path and relied on command line tool zopflipng:

$ zopflipng blank-g-1b-l9-rle.png blank-g-1b-zopfli.png
Optimizing blank-g-1b-l9-rle.png
Input size: 87 (0K)
Result size: 87 (0K). Percentage of original: 100.000%
Result has exact same size

Nope.

Here it is:

Extending the PNG specification

A 87 bytes file is as far as I was able to go by respecting the PNG specification. Now, what about extending the specification?

What I need here is a way to express a PNG image with only one colour and then mark this colour as transparent.

A first method could be to reuse as much as possible from the existing standard with only a small modification: allow 0 bit depth and empty IDAT chunk. This way a PNG decoder can understand that it needs to draw an image using only one colour, perhaps found in the first entry of the PLTE chunk.

Building from pnginfo and pngblank I slowly consolidated a set of functions which evolved into a library, named lgpng (Rust zealots warning: this is raw C code, be careful). It is not going to compete any time soon with libpng or imagemagick because it cannot be used to actually decode a PNG image and display it. Rather, it is a tool that allow its user to decorticate a file and expose its structure. It can also be used to generate weird files. For example the following code generates a totally broken image with no data in the IDAT chunk and a bitdepth of zero:

struct IHDR      ihdr;
struct PLTE      plte;
struct IDAT      idat;
struct tRNS      trns;

/* IHDR preparation */
ihdr.length = 13;
ihdr.data.width = htobe32(width);
ihdr.data.height = htobe32(width);
ihdr.data.bitdepth = 0;
ihdr.data.colourtype = COLOUR_TYPE_INDEXED;
ihdr.data.compression = COMPRESSION_TYPE_DEFLATE;
ihdr.data.filter = FILTER_METHOD_ADAPTIVE;
ihdr.data.interlace = INTERLACE_METHOD_STANDARD;
lgpng_chunk_crc(ihdr.length, "IHDR", (uint8_t *)&ihdr.data, &ihdr.crc);

/* PLTE preparation */
plte.length = 3;
plte.data.entries = 1;
(void)memset(plte.data.entry, '\0', sizeof(plte.data.entry));
lgpng_chunk_crc(plte.length, "PLTE", (uint8_t *)&plte.data.entry, &plte.crc);

/* tRNS preparation */
trns.length = 1;
(void)memset(&(trns.data), '\0', sizeof(trns.data));
lgpng_chunk_crc(trns.length, "tRNS", (uint8_t *)&trns.data, &trns.crc);

/* IDAT preparation */
idat.length = 0;
idat.data.data = NULL;
lgpng_chunk_crc(idat.length, "IDAT", idat.data.data, &idat.crc);

...

lgpng_stream_write_chunk(stdout, ihdr.length, "IHDR", (uint8_t *)&ihdr.data, ihdr.crc);
lgpng_stream_write_chunk(stdout, plte.length, "PLTE", (uint8_t *)&plte.data.entry, plte.crc);
lgpng_stream_write_chunk(stdout, trns.length, "tRNS", (uint8_t *)&trns.data, trns.crc);
lgpng_stream_write_chunk(stdout, idat.length, "IDAT", idat.data.data, idat.crc);

...

$ ls -gnh blank-broken-1.png
-rw-r--r--  1 0    85B Mar  7 16:20 blank-broken-1.png

85 bytes, this is a very small gain for a non backward compatible change to the specification.

Amusingly the command file is already aware of my proposition:

$ file blank-broken-1.png
blank-broken-1.png: PNG image data, 80 x 80, 0-bit colormap, non-interlaced

Here is the resulting file (although your browser won’t display it):

Conclusion

Down from 138 to 87 bytes was the best I could do, which is still an impressive optimisation to 63% of the original size. It took me a very long time to write this article as I started way back in 2018 and delayed publication again and again. Damn you, lack of confidence.

Anyway, I hope it was fun and that someone will beat me in further reducing this file size.