... that's right! True GIF filtering! (with a side of dad jokes)
Just like socks or ugly pajamas you get for Christmas, it's perhaps something you never thought you'd need. In fact, isn't Wingman Jr. already filtering GIFs..?
Well, sort of. From the browser's standpoint, GIFs occupy a strange space between video and static images. This has some implications. Warning: tech ahead!
First, it is easy (and has been for a long time) to load GIFs with the <image> tag. While the <video> tag is great now, it wasn't until HTML5 that the video tag gained the support it now enjoys. What's interesting though is that folks made a choice to not add GIF support for the new(er) <video> tag - GIF's are left solely to the realm of images.
From a lot of web developers' perspectives, that puts them in a bit of a bind. Why? Well, the <image> tag and related API's for manipulating images don't give you any way to control the animation aspects - for example, you can't control what time in the image to show.
How that impacts Wingman is that we only get the default behavior when we're loading and drawing the image to prepare it for filtering - and that behavior is to simply draw the first frame. That means the rest of the frames don't even get a chance to get filtered because we can't indicate that they're a normal video type. To make matters worse, others have done research that use adversarially designed GIFs to defeat image filters by leveraging animation - for example by putting a black frame for a tiny amount of time before showing the full NSFW image.
Back to GIF support: the lack of good GIF support has fortunately not dampened the spirit of web developers. The workaround has typically been to employ a library that parses and decodes GIFs in a Javascript library, rather than in the browser itself. Unfortunately, decoding video truly is something better left for the browser. However, there are some libraries out there to handle this - see gifuct-js, jsgif, or libgif-js.
But as great as these libraries are, the use case for Wingman is a bit more constrained. While it's true that ultimately the images are going to just get rendered onto a canvas like these libraries do, the addon does its best to consider performance, and canvas and drawing management has actually been one area of a bit of optimization over the years. Additionally, there's a difference between parsing and decoding - these libraries choose to do both. Parsing deals with understanding the video frames and the general content, decoding deals with actually decompressing the bulkier image data. And that's a bit problematic: the Javascript has to implement a simple LZW decompressor to decode each frame into essentially pixels. Not so great if you're trying to do that for a page full of images and both your code and the browser maintain a copy of the pixel data. This reason, along with the fact that I do not wish to become dependent on more libraries if possible, has made me hesitant to pull in true GIF support via one of these libraries. However, it's one of those areas lacking support that has bugged me.
But I woke up from a nap the other day with a burst of creativity, and took a look at the GIF format internals once again with a fresh set of eyes. It came to me that instead of trying to solve both the parsing and decoding problems, I could instead solve just the parsing problem and leave the decoding to the browser.
The way the GIF format works, the structure is basically a header followed by an alternating mix of image frames and blocks that indicate a time gap. It allows for the image frames to stream in and update the existing image being displayed. With the way the details work out, it is possible - and indeed relatively easy - to parse out the GIF's header and image frames, then repackage (remux) each frame into a standalone GIF using the header information.
While the GIF format allows for arbitrary patches to be updated over
time on an image, the reality is that most GIF's fully replace (or
nearly fully replace) the image on every frame. Particularly for images
that are more photographic in nature (as many NSFW images are!), it is a
bit more difficult to take a patch-based approach to image drawing, so
it is generally avoided.
This overall situation is ideal for Wingman: since most images for the NSFW-filtering use case are full replacements, standalone GIFs can be created as the data streams in, and then filtered just like normal images. The code to do the actual parsing/repackaging is quite small (~250 lines), and the actual decompression is left to the browser - letting the browser do what it does best.
I still have some tweaking to do before it's quite ready, but it's up and running fairly smoothly now. If you're feeling adventurous, go check out the GitHub issue for more details and a pointer to the branch under development!