Showing posts with label release. Show all posts

Sunday, December 26, 2021

3.3.0 Now Available!

Three main updates this time!

First, GIFs will finally be scanned on a frame-by-frame basis rather than simply scanning the first frame. Note that the replacement image logic still has a bit of work so blocked images will often appear as a broken GIF. See the blog post for more details! https://wingman-jr.blogspot.com/2021/12/gift-giving-season-is-here.html

The second feature - default zones - comes from a new contributor, Abdullah! (https://github.com/abdullahezzat1) When the addon starts up, you can now pick which zone is set by default in the settings. This is a great feature and I think it will work especially well with the way some people want to use the plugin. Thanks Abdullah!

Finally, I tweaked the flags on the Tensorflow.js library/model startup (for the WebGL backend). This part of the startup is what causes the several second delay, so it's an important place to try to optimize. With the new settings, I've seen my startup time go from about 10 seconds to 5 seconds, but every computer is going to be a bit different. Let me know in the feedback link how it's working for you!

Sunday, March 7, 2021

3.0.0 Released!

I have finished development and testing of 3.0.0 and am excited to say it is now available over on AMO - check it out! https://addons.mozilla.org/en-US/firefox/addon/wingman-jr-filter

See the post 3.0 Inbound Soon for a quick list of new features and updates!

Sunday, February 28, 2021

3.0 Inbound Soon!

I'm excited to tell you about the upcoming 3.0 release! I've finished most of the key development on it and will be moving to a further round of testing/tweaking relatively soon.

So, why move to 3.0 since the move to 2.0 was not long ago? Let's take a look at the feature list.

Video Peek

I've blogged about this before but a simple form of video scanning is coming. It will "peek" at a good chunk of the first part of a "basic" video and block it if bad content is detected over a sustained period. "Basic" videos include most simple embedded and banner ad videos and some services such as TikTok. More advanced "streaming" video services basically need to be handled on a case-by-case basis because behind the scenes they work differently from each other; for now, the only "streaming" site support is YouTube. Here the video is streamed chunk by chunk and each chunk has the first section scanned.

When a video is blocked, the behavior will depend on when the blocking occurred. If it happened right at the beginning, a placeholder Wingman Jr. icon may appear in the video stream. If it happens midstream, it is not easy to insert an image so the video will simply stop playing; there is also a visual indicator that will appear on the Wingman Jr. icon - to be discussed a bit later.

Note this feature is still quite new and may need some further refinement, but I found the current form useful enough that I wanted to share it with you. There will be an option to turn video scanning on and off apart from image scanning. Don't hesitate to send in feedback using the link in the popup options after you've used it - there are a number of constraints on what I'm able to do but I'd like to hear about your experiences to make it as good as possible.

New Model

The model in 2.0.1, SQRXR 62, has been in use for a long time. I've done many experiments to see if I could create a worthy successor, but improvements had long been marginal enough that I did not wish to change the current experience.

However, that has now changed with the advent of SQRXR 112. It builds on a slightly different base model and achieves better results. I'm still working on the final cutoff parameters to use for the release but the bulk of the work is done. If you're a machine learning nerd, you can use the model in your own projects - check it out here.

Silent Mode

The human psyche is such that we are curious creatures - and it is human nature to seek out the "forbidden fruit". Currently the browsing experience accentuates where the blocked images have been; even though they are not visible, it can potentially promote a dark pattern where one can want to click on the image slot to see what was there.

I've long had an issue out to improve this with a "silent mode" where blocked images are instead transparently replaced. I gave it a whirl and so far I'm quite liking the results. The actual implementation places a small watermark with "W" and the image score in the center of the replaced image, so it is discernible if you look closely. However, in a wall of images it does not stand out heavily and cognitively I've found it to be significantly less jarring.

Scanning Progress Feedback

Have you ever wondered in the past: did the addon get stuck? Or is this image just taking a long time or failing to load?

Now you'll be able to have more clarity on that. A simple progress bar has been embedded in the main browser action icon, so you'll be to track how many images are queued up to be scanned. Additionally, it provides a video scanning indicator in the form of a tiny "v" in the bottom right; a blocked video will also cause this area to light up with a different color.

Advanced Option - Tensorflow.js Backend Selection

Tensorflow.js is the library I use to perform the AI model predictions. It has more than one "backend" that can be used to perform the calculations. For many users, the WebGL one is the best default choice. However, one of my users surprised me by sharing that the new WASM backend was faster for them. On my computer it is about 10x slower than the WebGL backend, so this was unexpected. This user requested that I implement a new feature to allow the user to choose the backend - that will be available in this upcoming release as well.

When Will It Be Ready?

As noted, development is mostly wrapped up - at that point it will mostly depend on what gremlins are found during testing. Stay tuned!

Sunday, December 13, 2020

Firefox 83 Fix - Get 2.0!

(Update: 2.0 is out!)

Fix in Progress

There is a fix in progress in Wingman Jr., it changes quite a bit of code, but it should be out in a bit. I have a fully working solution and am testing it. If you upgrade, you will likely see a note about hidden tabs; this is expected and is due to the nature of the solution.

What Went Wrong

First, it's important to know that to make the AI work, the computer does a lot of math for each image it scans. A typical computer has more than one type of chip on it that knows how to do math. There's the CPU, which is for general purpose math. Then - on most modern computers - there's the GPU, which is for large numbers of parallel calculations of the same kind - which works great for both graphics/video as well as for AI.

Having a GPU - and indeed a fast GPU - can make a big difference. In some cases, it scans images 10x faster or even more than the CPU. The AI library I use for helping run calculations, Tensorflow.js, is careful to ensure that it goes as fast as possible.

So what happened with Firefox 83?

Prior to Firefox 83, there was bug in Firefox in certain cases. Basically, there is a special call that you can make that says "give me access to the GPU, but if there's a big performance problem because something about loading the GPU isn't quite right, don't give me access to the GPU at all - just let me know and fail". For the most part, this function was working correctly in Firefox 82 and prior. However, it didn't work in all cases. In some cases, it would give access to the GPU even when it maybe was taking a performance hit.

In Firefox 83, the Mozilla team fixed the glitch. So wouldn't this make things better?

Not quite. Basically, the fact that Wingman was trying to load the GPU in the background in the addon wasn't fully supported in some cases. So when Tensorflow.js tried to get access to the GPU in this way, it would now correctly fail and say, "I won't give you access to the GPU because there is a bit of a performance hit". This meant that Tensorflow.js would fall back to doing all the calculations on the CPU, even if the performance-hit GPU would still be much faster.

If you're one of the unfortunate users like myself that encountered this, it made the browsing experience basically unusable. The whole browser would seem to lock up on loading pages and things would take forever to load.

Partial Mitigation

This spring, Tensorflow.js also released another method of running AI models, the "WASM backend". It still used the CPU, but it did some advanced tricks that leveraged some things basically all modern CPU's can do, and it made the CPU case much faster. So much faster, in fact, that in some cases it was as good as the GPU or maybe even a tiny bit better. (See here for Google's blog post on the matter).

I added this as a fallback method for calculation, and it helped some users. But for some users (like myself), the performance with this method is still unbearably slow.

Options

One option I pursued for fixing this was to have Tensorflow.js use the GPU even if there were performance issues noted by Firefox. This loading option is not exposed by Tensorflow.js, but they were kind enough to consider adding it as an option.

While this might work for some, it might end up being the wrong choice for others. If it was the wrong choice, then the system should by all rights fallback to the "WASM backend" but would not if we forced it to use the GPU. Likely then the right thing to do would be to expose an option in Wingman to pick which method to use, but this makes for a potentially poor default experience.

As the true nature of the bug unfolded as the excellent team at Mozilla looked into my bug report, it started to become clearer that 1) a real bug had been fixed and that 2) existing performance may have been suboptimal already! Additionally, there was a critical realization: it wasn't that the GPU couldn't necessarily be loaded quickly - it's that the addon background setting wasn't programmed quite correctly to allow it do so. However, this meant that if you could load the GPU in a different setting, it might work as expected - for example, in a "normal" web page setting. So how could this be accomplished?

New Architecture

In the past, there was more or less one place where code would run: the background of the addon. This approach is simple, light, and generally works great. But now we needed to do processing in a "normal" web page. An addon can also create and load web pages, too.

So the solution was to split the code into two parts: the "background" and a "processor" running on a normal page. The two parts need to talk back and forth in deep conversation in order to work. The "background" says things like "here's a request and the data flowing in for it" and the "processor" says things like "here are the scan results you asked for". The addon ecosystem makes this straightforward to accomplish, but it's a lot of plumbing and a large rip up of the existing code.

I've finished the rewrite and am ensuring the changes are stable. While there is some overhead in this approach (due to the two sides being in conversation), there are also some advantages. One of those is that it is much easier to load more than one processor if needed. So far I have not yet been able to see a performance gain out of this, but in the future I may be able to use the GPU and WASM backends together to see a bit of a performance boost.

It is probably apparent now why there might be a warning about hidden tabs. Wingman creates the "processor" tabs as web pages so that they work properly, but they're not helpful for the user to see, so it immediately hides them. That's all the tab hiding that Wingman does, but it still drives needing the "tabHide" permission and will prompt a new message after the upgrade.

Final Notes and 2.0

This is a big change in the overall architecture - large enough that I plan to change the version to 2.0 to reflect what has happened.

I may try to squeeze in a couple other features or fixes, but stay tuned for a new release soon.

Friday, November 27, 2020

Release 1.3.0 - Partial Fix for Firefox 83 Slowness!

This release is an emergency release in response to the release of Firefox 83.

TL;DR - Firefox 83 broke things for some users and made browsing unbearably slow. While things get properly fixed, I can make it faster again, but not quite as fast as the plugin would be in Firefox 82.

The long version:
This plugin leverages another excellent library, Tensorflow.js, that runs the AI models created for this plugin. Tensorflow.js gives many different ways to run the AI models, called backends. They all give the same prediction, but some backends are much faster than others. The fast backend (WebGL) started failing in Firefox 83 for some users, which caused the default slow backend (CPU) to be used instead. For at least two users, this made the browsing experience so slow as to be unusable.
Fortunately Tensorflow.js recently added support for another relatively fast backend (WASM) that I have found does not seem to fail to load in Firefox 83. I am adding in support for that new backend as a fallback. It is not quite as fast, but makes browsing usable once again.

If you are experiencing issues, please disable the plugin and let me know over at Github - thanks!

For the technically curious, the Tensorflow.js team has a great writeup on the introduction of fast SIMD in the WASM backend over at their blog.

One final note - this version also fixes one issue that caused downloads to sometimes show up as gibberish rather than prompting for download.

Wednesday, November 4, 2020

Release 1.2.1 - The Case of the Distorted Symbols

International users - this release is a bug fix release for you!
One of you kindly reported that they were seeing special characters such as "ä", "ö", "ü", "ß" and "€" showing up incorrectly as ¿½. This release should fix most instances of that happening, but please comment at https://github.com/wingman-jr-addon/wingman_jr/issues/70 if you are still seeing problems. Thanks!

For the technically curious (or perhaps those who are having trouble falling asleep at night and need something boring to read), here's what was happening. In order to scan images that have been encoded as Base64 data URI's, I fully scan all documents of Content-Type text/html and do search and replace as necessary. However, when I get the document it is as bytes, so I need to handle the decoding from bytes into text myself. All examples out there just use UTF-8 for the TextDecoder, but alas, real life is a bit more complex - the source of this issue is due to incorrectly decoding non-UTF-8 docs as UTF-8. So now I try to do rudimentary encoding detection based on "charset" in Content-Type. An interesting followup is that when I turn text back into bytes, I use TextEncoder which - at present - only supports UTF-8, so I need to make sure the Content-Type gets set appropriately for that.

Note that using only Content-Type for character encoding detection is considerably simpler than the mechanism that browsers use, but it still hits a vast majority of the use cases even though it is not quite accurate. You can see how it fares against a selection of standardized tests by W3C. Character encoding detection is exceedingly sophisticated - if I still haven't bored you with the details, I recommend checking out the spec for those facing truly persistent insomnia.

Saturday, September 26, 2020

Release 1.2.0

It's been a while since the last release. I've got a couple small things in this latest 1.2.0 release:

A helpful user, Stephen, submitted a feature request to add an on/off button to the main menu. While showing it isn't the default, you can now turn it on in the options. I know I'll find this feature valuable as well! It's useful when most of the time you're browsing safe sites but then need to go to e.g. a photo site of some sort to find some content. See the GitHub issue here. Thanks Stephen!
As you may have noticed, the image score for blocked images almost always shows "99" since the release of the zones feature. I finally got back to adding a bit better approximation of the image score.
The key library for the AI part, Tensorflow.js, has seen an upgrade from version 1.x to 2.x in order to make sure this plugin will continue to be compatible with it.

The AI model was not changed, so no change in how good or bad the filtering performs is expected. However, if you're into machine learning, you may be interested to know that I've released the model into its own repository now, too.

As always, feel free to contact me at the GitHub project site: https://github.com/wingman-jr-addon/wingman_jr

Saturday, April 4, 2020

1.1.1.1 for Families Opt-In Support in Wingman Jr. 1.1.0

I was excited about a new service announced by Cloudflare this week - "1.1.1.1 for Families"! I admit, without an understanding of the company and the technology, that headline might not be the most eye-catching. Let me provide a bit of background.

Cloudflare is a technology company that provides many foundational services for using the internet. One exceptionally important service they provides is the DNS or Domain Name Service. While we think of internet addresses as text-based addresses, these text-based addresses are converted to a numerical form under the hood called an IP address that is used to route traffic. Specifically, the hostname - for example "google.com" - is represented numerically, but not the part of the address afterwards that goes to a specific page. Basically, every single webpage you visit "resolves" the hostname into this IP address by using a "DNS Provider".
One trick that has long been used is to block hostnames that contain questionable content by simply saying using a DNS provider that says "I don't know how to convert yourbadsite.com into an IP address", so all requests for media from that hostname fail. This is a lightweight check, and is a relatively coarse form of a blacklist. Maintaining this blacklist is a gargantuan effort, almost always a commercial one.
So what is this "1.1.1.1 for Families"? Well, two years ago Cloudflare launched their own DNS provider at "1.1.1.1". Now they have extended - free to the public - offerings that can filter out hostnames of known malware and adult content providers.

Wingman Jr. relies on AI to scan images fully client-side, which has the distinct advantage that 1) each image is considered individually rather than being lumped in with a whole site and 2) no communication with an external service provider is needed. However, as I've had at least one user helpfully remind me in an email, video is not blocked. Long term, I would like to support filtering video, but it is a difficult technical challenge to get right - and performant. One thing I can do in the mean time is provide the option to also block images and video by using the lighter weight DNS-based approach. This is now quite feasible thanks to Cloudflare!

So how does it work? Well, roughly speaking you go to the plugin's new settings area and enable DNS-based blocking. That's all you have to do. Under the hood, the plugin will start capturing image and video requests before they even occur and check the hostname with Cloudflare's servers. If Cloudflare says to block it, the image or video request will be aborted - you won't even see the usual Wingman icon or the update to the number of blocked images.

Now here's the thing: while there is a definite upside to this - a second layer of blocking, in some cases better efficiency, and basic video blocking - enabling this option does communicate the domains you are fetching media from with Cloudflare. Additionally, some websites with rather mixed content may end up being categorically blocked. These are tradeoffs - which means I am making this an opt-in only feature.

However, I'm excited about this new option! I believe it makes sense for many users. I also want to thank the user that took the time to write me an email and got me thinking about this - it's great to hear how people are using this plugin and what they'd like to see next. Look for an update in Firefox soon - I plan to release this with version 1.1.0!

Friday, January 24, 2020

Release 1.0 is here!

Woohoo! After well over a year of releases, I'm excited to announce version 1.0!

Why now? What makes it special enough to flip to version 1.0?

The big new feature here is "zones". Now you can pick different zones for different levels of sensitivity on blocking. This can either be manual, or it can be automatically managed based on how many images have recently been blocked (the default). The "neutral" zone is not too different from the current blocking level, but the "trusted" zone that engages when not many images have been blocked recently might just provide you a better tradeoff on accidental image blockages. I've found the experience to be better when I'm using it, and I hope you do too. If you're interested in the backstory on zones, I recommend checking out my series of posts on model sensitivity, starting here: Model Sensitivity - Part One.

While this release is a landmark, it is just the foundation. I'm excited to get your feedback! Speaking of feedback, thanks so much to those who have given feedback. It's been helpful!