Update: This article was renamed to “part 1” when publishing part 2.
These last couple of week I’ve been putting a lot of time into shotman. Shotman is a small GUI tool I wrote a few months ago. It shows a small preview window when a screenshot is saved, and has controls to copy the screenshot into clipboard or delete it right away.
The previous version was 0.1. This newly written version is to be 0.2.
About version 0.1
The initial implementation was more of a prototype than anything else. It worked, but was slow. It can take a couple of seconds to even render the window. That aside, it had a lot of quirks regarding window sizing and positioning, and its clipboard support is flaky (this was delegated to Qt, so not much control under the hood here).
The slowness is mostly due to the design; taking the screenshot was delegated
grimshot, a bash script which in turn uses
grim to actually take the
screenshot. The screenshot was taken, then encoded as png, saved to disk, then
loaded from disk and decoded by shotman, and finally rendered on screen by
shotman. This is a lot of back and forth to get pixels from the compositor back
into the same compositor.
The most obvious design change I’ve made is how screenshots are taken; the new version will request that the wayland compositor make a copy of the screen into memory. Shotman will then create its window, and just tell the compositor to render that same in-memory screenshot onto the new window. Shotman doesn’t copy the screenshot around at all – it doesn’t even need to read it for rendering!
The result of this is that it’s a lot faster. It takes well under 80ms from startup to the point a window is rendered on screen on the same machine where shotman would take between 1.6 seconds (at 2560x1600px). That’s a 95% speed-up.
The thumbnail window
On shotman 0.1, the rendered window was just a regular window, and required
sway to make the window float and render properly. This was a
very hacky approach, and wouldn’t even work on other compositors.
The 0.2 shotman now renders itself using the
protocol. This lets it render as a floating widget without any custom
compositor configuration, and allows it to work on any compositor exposing this
protocol (for now, mostly wlroots-based compositors).
Keyboard controls remain mostly the same (see the relevant section of README). Usually, after taking a screenshot one would either dismiss the window right away (Esc), or copy it to clipboard and then dismiss. It’s not meant to linger around.
It is not also possible to edit the image by pressing e. Currently,
gimp is hardcoded, though I expect to make this configurable in future.
A new “nice to have” is that the h, j, k, l, keys can now be used to re-anchor the window onto different corners of the screen quickly and easily.
A pending annoyance is that, once the window is unfocused, it cannot be refocused with just the keyboard. This is because sway does not offer any mechanism for a user to re-focus a layer-shell window (with good reason too; this almost never makes sense for layer shell surface). I’m considering attaching a pop-up to an invisible layer-shell instead. Using the invisible layer-shell surface allows me to continue anchoring the window in a specific corner, but the floating pop-up can be re-focus via the usual “focus floating” mapping.
Regarding clipboard support, the new version handles it directly, which has resulted in far more reliable results.
Normally, pressing ctrl+c copies the image into clipboard (no need to “select anything here, this window can only copy one thing). I’m also working on copying the filepath into clipboard, so that applications can paste the file themselves too.
Copying the filepath should also provide an interesting mechanic with ctrl+x, which should allow moving the original screenshot file by just pasting it elsewhere.
Mouse and touch controls
The original version also had buttons that could be clicked. I realised I’d last used these the day I wrote the tool and never again. However, I will be implementing those in the new version as well, since I want shotman to be usable either keyboard-only, pointer-only or touch-only. The touch-only usability aspect is important since I want to be able to use this same tool to take screenshots on Linux-based phones. A fast, reliable tool for this is much needed, so I hope to be able to fill this gap early on.
Currently shotman runs on a Linux phone, but there’s no convenient way to interact with it due to being purely keyboard driven.
My intent is to include buttons for copying, deleting and closing, and also allow swiping the window towards the border of the screen to dismiss it. Fortunately, one of laptop has a touch display which should make testing this a lot simpler.
Current state and near-future plans
The newly written version work works well, fast and reliably. There are, however, several pending items that still need to be addressed. The biggest one is the ability to take a screenshot of a single window or a selected region.
There’s also a lot of edge cases and smaller details that need to be addressed all over the codebase, though this are more about cleaning up than about actual usability issues.
Filenames format is now a bit less problematic:
This keeps files sorted in natural order (thanks to using ISO-like formats),
makes them very easy for humans to recognise, and avoids issues with tools and
websites that are picky about filename characters. Keeping the
weird, but avoid issues when multiple screenshots are taken within a second.
This last part might still change a bit in future.
I do want to look into animating the motion when the window is re-anchored from one corner onto another – but only if this keeps complexity low, which is yet to be seen.
Scaling support is still WIP, but should be ready soon.
There’s two low-impact bugs that cannot be easily fixed:
- There’s no wayland protocol to pick a window to screenshot, so this functionality will remain sway-only once implemented. This should not be an issue on mobile, where I would usually expect to take a screenshot of the entire screen.
- The first frame is rendered blurry and only the second frame will be sharp. This is due to a bug in wayland protocols, where clients cannot know the scale until after they’ve rendered a window, forcing the first frame to be imperfect.
Why not existing any tools?
Tools for taking screenshots already exist on Linux, but none of them fit the bill quite right for this use case.
The most obvious contender is
grim. It is a general-purpose tool for taking
screenshots, and also very easy to use in scripts. However, it was not quite
right for this use case; it will save the image to disk, which then needs to be
read again to render it on-screen. It’s a great tool, but not right for this
Shotman does one thing and one thing only, but does it well. It renders a GUI
when taking a screenshot, allowing copying, deleting or keeping it. A pending
feature is to press
e to open the image in an editor (e.g.: GIMP), but that’s
the end of its scope.
If time permits, I hope to have a release with all these new improvements in the next week or two. I’m not entirely sure about a timeline for the touch-aspect, but after implementing that I hope to transition this project into a stable phase.
In the meantime, have a look at shotman in its new home over at sourcehut:
Want to sponsor my work? See this page for details.