Popular webcomic xkcd posted a comic (#3054) that presented a cipher in which all letters are converted to the letter A with various diacritics (e.g., , , ). This seemed like it would be a nice little programming exercise—and I found that others in a related Reddit thread had the same idea, which further spurred me on—so I spent some time this afternoon implementing my own version.

I briefly debated whether to build it in Ruby (which I use at work) or F#, but I yet again couldn’t resist the allure of F# for this. Along with being more enjoyable to work in, I appreciate how F# (and, presumably, other functional languages) makes it suitably easy to write satisfyingly robust code. (Aside: I’m pretty surprised that C# wasn’t even in this equation, given that it’s been my go-to language for years until recently.)

My implementation is in my scripts repo: XkcdScreamCipher.fsx.

It was fun to make! I perhaps went a wee bit overboard for what’s essentially a throwaway script, but part of the reason for the exercise at all is to enjoy myself. ☺️

Usage examples

Here are some usage examples. Note that dfsi is my alias for dotnet fsi.

$ dfsi fsharp/XkcdScreamCipher.fsx --encode "hello"
A̰ÁĂĂÅ
$ dfsi fsharp/XkcdScreamCipher.fsx --decode A̰ÁĂĂÅ
HELLO
# `--test` confirms there are no conversion issues with the given string(s).
$ dfsi fsharp/XkcdScreamCipher.fsx --test "hello" "hi" "how neat\!" 123
OK: HELLO --> A̰ÁĂĂÅ --> HELLO
OK: HI --> A̰Ả --> HI
OK: HOW NEAT! --> A̰ÅȀ ÂÁAĀ! --> HOW NEAT!
OK: 123 --> 123 --> 123

StringInfo.SubstringByTextElements method

Through this exercise, I learned about the StringInfo.SubstringByTextElements method in the System.Globalization namespace. It was necessary for properly iterating over encoded Unicode composed characters, instead of their composite parts, when decoding.

You can read a bit more about Unicode normalization in my recent post “Unicode Normalization Forms and Recursive Path Normalizer (F# Script).”

Argument issue

I experienced one fascinating issue while setting up argument validation.

I had originally intended to allow -d as an argument to indicate that the provided input should be decoded back to standard letters. However, when examining the arguments during validation, -d would always be combined with the following argument. (For example, if I passed -d "A̰ÁĂĂÅ", the first argument would be -d:A̰ÁĂĂÅ during runtime, which led to spurious validation failures.)

I was pretty confused until I came across an issue on the official F# repository entitled “fsi.CommandLineArgs different behavior on -d/-r/-I args, ignores –,” which confirmed it as a bug and explained why it occurs.

I’m glad I came across that issue! I might have to figure out what I want to do as a workaround for that, though it’s hardly a big deal either way.


<
Previous Post
Regex: Extract messages from Rails logs
>
Blog Archive
Archive of all previous blog posts