A Better CAPTCHA
I’ve been thinking about improving the design of the CAPTCHA. What I came up with today is nothing really new in terms of principles, just capitalizing on some important observations:
- Humans are clever with words as well as pictures. Example: leetspeak, which in its most gloriously convoluted form increases the complexity of lexical analysis by (I’d estimate) an order of magnitude or so.
- Humans maintain large and dynamic repositories of all manner of content, such as text and images.
- Search engine response times, while quite fast and getting faster, impose very long delays from a computational standpoint.
Image processing is the sole basis of the most common form of CAPTCHA, which asks the respondent to identify distorted and/or broken letters against an uneven background. It’s not something we consciously think about while doing it, and it’s very hard to explain how to do it in terms of the instructions a computer can follow. However, the simpler CAPTCHAs, in which lines are bent but not broken, are little obstacle to optical character recognition (OCR) programs. Artificial Intelligence and neural network development effort in this area, fueled by roboticists and grocers and spammers alike, may soon make it impossible to distinguish a smart computer from a regular joe.
Alternative methods do exist, most still based on images. One eccentric but moderately effective solution that was created was KittenAuth, which involves correctly identifying some subset of a group of adorable animal photographs, for instance, kittens. Presumably the test draws on some largish database of adorable animal images. Whether it would work against a resourceful foe is questionable, but then, most sites large enough to have such foes would find KittenAuth kinda tacky for their purposes. It’s a great place to start for considering the variables involved in such an auth tool, though.
As best I can tell, the primary limitations are the number of distinct images possible and the difficulty of statistically correlating two images from the same category. If an attacker can easily collect all the images and index them by hand in some reasonable amount of time, there’s a problem. Similarly, if he discovers statistically significant hash functions that lump the database entries into common subpopulations, and can index those in a reasonable amount of time, he can write a program that has an unacceptably high probability of passing the CAPTCHA.
The first limitation can be transformed into a special case of the second by introducing random mutations to each image so that they cannot be positively identified as the original from the database. After that, it becomes a matter of addressing statistical weaknesses, such as the distinctive color properties of foxes or the frequency domain characteristics in a picture of a porcupine.
I’m of two minds on how to best generalize on these concepts. A straight image identification test with a vast library of arbitrary clipart was my first approach, though with synonyms there may be unacceptable amounts of ambiguity. Then today I got to thinking about rebuses.
A rebus is a representation of a word or phrase in terms of textual and pictoral elements and the spacial relationships between elements. The I Heart NY logo is an example, although the most common form is that of a symbolic concatenation resembling an arithmetical expression. The thing about rebuses is that they combine our strengths in identifying symbols, recognizing implicit or disguised information, and verifying our conclusions within context. They’d be more effective in languages like italian, where the endings of words convey very little statistically about the rest of the word, but might also be made to work in english. To script a rebus maker you’d need a few things:
- A very large and diverse image database. Google image search is an example, though the pictures aren’t always what’s expected and technically the images in their cache are often copyrighted material.
- A dictionary mapping nouns (or really just any words that have readily recognizeable symbols) to possible textual representations (letters, homonyms, (mis)spellings, etc).
- Routines that filter or transform images from the database to a usable size and alter them in some randomized fashion to be different from the original.
- A seive that finds words or phrases which particularly lend themselves to rebuses of the symbol-concatenation variety. To meet this criterion it must be difficult to accurately guess an individual word using wildcards and the remaining unaltered letters (which we assume OCR can read), or to guess the entire phrase. One could change the puzzle to contain random strings rather than words, but the context would be lost.
Without some research, it’s hard to say whether this method could work. There are a lot of words in the english language, though not as many are in frequent use. The same goes for pictures on the web.
Great Moments in Instant Messaging, vol. 1 no. 1
This didn’t quite fit roundly in the existing categories, though it’s sort of a Yours Truly issue, sort of just a ha-ha-only-serious aphorism. So, I’ve opted to install a quoteboard section for such silly little things. Anyway, it’s on the subject of coming clean to one’s crush:
Me: like getting up the nerve to hit myself in the mouth with a wrench
Amused confidante: yeah.
Me: you know it’s not going to be pretty, but the ache has gone on for too long and that fucker’s gotta come out
On a loosely related note, I am getting seriously miffed about my upload rate on Time Warner (not to mention the inconstancy of data flow in both directions). Up-rate’s not something the average person gets concerned with very frequently, but when you’re uploading 256Kb jpeg files via SSH and even unencrypted blog posts on a frequent enough basis, at some point you just want to stab your eyes out. To think that if I were working in Denmark or the UK, I could have seeded multiple ogg or mp3 torrents in the time it just took Wordpress’s post editor page to reload. And this file can’t be much over 2 kilobytes. If the broadband industry wants to convince net neutrality watchdogs of their good intent, they could start by catching us up with the EU ratewise. I couldn’t stand to be any more throttled than I already am at present.
Beautiful, Beautiful Things.
Okay so I’m finally ready to back up my previous statements regarding all the sightseeing I did on my vacation:
Hotness.
What I’ve posted is a small, small sample of what Nikkie has in her gallery. It was too much to wade through, so I picked out my personal favorites. They make great wallpaper and whatnot. And, if you happen to know someone looking for an apprentice photographer, please give her a nod. She did kind of just have her camera die on her in the process of taking these.
High Art in Gaming?
Been meaning to take a crack at this one for a couple weeks now, but I got delayed from vacation and all.
A while back I read this feature by Ernest Adams of Gamasutra. It points out that gaming has yet to gain respect as an art form, and explores the question of where and when we can expect a renaissance that will bring universal recognition to the industry. While I largely agree with his analyses of the current climate, I take issue with some of the conclusions regarding what constitutes artful game design. He cites such well-known literati as Will Wright and Sid Meier–well and good–but a careful listen to what he finds important gives me the impression he’s looking in the wrong direction. He’s not looking at the things such game designers actually care about.
The public misconception that a game cannot be high art reflects not just a lack of quality in the current product, but also misunderstanding of where its true art lies. On this, the industry cannot even agree with itself. Adams’ feature list sounds like a drier, artsier version of what companies like Square already strive for. It casually ticks off the programming and interaction design aspects before dwelling at length on the things it has in common with conventional media – setting, CG & cinematography, voice acting, themes & morals, obligatory allusions to bring in the cultural backstory and whatnot.
If these things lay at the heart of gaming, all games of any worth would have to play like movies. And for awhile, this actually seemed the case, because with the rise of monolithic gaming studios came the same wrong assumptions by executives, who had plenty of money to throw at gorgeous graphic design, star-studded voiceovers and soundtracks.
What’s wrong with this? Gaming abhors repetition. More than just a simulation of the incalculably complex bifurcating nature of reality, a modern game needs a formula for captivating gameplay–an artificially intelligent engine that functions to engage the audience’s humanity and reasoning. Under this definition, we can look at game design as a computability problem. Assuming that AI-completeness is different from Turing-completeness, a tradeoff is necessary between the perceived quality of a gaming experience, which comes from its AI strength as well as the prefabricated script, and situational flexibility.
A formula for games that is broadly repeatable is inherently going to produce less lasting, less satisfactory games, but it spares the studio from having to employ real artistes, who are more valuable by a figure or two than basic code-monkeys. Most of them need that savings nowadays to stay profitable. Lo and behold, formula art emerges.
For several years, the result has been stunningly pretty games but a general lack of forward progress and innovation. While a few studios pushed forward with new ideas about gameplay, the rest focused on the makeup job, giving us dynasties like Halo, Final Fantasy and Prince of Persia (and a slew of less impressive clones). They’ve set the standards for prettiness, and any game designer would do well to pay attention to them in this aspect, but they are often called overrated by the very gamers most addicted to them.
By contrast, innumerable smaller studios continue to accumulate small but ravenous fan bases and are followed, at least casually, by gamers and the gaming press. They know their niche, and they don’t stray from it into the world of hollywood-style games, because to them following norms and appeasing critics is less profitable than delivering the ultimate user experience (fun may not be the same thing as art, but in gaming the two are closely intertwined).
These are companies that are adding real value to the industry. Valve, technically adept creators of the Half-Life series of PC games, have set the bar for fast, reliable execution, in addition to being worshipped for their beautiful visuals and for pioneering interactions between player and simulated environment. Id software, maker of some of the first 3d games, created a rock-solid, cross-platform game engine (developed for Unreal) that is often licensed to other studios, is heavily modifiable by end users via the UnrealScript API and includes a suite of CAD software. Cyan Worlds cornered the market on thoughtful, gorgeous, intellectually involved games until the recent completion of their Myst saga. Their success is a product of their attention to what matters: the craft. At the heart of the craft, of course, is the code and data.
Looking much farther into the future, there is certainly the potential for developing highly cinematic holo-novels in a style much like that suggested by Adams. However it is naive to assume that they will precisely emulate old media when they could instead be making use of innovative graphical interfaces that are currently in use or development. Some things, like the HUD, are too valuable to be dropped, and aren’t nearly as disruptive to the user experience as having an NPC say something from his script that’s somehow inappropriate to the situation at hand. The same goes for overhead indicators, which often take the place of details that can’t be effectively made visible or audible in-game.
Ultimately, the holy grail of game developers should be not to recreate Hamlet or Gone with the Wind, but rather the Illustrated Primer from Neal Stephenson’s The Diamond Age. They will put humans in situations just beyond what circumstances allow for, and proceed to tease and test, resulting in emotional and intellectual development. It is not beyond us to dream up effective approximations to such an extreme. On our way, however, we must learn to revere a new and different group of artistes who ply a wholly different kind of art.