| russ ( @ 2003-12-16 22:56:00 |
An inquiry into several prevalent families of text encoding schemes
I've been pondering
ghewgill's recent l33tspeak article in which he received a letter from AT&T urging him to write like an illiterate moron. The ostensible goal of AT&T's suggestion of using "text abbreviations" is to shorten the message length (perhaps to save transmission costs or make more of it viewable on tiny cellphone displays). I submit that this dialect is not actually l33tspeak per se, even though it is common (among people I know anyway) to call it l33tspeak. I now explore some related but distinct dialects, which all have the property of making the author appear to be a stupid idiot.
There are actually several different dialects which are erroneously lumped under the term "l33tspeak"; I will consider 3 besides actual l33tspeak. The oldest one is what I'll call "license-plate-speak", canonical examples being various forms of braggadocio such as "2FAST", "2FAST4U" "GR8 CAR", etc. There is some intersection between this and l33tspeak, the classic example being "u" for "you". But ultimately text samples like "CAN U", "C U L8R", "2NITE" are attempting to save space by the single device of using the name of a digit or letter as a syllable in an existing word, purely as an abbreviation to shorten the text. This is important: aside from the substitution of a single character sound for a syllable, no other changes occur, because they are trying to combine otherwise clear conventional communication with a single pathetically misguided alteration. This was done on license plates years before today's l33tspeakers were even born, because license plates have even more severe restrictions on message length, roughly a half dozen characters (depending on state, time period, type of plate, etc.), and the characters could only be letters and digits (and maybe you'd get a space or word separation symbol like a star or shape of your state, in order to form 2 words on the plate).
Now observe that l33tspeak (which does use that same technique of text shortening) also does various other things as well, which have nothing to do with reducing text length. The additional goals serve to obscure the text (unlike license plate speak, which usually wants to be understood), making l33tspeak more cryptic to the uninitiated, via subcultural in-jokes, and for that reason I feel it is a separate (though just as annoying) phenomenon. For instance:
Substitution of one symbol for visually similar symbol, as a visual pun (I=1=l=!=|, A=4, S=5)
Transpositions of adjacent symbols (the=teh, porn=pr0n)
The astute reader will observe that these leave text length unchanged, and they simply do not occur in license-plate-speak. Note that transpositions often resulted from incompetent typing mistakes which somehow became entrenched in the dialect.
Spurious augmentation (own=pown, suck=suxx0r, rock=r0xx0r)
Our most assiduous reader will notice that these actually increase text length, in diametrical opposition to the professed goals of AT&T, and the goals of license plate text.
L33tspeak itself has its roots in chatspeak, which I would consider to be an intermediate evolutionary step between license-plate-speak and l33tspeak. Chatspeak attempts to save space, like license-plate-speak, but as it occurred in real time by people with marginal typing skills, it was more aggressive in its text reduction, often using many abbreviations such as rofl, rotflmao, asl, etc. To the uninitiated, these are also cryptic, but they are not so willfully cryptic (nor as inventive) as l33tspeak. (I will avoid the controversial related subject of emoticons.)
The most modern of these illiterate and incompetent text systems is spamspeak, also sometimes erroneously called l33tspeak. The goals of spamspeak of course are to attempt to elude detection by spam filters, generally by any of the techniques already discussed, as well as by more insidiously retarded methods like inserting random punctuation be-tween l.e.t.t.e.r.s i_n var'ious ill:it:er:ate look!ng w4y5. These often serve to increase text length far more than l33tspeak. Like l33tspeak, spammers also use various techniques which are purely unintentional errors, due to their inability to operate their mailing software, e.g. instead of a random name appearing in the subject, one sees [RND_NAME]. My favorite was the subject line which said "Type your message subject here", which is paradoxically perfectly correct English, yet totally incompetent. I did recently observe a novel technique in the message body of a spam, which perhaps represents the pinnacle of cleverness for spamspeak:
There you have it: at least 4 dialects (license-plate-speak, chatspeak, l33tspeak, and spamspeak), all distinct even though superficially similar in that they mangle the text and leave the reader feeling that the author is probably a 14-year-old. I trust linguists are already at work studying these curious phenomena. An interesting question is how these increasingly bizarre dialects carry over into other languages. Ĉu ekzistas l33t-esperanto? Ĉu "Mi rokas" iĝas "Mi r0XX45"?
In any case, it seems sad but true that this stuff is becoming more prevalent in daily life. I am confident we will indeed soon see b33r commercials with people saying "WH444445555555UP?!"
I've been pondering
There are actually several different dialects which are erroneously lumped under the term "l33tspeak"; I will consider 3 besides actual l33tspeak. The oldest one is what I'll call "license-plate-speak", canonical examples being various forms of braggadocio such as "2FAST", "2FAST4U" "GR8 CAR", etc. There is some intersection between this and l33tspeak, the classic example being "u" for "you". But ultimately text samples like "CAN U", "C U L8R", "2NITE" are attempting to save space by the single device of using the name of a digit or letter as a syllable in an existing word, purely as an abbreviation to shorten the text. This is important: aside from the substitution of a single character sound for a syllable, no other changes occur, because they are trying to combine otherwise clear conventional communication with a single pathetically misguided alteration. This was done on license plates years before today's l33tspeakers were even born, because license plates have even more severe restrictions on message length, roughly a half dozen characters (depending on state, time period, type of plate, etc.), and the characters could only be letters and digits (and maybe you'd get a space or word separation symbol like a star or shape of your state, in order to form 2 words on the plate).
Now observe that l33tspeak (which does use that same technique of text shortening) also does various other things as well, which have nothing to do with reducing text length. The additional goals serve to obscure the text (unlike license plate speak, which usually wants to be understood), making l33tspeak more cryptic to the uninitiated, via subcultural in-jokes, and for that reason I feel it is a separate (though just as annoying) phenomenon. For instance:
Substitution of one symbol for visually similar symbol, as a visual pun (I=1=l=!=|, A=4, S=5)
Transpositions of adjacent symbols (the=teh, porn=pr0n)
The astute reader will observe that these leave text length unchanged, and they simply do not occur in license-plate-speak. Note that transpositions often resulted from incompetent typing mistakes which somehow became entrenched in the dialect.
Spurious augmentation (own=pown, suck=suxx0r, rock=r0xx0r)
Our most assiduous reader will notice that these actually increase text length, in diametrical opposition to the professed goals of AT&T, and the goals of license plate text.
L33tspeak itself has its roots in chatspeak, which I would consider to be an intermediate evolutionary step between license-plate-speak and l33tspeak. Chatspeak attempts to save space, like license-plate-speak, but as it occurred in real time by people with marginal typing skills, it was more aggressive in its text reduction, often using many abbreviations such as rofl, rotflmao, asl, etc. To the uninitiated, these are also cryptic, but they are not so willfully cryptic (nor as inventive) as l33tspeak. (I will avoid the controversial related subject of emoticons.)
The most modern of these illiterate and incompetent text systems is spamspeak, also sometimes erroneously called l33tspeak. The goals of spamspeak of course are to attempt to elude detection by spam filters, generally by any of the techniques already discussed, as well as by more insidiously retarded methods like inserting random punctuation be-tween l.e.t.t.e.r.s i_n var'ious ill:it:er:ate look!ng w4y5. These often serve to increase text length far more than l33tspeak. Like l33tspeak, spammers also use various techniques which are purely unintentional errors, due to their inability to operate their mailing software, e.g. instead of a random name appearing in the subject, one sees [RND_NAME]. My favorite was the subject line which said "Type your message subject here", which is paradoxically perfectly correct English, yet totally incompetent. I did recently observe a novel technique in the message body of a spam, which perhaps represents the pinnacle of cleverness for spamspeak:
T W W V A
H O E E L
E R R R I
D E T G
S I N
C E
A D
L
L
Y
There you have it: at least 4 dialects (license-plate-speak, chatspeak, l33tspeak, and spamspeak), all distinct even though superficially similar in that they mangle the text and leave the reader feeling that the author is probably a 14-year-old. I trust linguists are already at work studying these curious phenomena. An interesting question is how these increasingly bizarre dialects carry over into other languages. Ĉu ekzistas l33t-esperanto? Ĉu "Mi rokas" iĝas "Mi r0XX45"?
In any case, it seems sad but true that this stuff is becoming more prevalent in daily life. I am confident we will indeed soon see b33r commercials with people saying "WH444445555555UP?!"