russ ([info]goulo) wrote,
@ 2003-12-16 22:56:00
Previous Entry  Add to memories!  Tell a Friend!  Next Entry
An inquiry into several prevalent families of text encoding schemes
I've been pondering [info]ghewgill's recent l33tspeak article in which he received a letter from AT&T urging him to write like an illiterate moron. The ostensible goal of AT&T's suggestion of using "text abbreviations" is to shorten the message length (perhaps to save transmission costs or make more of it viewable on tiny cellphone displays). I submit that this dialect is not actually l33tspeak per se, even though it is common (among people I know anyway) to call it l33tspeak. I now explore some related but distinct dialects, which all have the property of making the author appear to be a stupid idiot.

There are actually several different dialects which are erroneously lumped under the term "l33tspeak"; I will consider 3 besides actual l33tspeak. The oldest one is what I'll call "license-plate-speak", canonical examples being various forms of braggadocio such as "2FAST", "2FAST4U" "GR8 CAR", etc. There is some intersection between this and l33tspeak, the classic example being "u" for "you". But ultimately text samples like "CAN U", "C U L8R", "2NITE" are attempting to save space by the single device of using the name of a digit or letter as a syllable in an existing word, purely as an abbreviation to shorten the text. This is important: aside from the substitution of a single character sound for a syllable, no other changes occur, because they are trying to combine otherwise clear conventional communication with a single pathetically misguided alteration. This was done on license plates years before today's l33tspeakers were even born, because license plates have even more severe restrictions on message length, roughly a half dozen characters (depending on state, time period, type of plate, etc.), and the characters could only be letters and digits (and maybe you'd get a space or word separation symbol like a star or shape of your state, in order to form 2 words on the plate).

Now observe that l33tspeak (which does use that same technique of text shortening) also does various other things as well, which have nothing to do with reducing text length. The additional goals serve to obscure the text (unlike license plate speak, which usually wants to be understood), making l33tspeak more cryptic to the uninitiated, via subcultural in-jokes, and for that reason I feel it is a separate (though just as annoying) phenomenon. For instance:

Substitution of one symbol for visually similar symbol, as a visual pun (I=1=l=!=|, A=4, S=5)
Transpositions of adjacent symbols (the=teh, porn=pr0n)
The astute reader will observe that these leave text length unchanged, and they simply do not occur in license-plate-speak. Note that transpositions often resulted from incompetent typing mistakes which somehow became entrenched in the dialect.
Spurious augmentation (own=pown, suck=suxx0r, rock=r0xx0r)
Our most assiduous reader will notice that these actually increase text length, in diametrical opposition to the professed goals of AT&T, and the goals of license plate text.

L33tspeak itself has its roots in chatspeak, which I would consider to be an intermediate evolutionary step between license-plate-speak and l33tspeak. Chatspeak attempts to save space, like license-plate-speak, but as it occurred in real time by people with marginal typing skills, it was more aggressive in its text reduction, often using many abbreviations such as rofl, rotflmao, asl, etc. To the uninitiated, these are also cryptic, but they are not so willfully cryptic (nor as inventive) as l33tspeak. (I will avoid the controversial related subject of emoticons.)

The most modern of these illiterate and incompetent text systems is spamspeak, also sometimes erroneously called l33tspeak. The goals of spamspeak of course are to attempt to elude detection by spam filters, generally by any of the techniques already discussed, as well as by more insidiously retarded methods like inserting random punctuation be-tween l.e.t.t.e.r.s i_n var'ious ill:it:er:ate look!ng w4y5. These often serve to increase text length far more than l33tspeak. Like l33tspeak, spammers also use various techniques which are purely unintentional errors, due to their inability to operate their mailing software, e.g. instead of a random name appearing in the subject, one sees [RND_NAME]. My favorite was the subject line which said "Type your message subject here", which is paradoxically perfectly correct English, yet totally incompetent. I did recently observe a novel technique in the message body of a spam, which perhaps represents the pinnacle of cleverness for spamspeak:
T  W  W  V  A
H  O  E  E  L
E  R  R  R  I
   D  E  T  G
   S     I  N
         C  E
         A  D
         L
         L
         Y


There you have it: at least 4 dialects (license-plate-speak, chatspeak, l33tspeak, and spamspeak), all distinct even though superficially similar in that they mangle the text and leave the reader feeling that the author is probably a 14-year-old. I trust linguists are already at work studying these curious phenomena. An interesting question is how these increasingly bizarre dialects carry over into other languages. Ĉu ekzistas l33t-esperanto? Ĉu "Mi rokas" iĝas "Mi r0XX45"?

In any case, it seems sad but true that this stuff is becoming more prevalent in daily life. I am confident we will indeed soon see b33r commercials with people saying "WH444445555555UP?!"


(Post a new comment)

wtf?
[info]nugget
2003-12-16 09:34 pm UTC (link)
The most novel bit of spamspeak that I've encountered lately is the introduction of the word "curn" in porn spams. Presumably in the appropriate proportional font the "r" and "n" are intended to bleed together in order to appear as a single "m".

I have got to imagine that this particular invention causes a slight dilemma in the spam author in that it mandates using lower-case text which is not the norm in this context.

(Reply to this)


[info]argilo
2003-12-16 09:53 pm UTC (link)
Jam de kvar jaroj mi uzas Esperanton en la reto, sed mi neniam vidis tiajn stultaĵojn. Mi scivolas, ĉu ili jam aperis en aliaj lingvoj.

(Reply to this)


[info]ghewgill
2003-12-16 09:56 pm UTC (link)
Wow, nice summary! I had vaguely recognized this morning that "l33tspeak" might not have been the right word for the abbreviated literary jetsam that found its way into my mailbox, but couldn't think of anything more specific at the time. Now I know.

I too was thinking about the english-isms that make this sort of chatspeak possible. The pronunciations of at least the following letters and numbers seem to match exising words unrelated to their standalone purpose: B,C,G,J,P,Q,R,T,U,Y,1,2,4,8. How often does this happen in other languages? (Letter/word homonyms in Esperanto seem relatively uninteresting, because of the regular letter names and consistent phonetic pronunication.)

How many sensible sentences can you construct with single-letter chatspeak "words"? I8AB. ICUP. RU4T? UR12B8. Okay, I'm taking some liberties with the grammar.

One could hope that all this will eventually become a footnote in the history of language, but I fear that you are right and it will become worse before it gets better.

(Reply to this)(Thread)


[info]nugget
2003-12-16 10:12 pm UTC (link)
A common one is "N8" in German meaning "good night". "N" "acht" equalling the german word "nacht".

In French I've seen "B1" used as "bien" as in "b1sur" for "Bien sûr" (of course) or "cb1" for "C'est Bien" (that's good).

(Reply to this)(Parent)

d00d
(Anonymous)
2003-12-18 06:07 am UTC (link)
Don't forget your cow-orkers.

(Reply to this)

you forgot...
[info]6opou
2003-12-18 06:45 am UTC (link)
I think that a nod must be given to HAMM radio operators (and for that matter, telegraph operators). They created an impressive shorthand system to speak to one another through morse code which, when translated into letters, is still incredibly difficult to decipher to the uninitiated. I looked at a sheet where a buddy of mine was translating morse into letters and had to ask him what just about every other word was. They often use only two or three unambiguous letters to represent a word.

Of course HAMM radio operators then became the first computer geeks and started unix and DARPAnet. I don't think they were the first to speak l33t, though. They just used abreviations like `fsck' for file system check.

(Reply to this)


[info]venaja
2003-12-18 07:52 pm UTC (link)
I'm so happy to see someone finally distinguishing between text message speak and l337 speak. I've been trying to convince people of that for years, only to be foiled by rampant apathy. ;)

(Reply to this)


Create an Account
Forgot your login?
Login w/ OpenID
English • Español • Deutsch • Русский…