Y’all be speaking funny on Twitter

Twitter can be coo. Twitter can also be koo. But which it is depends on where you live.

Carnegie Mellon University researchers have been using the microblogging site to learn more about regional slang. They analyzed 380,000 messages from the site during a one-week period from last March, an estimated 15% of the daily US total at the time. Thanks to the site’s use of geotags for users posting from mobile devices, they were able to track trends of where different phrases were most popular.

The results encompassed cliches/truisms old (“y’all” in the south, “yinz” in Pittsburgh) and new (“hella” in Northern California), as well as the revelation that there’s “suttin” notable about New Yorkers.

There were also some notable differences in modifiers: a New Yorker would be more likely to be “dead ass tired” while a Los Angeles citizen would more likely be “tired af” with the “a” standing for as, and the f, well…

It also appears that New Yorkers, for all their stereotype as being rushed for time, don’t always practice text-style abbreviations: they are disproportionately likely to write “youu” in place of “you” or to double type the letter “l”. Of course, the assumption is that this has something to do with the people rather than New York keyboards being more unreliable.

The research isn’t necessarily a representative sample of the general population: Twitter users are much more prevalent among 18-34 year olds, though as of 2009 the site had a higher median age of user than both Facebook and MySpace. (That may have changed since, with a lot of older users joining Facebook.) And while I don’t have figures, I’d suspect that as the sample was restricted to those posting from phones, the bias to younger users may be even stronger.

The report was part of work by a team of four headed by post-doctoral fellow Jacob Eisenstein: it wasn’t designed so much to find out what people said on Twitter as to find a model for analyzing the text.

As part of the project, Eisenstein and company tried to automatically predict users locations. When it came to pinpointing a location, they were out by a mean average of 900km, but a median average of 494km (suggesting that the larger misses were rarer but more spectacular). In picking a more general location, they were able to get the correct state 24% of the time, and correctly pick one of four regions of the country on 58% of occasions.

Oh, and if you’re wondering, it’s coo in Southern California and koo in Northern California.

Advertisement





5 Responses to Y’all be speaking funny on Twitter

    • Sadly, I fear for the English language; yes, it's malleable, but current use just seems to make it more cliquish, and ideas less well defined. People could just as easily be muttering drivel in the hope that everyone agrees with something that has no defined meaning.