The user-defined geographic information is noisy data: while "Boston, MA" can be automatically disambiguated relatively easily to a physical location on the earth (corresponding to coordinates 42.35843, -71.05977), others ("Springfield") are more difficult (there are many Springfields); others still are nearly impossible ("home of that boy Biggie," a reference to New York City quoted from Jay-Z's "Empire State of Mind"), and some ("in ur fridge eatin ur foodz") don't map to any space in physical reality. The sheer volume of data, however, gives us the flexibility to focus more on precision than on overall accuracy - we can throw away all tweets where we aren't over 99% sure of the physical location.
What I like about the article and the research connected to it is that instead of talking about the technology (Is it bad for our language? Is it destroying our ability to talk to each other?) it just gets on and uses it to find out interesting things about the language around us.
It's linked to this site, The Lexicalist, which calls itself "a demographic dictionary of modern American English"