Researchers at Bloomberg are always looking for innovative ways to use data, because users of our products need to have quick and reliable access to the latest news and sentiment, and alerts about the companies and people that interest them.
Mark Dredze, Miles Osborne, and Prabhanjan ‘Anju’ Kambadur recently published a paper titled Geolocation for Twitter: Timing Matters at the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), one of the premier conferences on natural language processing. Their paper documents the exploration they’ve done into determining how time and place relate to each other in the Twitter stream.
Knowing the location from which a tweet was sent can provide important information to a variety of applications, such as judging sentiment about a company within a geographic region. Even though Twitter can provide location information in the structured metadata alongside tweets, most tweets do not include location information. Machine learning systems have been built to infer the most likely location a Twitter user was in when they sent a tweet, but these systems generally ignore differences in the time of day. For example, a user might post a tweet in the morning while at home, and another later in the day while at work, with each focusing on different topics. Variations in topics means that determining this user’s location with any degree of confidence depends on the time of day.
The paper shows that cyclical temporal effects affect the ability to geolocate Twitter users, reducing accuracy. Furthermore, the Twitter stream changes rapidly, and a model built one day may perform quite poorly a week later when it is put into production. The researchers show that temporal drift can be effectively countered by making regular updates to the machine learning model, training it using only a small number of accurately geolocated tweets. This paper is the first to consider temporal aspects of geolocation prediction.
To learn more about this work, and the researchers, check out the paper on the ACL website and their personal websites (linked from their names above).