acl acl2013 acl2013-20 acl2013-20-reference knowledge-graph by maker-knowledge-mining
Source: pdf
Author: Bo Han ; Paul Cook ; Timothy Baldwin
Abstract: We implement a city-level geolocation prediction system for Twitter users. The system infers a user’s location based on both tweet text and user-declared metadata using a stacking approach. We demonstrate that the stacking method substantially outperforms benchmark methods, achieving 49% accuracy on a benchmark dataset. We further evaluate our method on a recent crawl of Twitter data to investigate the impact of temporal factors on model generalisation. Our results suggest that user-declared location metadata is more sensitive to temporal change than the text of Twitter messages. We also describe two ways of accessing/demoing our system.
Lars Backstrom, Jon Kleinberg, Ravi Kumar, and Jasmine Novak. 2008. Spatial variation in search engine queries. In Proc. of WWW, pages 357–366, Beijing, China. Lars Backstrom, Eric Sun, and Cameron Marlow. 2010. Find me if you can: improving geographical prediction with social and spatial proximity. In Proc. of WWW, pages 61–70, Raleigh, USA. 11 Orkut Buyukokkten, Junghoo Cho, Hector GarciaMolina, Luis Gravano, and Narayana Shivakumar. 1999. Exploiting geographical location information of web pages. In ACM SIGMOD Workshop on The Web and Databases, pages 91–96, Philadelphia, USA. Zhiyuan Cheng, James Caverlee, and Kyumin Lee. 2010. You are where you tweet: a content-based approach to geo-locating twitter users. In Proc. of CIKM, pages 759–768, Toronto, Canada. David J. Crandall, Lars Backstrom, Daniel Hutten- locher, and Jon Kleinberg. 2009. Mapping the world’s photos. In Proc. of WWW, pages 761–770, Madrid, Spain. Jacob Eisenstein, Brendan O’Connor, Noah A. Smith, and Eric P. Xing. 2010. A latent variable model for geographic lexical variation. In Proc. of EMNLP, pages 1277–1287, Cambridge, MA, USA. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, XiangRui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9: 1871–1874. Bo Han, Paul Cook, and Timothy Baldwin. 2012. Geolocation prediction in social media data by finding location indicative words. In Proc. of COLING, pages 1045–1062, Mumbai, India. Brent Hecht, Lichan Hong, Bongwon Suh, and Ed H. Chi. 2011. Tweets from justin bieber’s heart: the dynamics of the location field in user profiles. In Proc. of SIGCHI, pages 237–246, Vancouver, Canada. Liangjie Hong, Amr Ahmed, Siva Gurumurthy, Alexander J. Smola, and Kostas Tsioutsiouliklis. 2012. Discovering geographical topics in the twitter stream. In Proc. of WWW, pages 769–778, Lyon, France. Sheila Kinsella, Vanessa Murdock, and Neil O’Hare. 2011. ”I’m eating a sandwich in glasgow”: modeling locations with tweets. In Proc. of the 3rd In- ternational Workshop on Search and Mining Usergenerated Contents, pages 61–68, Glasgow, UK. Jochen L. Leidner and Michael D. Lieberman. 2011. Detecting geographical references in the form of place names and associated spatial natural language. SIGSPATIAL Special, 3(2):5–1 1. Michael D. Lieberman and Jimmy Lin. 2009. You are where you edit: Locating wikipedia contributors through edit histories. In ICWSM. Marco Lui and Timothy Baldwin. 2012. langid.py: An off-the-shelf language identification tool. In Proc. of the ACL, pages 25–30, Jeju Island, Korea. Alan M. MacEachren, Anuj Jaiswal, Anthony C. Robinson, Scott Pezanowski, Alexander Savelyev, Prasenjit Mitra, Xiao Zhang, and Justine Blanford. 2011. Senseplace2: Geotwitter analytics support for situational awareness. In IEEE Conference on Visual Analytics Science and Technology, pages 181– 190, Rhode Island, USA. Jalal Mahmud, Jeffrey Nichols, and Clemens Drews. 2012. Where is this tweet from? inferring home locations of twitter users. In Proc. of ICWSM, Dublin, Ireland. Teng Qin, Rong Xiao, Lei Fang, Xing Xie, and Lei Zhang. 2003. An efficient location extraction algorithm by leveraging web contextual information. In Proc. of SIGSPATIAL, pages 55–62, San Jose, USA. Gianluca Quercini, Hanan Samet, Jagan Sankara- narayanan, and Michael D. Lieberman. 2010. Determining the spatial reader scopes of news sources using local lexicons. In Proc. of the 18th International Conference on Advances in Geographic Information Systems, pages 43–52, San Jose, USA. John Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, USA. Stephen Roller, Michael Speriosu, Sarat Rallapalli, Benjamin Wing, and Jason Baldridge. 2012. Supervised text-based geolocation using language models on an adaptive grid. In Proc. of EMNLP, pages 1500–15 10, Jeju Island, Korea. Dominic Rout, Kalina Bontcheva, Daniel Preo ¸tiucPietro, and Trevor Cohn. 2013. Where’s @wally?: a classification approach to geolocating users based on their social ties. In Proc. of the 24th ACM Conference on Hypertext and Social Media, pages 11– 20, Paris, France. Adam Sadilek, Henry Kautz, and Jeffrey P. Bigham. 2012. Finding your friends and following them to where you are. In Proc. of WSDM, pages 723–732, Seattle, USA. Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake shakes twitter users: real-time event detection by social sensors. In Proc. of WWW, pages 851–860, Raleigh, USA. Benjamin P. Wing and Jason Baldridge. 2011. Sim- ple supervised document geolocation with geodesic grids. In Proc. of ACL, pages 955–964, Portland, USA. David H. Wolpert. 1992. Stacked generalization. Neural Networks, 5(2):241–259. Zhijun Yin, Liangliang Cao, Jiawei Han, Chengxiang Zhai, and Thomas Huang. 2011. Geographical topic discovery and comparison. In Proc. of WWW, pages 247–256, Hyderabad, India. Wenbo Zong, Dan Wu, Aixin Sun, Ee-Peng Lim, and Dion Hoe-Lian Goh. 2005. On assigning place names to geography related web pages. In ACM/IEEE Joint Conference on Digital Libraries, pages 354–362. 12