Andrew has research interests in all aspects of Corpus Linguistics, including the development of software tools for the identification and visualisation of language change across time. He has a particular interest in the use of the web as a source of natural language data and has expertise in the areas of search engine design, topic detection and indexing, web document formats, and the extraction of authorship date from web documents.
Andrew has recently collaborated with the Academic Planning department at BCU on the analysis of feedback received through the National Student Survey (NSS). The NSS makes a significant contribution to university rankings in national league tables and many institutions are developing increasingly sophisticated methods for analysing its results. However, much of the emphasis has been on the multiple choice questions, with relatively little attention paid to the free-text answers where students can give detailed comments on positive and negative aspects of their course. To offer an enhanced analysis, Andrew and colleague Matt Gee have developed a user-friendly ‘dashboard’ system called OurSurveySays which provides non-specialists with new linguistic insights into comments made in the NSS and other text-based questionnaires.
Grants Awarded
2020-21 AHRC - TRAC:COVID Trust and Communication: A Coronavirus Online Visual Dashboard (Principal Investigator)
2017-20 ERC Horizon 2020 - Real-time Early Detection and Alert (RED-Alert) System for Online Terrorist Content (Member of cross-disciplinary team working with EU partners)
2012-13 JISC Embedding Benefits Grant - Integration of eMargin with Virtual Learning Environments (Project Lead)
2011-12 JISC Learning and Teaching Innovation Grant - eMargin: an online collaborative textual annotation resource (Project Lead)
Recent Invited Talks
2019 Guest Speaker at University of Jyväskylä, Finland
2019 Guest Speaker at Vienna University of Economics and Business, Austria
2017 Public lecture at Umwelt-Campus Birkenfeld, Hochschule Trier, Germany
2016 'Pushing the Boundaries of Corpus Linguistics: New Approaches and New Audiences'. Keynote lecture at annual Birmingham English Language Postgraduate (BELP) conference, University of Birmingham, April 22.
2015 Reader comments on online news articles: a corpus-based analysis. CRAL Corpus Linguistics Workshop, University of Nottingham, February 20.
2014 "Your blog is (the) shit" - the role of context in the analysis of swearing in blogs (with Ursula Lutzky), English Department Research Seminar, University of Liverpool, December 10.
2014 Reader comments on online news articles: a corpus-based analysis. English Department Research Seminar, University of Liverpool, May 21.
2013 The role of context in the analysis of swearing in blogs (with Ursula Lutzky). Workshop on politeness and impoliteness in digital communication: Corpus-related explorations. ESRC Centre for Corpus Approaches to Social Science, Lancaster University, September 20.
2012 eMargin and Linguistic Analysis. UCREL Corpus Research Seminar, Lancaster University, December 6.
2012 eMargin and Text Annotation, AHRC Hidden Collections Doctoral Training Programme, University of Nottingham, November 23.
2012 eMargin in Literary Study, HEA Workshop, University of Leicester, July 5.
2012 Introduction to eMargin, Digital Conversations Workshop, British Library, March 30.
Past Projects
2009-11 Introducing A-Level English Language students to empirical text study using the WebCorp Linguist's Search Engine (AHRC Knowledge Transfer Fellowship) Research Associate / Co-author
2006-08 WebCorp Linguist's Search Engine (EPSRC / HEFCE-SRIF) Technical Lead
2006-07 Repulsion: The investigation of an organising force in text (EPSRC) Researcher Co-investigator / Software Developer
2001-04 SHARES: System of Hypermatrix Analysis, Retrieval, Evaluation and Summarisation (EPSRC) Research Associate / Software Developer
2000-01 WebCorp: The Web as Corpus (EPSRC) Research Assistant / Software Developer
1999-2000 APRIL: Analysis and prediction of innovation in the lexicon (EPSRC) Research Assistant / Software Developer
Andrew is interested in supervising doctoral research projects in corpus linguistics, particularly those in the following areas:
- linguistic analyses of web data, especially blogs and social media
- language change over time (diachronic corpus linguistics)
- the language and discourse of online news reporting
- corpus pragmatics / (im)politeness
Previous students
2022 Gabriela Csulich
(Im)politeness and Power in the Early Modern English Courtroom (1560 to 1639)
2020 Selina Schmidt
Rapport management in online spoken interaction: A cross-cultural linguistic analysis of communicative strategies
AHRC-funded through the Midlands4Cities Doctoral Training Partnership. Passed with minor corrections.
2017 Zhixia Yang
A corpus-based study of rhetorical questions in monologic genres in the framework of Relevance Theory.
Passed with minor corrections.
Doctoral Examining
2018 Nils Smeuninx (Gent University, Belgium)
“Dear Stakeholder” Exploring the language of sustainability reporting: a closer look at readability, sentiment and perception.
2018 Tamara Peeters (Birmingham City University)
Power and Language in the Wars of the Roses.
Books
2009 with Renouf, A. (eds.) Corpus Linguistics: Refinements and Reassessments, Amsterdam: Rodopi.
2006 with Renouf, A. (eds.) The Changing Face of Corpus Linguistics, Amsterdam: Rodopi.
Chapters
Forthcoming with M. Gee & A. Renouf. 'A data-driven approach to finding significant changes in language use through time series analysis'. In S. Flach & M. Hilpert (eds.) Broadening the spectrum of corpus linguistics: New approaches to variability and change. Amsterdam: John Benjamins.
2022 with Lutzky, U. 'Using Corpus Linguistics to Study Online Data'. In C. Vazquez (ed.) Research Methods for Digital Discourse Analysis. London: Bloomsbury.
2021 ‘Web Corpora’. In S. Gries & M. Paquot (eds.) A Practical Handbook of Corpus Linguistics. Berlin: Springer.
2019 with Gee, M. ‘“Thanks for the donds”: A corpus linguistic analysis of topic-based communities in the comment section of The Guardian’. In U. Lutzky & M. Nevala (eds.) Reference and Identity in Public Discourses. Amsterdam: John Benjamins, 127-158.
2019 with Lutzky, U. ‘“Friends don’t let friends go Brexiting without a mandate”: Changing discourses of Brexit in The Guardian’. In V. Koller, S. Kopf & M. Miglbauer (eds.) Discourses of Brexit. London: Routledge, 104-120.
2009 with Gee, M. Weaving Web data into a diachronic corpus patchwork in A. Renouf and A. Kehoe (eds.) Corpus Linguistics: Refinements and Reassessments, Amsterdam: Rodopi.
2006 Diachronic Linguistic Analysis on the Web with WebCorp in A. Renouf and A. Kehoe (eds.) The Changing Face of Corpus Linguistics, Amsterdam: Rodopi.
2006 with Renouf, A. and J. Banerjee WebCorp: an integrated system for web text search, in M. Hundt, N. Nesselhauf and C. Biewer (eds.), Corpus Linguistics and the Web, Amsterdam: Rodopi.
2004 with Renouf, A. and D. Mezquiriz "The Accidental Corpus: Some Issues in Extracting Linguistic Information from the Web", in K. Aijmer and B. Altenberg (eds.) Advances in Corpus Linguistics, Amsterdam: Rodopi.
Journal Articles
2017 with Lutzky, U. "I apologise for my poor blogging": Searching for Apologies in the Birmingham Blog Corpus. Corpus Pragmatics. pp. 1-20. ISSN 2509-9507
2016 with Lutzky, U. “Oops, I didn't mean to be so flippant” A corpus pragmatic analysis of apologies in blog data, Elsevier, Special issue of the Journal of Pragmatics on Adaptability in New Media, forthcoming.
2016 with Lutzky, U. ”Your blog is (the) shit”: a corpus linguistic approach to the identification of swearing in computer mediated communication. International Journal of Corpus Linguistics 21:2, 165-191.
2013 with Renouf, A. Filling the gaps: Using the WebCorp Linguist's Search Engine to supplement existing text resources. International Journal of Corpus Linguistics 18:2, 167-198.
2013 with Gee, M. eMargin: A Collaborative Textual Annotation Tool. Ariadne, Issue 71.
2012 with Gee, M. Reader comments as an aboutness indicator in online texts: introducing the Birmingham Blog Corpus in S. Oksefjell Ebeling, J. Ebeling and H. Hasselgård (eds.) Studies in Variation, Contacts and Change in English Volume 12: Aspects of Corpus Linguistics: Compilation, Annotation, Analysis, University of Helsinki e-journal.
2011 with Gee, M. Social Tagging: A new perspective on textual 'aboutness' in P. Rayson, S. Hoffmann and G. Leech (eds.) Studies in Variation, Contacts and Change in English Volume 6: Methodological and Historical Dimensions of Corpus Linguistics, University of Helsinki e-journal.
2007 with Gee, M. New corpora from the web: making web text more 'text-like' in P. Pahta, I. Taavitsainen, T. Nevalainen and J. Tyrkkö (eds.) Towards Multimedia in Corpus Studies, electronic publication, University of Helsinki.
Reviews
2010 Review article on 'ConcGram 1.0' software, in ICAME Journal: Computers in English Linguistics, No. 34, April 2010.
Conference Proceedings
2005 with Renouf, A. and J. Banerjee The WebCorp Search Engine: a holistic approach to Web text Search in Proceedings from the Corpus Linguistics Conference Series, Vol. 1, no.1, University of Birmingham.
2004 with Renouf, A. 'Textual Distraction as a Basis for Evaluating Automatic Summarisers', in M.T. Lino et al (eds.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), Paris: ELRA, Vol IV pp. 1347-1350.
2003 with Morley B. and A. Renouf Linguistic Research with the XML / RDF aware WebCorp Tool. World Wide Web 2003 Conference, Budapest.
2002 with Renouf, A. WebCorp: Applying the Web to Linguistics and Linguistics to the Web. World Wide Web 2002 Conference, Honolulu, Hawaii.
Other
2010 The Birmingham Blog Corpus (with Matt Gee and Ursula Lutzky)
2000-ongoing WebCorp software and user guide.
2000 APRIL (Analysis and Prediction of Innovation in the Lexicon) project software, databases and web front-end.
1999 Discourse Tree Manipulation Algorithms: Using Rhetorical Structure Theory to Restructure and Summarise Texts, MSc Dissertation, University of Liverpool (with accompanying C++ software).
Andrew worked as linguistic consultant to the Grey London communications agency on behalf of the fashion brand Puma and their fragrance partner Procter & Gamble. This work resulted in the creation of the critically acclaimed Puma Dance Dictionary website and accompanying Europe-wide TV advertising campaign to launch the Puma Sync fragrance range.