Matt Gee

Matt Gee

Research Fellow in Linguistics

School of English

Matt Gee develops research and teaching tools in the Research and Development Unit for English Studies (RDUES). This includes the creation of a fully-fledged search engine (WebCorp LSE) designed to treat the web as a source of linguistic data. WebCorp LSE can download, clean-up and present through a search interface texts from the web for corpus linguistic style analysis (including wildcard and part-of-speech search, concordancing, collocation and change over time).

He developed a tool for online collaborative annotation called eMargin. eMargin is a web-based tool that replicates the practice of close reading by allowing the annotation and discussion of digital texts. eMargin is used by educational institutions of all levels for literature studies, language learning, literary skills development, linguistic annotation and humanities research to name a few. It has also been used outside of education, for example, for the discussion of policy documents by political groups.

Matt continues to develop WebCorp Live. WebCorp Live uses APIs provided by commercial web search engines to retrieve basic search results and then further refines these results to present them in a manner suitable for linguistic study. Searches can be performed in multiple languages. WebCorp Live is used by teachers, researchers and translators around the world.

As a continuation of his research into tools and methods to support linguistic analysis, Matt has developed XTranscript. XTranscript is an online tool to convert transcripts encoded with Conversational Analysis notation into XML. The resulting XML can then be subjected to quantitative analysis using mature XML processing technologies, such as XPath and XQuery.

In his recent work, Matt has used automated methods to study topic on web-forums, investigate the differing use of language in professional and consumer reviews of video games and analyse interactions between users in newspaper comment threads. He is also working on a visualisation tool for the analysis of open-text questions in surveys, with a focus on the National Student Survey.