CQP Web for Language Corpora
Developed by the Language and Mutilmodal Analysis Lab(LAMAL),
Department of English, The Hong Kong Polytechnic University
British Academic Spoken English Corpus (BASE)
The British Academic Spoken English Corpus (BASE), developed at the Universities of Warwick and Reading, is a collection of transcripts of lectures and seminars recorded at these two universities in the UK during the period 1998-2005, with about 1.3 million words in total.

British Academic Written English Corpus (BAWE)
The British Academic Written English Corpus (BAWE), as a sister corpus of BASE, contains just under 3,000 good-standard student assignments evenly distributed across four broad disciplinary areas (Arts and Humanities, Social Sciences, Life Sciences and Physical Sciences) with about 7.5 million words in total.

Emergency Department Communication Corpus (EDCC)
This corpus contains 1.5 million words of spoken texts collected from Emergency Departments of five Australian hospitals. It consists of interviews with clinicians (doctors, nurses and medical technicians) as well as interactions with patients throughout their ED journey.

Corpus of Ecolinguistics
This corpus contains a collection of 1-million-word English texts, including 5 books and 25 research articles, all on ecolinguistics, a new research paradigm which has emerged since Michael Halliday’s seminal paper "New ways of Meaning: the challenge to applied linguistics" (1990).

Manchester United Football English News Corpus (MUFE)
Manchester United Football News Corpus (MUFE Corpus, or MUFE) covers the football news reports of and/or related to the team of Manchester United during 2009-2012. All the texts contained in MUFE are used only for academic purposes.

Fashion Communication Corpus (FCC)
The FCC comprises 1 million words of Engish texts, grouped into five catergories, including blogs (208,124 words), comments (205,248 words), research articles (205,248 words), news and trend reports (212,188 words), and styling tips and product launches (204,71 words).

PolyU Corpus of Travel and Tourism Texts (TnT)
The PolyU TnT Corpus contains a collection of 1 million words of English texts on travel and tourism from the World Wide Web.The texts are more or less equally divided into five different registers: A. Legal documents(246,342 words), B. Academic papers(207,941 words), C. Promotional literature(209,261 words), D. Travelogues(218,993 words), and E. On-line discussions(201,392 words), with a total of 1,083,929 words.

PolyU Learner English Corpus (PLEC)
The PLEC contains 1 million words of English composition by students of undergraduate and sub-degree programmes of various departments at The Hong Kong Polytechnic University. They are mostly short expository essays written under exam conditions.

System messages
Since the corpus data on this website were tagged with The Penn Treebank Project Part-of-Speech tagset intead of any of the CLAW tagsets, please follow the Alphabetical list of part-of-speech tags used in the Penn Treebank Project when you do the searches. Should you need a copy of the quick reference guide for that purpose, please email Dr. Xu at egxu@polyu.edu.hk
Hello! A warm welcome to the CQP web for Language Corpora developed by LAMAL!
For access to the corpus data here, please contact Dr. Xu Xunfeng at egxu@polyu.edu.hk

CQPweb v3.0.15 © 2008-2013 [Admin logon] You are not logged in