Creating and Processing a Corpus

UNSPECIFIED Creating and Processing a Corpus. LINGUAL: Jurnal of Language and Culture Vol 2 (4) . pp. 19-33. ISSN ISSN 2085-7373

[img]Microsoft Word


This paper seeks to describe some crucial importance of corpus and text processing. Corpus is is a projection of how language is used by its speakers. Technology support has improved corpus for easier maintenance, made it space-saving, and it may electronically structure its data. The latest offers much freedom for corpus users to access and exploit it for language teaching, analysis or other specified tasks. This paper will demonstrate how to use open-access corpus on internet such as Corpus of Contemporary American English (COCA) and British National Corpus (BNC). Besides how to use a corpus, another crucial importance that this paper seeks to describe is how to build a corpus. In this paper, the writer will use UNITEX, a corpus (text-based) processing software. This software will demonstrate steps of corpus building, ranging from text collection, annotation, electronic dictionary application to some natural language based operations ranging from pattern matching, concordance, to simple extraction. It will show how graph technology may outperform regular expression, a retrieval method exploited by other corpus processor, in terms of writing output.

Item Type:Article
Subjects:P Language and Literature > P Philology. Linguistics
Divisions:Faculty of Humanities > Department of English
ID Code:39600
Deposited On:23 Jul 2013 10:49
Last Modified:06 Jul 2015 16:55

Repository Staff Only: item control page