A Corpus Based Study In Cognitive Linguistics English Language Essay

Published: 2021-07-03 02:35:05
essay essay

Category: English Language

Type of paper: Essay

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Hey! We can write a custom essay for you.

All possible types of assignments. Written by academics

Natalia Lushchan
Customer is one of the central and crucial notions in marketing. Though the terms customer and client are distinct, they are often confused not only by foreign language learners but also by native speakers. Moreover, consumer, the third term which is closely related to customer and client, varies significantly from the first two, and depends on the context, and thus may cause misunderstandings. The aim of this study is to find out how these near-synonyms are used in different contexts by people who speak British and American variants of English, and how the conceptual structures are reflected in the language and in the mind. In order to analyse these differences in the use between different individuals and the language use itself, authentic texts produced by native speakers have been used as a source of data. They were extracted from the written media (newspapers and magazine) of the two corpora: Corpus of Contemporary American English (COCA) and the British National Corpus (BNC). Drawing the evidence from corpus data enabled to create a proper usage-based description of the pattern usage and examine differences in the way the British and Americans conceptualize CUSTOMER. To do this, 50 instances of each lexeme have been annotated manually for functional (e.g. grammatical categories) and semantic features (e.g. axiology). The results were subsequently tested statistically with the help of R and its packages; more specifically, the exploratory techniques of Multiple and Binary Correspondence analyses, as well as Cluster analysis, were employed in order to identify patterns of usage and to examine which kind of semantic relations three near-synonymous words customer, client and consumer share. Using a quantitative corpus-based method, this study concerned with the conceptualization of CUSTOMER has shown the ways in which the given lexemes have been operationalized. The findings indicate that speaker’s attitudes differ to a certain extent in British and American English, resulting in the differences in the use the three near-synonyms: customer and client seem to be more similar in meaning, and are conceptualized as ‘neutral’ or ‘positive’, whereas consumer is quite distinct from both of them, and tend to evoke more ‘negative’ associations.
Key words: corpus cognitive linguistics; corpus linguistics; Correspondence Analysis; Cluster analysis; corpus-based method; near-synonyms; semantic relations; customer
1. Introduction
It often quite difficult for native speakers and foreign language learners to choose the right word from near-synonymous words in English and thus, convey the connotations and implications they really intend to. Such words with the same meaning but different lexical features in different contexts will be regarded as near-synonyms. This study will compare the usage of the three near-synonyms customer, client and consumer, in order to reveal the differences in patterns of their usage from the perspective of cognitive linguistics. Taking into account the multi-disciplinary nature of cognitive linguistics and the importance of studying linguistic phenomena as a result of ‘language use’ (Croft & Cruse 2004: 1), the analysis of the given lexemes will be primarily focused on drawing the evidence from corpus data (the BNC & COCA). Corpus linguistic methods may be effectively applied in cognitive linguistics for the investigation of near-synonymous words, since they provide the contexts where these words with similar meanings occur, which, in turn, enables linguists to examine and trace how they are used in various situations and whether they may be substituted with one another in certain contexts or not. In other words, as a result of merging corpus-based methodologies with cognitive linguistics, and therefore, treating the phenomenon under study within ‘cognitive corpus linguistics’, this research will be grounded "on authentic language use", which will help make "its results [...] replicable, and its claims falsifiable" (Arppe et al. 2010: 2). Moreover, performing statistical analysis on corpus data will make it possible to answer broader questions, raised in cognitive linguistics. Furthermore, following the theoretical framework of ‘cognitive grammar’, in particular, considering the notion of ‘conceptualization’ as one of its most significant and basic elements, whereby "semantic structure is conceptualization tailored to the specifications of linguistic convention" (1987: 99), all grammatical constructs will be regarded as meaningful, since "meaning is equated with conceptualization […] in terms of cognitive processing" (Langacker 1988: 6).
The most prominent cognitive linguists, such as Ronald Langacker (1987, 1990), George Lakoff (1987) and Leonard Talmy (2000) have highlighted that meaning is of paramount importance for this research field. In order to express this meaning linguistic structures are used, which are closely related to the semantic structures (Kemmer 2010: 2). Therefore, the "mappings" between meaning of words and their formal grammatical form will be a focus of the given investigation. Such an emphasis on the meaning of each lexeme manifested by its meaningful grammatical structures will also help discriminate between the senses of the three near-synonyms. In addition, it should be noted that the research studies of near-synonyms are not new nowadays and some researchers have already examined some of them, especially of such parts of speech as verbs (cf. Gries 2006, Divjak & Gries 2008; Frawczak & Kokorniak in press, Fabiszak et al. 2012, etc.) or adjectives (cf. Taylor 2003), etc. Of course, nouns were also studied, however, these were mostly "purely" corpus studies concerned with a class of abstract nouns, namely ‘shell nouns’ and their linguistic environments (cf. Schmid 2000), general nouns (cf. Mahlberg 2005), collective nouns (cf. Papadopoulou 2008), noun complementation in English (cf. Bowen 2005) or modification of nouns in English and Spanish (cf. Ramón García 2003), etc. That is why, there is a certain novelty value in the approach of this study, since it analyses three lexemes, which have not yet been scrutinized. It is noteworthy to mention that the given lexemes have never been investigated neither within another theory nor methodology which makes this small research also a quite unique one.
Importantly, the implementation of methodologies of cognitive corpus linguistics; critical evaluation of cognitive grammarians’ claims concerned with grammatical constructions and lexical meanings of the words, which contribute to meaning making; and the stress on the "essential nature" of cognitive linguistics as a usage-based linguistics (Geeraerts 2006: 29), which tends "in the direction of corpus-based research" (Newman 2011: 522); are crucial for this study in order to understand subtle differences in the use of customer, client and consumer. But this theoretical background was not only essential as a basis for the given paper, but also became a source of inspiration and motivation of the research question. This paper is aimed at figuring out several questions:
Are there any differences in the use of customer, client and consumer between British and American English?
How these terms differ in the two genres (Magazines and Newspapers)?
Do the analyzed terms show any axiological differences? (positive/negative/neutral)
Taking into consideration Cruse’s (1986: 271-277) approach of differentiating between the interface between speakers’ intentions and languages (prepositional mode (e.g. question, simple statement) and expressive mode (e.g. emotion, attitude, register effect)); and the interactions among linguistic items (presupposed meanings (e.g. selectional and collocational restrictions) and evoked meanings (e.g. discourse cohesion and communicative roles)) of closely related meanings of near-synonyms, this paper will concentrate on the level of expressive mode and evoked meaning. The following hypotheses were formulated for this study:
H1: When the speakers of the British variant of English talk about CUSTOMER they conceptualize it in a different way than the speakers of American English do.
H2: Speakers of English use existing meanings of CUSTOMER differently in various registers, according to its role in discourse.
H3: People tend to show their emotions about the notion CUSTOMER, and depending on how they feel about it (or are customers themselves), they express their positive or negative attitude towards CUSTOMER.
H4: English speakers communicate additional information about CUSTOMER more creatively and independently as a result of their emotional or social status.
The purpose of this study is to investigate the differences in genre and dialect of the three near-synonyms, their potential axiological aspects and reveal formal and semantic correlations between them. Using corpus-based method, namely a profile-based feature analysis, based on the Behavioural Profile (BP) approach (cf. Divjak 2003, Gries 2006, etc.) it will be possible to investigate the patterns of usage and "yield cognitive-linguistically relevant results" (Divjak & Gries 2009: 276). This will enable to analyze the meaning of these nouns in English, which may enable not only native speakers and language learners to be able to distinguish between these basic marketing terms and avoid misunderstandings, but also may help marketing professionals target their advertising to a proper audience and communicate the right message to the right person (e.g. by answering the question: "Is my consumer also my client or customer?").
The first section of the paper is an introduction, which discusses the theoretical framework of the research, gives a short overview of the previous studies and presents research questions. It is followed by Section 2, which is concerned with the methods employed in this study. This section starts with the operationalization of variables (2.1.) and their translation into the coding schema (2.2.). Subsection 2.3 explains the reasons of the choice of the method and Subsection 2.4. - the source of data. Subsection 2.5 is devoted to the data selection procedure, where a detailed overview of break-up into genres in both corpora is displayed. Subsections 2.6. and 2.7. describe the software and the statistical tests used in the study. An additional survey conducted in this research is shortly discussed in Subsection 2.8. All findings are shown in Section 3. The next subsection is a discussion, which is aimed at describing implications of the findings for the hypothesis (4.1.) and for the research area (4.3.), and limitations of the study (4.2.). The final section will draw a short conclusion.
2. Methods
All existing approaches in cognitive linguistics, e.g. a multifactorial (cf. Divjak 2006), a collocation-based approach (cf. Gries & Stefanowitsch 2004), support "a usage-based model of language and assume that patterns of usage found in corpora are indices of the grammar in the minds of speakers in a language community" (Gries 2010c: 240). The present research follows this point of view and is based on the Behavioural Profile (BP) approach, which "combines the best of both the cognitive and corpus linguistic traditions, i.e. a precise, quantitative corpus-based approach that yields cognitive-linguistically relevant results" (Divjak & Gries 2009: 276). This approach is based on the "parallelism between the distributional and functional planes" (Divjak & Gries 2009: 277) and was adopted in order to investigate formal and semantic correspondences between three lexemes, customer, client and consumer. The BP is considered to be "a powerful method that provides as objective basis for the semantic analysis of both polysemous and synonymous items as possible" (ibid. 277), and was thus employed in this study.
This Section is aimed at showing how the data were extracted from the corpora, how these were then annotated for certain semantic and formal factors, and with the help of which software and statistical techniques they were subsequently analysed. In the next Subsection an attempt will be made to operationalize the variables.
2.1. Operationalization of variables
Operationalization of hypothesis is necessary for every empirical study, it is therefore important not only to invent a hypothesis, but also to "formulate it in such a way that it can be put to the test" (Geeraerts 2010: 73). This means that variables implied in the hypothesis (See Introduction) should be "translatable" into the coding schema (See Subsection 2.2), and thus, the transformation of the hypothesis into "concrete predictions that can be tested against the data" is possible (ibid. 73).
The following variables were mentioned in the hypotheses:
"conceptualize it in different way" (H1), "differently in various registers" (H2), "emotions", "positive or negative attitude" (H3), "additional information", "creatively and independently", "emotional or social status" (H4). They were consequently operationalized as follows (1):
(a) "conceptualize it in different way">> there will be differences between British and American variants of English
(b) "differently in various registers" >> there will be differences in genre distribution (between magazines and newspapers)
(c) "emotions" >> the three lexemes will be used in negative,
"positive or negative attitude positive or neutral contexts
(d) "additional information" >>
"creatively and independently",
and "emotional or social status"
customer, client & consumer will be used more with various parts of speech (e.g. adjectives), determiners, premodifiers, quantifiers, etc; as well as will be play a role in the sentences themselves (act as object, subject or modifier).
It should be mentioned that the first variable (1a) elicited from the hypothesis is concentrated on the differences in the use of the three lexemes in American and British English. This means that it will reveal more habitual patterns of language usage. Moreover, as many corpus-based studies, this research also includes analysis of linguistic differences in registers (1b) (i.e. magazines and newspapers), with the focus on "the pervasive characteristics of representative text excerpts from the variety", which, in turn, connects linguistic features "functionally to the situational context of the variety" (Biber 2010: 241). In this study I hope to find out which linguistic features are more common in Magazines and whether they differ from those usually used in Newspapers. Contrary to (1a) and (1b), variables (1c) and (1d), mentioned in the hypothesis are supposed to show more interactions of language and mind and will help make at least several assumptions about the way language can be operationalized. Therefore, these are more associated with the theoretical postulates of corpus cognitive linguistics than the corpus-based descriptions of dialect and register.
These operationalized variables were translated into the coding schema and are listed below (See next Subsection 2.2for more detailed description of the coding schema):
Variable: Dialect
Variants: BrE, AmE
Variable: Genre/register
Variants: Mag, Newsp
Variable: Axiology
Variants: neg, pos, neu
Variable: Number
Variant: sg, pl
Variable: Parts of sentences
Variants: s, o, c, mod
Variable: Quantifier
Variants: Num, zeroN
Variable: Determiner
Variants: a, the, zero, pPr, other
Variable: Premodification
Variants: zeroP, adj, ed, noun
2.2. Annotation (coding schema)
The following coding schema was elaborated and applied in the process of coding corpus data in order to examine different patterns of linguistic forms and meanings used to express CUSTOMER:
(2) Variable: Dialect
Variants: BrE, AmE
Rationale: as one the main research focuses of this research was to reveal the meaning of CUSTOMER in different dialects of English, a distinction between the British (coded as BrE) and American (coded as AmE) varieties of English was made.
(3) Variable: Genre/register
Variants: Mag, Newsp
Rationale: I want to see whether there are differences in usage of the given lexemes in various styles/registers. Due to study limitations, only genres of MAGAZINE (Mag) and NEWSPAPER (Newsp) were taken into consideration. Moreover, the choice of these two genres was motivated by the fact, that according to both corpora, all three lexemes are used Magazines and Newspapers most frequently.
(4) Variable: Axiology
Variants: neg, pos, neu
Rationale: This variable is aimed at tracing how people use powerful language in both genres. It should figure out whether the lexemes occur in more positive (pos), negative (neg) or neutral (neu) context. It should be noted that the coding of this variable was based on my intuition and the context in which the given lexeme appeared. The examples of Axiology are provided below:
client (pos):
…' he replies hesitantly. Teddy gets his promotion after punching the agency's biggest client, a sociopathic film-star, thus proving his honesty in a town of yes-men… (BNC: A25-42)
consumer (neg):
…Appeals to the company go nowhere, and you're left feeling powerless. Some disgruntled consumers, however, have taken their case to small claims court and gotten,… (COCA: MAG-26)
customer (neu):
…Four in the British charts. # And... # By Tim Satchell # A customer reports that yesterday he phoned Harrods to ask for the barber shop.' Certainly…(BNC: CBC-67).
(5) Variable: Number
Variant: sg, pl
Rationale: This grammatical feature should reveal the contrast between the singular and plural forms of the lexemes. In my intuition, this will also show the semantic differences between the lexemes.
(6) Variable: Parts of sentences
Variants: s, o, c, mod
Rationale: This variable will give an explanation of the usage of the lexemes and how people conceptualize the notion CUSTOMER while constructing a sentence, either as a subject (s), an object (o), which included both direct and indirect objects; a complement (c) or a modifier (mod) (e.g. customer service...)
(7) Variable: Quantifier
Variants: Num, zeroN
Rationale: This "numeric" category refers to the determiners which indicate quantity, however, common quantifier (e.g. all, many) were not included here and belong to broader category of determiners, marked other (See (8) below). This means that only quantifiers which contained numbers (e.g. fifty consumers) presented this variable in the coding. The reason for this separation lies in the hypothesis that numbers may signify the social position of a person. However, in the course of analysis, it was clear that occurrences of lexemes with numbers are quite rare, thus, these empty spaces were marked as zeroN.
(8) Variable: Determiner
Variants: a, the, zero, pPr, other
Rationale: The Determiner variable constitute the functional category. Determiners, such as indefinite (a), definite (the) article, which usually precede nouns (head of the noun phrase), were coded and then analysed in the combination with other grammatical features. The absence of determiner was marked as (zero). Moreover, since determiner may be represented by possessive pronouns or pronominal adjectives (coded as pPr), these were also included into the coding schema. In addition, other types of determiners, were gathered into one category, since they are not frequent and were marked as other. These included the following determiners:
e.g.: all, each, every, either, neither, some, any, no, much, many; more, most, little, less, least, few, fewer, fewest, what, that, those, another; whatever; which, whichever, both, several, etc.
(9) Variable: Premodification
Variants: zeroP, adj, ed, ing, noun, Pphrase
Rationale: From a grammatical point of view, adjectives follow directly after the determiner in the noun phrase. "Pure" (attributive) adjectives were marked as adj, and only the first adjective after the determiner was taken into consideration. The rest of variants of this variable were marked as ed and ing for -ed/-ing participal modifiers, respectively, noun premodifiers were coded as noun. The lack of Premodification was labelled zeroP and if nouns were a part of a prepositional phrase (e.g. most of its customers) they were marked as Pphrase.
Such coding schema made it possible to analyze different formal and semantic factors of the patterns of usage of the given lexemes. All the variables, suggested for the coding, were based on the general knowledge and personal intuition; as well as on "the standards of good practice for the annotation of corpora" (Leech 1997: 6-8), according to which, raw corpus should be "recoverable", the "extricable" annotation should contain rationale and comments on its quality, and is provided as a "matter of practical usefulness only". The cases of non-informative variables and their variants will be clearly seen at the plots, produced with the help of R, and will be discussed in Sections 3 and 4. The following Subsection is devoted to the corpus methodology, namely its advantages for the present study.
2.3. Choice of method
As already mentioned, this study is based on a quantitative corpus linguistic methodology. It combines the most effective approaches and features of cognitive and corpus linguistic theoretical frameworks. This means that a particular attention is paid to the advantages of corpus linguistics, in order to integrate corpus-based method and elicit corpus data in the most appropriate way to substantiate the claims of cognitive linguistics and cognitive grammar, as well as to be able to base the resulting assumptions on the most reliable evidence. For the sake of brevity, a detailed description of the corpus linguistics remains beyond the scope of this paper. Only the benefits of corpus-based research, relevant for this study, will be briefly discussed in the following paragraphs.
First of all, taking into account all exiting debates on the definition of a linguistic corpus, corpus in terms of this paper will be limited to the definition given by Meyer (2002: xi-xii); according to which, a corpus is "a collection of texts or parts of texts upon which some general linguistic analysis can be conducted", and which is "made available in computer-readable form". Given that the corpus linguistics is aimed at studying the complexity of structure and focusing on descriptive adequacy, not explanatory (as in generative grammar) (Meyer 2002: 3), this study will focus on revealing structural (formal) and semantic features of the three lexemes. Arppe et al. (2010: 21) states that corpus linguistics should be "fully accepted as a fundamental method in cognitive linguistics", if the advantages of corpus methods are highlighted. Among those benefits, one may point out the testability of the results, which corpora may produce; "the ability to handle complex phenomena, and the ability to connect with other areas of research to produce converging evidence" (Arppe et al. 2010: 22). Moreover, since corpora are made up of texts or parts of them, the use of the BNC and COCA will make it possible to "contextualize their analysis of language", moreover, they are "very well suited to more functionally based discussions of language" (Meyer 2002: 6). "A wealth of usage-based data on naturally occurring language patterns" (Mittelberg et al. 2006: 36) will enable to analyze the functions and semantic aspects of the given lexemes in a "more ‘objective’" way (Arppe et al. 2020: 21).
While incorporating the benefits of the corpus-based methodologies into this study, it was also necessary to use the most appropriate forms of corpora. Hence, tagged corpora of American and British English were used in order to be able to extract nouns, and to be able to differentiate between parts of speech. Moreover, due to study limitations, the emphasis has been put on synchronic corpus data, though both corpora (the BNC and COCA) offer to undertake diachronic investigations. These corpora were chosen because they can be easily analyzed using a free online interface, because they are appropriately tagged and well-structured, and finally, regarding their size, they are considered to be the "most representative official corpus sample of English in current existence" (Mittelberg et al. 2006: 41). More information about the BNC and COCA will be given in the following Subsection.
2.4. Source of data
Corpora are excellent source for verifying the completeness, simplicity, strength, and objectivity of any linguistic hypothesis (Leech 1992: 113). The source of lexemes (nouns) analyzed in this study is the British National Corpus (BNC) and the Corpus of Contemporary American English (COCA) (Davies 2008-). Both corpora are regarded to be the most representative and reliable corpora both qualitatively and quantitatively. Since ‘representativeness’ (cf. McEnery & Wilson 1996) is one of the important points to be considered while choosing a corpus for a research, the BNC and COCA seemed to be suitable data source for the present study. Moreover, as pointed by Newman (2011: 521), "corpora are a natural source of data for cognitive linguists, since corpora, more than any other source of data, reflect "usage"", and the BNC and COCA were the best ones to investigate the patterns of usage of the given lexemes. Each corpus will be now shortly overviewed.
The BNC is "a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written (http://www.natcorp.ox.ac.uk/, 29 January 2013). Importantly, the BNC is "heavily tagged for part-of-speech, making it a useful tool for determining how individual words may be used in text and conversation" (Mittelberg et al. 2006: 41).
Davies (2010: 447) describes COCA as follows:
The Corpus of Contemporary American English is the first large, genre-balanced corpus of any language, which has been designed and constructed from the ground up as a ‘monitor corpus’, and which can be used to accurately track and study recent changes in the language. The 400 million words corpus is evenly divided between spoken, fiction, popular magazines, newspapers, and academic journals. Most importantly, the genre balance stays almost exactly the same from year to year, which allows it to accurately model changes in the ‘real world’.
The BNC and COCA were employed in this study due to their size and representative-ness, as well as its functions (e.g. additional tools, tagged form). The following Subsections are concerned with the detailed description of the work with these corpora as a source of data.
2.5. Data selection procedure
This Section deals with calculation procedure of the proportional genre representation of the given lexemes in both corpora, the BNC and COCA. The first Subsection will introduce the concordancer used in this study and how the false hits were avoided, and the following Subsection 2.5., which consists of Subsections and, will then demonstrate break-up into genres in the used corpora, as well as show how and from which sources the sentence examples were extracted.
2.5.1. Concordancer
Concordancer "has become an established fixture for the analysis of corpora" (Meyer 2002: 140). The concordance tools of the BNC and COCA are based on the same principles (same architecture and interface) and constitute a sort of an index of the words (word forms/lexemes) with the corresponding reference of the source of the text. It is now possible to resort concordances in the main corpora, created by Mark Davies (Davies 2008-). This function of "resorting" was not used in this study, because only one or two sources were used to represent the given lexeme; but concordances were used to show the most common words (strings) to the left and right of the searched lexeme. This enabled to see how the lexeme is used and what are the surrounding words from the left and right side of it. Moreover, the architecture of both corpora made it possible to focus on two particular genres (magazines or newspapers), and thus, produce the ‘concordance lines’ for each lexemes in MAGAZINE and in NEWSPAPER (http://corpus.byu.edu/concordances.asp, 28 January 2013).
Since the lexemes under study have only two word forms, it was also possible to perform two searches to find singular and plural form of each lexeme (simple search by word). However, to make the elicitation reliable and avoid false hits, concordance and other tools of corpora were used. One of the alternative ways to search for individual word forms is wild-card facility "(regular expression in UNIX environments)" (Tribble 2010: 173). It was mostly used to see the overall frequencies of both word forms which were needed for the calculations (e.g. Query lexeme client*). See Figure 1 below.
Figure 1. COCA: Using wild-card function for client* (in MAGAZINES)
Such corpora tools enabled not only to retrieve the data needed for this study, but also to get rid of false hits. For example, as it can be seen in Figure 1 above, the first two results showed how often client in the plural (first result) and in the singular form (second result) occurs in MAGAZINE in COCA. In all cases of the given lexemes it was easy to avoid "mismatches", since "main" nouns usually appear in the first two lines (results) of the list, while other words/derivatives are listed below them and were disregarded (e.g. 3.clientele, 4.client-server).
2.5.2. The calculation of the proportional genre representation
Lexemes customer, client and consumer have only two actual word forms - customers, clients and consumers, respectively. Therefore, only their singular and plural forms were taken into account while working with the BNC and COCA. Since one of the aims of the study is to investigate the dialect variations of the given lexemes and compare the usage of them in the British and American varieties of English, a break-up into two genres, Magazine and Newspaper, has been performed.
The following subsections describe the exact steps of the break-up into genres of the given lexemes, based on the data (frequencies), which were elicited from the BNC and COCA. They provide the calculation procedure of the proportional representation of two genres – MAGAZINE and NEWSPAPER. Moreover, the subsections show from which sources the sentential examples were extracted. Breaking up into genres in the BNC:
In this subsection the three lexemes customer, client and consumer will be examined according to their genre representation in the BNC. The purpose of the break-up into Magazine and Newspaper genres is to show how the word forms of the lexemes are represented proportionally in the genres and to be able to extract an appropriate number of the sentence examples from the BNC for the further analysis. The calculations will be presented as tables to provide a clear and readable visualization of the findings. The calculation procedure of each lexeme will be additionally numbered (e.g. (1) "lexeme" in genres) to make it easy for the reader to follow. First, genre representations of customer (1) will be calculated, and subsequently, those of client (2) and of consumer (3).
2.6. Software (‘R’ and its packages)
Software that was used for this study is a Free Software of GNU project, and which is called R. It is a language and software environment for statistical computing and graphics (http://www.r-project.org, 27 January 2013). The choice of this programme was reasoned by the fact that "R is an integrated suite of software facilities for data manipulation, calculation and graphical display", with the help of which one can easily produce "well-designed publication-quality plots" (http://www.r-project.org, 27 January 2013). The functions and processes of R can be extended through various packages, which constitute a number of commands for performing particular type of analysis and special statistical techniques or graphical tools, etc. A certain set of these packages is usually installed automatically with the installation of R, however, following packages were additionally downloaded for the work in R (operating system Windows7): {MASS}, {languageR} and {pvclust}.
Each of the packages has its own function and purpose. The first package (MASS), that will be used in this study for Multiple correspondence analysis, was created by Venables & Ripley (2002), and was applied in R during the analysis, because "it has simple, but effective, functions for both binary correspondence analysis and multiple correspondence analysis" (Glynn in press: 144). The second package (languageR), developed by Baayen (2011), is known for its variety of statistical options, and can only be used for binary analysis. Thus, it will used for all binary correspondence analyses. According to Glynn (in press: 152), this package is easy to use and it has "agreeable graphic output". Finally, the last package that will be chosen to extend capabilities of R, is the package (pvclust), which should help perform hierarchical cluster analysis on the given data frame using function hclust, and, therefore, should enable to plot cluster dendrogram (Divjak & Fieller in press: 127).
Since only two types of analyses were used in this study (Correspondence analysis and Cluster analysis, See more in the next Subsection 2.7), the above mentioned packages were used according to their functions, the type (set) of data analysed and the nature of analysis as follows:
For all features:
Multiple correspondence analysis: in the package (MASS)
Cluster analysis: in the package (pvclust)
For the subsets of data:
Binary correspondence analysis: in the package (languageR) – for all subsets of data
It should be noted that the choice of the packages depended on the capability of the package to produce the most meaningful results for the data. Overall, the extending of the functions of R via three additional packages ((MASS), (languageR) and (pvclust)) made it possible to analyse, manipulate the data and display the most significant results graphically (See R plots in Section 3). The next Subsection will overview statistical tests for which these packages were used.
2.7. Statistical tests (Correspondence and Cluster analyses)
Given that this study is concerned with patterns of usage of three lexemes, the implementation of several statistical techniques was necessary in order to be able to work with frequencies, convert them into distances and visualize correlations in the data. Due to study limitations, the emphasis will be put on only two exploratory techniques: Correspondence analysis (esp. multiple and binary correspondence analyses) and Cluster analysis. The former one is useful for identifying relations between linguistic forms, their meanings, contextual differences and their patterns of usage (Glynn in press: 133). Moreover, with the help of Correspondence analysis one can visualise the relations as a "configuration biplots, [...] which depict degrees of correlation and variation through the relative proximity of data points (which represent linguistic usage features and / or the actual examples of use)" (Glynn in press: 133). For Glynn (in press: 134) "Correspondence analysis is a tool for digging in the data for patterns and correlations", which is developed only for exploratory purposes and which does not guarantee that the identified patterns are "anything more than a chance result, specific to the sample under observation". Hence, in the given study Correspondence analysis will be used to reveal patterns in the data, their relations to each other and to visualize the results in the plot. The graphically presented findings will be then interpreted in Section 3 with "careful consultation", trying to avoid misinterpretation (Glynn 2010b: 101).
Correspondence analysis subsumes a range of different techniques, at least three kinds of binary correspondence analysis and three kinds of multiple correspondence analysis. First of all, Multiple correspondence analysis (in the packet (MASS)) will be used to see the overall "picture" of the data. Though this "simple and powerful" technique for performing correspondence analysis allows to "add many variables to the analysis and perform complex multivariate analysis", the interpretation of the results may be problematic, particularly if adding more variables, so that "it becomes increasingly difficult to obtain reliable results" (Glynn in press: 148). That is why, Multiple correspondence analysis will be used to show the overall results of the analysis of the whole data (all features), to see if there are clear groupings in the "clouds" of correspondences, and to trace the overlaps of certain features which may indicate non-distinctive data. After Multiple correspondence analysis, the next step will be to create subsets of data not only to get more readable plots, but also to make the interpretation easier and more reliable. As already mentioned in the previous Subsection, Binary correspondence analysis (in the packet (languageR)) will be used for these various subsets of data. Being "superior in its representation to the plot produced in (MASS)", it produces plots where "the four quadrants are clearly indicated and the relationship between the different data points much more clearly depicted", so Glynn (in press: 153). Moreover, Binary correspondence analysis produces plots where the quality of the analysis is indicated in percentages (amount of inertia), which will show whether the graphs should be interpreted more carefully (in the case of low explained inertia) (Glynn in press: 153-154). In addition to Multiple and Binary correspondence analyses, another technique will be applied in R, which produces dendrograms of "discrete visualization" (Glynn in press: 172), and which is called Hierarchical cluster analysis (only in the packet (pvclust)). Cluster analysis offers a different kind of the visualization of the results and demonstrates ‘groupings’, and will be thus used to show "how the features of one factor are grouped", while Correspondence analysis should show "what features are responsible for that grouping" (172). Such implementation of the given types of Correspondence and Cluster analyses should help interpret the results and produce different types of plots, and which will also reveal advantages and disadvantages of these statistical techniques.
2.8. Additional survey
One of the most subjective categories that was taken into consideration in this study was Axiology, the coding of which was based on purely intuitive judgements and the sentential examples were coded subjectively and according to the context where the lexemes were used. Moreover, the sentence examples taken from the corpora are only parts of the larger texts, the whole "story" of which was not taken into consideration. Therefore, it was necessary to find out whether my intuition as a non-native speaker may provide reliable results and whether the findings would differ if the coding would be done by a native speaker. In order to examine whether the variants (negative, positive and neutral) are really associated with the three lexemes in a way this study demonstrates, a short survey has been carried out. For this additional investigation, which was aimed to support my findings of axiological aspects of customer, client and consumer, a free web-based survey software and questionnaire tool, called ‘SurveyMonkey’ was used (www.surveymonkey.com). With the help of the services of SurveyMonkey, one can create and distribute survey online (shared via email with the link of the survey). That is why, it was useful and practical to apply this type of survey. Another reason for choosing this online survey was, of course, my previous experience of working with the software, as well as its quick procedure of collecting and analyzing the results.
The designed questionnaire (See Appendix C) contained only two questions and was focused on asking native speakers about their intuition and personal judgement about given lexemes (words). In order to distract attention of the survey participants from the given three lexemes, so that they would not only concentrate on three words, but think in general what are the differences between words describing the notion of "customer", additional synonyms (e.g. buyer, shopper, etc.) were included in the questionnaire to be judged positive, negative or neutral. Ten participants, all native speakers (5 speakers of British English and 5 of American English) aged between 25 and 40, were asked via email to answer the questions online (using the generated link in SurveyMonkey). The first question was aimed to identify which variant of English the participants spoke; whereas the second type of the question was a "Matrix of choices (only one answer per row)", where the subjects were supposed to tick the appropriate box for each word according to their axiological aspects (positive, negative, neutral or "I don’t know"). The survey was active (status "open") from the 20th till 30th of January and was administered with the help of Web Link collector, which collects anonymous survey by emailing it to the respondents using a WebLink. The results of the survey will be discussed and presented graphically in Subsection 3.3.
3. Results
After 150 instances of customer, client and consumer were extracted from the BNC (50 instances each) and 150 instances from COCA (50 each), they were manually annotated according to the coding schema, adopted for this study. Subsequently, the plots were produced in R using Multiple and Binary correspondence analyses and Cluster analysis. In the given Section the results will be presented graphically and briefly analyzed and illustrated with corresponding sentential examples from the utilized corpora.
The above figure shows how different words were judged by native speakers as positive, negative or neutral. In terms of this study, only customer, client and consumer will be interpreted and then compared with the results of corpus-based method, namely with the axiological distinctions between the lexemes presented in Figure 9, namely that consumer is more negative, client – positive and customer – neutral.
The survey findings demonstrated that 70% of the participants associated consumer with something negative, whereas 60% of them were sure that client is a very positive or neutral (30%) description of the concept CUSTOMER. Finally, customer was judged as "positive" by 60% of respondents and as "neutral" by 50%. Interestingly, it was also perceived by only one participant as "negative".
Finally, if compared to the results of the present study, the survey results are in accord with axiological distinctions for the three lexemes, displayed in Figure 9, with the only difference that there is less clear-cut distinction between "positive" and "neutral" features for customer and client. This means that consumer is more distinctly negative, whereas the other lexemes share positive-neutral connotations. Similar differences between the three lexemes were indicated in the plots in R (e.g. Cluster analysis, Figure 15).
4. Discussion
4.1. Implications of the findings for the hypotheses
The exploratory techniques of Correspondence analysis (Multiple and Binary) and Cluster analyses employed in this research made it possible to identify certain usage patterns of the three lexemes under investigation. The graphical representations in the previous Section helped to visualize these findings, compare the differences in British and American English, and interpret the results, at the same time trying to avoid generalizations about the language use in general.
First, the outcomes of the Multiple correspondence analysis demonstrated that the analyzed lexemes are clearly distinguished and occupy three distinct positions, but still share a number of features. Secondly, creating the subsets of the data and processing them separately in R, namely performing Binary correspondence analysis in the packet (languageR), enabled to analyze the lexemes more detailed. The first two plots (Figure 7 and Figure 8) showed that consumer, client and customer differ to a lesser extend in Magazines and Newspapers than they differ in British and American English. This means that hypothesis (H2) postulating that there might be differences in genre distributions has not been confirmed. By contrast, there is more evidence to support (H1), and thus, the British and Americans seem to differ in their conceptualizations of CUSTOMER. Importantly, such distinctions in the two dialects tend to be reasoned by the fact that consumer differs quite significantly from customer and client in its usage in the singular form with the definite article (the) in British English and a lack of determiner in American English. The plots performed on the sub-subsets of the grammatical features (Figures 11-14) also displayed distinct differences in the use of the given lexemes. For example, consumer is clear associated with modifier (Sentence parts) and definite article, client is usually used as an object with possessive pronouns, whilst customer tend to be used as a complement or a subject, accompanied by indefinite article (See Figure 11 and Figure 12). In addition, in order to distinguish between British and American use of the near-synonyms under study, the plots in Figures 13 and 14 were compared and showed that there are fewer distinct grammatical features in the British than in the American use. It has become obvious that although there are some usage differences between the two dialects (e.g. customer/client are equally associated with object and the indefinite article in BrE; customer is more associated with the indefinite article and complement; client - with object; and both equally with - adjective in AmE), there are also several shared features (e.g. customer associations with the plural form and consumer in singular form in BrE and AmE). All grammatical categories helped to test the fourth hypothesis (H4), and showed that both American and British speakers of English use the three near-synonyms creatively and try to add as more additional information as possible; they also seem to convey their emotional or social status in some ways (e.g. using object instead of subject may be interpreted as a wish to portray himself/herself in a higher position (i.e. somebody who is allowed to control customers, etc.); or using a complement shows more personal attitude or involvement (i.e. We are the biggest client..)).
Thirdly, distinct axiological features, captured by the plot of the subset Axiology (See Figure 9), verify the hypothesis concerned with the speakers’ emotions (H3), which are operationalized in positively or negatively charged contexts where three lexemes consumer, client and customer occur. The results of the incorporation of the exploratory techniques (Binary correspondence analysis) have shown that each lexeme tend to be used in either positive (client), negative (consumer) or rather neutral (customer) situations. Given that the coding was done by a non-native speaker of English, who used her own intuition to analyze the sentential examples, the findings of this semantic-axiological data were additionally tested with another method (survey). The results of the additional survey showed no clear distinction between positive and neutral characteristics of client and customer, but confirmed that consumer is more ‘negative’ than the other two (See Figure 16).
Finally, in order to distinguish between all three lexemes and illustrate the semantic and functional differences between the uses of the given near-synonyms another exploratory technique of Cluster analysis was used in this stud. The resulting plot (Figure 15) clearly demonstrated that consumer stands out from the lexemes customer and client. This may be explained by the tendency of consumer to be used with modifier and definite article (the); as well as the differences in the use of dialect: in AmE consumer is more associated with definite article, whereas in BrE – with the singular form. Taking into account that no research has been done into the patterns of usage of these lexemes from the corpus cognitive perspective and that the scope of the study is restricted to several features which might explain the conceptualization of CUSTOMER in British and American culture, it would be premature to make any generalizations about language use at large. Glynn’s (2010a) study, which only partially deals with nouns, might be referred to as the only that seem to corroborate the findings of this studies in some ways; according to which "count nouns profiling is more typically associated with light emotional inconsequential effect" (Glynn 2010a: 14).
4.2. Limitations of the study
The study represents only an attempt to use corpus-based methodology and explore language phenomena from a corpus cognitive perspective, and is restricted in its scope. Moreover, due to the lack of previous experience in formulating hypotheses and operationalizing of the variables into coding schema, the research might contain some unclear assumptions. The design and structure of the research should probably be more elaborate and the size of the sample (300 instances) makes it difficult to make any general conclusions and should be extended in future investigations. In addition, it would be interesting to look at other near-synonyms, which are also close in their meaning to CUSTOMER (e.g.: buyer, shopper), and examine whether other words convey emotions or attitudes. Since only several exploratory techniques were used in this study and other tests (e.g. significance test, p effect size, etc.) were not employed, any claims about language at large would be thus inappropriate.
4.3. Implications of the findings for the research area
As already discussed in the previous Sections, combining the most appropriate techniques of corpus methodology and theoretical background of cognitive linguistics has a potential to reveal associations and differences between patterns of usage of near-synonyms. As stated by Glynn (2010: 5), identifying recurring combinations or patterns of linguistic forms, which are postulated to reflect "a speaker’s intentional knowledge" (in corpus linguistics), and considering language as being "conceptually motivated" (in cognitive linguistics), allows, first of all, "a direct method for making generalizations across large numbers of speakers (thus a language’s grammar)" and secondly, "an indirect method for producing hypotheses about the conceptual structure of a language (motivation for a language’s grammar)". A full description of all advantages and disadvantages of the basic principle of operationalization of conceptual structure, which underpin the corpus-based cognitive linguistics remains beyond the scope of this study.
4. Conclusion
This study on near-synonyms customer, client and consumer has demonstrated differences in their meanings by analyzing their formal and semantic correlations and investigating the ways of the operationalization of these lexemes. The BNC and COCA provided quantitative linguistic data for the analysis, whereas R was used to present the results graphically and helped to identify the patterns of usage of the given lexemes. Using a quantitative corpus-based method, this study, concerned with the conceptualization of CUSTOMER, has shown the ways in which the given lexemes have been operationalized.
The findings of the research emphasize that the conceptualization of CUSTOMER differs considerably in the speakers of the British and American English and that all three lexemes are clearly distinguished, and occupy three distinct positions, but still share some features. Moreover, people are likely to express their attitude towards CUSTOMER as ‘negative’ when they talk about consumer, and as "neutral-positive" when they use customer and client. Finally, the results of the analysis of the three lexemes have also revealed that customer and client have more similar meanings, whereas consumer, standing out from the other lexemes, seems to be quite different in meaning. Such a clear distinction between the three near-synonyms may help marketing professionals and advertisers to plan their champagnes in more details and develop marketing strategies in a more skilful way, and thus, have more success.

Warning! This essay is not original. Get 100% unique essay within 45 seconds!


We can write your paper just for 11.99$

i want to copy...

This essay has been submitted by a student and contain not unique content

People also read