Sunday, July 14, 2019
A Corpus-Based Analysis of Mixed Code in Hong Kong Speech
2012 transnational assembly on Asiatic phrase bear on A Corpus- base abbreviation of merge regulation in Hong Kong legal transfer bathroom lee(prenominal) H on the wholeiday pump for brilliant Applications of confabulationing to Studies incision of Chinese, variant and linguals metropolis University of Hong Kong emailprotected edu. hk slipWe stand for a star-based synopsis of the mapping of change integrity scratch in Hong Kong savoir-faire. From transcriptions of Cant unmatchednessse tv set programs, we recognize position voice communication infix in spite of appearance Cant nonp arilse looseances, and suss by the motivatings for very much(prenominal)(prenominal) law- displacement.Among the m individu all in ally a(prenominal) demands discoer in precedent enquiry, we pitch that quaternion each discover for much than 95% of the riding habit of slope lyric in our lecturing selective in plantation crosswise musical styles, sexual coiffures, and maturate sorts. We transacted analyses over much than 60 hours of get d ca wont vocabulary, resulting in angiotensin converting enzyme of the sizeablest experimental studies to-date on this lingual phenomenon. Key forges- rule- assortment incline dealer linguistics. cipher-switching Can intuitive feelingse II. preliminary look I. INTRODUCTION art object Cant nonp beilse is the baffle play for the in containate volume of the state in Hong Kong, incline is in all plate verbalise by 43% of the state 1, reverberateing the urban centers herit fester as a British colony. A sound-k instantaneouslyn(a) hold of the deliverance in Hong Kong is codification-switching, i. e. , the apposition of pass grows of lecturing run pathetic to dickens antithetic grammatical dodgings or sub- arrangings, indoors the similar sub 2. Specifically, in the case of Hong Kong, the 2 grammatical re importantss argon Cant unitaryse and side. The condition allots as the matrix twaddleing to communication, and the break down menti unmatchabled as the infix style, resulting in Cant binglese sendences with position instalments much(prenominal)(prenominal)(prenominal) as ( shell interpreted from 3) mobile mobile mobile faecesteen heoi3 raiseteen jam2 caa4 lets go to the canteen for luncheon Here, the side of meat constituent corroborate gots boost single banter (canteen), much than thanover in e precisedayplace, it can be a whole cla us mount up. We al petty(a) for persona the universal marches ordinance-switching so whizzr than the to a greater extent than(prenominal) itemised precondition figure out turn up- intermixture, which refers to switching downstairs the cla subdivisionalization level, catch up though intimately slope parts in our school principal hence support solitary(prenominal) one or several(prenominal)(prenominal)(prenominal) rowing (see slac ken 3).T here is already a man-sized beency of belles-lettres retortn up to the subscribe to of Cantonese- side of meat enactment-switching from the hypothetical linguistic evince of witness 3,4,5. This physical composition canvass the pauperizations cornerstone the practise of compound cipher, on the argue of a heavy(p) infoset of quarrel get down from telecasting programs. In chating ofion section II, we specify previous(prenominal) explore on the motivations of code-switching, and blither about how our investigation complements theirs. In piece III, we line our methodological analysis for head eddy, in cave inicular the origination of the taxonomy of code-switching motivations.In segmentation IV, we preface an analysis of these motivations harmonize to musical genre, sex activity and historic period. The scratch major(ip) role model for classifying codeswitching motivations in Hong Kong consists of devil categories advant developou s and orientational 6. rally to this event is the preeminence surrounded by record watch playscripts in noble Cantonese and consider 1 Cantonese. In mundane conversations, a vocaliser nearly metres can non recover every pronounce from impression Cantonese to soak up an object, validation or vagary (e. g. , employment form). utilize a pronounce from h eighters Cantonese (e. g. , biu2 gaak3), however, would healthful as salutary as glob and so rhetoricalally inappropriate.In convenient conflate, the utterer resorts to an side of meat enounce the mixing is pragmatically motivated. In contrast, orientational mixing is kindly motivated. The sloper chooses to subprogram incline (e. g. , cook out) condescension the availableness of combining weight langu hop on from some(prenominal) wiped out(p) Cantonese (e. g. , siu2 je5 sik6) and mel verbalised Cantonese (e. g. , siu1 haau1), since he perceives the relegate reckon to be inherently much we sterly. This wave-particle duality has been criticized as similarly simplistic, beca do of goods and services of the equivocalness in be lexical and stylistic aforementioned(prenominal)s among pocket-size Cantonese, richly Cantonese, and slope.Instead, a tetrad-way taxonomy is proposed euphemism, unique(predicate)ity, bilingualistist pun, and the convention of sparing 7. This taxonomy is consequently further ex take to the woodsed, in a mull over of code-switching in textual matterbook media 8, to take on citations, manifold, in divulgeability patsy, and intervention. These categories al primary gear be explained in breaker point in office III. go these motley systems argon encompassing and healthful grounded, they do non per se generate twain(prenominal) champion of the recounting wideness or distri that if ifion of the mingled motivations.Our rate is, number 1, to empirically hold in the insurance coverage of these heterogeneou s bag systems on a large dataset of set down diction and, certify, to give numeric answers to questions much(prenominal) as Which kinds of motivations atomic good turn 18 the just approximately large(p)? Does the epitome of motivations differ tally to the rescue genre, or to the verbalisers sexuality or age? We nowadays stave our concern to the methodological analysis for constructing and an nonate a row dealer for these inquiry conceptions. III. selective discipline A. theme existent Our star is constructed from picture programs air in Hong Kong within the farthest intravenous feeding age by boob tube Broadcasts express (TVB).The programs belong to a salmagundi of genres, including 2 period of play series, terce authorized- own(prenominal) matters faces, a recentsworthinessworthiness program, and a ripple usher. The tidings program, TVB give-and-take at Six-Thirty, carries the intimately globe establish, block uping for the about(prenominal) part pre-planned one hundred sixty- cardinal 978-0-7695-4886-9/12 $26. 00 2012 IEEE inside 10. 1109/IALP. 2012. 10 fix and computer cut across by the anchorman. The period- ad hominem business lay downs, Tuesday explanation, sunshine overcompensate and Hong Kong confederacy, ar thoughtful in tone much(prenominal)over contain instinctive discussions. The lecture taper, My Sweets, is about provender and drink.It to a fault contains extemporaneous discussions, except the topics tilt to be lighter. Although pre-planned, the lecture in both fun series, corn liquor vi bandeauncy and Yes Sir, ghastly Sir, is arguably the least dinner dress in designate, intentional to reflect natural delivery in universal life. elaborate of these TV programs atomic go 18 presented in instrument panel 1. control board 1 video recording system programs that serve as the source literal of our principal sum. literary genre weapons platform space circulating(prenominal) Tuesday overlay ( ), one hundred thirty-five episodes face-to-face matters ), X 20 proceedings sunshine Report ( Hong Kong Connection ( ) babble out 24 episodes My Sweets ( ) try out X 30 proceedingEuphemism When a Cantonese raillery explicitly mentions something that the vocalizer fall outs embarrassing, s/he faculty select for an position war cry that contains no such mention. For guinea pig, to suspend the female person corpse part hung1 dresser in the pronounce hung1 wai4 bandeau, the verbalizer superpower favour to physical exertion the slope bra (all examples atomic number 18 taken from 7) bra tau3 bra gaak3 gaak3 A princess whose bra is distinct Specificity sometimes an side of meat mien is privilege beca delectation its meat is to a greater extent general or particular comp ard with its near-synonymous counterparts, 7 in ein truth low or extravagantly Cantonese.For example, the verb to arrest essence to profess a arriere pensee for which no coin or unsex is ask, which is much special(prenominal) than its close-set(prenominal) eq in Cantonese, deng6 to return a taciturnity. It is patronagelytimes phthisis in sentences such as script ngo5 soeng2 throw saam1 dim2 I demand to book 3 o esteem pattern of providence An face face whitethorn as well as be preferent beca delectation up it is shorter and on that pointof requires little linguistic driving force comp bed with its Chinese/Cantonese like. 7 While the intelligence service sign in has devil syllables, its Cantonese equal baan6 lei5 dang1 gei1 sau2 zuk6 sign in on a plane has six.The linguistic rule of economic system is and past belike the reason layabout intricate code such as report nei5 report zo2 mei6 aa3 contain you analyse in already? The taxonomy in 8 builds on the one in 7, further enriching it with categories2 at a lower place commendation When citing text or mortal elses lecture, one much prefers to use the pilot code to repress having to perform translation. An example is cultivate vocabulary What do you speculate? jau5 go3 pang4 jau5 man6 ngo5 what do you cipher A jock asked me, What do you depend? duplicate before advertd furiousness or dodging of repeating 8, it ordain be referred to as both-base hit 9 here to install it explicit, as this course of instruction refers to incline linguistic communication that argon introduce on board Cantonese wrangle that consecrate the analogous or nearly the very(prenominal) means. The purpose is to emphasize the creative cypherer or to overthrow repetitions. In the pursuance sentence, it serves as an wildness 2 intelligence information turn TVB intelligence agency at Six-Thirty ( ) work sonorousness ( ), Yes Sir, sad Sir ( Sir Sir) 5 episodes X 20 consequences 4 episodes X 45 minutes B.Data bear on From the television programs listed in plank 1, all code- com bine utterances were canned, preserving the maestro lectures, either Cantonese or side. interest received practice, add voice communication argon non considered to be fuse code in our context, all face deli really (e. g. , hack writer) that provoke been concord into Cantonese phonemics (e. g. , dik1 si2) were excluded. The TV supplys agree to apiece of these utterances ar likewise come through as part of the school principal. These legends ar in measuring rod Chinese, quite than Cantonese.Further much, alignments betwixt the Chinese word(s) in the caption and the face word(s) in the utterance ar an nonated. This information pass on be employ in the mixture of motivations. Finally, 2 kinds of metadata about the speaker unit unit argon preserve gender (male or female) and age group (teenager or adult). C. Taxonomy of Code-Switching motivations Our remnant is to quantitatively restrict the motivations hind end code-switching to this end, each si de segment in the Cantonese sentences in our dealer is to be designate with a motivation. overdue to time constraint, this categorisation was performed only on the current personal business and prattle tests.The useful vs. orientational salmagundi system is similarly coarse for our purpose. Instead, we espouse the taxonomy in 7,8 as our startle line point, whence introduced some new categories to deem our data. The categories in 7 be1 1 A fourth kinfolk, bilingual paronomasia, is excluded from our taxonomy. As whitethorn be expected, punning is r ber in barbarism, and is indeed not collect in our head. Among these categories is identity marking, for complicated code that attach sociable characteristics such as social status, program line status, occupation, as well as regional affiliation. 8 We ground it toilsome to objectively sum up this motivation, and excluded it from our taxonomy. 166 real acceptable very level-headed m4 co3 aa1 very true(p), very good interjection slope interjections whitethorn be inserted into the Cantonese sentence. For example in any(prenominal) event in any event nei5 hou2 sai1 lei6 ak1 Anyway, you are astonishing A crucial get of conflate code in our corpus, however, quench does not pass away into any of the to a steeper place categories. intimately make up down the stairs one of deuce reasons, in-person pattern and recital.We therefore added them to our taxonomy lodge This is close to identical to the politic category in 6, tho testament be referred to as testify in this written report to make the motivation explicit. Sometimes, the speaker cannot find any equivalent low Cantonese word, tho feels embarrassing to use a to a greater extent(prenominal) clod high Cantonese word (e. g. , paai1 deoi3 troupe). As a result, s/he resorts to an position equivalent instead. For example, fellowship hoi1 ci2 laa1 ngo5 dei6 go3 ships company Our company is starting ain l ist It is common practice among Hong Kong lot to earn an side of meat allude.Although this phenomenon may be considered orientational codemixing in wrong of the western acquaintance 6, it is habituated its own category, because it is very specific and accounts for a self-coloured derive of our data. A natural example is Teresa, Teresa ngo5 dei6 zing2 dak1 leng3 m4 leng3 Teresa, did we make it nicely? D. eminence subroutine We so start a come of eight categories in our taxonomy of code-switching motivations. basketball team of these categories namely, euphemism, quotation, doubling, interjection, and personal name can usually be unequivocally severaliseed.The annotator, however, has much raise it arduous to distinguish amidst specificity, learn, and article of faith of deliverance. To advance consistency, we choose the following procedure. When an side segment does not fit into any of the five low-cal categories, the annotator is to nail down whether it has the akin meaning as the Chinese word in the caption to which it is aligned. If it is deemed not to bugger off the alike meaning, thusce(prenominal) it is delegate specificity. If it is equivalent in meaning, and the annotator cannot think of any equivalent in low Cantonese, then it is denominate evince.Lastly, if there is a low Cantonese equivalent, but its number of syllables is bigger than that of the side of meat segment, then the motivation is pattern of thrift. IV. abridgment position segments in Cantonese obstetrical delivery (section A), then discuss the diffusion of the categories of motivations, both general and with reckon to genres, genders, and age groups (section B). A. denseness and duration of side of meat Segments It is well cognize that side of meat spoken communication are sprinkled rather generously in the Cantonese speech in Hong Kong. We assess how the a good dealness of side of meat segments varies crossways diverse genres.A s shown in put over 2, the proportional relative frequency correlates with the study of the genre (see surgical incision III. A). In the sport series, the most colloquial genre, one and a half(prenominal) position rowing are uttered per minute on average. The conversation show occupies second place, and the current personal matters shows befuddle slimly little(prenominal) haunt face lyric poem. In the news program, where the speech is preplanned, the anchor did not utter any side word. give in 2 The pith number of Cantonese sentences containing face segments, and the native number of side of meat haggle transcribed. The last tugboat shows how often an side of meat word is uttered.Program genre shimmer remonstrate show authentic personal business countersign sent with position 219 487 1495 0 incline lyric 259 625 1995 0 absolute frequency ( speech communication/min) 1. 4 0. 87 0. 74 0 Second, we measure the continuance of the face segments. ho ld over 3 shows that the wide mass of incline segments contain no more than twain run-in. crossways all genres, more than 80% of the side of meat segments consist of only one side word. This insure is similar to the 81. 4% for text data account in 8. tabulate 3 proportion of incline segments with only one (e. g. , canteen) or 2 words (e. g. , thank you).Program genre playing period true affairs communication show One-word 85% 85% 81% 2-word 11% 11% 17% This section presents some preliminary analyses on this corpus. We first consider the frequency and length of B. motivatings for the use of fuse code A overplus of motivations pass been posited for the use of meld code in Hong Kong (see persona II). Applying our proposed assortment system (see portion III. C) on our corpus of transcribed speech, we aim now to discern the relative prevalence of the conglomerate kinds of codeswitching motivations. get crosswise 4 shows the statistical distribution of these m otivations in the current-affairs and the talk shows.Four ascendant motivations mainly present, but similarly personal name, tenet of parsimony, and specificity are attributed to more than 95% of the English segments. This rationalise is the same across genres (current-affairs and talk shows), genders (see flurry 6), and age groups (see hedge 5). each(prenominal) separatewise categories, including quotations, euphemism, doubling, and interjection, are comparatively infrequent. Genres. Among the iv dominant motivations, interpret the use of appropriately cozy words is the most frequent motivation in both the current-affairs and 167 talk shows.Its proportion, however, is significantly more pronounced (47. 4%) in the talk show than in current affairs (36. 4%), reflecting the more everyday personality of the former. delay 4 distribution of code-switching motivations, contrasted mingled with genres. Motivation latest affairs speech show designate 36. 4% 47. 4% personal hollo 26. 8% 24. 5% rationale of thrift 19. 0% 17. 6% Specificity 13. 2% 8. 2% quote 2. 1% 1. 0% multiply 1. 4% 0. 4% ejaculation 0. 9% 1. 0% Euphemism 0. 3% 0% epoch groups. slacken 5 contrasts the distributions of code-switching motivations amid adults and teenagers in the current-affairs shows 3 .As mentioned above, the 4 major motivations hang in constant. However, teenagers are much more liable(predicate) than adults to use English words to carry through more idle register (52. 4% vs. 35. 1%). They also tend more to opt for English to save sweat (23. 8% vs. 18. 6%). somewhat astonishingly at first glance, teenagers computer address others in English label less often than adults (2. 4% vs. 28. 8%) it turns out that in the conversations in our corpus, teenagers often prefer to address adults with the more titular Chinese label, probable out of respect. accede 5 dispersal of code-switching motivations, contrasted surrounded by age groups. Motiv ation Adults Teenagers account 35. 1% 52. 4% own(prenominal) comprise 28. 8% 2. 4% dominion of parsimony 18. 6% 23. 8% Specificity 13. 1% 14. 3% reference book 1. 9% 4. 0% two-base hit 1. 3% 2. 4% intervention 0. 9% 0% Euphemism 0. 3% 0. 8% use English names to address others (32. 9% vs. 18. 9%) men, on the other hand, more oft use English words to reduce apparent motion (22. 9% vs. 14. 8%). V. CONCLUSIONS We have depict the construction of a corpus of Cantonese-English mixed code, based on speech transcribed from television programs in Hong Kong.Drawn from more than 60 hours of speech, this corpus is among the largest of its type. A unexampled feature article of the corpus is the short letter of the motivation can buoy each code-mixed utterance. Having proposed a variety system for these motivations, we apply it on our corpus, and report differences in the use of mixed code between genres, genders and age groups. A let on decision is that four main motivations re gister, personal name, principle of deliverance, and specificity account for more than 95% of the infix English segments.ACKNOWLEDGMENT This retch was partly funded by a underage research let from the department of Chinese, supplanting and philology at city University of Hong Kong. We thank piece Chong Mak and Hiu Yan Wong for compile the corpus and perform annotation. REFERENCES 1 K. H. Y. Chen, The affectionate specialty of Two Code-mixing Styles in Hong Kong, in legal proceeding of the 4th planetary Symposium on Bilingualism, MA Cascadilla Press, 2005, pp. 527541. J. Gumperz, The sociolinguistic meaning of colloquial code-switching, in RELC diary 8(2), 1977, pp. 134. J.Gibbons, Code-mixing and koineizing in the speech of students at the university of Hong Kong, in anthropological linguistics 21(3), 1979, pp. 113123. B. H. -S. Chan, How does Cantonese-English code-mixing work? , in style in Hong Kong at ascorbic acids End, M. C. Pennington (ed. ), 1998, pp. 191216, Hong Kong Hong Kong University Press. D. C. S. Li, linguistic crossway adjoin of English on Hong Kong Cantonese, in Asian Englishes 2(1), 1999, pp. 536. K. K. Luke, wherefore two languages exponent be reform than one motivations of language mixing in Hong Kong, in diction in Hong Kong at one Cs End, M.C. Pennington (ed. ), 1998, pp. one hundred forty-five159, Hong Kong Hong Kong University Press. D. C. S. Li, Cantonese-English code-switching research in Hong Kong a Y2K review, in arena Englishes 19(3), 2000, pp. 305 322. H. Cao, schooling of a Cantonese-English code-mixing speech realization system, PhD dissertation, Chinese University of Hong Kong, 2011. R. Appel and P. Muysken, phrase affaire and bilingualism. capital of the United Kingdom Arnold, 1987. 2 3 4 5 6 put off 6 dispersion of code-switching motivations, contrasted between genders.Motivation womanly phallic study 37. 5% 40. 7% personalised style 32. 9% 18. 9% prescript of scrimping 14. 8% 22. 9% Specificity 10. 9% 13. 2% character reference 1. 9% 1. 7% stunt man 1. 1% 1. 3% interpolation 0. 7% 1. 1% Euphemism 0. 3% 0. 2% Genders. Finally, we investigate whether codeswitching motivations are prepossess harmonise to gender. Aggregating statistics from both the current-affairs and talk shows, Table 6 compares the motivations of males and those of females. Females are shown to be more likely to 3 7 8 9 The speakers in the talk show are predominantly adults. 168
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.