Charlotte Dow, Rachel Lee, Ellery Yu, Olivia Salazar, Yennie Song
In this study, we compared qualitatively and quantitatively transfer errors among bilingual speakers. We targeted classifiers and aspect markers zai and le in Mandarin, examining differences between heritage and L2 speakers. We predicted that heritage speakers would experience significantly fewer errors with aspect markers than L2 learners but experience more errors with classifiers. Data was collected from 11 heritage speakers and 11 second language learners of Mandarin, as well as 2 balanced bilinguals who served as the baseline for comparison. Our results showed that L2 learners significantly outperformed heritage speakers in classifier accuracy, while there was no significant difference in the performance between the two groups in aspect marker usage. These findings suggest that specific classifiers are more difficult to acquire or maintain, particularly for heritage speakers who have limited formal input. Moreover, the results show that L2 learners can sometimes perform better in certain scenarios compared to heritage speakers, challenging commonly-held assumptions about the heritage speaker advantage.
Keywords: Bilingualism, Mandarin Transfer Errors, Classifiers, Aspect Markers Bilingualism, Mandarin, Transfer Errors, Classifiers, Aspect Markers
Introduction and Background:
We know that second language (L2) learners and heritage speakers (HS) follow vastly different paths to proficiency (Paradis, 2023). Given that L2 speakers learn in a classroom environment and heritage speakers develop language in a less formal, home-based environment, how do these distinct journeys affect how they maintain a language and the specific mistakes they make?
This led to our proposed research question: Do heritage speakers and late L2 learners experience different morphosyntactic transfer errors?
We hypothesized that heritage speakers would commit significantly fewer errors with aspect markers than L2 learners, but experience more errors with classifiers. Conversely, we predicted that L2 learners would demonstrate a higher grammatical awareness and make significantly fewer errors with classifiers, but not with aspect markers.
In order to conduct this research, we sorted our participants into 3 groups based on their acquisition trajectory: Heritage speakers (HS) are those who naturally acquire their heritage language (Hao et al., 2025). Second language learners (L2) are cognitively mature learners who pursue learning of a second language after the critical period (Eyring, 2014). Balanced bilinguals are speakers who have proficiency in two languages such that their skills in each language match those of a native speaker of the same age (APA Dictionary of Psychology, 2023).
A classifier is a morpheme used between a numeral or demonstrative and a noun. The general classifier, ge, is most common, while there are specific classifiers when counting certain nouns. Children use the general classifier ge with many nouns, though adults may use more specific classifiers with certain nouns (Chien et al., 2003).
Aspect markers are bound morphemes used to indicate the event state of an action. The two aspect markers we looked at were le and zai. Le is a perfective aspect marker that describes event completion and zai is a progressive aspect marker that describes an ongoing event.
Our contribution to this research involves the analysis of a speech generation task and usage of these grammatical targets by L2 learners and HSs.
Methods:
– Participants, Procedure, Methods, Controls –
We recruited 11 HS and 11 L2 learners of Mandarin Chinese. There were also 2 balanced Mandarin-English bilinguals. We omitted two speakers’ data because the samples either had too much English or there was background noise that prevented us from properly transcribing the data.
Participants self-reported their proficiency using a Language Background Questionnaire (LBQ) adapted from Anderson, Mak, Keyvani Chahi, & Bialystok (2017), with additional questions modelled off of suggestions from Hao et al. (2025). We used the questionnaire to better understand participants’ individual differences and types of exposure to the target language to classify them as HS, L2 learners, or balanced bilinguals for the purposes of the experiment.
For the experiment, participants were given written instructions and a recorded file of instructions in either English or Mandarin; half listened to the English version first, and half to the Mandarin version first, to account for practice effects via counterbalancing. Giving instructions in both languages between tasks may leave a priming effect on the language, but not on the targeted variables. Participants were presented with a silent clip from the 1988 film The Hill Farm, between 3:58 and 4:40. For the first language, they watched the video twice and were asked to describe the video in either English or Mandarin, depending on the language of the directions they listened to. Participants were then asked to describe the video. To elicit naturalistic speech, participants were not informed of a time limit, but a two-minute maximum was established for data collection purposes. Next, participants watched the video again and were asked to describe it in the other language. Experiments were conducted via Zoom and participants were audio recorded while describing the video in each language.
We looked at whether the general classifier ge is used in place of specific classifiers and how the aspect markers le and zai are used, to analyze for errors or misuse. Data was analyzed by counting the number of errors and categorizing them by type of error (incorrect aspect, generalized classifier, omission). The proportion of correct classifiers and correct aspect markers was also recorded per speaker, and compared between HSs, L2 learners, and balanced bilinguals. Classifier errors were further categorized by recording with which word they appeared in highest frequency by, and aspect errors were split between the progressive marker zai and the perfective marker le. These errors in both groups were then compared to each other to determine the variance between heritage and L2 speakers. The percentage of correct classifiers and the percentage of correct aspect markers used per speaker was also compared to speakers’ self-reported reading, writing, speaking and understanding proficiencies in Mandarin, also taking into account whether participants did any self-studying or if they spoke any other classifier languages. Fisher’s exact test was performed to observe the total number of classifiers and aspect markers used by HSs and L2 learners.
Results and Analysis:

Table 1 and 2: Fisher Exact Tests for Aspect and Classifier Correctness, Respectively
By totaling the group outcomes and performing Fisher’s exact test, we found that there was no significant difference in aspect between the groups (p = 0.735). Therefore, we cannot reject the null hypothesis that there is no difference between L2 learners and heritage speakers’ aspect proficiencies. However, the results for classifiers were much different, with a p value of 0.0056, suggesting there is a very strong statistical difference between the two groups.
Additionally, based on the odds ratio of .38, L2 learners had nearly 2.6 times higher odds of being correct when it comes to classifiers.
Figure 1: Average Classifier Use in HS, L2, and BB Speakers
Classifiers Figure 2: Average Aspect Use in HS, L2, and BB Speakers
Aspect Markers
This graph shows the significant difference we discussed previously in that there is a significant accuracy divide. HSs, on average, used the highest volume of classifiers, but struggled with precision, resulting in more incorrect uses than correct ones. In contrast, L2 learners were more conservative in their total usage, but maintained a higher accuracy rate. Unlike the classifier analysis seen in Figure 1, aspect markers were used with higher precision across all three groups. HSs and L2 learners performed very similarly, with average correct uses of 2.36 and 2.45, respectively. While balanced bilinguals demonstrated a higher volume of aspect use, the error rates remained consistently low (<0.60) across all three participant types.
Balanced bilinguals performed much better than the HS and L2 groups. L2 learners also significantly outperformed HSs in classifier accuracy. Contrastingly, as confirmed by statistical analysis there was no significant difference between L2 learners’ and HS’ performance in terms of aspect marker usage (p=.0735).
Discussion and Conclusion
Balanced bilinguals significantly out-performed the other groups in this study. This indicates stronger grammatical competence or more stable representations of these structures. Aspect scores were generally higher than classifier scores. This suggests classifiers are harder to acquire or maintain than aspect marking. To further analyze these results, we looked at several factors, including speakers’ self-rating tendencies, speakers’ self-studying tendencies, cross-linguistic influence, and target language input.
When comparing speakers’ self-rating tendencies to their performance, we saw that HSs tended to rate themselves with lower reading and writing proficiencies than L2 learners, and that HSs who rated themselves with higher proficiencies had generally higher correct percentages of aspect marker usage. The correlation between HS’ self-rated reading and writing scores and their aspect marker proficiencies adds to previous research about how levels of literacy are a factor for mediated results in HSs (Hao et al., 2025) and supports existing research on how increased literacy in the target language is correlated with “target-like” production (in this scenario, our target markers were the balanced bilinguals, as we were unable to access a population of monolingual Mandarin speakers), or greater accuracy in usage (Bayram et. al, 2019). Overall, these findings further add to the larger body of research highlighting the importance of the role of literacy, or reading and writing, in language development and maintenance (Goldenberg, 2020; Biber & Hared, 1991; Bigelow & Tarone, 2004; Eisenchlas et al., 2013).
While we did record data based on participants’ reports of self-study, to further analyze the differences between HS and L2 learner performances, we were unable to carry out much analysis beyond further clarifying how future research might approach the effects of informal literacy on heritage language maintenance as there was no apparent difference between the participants who did self-study and those who did not in terms of classifier and aspect accuracy. Previous research has posited bilinguals who don’t receive formal literacy training have difficulties maintaining their non-dominant language over time, and studies on mobile-assisted language learning have discussed enhancements in primarily speaking and listening skills, although they have typically focused on the app Duolingo and learning English (Shortt et al., 2023; Shadiev et al., 2020; Eisenchlas et al., 2013). Because of our lacking understanding of the specificities of each participant’s self-study, we were unable to draw many conclusions from the comparison of the participants who reported self-study to those who did not and any comparisons within the group of self-studying participants. However, future research might consider a more specific categorization of self-study methods in the LBQ, such as separating mobile-assisted language learning implements from informal practice with target language interlocutors for a clearer understanding of how self-study affects speaker performance. Further categorization of the types of mobile learning implements, such as text-based conversation compared to flashcards, would also be helpful for comparisons of the effects of specific self-study methods.
The following portion of our analysis focused on the role of cross-linguistic interference in aspect and classifier proficiency. Our results did note that L2 learners who spoke another classifier language tended to record classifier accuracy percentages significantly larger than the average for L2 learners overall. However the extent to which participants’ other languages affected their Mandarin classifier proficiency was unable to be determined. It is possible that if we collected additional data on participants’ proficiency in their other languages, such as their self-rated proficiencies in reading, writing, speaking, and listening of the other language, then we might have been able to draw additional conclusions, however our questionnaire only asked whether participants spoke any other language beside Mandarin and English, and what their self-rated proficiency was in the other language according to CEFR scale. Bayram et al. (2019) suggest that increased literacy in HSs protects against cross-linguistic interference, however, as our research could not find any trends regarding classifier proficiencies for speakers of non-classifier languages, we are unable to provide any additional support for this claim. Additionally, the speakers of another classifier language had widely differing self-rated reading and writing proficiencies, thus it is difficult to gauge the effects of literacy on positive transfer errors, which is a type of cross-linguistic interference where influence from another language leads to faster acquisition or use of the target language (Bardovi-Harlig & Sprouse, 2024), such as that which we have observed in classifiers.
Regarding our analysis of participants’ exposure, AoA and current usage of Mandarin on their performance: it is well established that exposure and frequency of usage are prominent factors in both heritage language maintenance and language development and L2 learning (Thordardottir, 2011; Polinsky & Scontras, 2020; Sopata & Długosz, 2021; Sánchez et al., 2023; Du, 2022), so why did these factors not correlate with any of our measures? We found that one of the major limitations of our study was the lack of specificity in the LBQ. Although we modeled the LBQ off of one used by Anderson, Mak, Keyvani Chahi, & Bialystok (2017), with additional modifications suggested by Hao et al. (2025), we cut down the length significantly to improve participant recruitment. In doing so, we missed a significant portion of the LBQ that would have clarified questions surrounding many participants’ levels of prior and current exposure to the target language. In sum, our questions could have been further refined, as there was some confusion in participants’ responses regarding the kind of content their answers were intended to contain, which also led to difficulties processing the data.
Additionally, while we did end up analyzing the differences in speaker performance caused by cross-linguistic interference, the original intention of our study was to examine the differences in performance mediated by differences in manner of acquisition in Mandarin-English bilinguals (as opposed to multilinguals). We thus acknowledge that the results of our research may have been skewed as a result of too many mediating language background variables, especially with regard to cross-linguistic influence, as some of our participants reported speaking or having learned up to three additional languages aside from Mandarin and English, and it is possible that each additional language exerted its own, different kind of influence on speakers’ Mandarin learning and production. In future research studies, stricter recruitment processes on participants’ language background, such as limiting participants to those who only speak or have learned Mandarin and English, or limiting speakers of additional languages to those speaking only one other language at below the B1 level on CEFR, might improve the clarity of results.
Ethics Declaration:
All participants prior to the experiment signed consent and were given a debrief afterwards describing the main focus of the study.
References:
American Psychological Association. (n.d.). APA Dictionary of Psychology. American Psychological
Association. https://dictionary.apa.org/balanced-bilingual
Anderson, J.A.E., Mak, L., Keyvani Chahi, A. et al. The language and social background
questionnaire: Assessing degree of bilingualism in a diverse population. Behav Res 50,
250–263 (2018). https://doi.org/10.3758/s13428-017-0867-9
Baker, M. (1988). The Hill Farm. YouTube. https://www.youtube.com/watch?v=RYCOw95Rr_k
Bardovi-Harlig, K., & Sprouse, R. A. (2026). Negative versus positive transfer. In J. I. Liontas (Ed.), The TESOL encyclopedia of English language teaching (pp. 1–6). Wiley. https://doi.org/10.1002/9781118784235.eelt0084.pub2
Bayram, F., Rothman, J., Iverson, M., Kupisch, T., Miller, D., Puig-Mayenco, E., & Westergaard, M. (2019). Differences in use without deficiencies in competence: passives in the Turkish and German of Turkish heritage speakers in Germany. International Journal of Bilingual Education and Bilingualism, 22(8), 919–939. https://doi.org/10.1080/13670050.2017.1324403
Biber, D., & Hared, M. (1991). Literacy in Somali: Linguistic Consequences. Annual Review of Applied Linguistics, 12, 260–282. doi:10.1017/S0267190500002269
Bigelow, M., & Tarone, E. (2004). The Role of Literacy Level in Second Language Acquisition: Doesn’t Who We Study Determine What We Know? TESOL Quarterly, 38(4), 689–700. https://doi.org/10.2307/3588285
Chang-Smith, M. (2010). Developmental pathways for first language acquisition of Mandarin nominal expressions: Comparing monolingual with simultaneous Mandarin—English bilingual children. International Journal of Bilingualism, 14(1), 11–35. https://doi.org/10.1177/1367006909356645
Chien, YC., Lust, B. & Chiang, CP. Chinese Children’s Comprehension of Count-Classifiers and Mass-Classifiers. Journal of East Asian Linguistics 12, 91–120 (2003).
https://doi.org/10.1023/A:1022401006521
Comrie, B. (1976). Aspect: An introduction to the study of verbal aspect and related problems (Cambridge Textbooks in Linguistics). Cambridge University Press.
Deng, X. & Mai, Z. (2024). Input effects and unidirectional transfer in Mandarin–English heritage bilingual children: The case of spatial prepositional zai-phrases. International Journal of Bilingualism, 29(3), 651-666. https://doi.org/10.1177/13670069241229392
Du, H. (2022). Grammatical and lexical development during study abroad: Research on a corpus of spoken L2 Chinese. Foreign Language Annals, 55(4), 985–1005.
https://doi.org/10.1111/flan.12631
Eisenchlas, S. A., Schalley, A. C., & Guillemin, D. (2013). The Importance of Literacy in the Home Language: The View From Australia: The View From Australia. Sage Open, 3(4). https://doi.org/10.1177/2158244013507270
Eyring, J. L. (2014). Adult Learners in English as a Second/Foreign Language Settings. In Teaching English as a Foreign or Second Language, Third Edition: A Self-Development and Methodology Guide (4th ed., pp. 571–572). essay, Cengage Learning.
Goldenberg, C. (2020). Reading Wars, Reading Science, and English Learners. Reading Research Quarterly, 55(S1), S131–S144. https://doi.org/10.1002/rrq.340
Gries, S. T., & Ellis, N. C. (2015). Statistical Measures for Usage-Based Linguistics. Language Learning, 65(S1), 228–255. https://doi.org/10.1111/lang.12119
Hao, J., Kubota, M., Bayram, F., González Alonso, J., Grüter, T., Li, M., & Rothman, J. (2025). Schooling and language usage matter in heritage bilingual processing: Sortal classifiers in Mandarin. Second Language Research, 41(4), 649–674. https://doi.org/10.1177/02676583241270900
Jia, R., & Paradis, J. (2015). The use of referring expressions in narratives by Mandarin heritage language children and the role of language environment factors in predicting individual differences. Bilingualism: Language and Cognition, 18(4), 737–752. https://doi.org/10.1017/S1366728914000728
Klein, W., Li, P. & Hendriks, H. Aspect and Assertion in Mandarin Chinese. Natural Language & Linguistic Theory 18, 723–770 (2000). https://doi.org/10.1023/A:1006411825993
Li, J., & Bayley, R. (2008). The (re)acquisition of perfective aspect marking by Chinese heritage language learners. In A. W. He & Y. Xiao (Eds.), Chinese as a heritage language: Fostering rooted world citizenry (pp. 205-221). National Foreign Language Resource Center, University of Hawaiʻi at Mānoa.
Montrul, S., & Silva-Corvalán, C. (2019). The Social Context Contributes to the Incomplete Acquisition of Aspects of Heritage Languages. Studies in Second Language Acquisition, 41(2), 269–273. https://doi.org/10.1017/S0272263119000354
Paradis, J. (2023). Sources of individual differences in the dual language development of heritage bilinguals. Journal of Child Language, 50(4), 793-817. https://doi.org/10.1017/S0305000922000708
Polinsky, M., & Scontras, G. (2020). Understanding heritage languages. Bilingualism: Language and Cognition, 23(1), 4–20. doi:10.1017/S1366728919000245
Sánchez, L., Goldin, M., Hur, E., Jimenez, A., López Otero, J. C., Thane, P., Austin, J., & Markovits
Rojas, J. (2023). Dominance, Language Experience, and Increased Interaction Effects on the Development of Pragmatic Knowledge in Heritage Bilingual Children: Acceptance of Null and Overt Subjects in Spanish and English. Heritage Language Journal, 20(1), 1-39. https://doi.org/10.1163/15507076-bja10012
Shadiev, R., Liu, T., & Hwang, W. (2020). Review of research on mobile‐assisted language learning in familiar, authentic environments. British Journal of Educational Technology, 51(3), 709–720. https://doi.org/10.1111/bjet.12839
Shortt, M., Tilak, S., Kuznetcova, I., Martens, B., & Akinkuolie, B. (2023). Gamification in mobile-assisted language learning: a systematic review of Duolingo literature from public release of 2012 to early 2020. Computer Assisted Language Learning, 36(3), 517–554. https://doi.org/10.1080/09588221.2021.1933540
Sopata, A., & Długosz, K. (2022). The effects of language input on word order in German as a heritage and majority language. Language Acquisition, 29(2), 198–228. https://doi.org/10.1080/10489223.2021.1992409
Thordardottir, E. (2011). The relationship between bilingual exposure and vocabulary development. International Journal of Bilingualism, 15(4), 426-445. https://doi.org/10.1177/1367006911403202
Tong, X. & Shirai, Y. (2016). L2 acquisition of Mandarin zai and -le. Chinese as a Second Language Research, 5(1), 1-25. https://doi.org/10.1515/caslar-2016-0001
Appendix:
Video: The Hill Farm (Baker, 1988)
Presentation shown to participants