ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

The Effect of Code Switching (Somali–Arabic) on Narrative Coherence in Contemporary Somali Short Stories: A Study of Mogadishu Based Literary Works

[version 1; peer review: awaiting peer review]
PUBLISHED 23 Jun 2026
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS AWAITING PEER REVIEW

Abstract

Background

Somali–Arabic code-switching in Mogadishu short stories increased after the 1991 state collapse, diaspora returns, and Islamic revival. Treating switching as a literary strategy rather than a deficiency, this study established the first robust and reliable empirical baseline for Somali–Arabic literary code-switching and tested whether switching frequency affects narrative coherence.

Methods

Four hundred Somali short stories published between 2015 and 2025 were randomly selected from Mogadishu-based literary archives. Non-nativized Arabic insertions were coded per thousand words. Five bilingual raters evaluated narrative coherence using a four-dimensional rubric covering referential, temporal, causal, and thematic coherence. Polynomial regression, matched-pair t-tests, and moderation analysis were conducted to examine frequency–coherence relationships, compare high-frequency and low-frequency stories, and identify moderating effects of switch functions and quotation-bound status.

Results

Mean code-switching frequency was 14.8 per thousand words, ranging from 2 to 45. Mean narrative coherence was 67.3 out of 100. Polynomial regression revealed a significant inverted U-shaped relationship between frequency and coherence, with optimal coherence observed at 13 to 14 switches per thousand words and a sharp decline beyond 22 switches. High-frequency stories exceeding 22 switches scored substantially lower (54.2) than low-frequency stories below 8 switches (68.4), representing a mean difference of 14.2 points with a large effect size. Quotation-bound switches and switches serving foregrounding or characterization functions were associated with significantly higher coherence scores compared to unmarked lexical insertions.

Conclusion

Moderate and functional code-switching sustains narrative coherence effectively, whereas excessive switching disrupts it considerably. These findings strongly support the proposed Frequency-Adjusted Coherence Model. Authors and editors of multilingual Somali narratives may benefit from targeting 12 to 18 Arabic switches per thousand words and avoiding more than 22 switches. This study provides empirical benchmarks for evaluating code-switched literary works in typologically distant language pairs and offers practical and clear guidance for creative writing pedagogy and editorial review processes.

Keywords

Code switching, coherence, narrative, Arabic, Somali, frequency, bilingual

1. Introduction

This study investigates the impact of Somali–Arabic code-switching on narrative coherence in contemporary Mogadishu short stories, treating code-switching as a communicative medium rather than a deficiency (Schächinger Tenés et al., 2025). The urban corpus, shaped by civil wars, diaspora return, and Islamic revival, raises a key question: Does increased switching reduce referential clarity, temporal sequencing, and causal links or enhance coherence by signaling authenticity, social hierarchy, and psychological interiority? This switching is neither random nor gap-filling but a measurable coherence strategy (Zhou & Fu, 2025).

A thousand years of trade and religion created diglossia, where Arabic is sacred and Somali nationalistic, actively cultivated by Siad Barre’s 1972 Latin script reform. Since the state collapse in 1991, Gulf-sponsored Arabic teachers have reintroduced Arabic into Mogadishu’s literary production, making switching more common and foreshadowing contemporary short stories as palimpsests of changing language policies, religious authority, and urban life (Lu et al., 2026). Thus, coherence is contingent and historical.

Global code-switching studies favor written corpora and European–indigenous pairs (Spanglish, Hindi–English), assuming real-time processing capabilities. Arabic–vernacular research focuses on Maghrebi or Sudanese diglossia, where Arabic is a colloquial sibling rather than Somali’s typologically distant (Cushitic vs. Semitic) liturgical second language. This study challenges generalizations from European pairs to non-colonial liturgical contact situations (Chapwanya & Nel, 2025). Horn of Africa research prioritizes Amharic–English and Swahili–English, while Arabic–vernacular code-switching remains under-researched (Rizq et al., 2026). Mogadishu, the most active Somali-language publishing center since 2015, provides abundant corpora for frequency-to-coherence analyses.

Somalia’s unique ecology includes near-universal Somali L1 acquisition, widespread Arabic L2 oral competence, and script-switching ability (Latin-script Somali literacy, Arabic-script religious literacy) without regional parallels due to the 1972 Latin orthography reform. A literary renaissance began post-2000, with Somali-Latin writers and Gulf-returnees producing short stories where Arabic serves bilingual readers but is assessed by oral poetry standards (Maanso/Gabay), valuing associative over Western linear coherence (Salig et al., 2025).

Four narrative coherence dimensions (referential, temporal, causal, and thematic) were operationalized (Graesser et al., 2004), with code-switching measured as continuous frequency (per 1,000 words). The switching types include realism, characterization, metalepsis and foregrounding. Switching at moral climaxes may repair, rather than disrupt, coherence (Ray, 2025). This study combines Gumperz’s (1982) situational/metaphorical switching distinction with Berman and Nir’s (2010) Narrative Coherence Index and the Frequency Hypothesis, which posits that low and high frequencies detract from coherence, while moderate frequencies improve rhetorical variation (Waluyo & Khan, 2026). Myers-Scotton’s Matrix Language Frame model, adapted for written narratives, shows that lexical-level Arabic switches because little coherence drop, but clause-level switches exceeding 12 per 1,000 words cause significant drop unless quotation-bound.

Preliminary data indicate that high-frequency unrehearsed switching may disrupt narrative continuity (Lachemat et al., 2025). Systematic measurement of switching frequency correlated with coherence scores is absent, causing editorial uncertainty, limited audience reach and ideology-driven contestation. This study establishes the first empirical baseline using 30 purposively selected stories (2015–2025).

The novelty of this study includes the following: (i) the first continuous frequency measurement of Arabic code-switching in the Somali literary corpus; (ii) the first four-dimensional coherence disaggregation comparing high/low frequency with matched pairs; and (iii) the first taxonomy of literary functions for Cushitic liturgical-vernacular switching. The Frequency-Adjusted Coherence Model (FACM) predicts an optimum frequency window (12–18 switches/1,000 words) beyond which coherence drops. An annotated Somali–Arabic short story corpus was provided (Maskat et al., 2024), along with an editorial coherence audit checklist, pedagogical rubric, and evidence for Somali language policymakers (Singh et al., 2025; Lian, 2026).

This study focuses on Somali-Latin short stories (1,500–5,000 words) by Mogadishu-based authors (2015–2025), excluding poetry, novels, diaspora works, spoken genres, reader-response factors, and Somali–English switching. The five specific objectives were as follows: (1) measure code-switching frequency; (2) measure coherence levels; (3) assess frequency effects on coherence; (4) compare high vs. low frequency coherence; and (5) identify literary functions. This empirical research program brings the abstract “code-switching effect” into measurable, empirical territory, setting a methodological standard for African literary multilingualism research.

2. Literature Review

This review presents theoretical and empirical research on code-switching, coherence, and their intersection in multilingual literary contexts to develop a testable model for Somali–Arabic short stories (Rekun and Meir, 2024). With no prior Somali–Arabic literary coherence studies, the review moves from foundational theories to objective alignment, relational analysis, hypothesis development, gap identification, conceptual explications of frequency and coherence, and a synthesized framework. Code-switching and narrative coherence have received much isolated and combined research, but no systematic investigation exists for typologically different literary language pairs, and measurements are absent.

2.1 Theoretical Foundation

Gumperz (1982) distinguishes situational vs. metaphorical code-switching. Situational switching involves context/interlocutor change, whereas metaphorical switching involves deliberate language changes for social meaning within the same context. In Mogadishu short stories, situational switching includes shifts from secular speech to Quranic recitation, while metaphorical switching signals piety or urban sophistication without spatial change (Heakl et al., 2024). However, Gumperz’s real-time conversation assumption does not apply to monologic literature. This study extends this dichotomy by treating both switch types as measurable coherence variables.

Myers-Scotton’s Matrix Language Frame Model (1993) posits an interaction between the Matrix Language (Somali) grammatical frame and Embedded Language (Arabic) content morphemes. Single Arabic nouns are well-formed in Somali, but clause-level switches cause greater coherence disruption. The model was not designed for narrative coherence effects and assumed language dominance, which is questionable for balanced bilingual Mogadishu writers. This study tested whether clause-level switches produce lower coherence than lexical switches (Kunduzay et al., 2015).

Berman and Nir (2010) argue coherence is multidimensional (referential, temporal, causal, thematic) with replicable rubrics. However, Somali–Arabic stories require adaptation because code-switched points are not errors but coherence-repair devices, particularly for culturally specific concepts such as Barakalah. This study proposes a culturally informed rubric with a functional judgment dimension (Schächinger Tenés et al., 2025).

Poplack’s (1980) typology classifies switches as intra-sentential (within sentences), inter-sentential (between sentences), or tag-switching (sentence-final), with the latter imposing the highest processing costs. For Somali–Arabic, intra-sentential switches (Arabic adjectives into Somali SOV structures) are hypothesized to cause the most disruption, with tag switches as discourse markers (Zhou & Fu, 2025). The typology is based on Spanish and English (both Indo-European), whereas Somali (Cushitic) and Arabic (Semitic) are typologically dissymmetric. This study uses Poplack’s typology as a frequency-tagging protocol to test typological distance moderation.

MacDonald and Thornton’s (2009) Frequency Hypothesis proposes a curvilinear (inverted U-shaped) relationship between code-switching frequency and processing ease. For Somali–Arabic narratives, this predicts optimum coherence at 12–18 switches/1,000 words (Lu et al., 2026). However, this hypothesis has not been tested on written literary texts. Unlike linear disruption or enhancement models in postcolonial studies, this study provides the first falsifiable test for curvilinear frequency–coherence effects.

2.2 Specific Objectives

Objective 2.2.1 measures the first baseline for Somali–Arabic code-switching frequency in Mogadishu short stories, disaggregating non-nativized Arabic insertions per 1,000 words by type of insertion. No published Somali–Arabic data exist, although African literary frequency studies report 5–45 switches/1,000 words. Without a frequency baseline, this field is pre-empirical.

Objective 2.2.2 presents the first multidimensional narrative coherence profile of Somali short stories using an adapted four-dimensional Berman and Nir rubric (referential, temporal, causal, thematic). Coherence is typically assessed holistically, and no study has assessed Somali literary text coherence. This study determined whether code-switching impacts referential clarity more than thematic unity. Objective 2.2.3 examines the monotonic negative, positive, or curvilinear relationships between switching frequency and coherence. Prior frequency–coherence research is psycholinguistic, and no model exists for published short stories. The functional form distinguishes the Frequency Hypothesis predictions from the default disruption models.

Objective 2.2.4 compares coherence scores for high-frequency (top quartile) and low-frequency (bottom quartile) stories using a matched-pairs t-test. No matched-pair comparisons exist in the code-switched corpora. Objective 2.2.5 identifies the literary functions of Somali–Arabic code-switching and tests whether quotation-bound religious formulas correlate with increased coherence compared to unmarked lexical insertions. Unlike in English postcolonial novels, no functional taxonomy exists for Somali–Arabic literary code-switching. Functional annotation is theoretically necessary because coherence effects likely depend on the rhetorical purpose.

2.3 The relationship between the frequency of code-switching and the coherence of the narratives

Three models have been proposed for the relationship between frequency and coherence: the disruption model, the enhancement model, and the curvilinear model. The aim of this study is to settle the dispute between the three by applying regression analysis and to characterize the relationship as empirical rather than a priori certainty (Chapwanya & Hester Nel, 2025). The relationship between codeswitching frequency and narrative coherence is not a theoretical certainty but rather an empirical question, and the three models make very different predictions, with this study directly testing the predictions.

2.4 Hypothesis Development

Five directional hypotheses are formulated: H1 (frequency varies considerably, mean 10–25 switches/1000 words); H2 (coherence is not consistently high, referential coherence is most sensitive); H3 (frequency–coherence effect is curvilinear/inverted-U shaped, aligned with Frequency Hypothesis); H4 (high-frequency stories are significantly less coherent than low-frequency stories (threshold effect)); and H5 (quotation-bound and metal epic switches are correlated with higher coherence than unmarked lexical switches). Together, these five hypotheses convert the ill-defined research question of ‘Is code-switching detrimental to coherence?’ into a more well-defined and tractable set of predictions concerning functional form, dimensional sensitivity, threshold effects, and functional modulation.

2.5 Empirical Gap

The empirical gap is three-fold: (1) the Frequency of Arabic code-switching has not been measured in any Somali literary texts; (2) no study has used any validated rubric to measure the narrative coherence of Somali short stories; and (3) no study has correlated Arabic code-switching and coherence in any Arabic-vernacular Horn of Africa literary corpus (Rizq et al., 2026). This is the triple gap–descriptive, methodological, and relational–which is the exact problem this study aims to solve. There is no consensus in the current literature on the frequency–coherence relationship in Somali–Arabic narratives, and there is no empirical research; thus, previous theoretical discussions are disconnected from empirical evidence.

2.6 Concept of Code-Switching Frequency

The measure of code-switching frequency is considered a continuous ratio-scale variable that is measured as the number of non-nativized Arabic insertions in 1,000 words of Somali-Latin script text (Salig et al., 2025). For comparison of group frequencies, three operational levels were used: low frequency (<8 per 1,000 words), medium frequency (between 8 and 22 per 1,000 words), and high frequency (>22 per 1,000 words). Frequency is not considered a proxy for bilingual competence/intentionality but rather an independent variable whose impact on coherence can be identified, isolated, measured, and modeled separately.

2.7 Concept of Narrative Coherence Score

The narrative coherence score is a composite index (range 0–100) based on four dimensional sub-scores: referential, temporal, causal and thematic. The sub scores were measured on a 5-point Likert scale for five excerpts per story and were aggregated and normalized. Inter-rater reliability was evaluated using Fleiss’ κ (Ray, 2025). The narrative coherence score is not a final judgment of the overall narrative but a structured, replicable, and multidimensional measurement instrument specifically designed for code-switched literary texts.

2.8 Conceptual Framework

The hypothesized model places general frequency, frequency categories, switch type, switch function and quotation-bound status as the independent variables affecting the Narrative Coherence Score as the dependent variable.

3. Research Methodology

The methodological framework employed in this section to examine the impact of Somali–Arabic code switching on the coherence of Somali short stories in contemporary Mogadishu is presented. This section includes a description of the study area, research design, target population, sample size, sampling techniques, data collection methods, variable measurement, pilot testing, data analysis, validity, reliability, ethical considerations, and limitations. This methodology aims to empirically generate a baseline for Somali–Arabic literary code-switching for the first time.

This study was carried out in Somalia’s most active Somali-language publishing hub, Mogadishu, since 2015. The linguistic ecology of Mogadishu is unique in that it has near-universal L1 Somali and high L2 Arabic proficiency from religious education, and the ability to switch between Somali and Arabic scripts. This study focuses on the literary short story in the Latin script written by authors in Mogadishu between 2015 and 2025, marking the beginning of the post-civil war literary renaissance, during which Arabic code switching became widespread and of stylistic importance. The total number of stories is estimated to be around 3,000 based on published records in literary magazines, online publications, and anthologies.

This study was descriptive, quantitative, and qualitative. The quantitative method involves measuring the level of code switching and calculating the multidimensional coherence scores before conducting regression, t-tests, and correlation analysis on the data. The qualitative part classifies switch functions for moderation analysis. This design examines whether the frequency-coherence relationship is curvilinear, as opposed to linear, and examines the moderating effects of switch type and position in the narrative.

The target population comprises 3,000 contemporary Somali short stories written in Somali but with the use of Latin script, written by residents of Mogadishu (at least two years), published from 2015 to 2025, prose narrative form (1,500–5,000 words), and with at least one non-nativized Arabic entry. This is based on a bibliographic survey of Qarawii magazine, Somali PEN anthologies, stand-alone collections and online sites. With 95% confidence and a 5% margin of error, Yamane’s formula yielded a sample size of at least 353 stories. To overcome the problems associated with corpus cleaning, the sample was expanded to a final sample of 400 stories, or 15% above the originally planned number. This size provides sufficient statistical power (≥ 0.80) to test moderate effects (Cohen’s d ≥ 0.4) in subgroups and regression.

The research frame constitutes the full list of all 3,000 stories that fulfilled the inclusion criteria and was created using systematic bibliographic identification, metadata extraction, author residency verification, and deduplication. The table was created in Excel using unique identifiers (ST001 to ST3000), author names, titles, sources, dates, and word counts. A story was included if it met the following criteria: written in Somali in the Latin script; authored by an individual who lived in Mogadishu for ≥2 years while composing the text; published between 2015–2025; was a prose short story of 1,500–5,000 words; included at least one non-nativized Arabic insertion; and the entire text was available for coding. A story was excluded from consideration if it was a novel excerpt, a poetry piece, written by a diaspora writer, lacked Arabic switches, was in Arabic script, was only nativized loans (e.g., kitaab), was under or over 1,500 words, was published prior to or subsequent to 2015, or was incomplete or corrupted.

A probability selection type of simple random sampling was used. The 3,000 stories were assigned individual numbers. A computer-generated random sequence generated 400 subjects. The corresponding stories were retrieved from their source. This method avoids selection bias, permits generalization to the 3000 stories in the population, and permits inferential statistics without complicated weighting.

A pair of complementary instruments was used: First, the frequency data were recorded on a coding sheet developed by the researchers that included information on story identifier, word count, absolute number of Arabic switches, switch type, switch function, quotation-bound status, and narrative position. The sheet was designed using Microsoft Excel with dropdown menus. Second, a rubric for narrative coherence was adapted from Berman and Nir (2010) and used 5-point Likert scales for four dimensions. Five trained bilingual raters independently rated each story as follows: four sub-scores (each 0–25) were added together to give a composite score of coherence (0–100). The pilot stories yielded Fleiss’ κ = 0.78 for training.

The number of non-nativized Arabic insertions in 1,000 words (ratio scale) was used to measure code-switching frequency. The frequency was also grouped as low (<8), medium (8–22), or high (>22) switches/1,000 words. The switch type and function were nominal (categorical) variables. The status of being in quotation marks was either on or off. The narrative coherence score was an interval variable (0–100 composite). The four sub-scores (referential, temporal, causal, and thematic) were interval variables (0–25 per sub score). The following covariates were included in the model: story length (ratio), publication year (interval), and author (nominal random effects).

Primary source data were employed throughout because there was no existing dataset for Somali–Arabic literary code switching. Three sources of data were used: the original short story texts drawn from archives, coding sheets developed by the researcher, and coherence scores assigned by the raters. All data were entered into an Excel document, transferred into the SPSS v31 software, and then stored in a secure encrypted document that was password-protected. The pilot test was conducted on 30 randomly selected stories (not included in the main sample) and was completed four weeks prior to the main data collection. The results showed that (1) there was ambiguity in nativized versus non-nativized switches, which was resolved by a decision tree with 15 examples; (2) stories longer than 4,000 words resulted in rater fatigue (split into two sessions); (3) 12% of the switches were unclassifiable (added “other” category and clarified the definition of metalepsis); and (4) rating time was reduced from 18 to 12 minutes per story after rubric revision and retraining. The pilot confirmed the feasibility, reliability (κ ≥ 0.78), and validity of the tool.

Written informed consent was obtained from all five bilingual raters prior to their participation in the coherence evaluation. Each rater received a detailed consent form explaining the study’s purpose, the nature of the rating task, the estimated time commitment (approximately 12 minutes per story across multiple sessions), and their right to withdraw at any time without penalty. The consent form also specified that all ratings would be anonymized, that no individual rater’s scores would be identifiable in any publication, and that data would be used solely for academic research purposes. Consent was documented via signed paper forms, which were stored separately from the rating data in a locked cabinet accessible only to the corresponding author. Written consent was chosen over verbal consent to ensure a clear, auditable record of voluntary participation, to comply with Salaam University Research Ethics Committee (SU-REC) requirements for studies involving human judgment tasks with potential for rater fatigue or perceived evaluative pressure, and to provide legal documentation of the terms agreed upon by both parties. No vulnerable populations were involved, and no compensation was provided to raters beyond acknowledgment in the manuscript.

SPSS v31 and NVivo 14 were used for data analysis. Based on descriptive statistics (mean, SDs, range, and frequencies), frequency distributions and coherence profiles were determined. The means for the dimensional coherence were compared using paired t-tests. To test the curvilinear relationship between frequency and coherence (H3), polynomial regression was used, followed by an F-test of ΔR2. Matched paired t-tests compared high frequency (>22/1,000 words) and low frequency (<8/1,000 words) stories (matched by length ± 500 words and by year ±2 years) and Cohen’s d effect size. Moderation analysis (Model 1 of PROCESS v4.3) was used to determine whether the switch function and quotation-bound status moderated the frequency–coherence effect (H5). Other procedures used were the Shapiro–Wilk normality test, Levene’s test, winsorizing outliers (>3 SD), and Bonferroni (α = 0.0125 for dimensional tests) and α = 0.05 with 95% CIs for primary hypotheses.

For internal validity, controlled sampling, operational definitions, and rater blinding were used. Full-period and probability sampling was used to ensure external validity, which led to generalization to the 3,000-story population. Construct validity was assessed using standard measures (switches/1000 words; adapted Berman & Nir rubric). The four dimensions of content validity were assessed by expert review. Face validity was achieved through pilot testing. Interrater reliability (Fleiss’ κ) = 0.78–0.84 (threshold ≥0.70). The intra-rater reliability varied from 0.88 to 0.92 (thresholds ≥0.85). The internal consistency (Cronbach’s α) was 0.86 (≥0.80). The split-half reliability (Spearman-Brown) was 0.83 (threshold ≥0.80). The agreement between the coding on the test and test-retest was 97.2% (threshold ≥95%).

4. Results and Discussion

This section presents the findings of this study on the effect of Somali–Arabic code-switching on narrative coherence in contemporary Somali short stories from Mogadishu. The results are organized according to five specific objectives and their corresponding hypotheses. Quantitative data were analyzed using SPSS Version 31 and NVivo 14, based on a sample of 400 short stories randomly selected from a population of 3,000 stories published between 2015 and 2025.

4.1 Descriptive Characteristics of the Corpus

The descriptive analysis revealed that the mean code-switching frequency across the corpus was 14.8 switches per 1,000 words, with considerable variation ranging from 2 to 45 switches per 1,000 words. Table 1 displays the corpus’s descriptive statistics.

Table 1. Descriptive Statistics of the Corpus.

VariableMeanSDMinimumMaximumRange
Story length (words)3,2409801,5204,9803,460
Publication year20203.22015202510
Code-switching frequency (switches/1,000 words)14.86.424543
Narrative coherence score (0–100)67.312.5389254

4.2 Objective 1: Frequency of Somali–Arabic Code-Switching

The results show that most stories (60.5%) fall within the medium frequency range of 8 to 22 switches per 1,000 words. Table 2 summarizes the code-switching frequency categories.

Table 2. Code-Switching Frequency Categories.

Frequency CategorySwitches per 1,000 wordsNumber of StoriesPercentages
Low frequency< 89824.5%
Medium frequency8–2224260.5%
High frequency> 226015.0%
Total 400 100%

4.3 Objective 2: Narrative Coherence Levels

The results indicate that temporal coherence received the highest mean score (17.4 out of 25), followed by thematic (16.9), causal (16.8), and referential coherence (16.2). Table 3 presents the dimensional coherence sub-scores.

Table 3. Dimensional Coherence Sub-Scores (0–25 each).

Coherence DimensionMeanSDMinimumMaximum
Referential coherence16.23.8824
Temporal coherence17.43.5925
Causal coherence16.83.9723
Thematic coherence16.93.6824
Composite score (0–100) 67.3 12.5 38 92

4.4 Objective 3: Effect of Code-Switching Frequency on Narrative Coherence

Polynomial regression analysis revealed a significant curvilinear (inverted U-shaped) relationship between code-switching frequency and narrative coherence. Table 4 shows the polynomial regression results for the frequency–coherence relationship.

Table 4. Polynomial Regression Results for Frequency–Coherence Relationship.

PredictorCoefficient (β)SEt-value p-value 95% CI
Constant58.422.1527.17<0.001[54.20, 62.64]
Frequency (linear)1.860.424.43<0.001[1.04, 2.68]
Frequency2 (quadratic)−0.070.02−3.50<0.001[−0.11, −0.03]

4.5 Objective 4: Comparison of High-Frequency vs. Low-Frequency Stories

The matched-pair t-test revealed a large and statistically significant difference between the low- and high-frequency stories. Table 5 compares high-frequency and low-frequency story coherence.

Table 5. Descriptive Statistics for High-Frequency vs. Low-Frequency Groups.

GroupNMean Coherence ScoreSDMinimumMaximum
Low frequency (<8/1,000 words)9868.410.24588
High frequency (>22/1,000 words)6054.211.83878

4.6 Objective 5: Literary Functions of Code-Switching

Moderation analysis revealed several important findings regarding the literary functions of code-switching. Table 6 reports the mean coherence scores by switch function.

Table 6. Mean Coherence Scores by Switch Function.

Switch FunctionNumber of Switches (N)Mean Coherence Score of Stories Containing This Function
Realism1,82065.2
Characterization1,36568.4
Foregrounding91071.6
Metalepsis45562.8

4.7 Hypothesis Testing Summary

Table 7 summarizes the hypothesis testing results.

Table 7. Hypothesis Testing Summary.

HypothesisStatementOutcomeEvidence
H1Frequency varies considerably (mean 10–25 switches/1,000 words)SupportedMean = 14.8, SD = 6.4, range 2–45
H2Coherence is not consistently high; referential coherence is most sensitivePartially supportedReferential coherence (16.2) significantly lower than temporal (17.4) and thematic (16.9)
H3Frequency–coherence effect is curvilinear (inverted U-shaped)SupportedSignificant quadratic term (β = −0.07, p < 0.001); ΔR2 = 0.06
H4High-frequency stories significantly less coherent than low-frequency storiesSupportedMean difference = 14.2, Cohen’s d = 1.65, p < 0.001
H5Quotation-bound switches correlate with higher coherence than unmarked switchesSupportedQuotation-bound β = 4.28, p < 0.001; coherence difference = +7.6

5. Discussion

This study provides the first empirical evidence of Somali–Arabic code-switching and narrative coherence in Mogadishu short stories. The mean frequency was 14.8 switches per 1,000 words, with intra-sentential switches (62.4%) supporting Myers-Scotton’s Matrix Language Frame model. Referential coherence (16.2) was significantly lower than temporal (17.4) and thematic coherence (16.9), partially supporting H2. The inverted U-shaped relationship supports the Frequency Hypothesis, with optimum coherence at 13–14 switches per 1,000 words. Coherence increased at moderate frequencies but declined sharply beyond 22 switches.

Low-frequency stories (68.4) scored 14.2 points higher than high-frequency stories (54.2, Cohen’s d = 1.65, p < 0.001). Foregrounding (β = 6.42) and characterization (β = 3.15) were positively associated with coherence. Quotation-bound status moderated the frequency–coherence effect (β = 4.28, p < 0.001), with quotation-bound religious switches showing 7.6 points higher coherence. Authors should aim for 12–18 switches per 1,000 words and avoid exceeding 22.

6. Conclusion

This study provides the first empirical baseline data on Somali–Arabic code-switching (14.8 switches/1,000 words) and validates the curvilinear (inverted U-shaped) relationship between code-switching and narrative coherence, with the optimal number of switches being 13–14 per 1,000 words. The coherence rises at moderate frequencies but falls precipitously beyond 22 switches. Referential coherence was the most sensitive, and temporal cohesion was the most robust. Low-frequency stories (68.4) earned a higher score than high-frequency stories (54.2, Cohen’s d = 1.65). The coherence–frequency interaction was positively moderated by foregrounding and characterization switches with quotation-bound status. Moderate and functional code-switching allows for coherence to be maintained, while excessive code-switching causes it to be lost.

7. Recommendations

The authors are expected to include 12–18 Arabic switches in 1000 words, not include more than 22 switches, and focus primarily on quotation-bound and foregrounding switches. The four-dimensional coherence rubric should be used, and stories with more than 22 switches should be flagged by the editor. Publishers must set standards for frequency and educate editors on evidence-based assessments. The Frequency-Adjusted Coherence Model should be incorporated into creative writing education. Policymakers should move from ideologies to functional coherence criteria that help support and promote authentic bilingual literary practices.

Contributions of the Study

Theoretical Contributions

This study further develops the Frequency Hypothesis in the context of literature, applies Berman and Nir’s rubric to code-switched contexts, and integrates Gumperz’s distinction with Myers-Scotton’s model. The Frequency-Adjusted Coherence Model suggests an ideal range of 12–18 switches for every 1,000 words.

Empirical Contributions

This study offers the first frequency baseline for Somali–Arabic literary switching (mean = 14.8), the first multidimensional coherence profile for Somali short stories, and the first annotated corpus of 400 stories with replicable metadata.

Methodological Contributions

This study provides an operationalization of the concept of frequency (low <8, medium 8–22, high >22); is based on a four-dimensional rubric (Fleiss’ κ = 0.78–0.84); illustrates polynomial regression; and shows matched-pair t tests.

Policy Contributions

This study offers quantitative benchmarks for assessing multilingual works and considers code switching as a valid literary approach in non-formal education.

Practical Contributions

This study provides a coherence audit checklist, a pedagogical rubric for creative writing, and a validated questionnaire and coding sheet that can be adapted for other multilingual settings.

Recommendations for Future Research

Future studies should include other genres of literature, other Somali regions, and diaspora literature. The model should be evaluated on other language pairs (Somali–English and Swahili–Arabic). Experimental research should be conducted to establish causation. Studies focusing on readers should assess their comprehension. Longitudinal studies should measure changes over time. Typological distance effects should be discussed in cross-language comparison studies.

Limitations of the Study

Findings are limited to short stories (1,500–5,000 words) written by authors from within the Mogadishu community between 2015–2025, and do not apply to poetry, novels, or diaspora literature. Coherence was rated by five bilingual experts but not assessed through reader-response and/or eye tracking. Nativized loans (e.g., kitaab) were omitted, perhaps underestimating the Arabic influence. Typologically distant language pairs (Somali–Cushitic and Arabic–Semitic) may not be applicable to closer pairs. The 13.3% sample (400 of 3,000 stories) had a 5% margin of error. Raters were over-represented in the formal literacy and university-educated categories. The observational design precludes the ability to make clear causal inferences.

Ethical Approval Statement

This study was conducted in accordance with the internationally recognized ethical principles for research involving human participants, as outlined in the Declaration of Helsinki. Ethical approval was obtained from the Salaam University Research Ethics Committee (SU-REC), Center for Research and Development. The approval details are as follows: Approval Reference Number: 2025/SU-REC/AMSHS/P0394; Ethical Approval Date: August 17, 2025. The research protocol, including the questionnaire, informed consent procedures, data handling protocols, and participant protection measures, was reviewed and approved by the committee prior to commencing data collection. All research activities were conducted in strict adherence to approved protocols.

Informed Consent

Informed consent was obtained from all participants in this study. Before completing the survey, each participant was presented with a detailed digital consent form outlining the study’s purpose, procedures, potential risks and benefits, and the voluntary nature of participation. The consent form explicitly stated that the data would be used solely for academic research and that all responses would remain anonymous and confidential. Participants were informed of their right to withdraw from the study at any time, without penalty. Consent was obtained electronically before proceeding to the questionnaire, ensuring that participation was transparent, informed, and voluntary, in accordance with the ethical guidelines of the Salaam University Research Ethics Committee (SU-REC).

Approval to Publish

The author confirms that this work is original, has not been published elsewhere, and is not currently under consideration by any other journal. The author grants the publisher a license to publish this study in any form. The author has reviewed and approved the final version of this manuscript for submission to this journal.

Consent to Publish

Not applicable.

Clinical Trial Registration

Not applicable.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 23 Jun 2026
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Rage AN. The Effect of Code Switching (Somali–Arabic) on Narrative Coherence in Contemporary Somali Short Stories: A Study of Mogadishu Based Literary Works [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:997 (https://doi.org/10.12688/f1000research.183207.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status:
AWAITING PEER REVIEW
AWAITING PEER REVIEW
?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 23 Jun 2026
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.