The Effect of Code Switching (Somali–Arabic) on Narrative Coherence in Contemporary Somali Short Stories: A Study of Mogadishu Based Literary Works

Abdifatah Nour Rage

doi:10.12688/f1000research.183207.1

Home Browse The Effect of Code Switching (Somali–Arabic) on Narrative Coherence...

ALL Metrics

-

Views

-

Downloads

Get PDF

Get XML

Export

▬

✚

Research Article

The Effect of Code Switching (Somali–Arabic) on Narrative Coherence in Contemporary Somali Short Stories: A Study of Mogadishu Based Literary Works

[version 1; peer review: awaiting peer review]

Abdifatah Nour Rage

PUBLISHED 23 Jun 2026

Author details Author details

IR, Salaam University, Mogadishu, Banaadir, SOm, Somalia

Abdifatah Nour Rage
Roles: Conceptualization, Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Resources, Software, Supervision, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS AWAITING PEER REVIEW

Abstract

Background

Somali–Arabic code-switching in Mogadishu short stories increased after the 1991 state collapse, diaspora returns, and Islamic revival. Treating switching as a literary strategy rather than a deficiency, this study established the first robust and reliable empirical baseline for Somali–Arabic literary code-switching and tested whether switching frequency affects narrative coherence.

Methods

Four hundred Somali short stories published between 2015 and 2025 were randomly selected from Mogadishu-based literary archives. Non-nativized Arabic insertions were coded per thousand words. Five bilingual raters evaluated narrative coherence using a four-dimensional rubric covering referential, temporal, causal, and thematic coherence. Polynomial regression, matched-pair t-tests, and moderation analysis were conducted to examine frequency–coherence relationships, compare high-frequency and low-frequency stories, and identify moderating effects of switch functions and quotation-bound status.

Results

Mean code-switching frequency was 14.8 per thousand words, ranging from 2 to 45. Mean narrative coherence was 67.3 out of 100. Polynomial regression revealed a significant inverted U-shaped relationship between frequency and coherence, with optimal coherence observed at 13 to 14 switches per thousand words and a sharp decline beyond 22 switches. High-frequency stories exceeding 22 switches scored substantially lower (54.2) than low-frequency stories below 8 switches (68.4), representing a mean difference of 14.2 points with a large effect size. Quotation-bound switches and switches serving foregrounding or characterization functions were associated with significantly higher coherence scores compared to unmarked lexical insertions.

Conclusion

Moderate and functional code-switching sustains narrative coherence effectively, whereas excessive switching disrupts it considerably. These findings strongly support the proposed Frequency-Adjusted Coherence Model. Authors and editors of multilingual Somali narratives may benefit from targeting 12 to 18 Arabic switches per thousand words and avoiding more than 22 switches. This study provides empirical benchmarks for evaluating code-switched literary works in typologically distant language pairs and offers practical and clear guidance for creative writing pedagogy and editorial review processes.

Keywords

Code switching, coherence, narrative, Arabic, Somali, frequency, bilingual

Corresponding author: Abdifatah Nour Rage

Competing interests: No competing interests were disclosed.

Grant information: The author(s) declared that no grants were involved in supporting this work.

Copyright: © 2026 Rage AN. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Rage AN. The Effect of Code Switching (Somali–Arabic) on Narrative Coherence in Contemporary Somali Short Stories: A Study of Mogadishu Based Literary Works [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:997 (https://doi.org/10.12688/f1000research.183207.1) First published: 23 Jun 2026, 15:997 (https://doi.org/10.12688/f1000research.183207.1) Latest published: 23 Jun 2026, 15:997 (https://doi.org/10.12688/f1000research.183207.1)

1. Introduction

This study investigates the impact of Somali–Arabic code-switching on narrative coherence in contemporary Mogadishu short stories, treating code-switching as a communicative medium rather than a deficiency (Schächinger Tenés et al., 2025). The urban corpus, shaped by civil wars, diaspora return, and Islamic revival, raises a key question: Does increased switching reduce referential clarity, temporal sequencing, and causal links or enhance coherence by signaling authenticity, social hierarchy, and psychological interiority? This switching is neither random nor gap-filling but a measurable coherence strategy (Zhou & Fu, 2025).

A thousand years of trade and religion created diglossia, where Arabic is sacred and Somali nationalistic, actively cultivated by Siad Barre’s 1972 Latin script reform. Since the state collapse in 1991, Gulf-sponsored Arabic teachers have reintroduced Arabic into Mogadishu’s literary production, making switching more common and foreshadowing contemporary short stories as palimpsests of changing language policies, religious authority, and urban life (Lu et al., 2026). Thus, coherence is contingent and historical.

Global code-switching studies favor written corpora and European–indigenous pairs (Spanglish, Hindi–English), assuming real-time processing capabilities. Arabic–vernacular research focuses on Maghrebi or Sudanese diglossia, where Arabic is a colloquial sibling rather than Somali’s typologically distant (Cushitic vs. Semitic) liturgical second language. This study challenges generalizations from European pairs to non-colonial liturgical contact situations (Chapwanya & Nel, 2025). Horn of Africa research prioritizes Amharic–English and Swahili–English, while Arabic–vernacular code-switching remains under-researched (Rizq et al., 2026). Mogadishu, the most active Somali-language publishing center since 2015, provides abundant corpora for frequency-to-coherence analyses.

Somalia’s unique ecology includes near-universal Somali L1 acquisition, widespread Arabic L2 oral competence, and script-switching ability (Latin-script Somali literacy, Arabic-script religious literacy) without regional parallels due to the 1972 Latin orthography reform. A literary renaissance began post-2000, with Somali-Latin writers and Gulf-returnees producing short stories where Arabic serves bilingual readers but is assessed by oral poetry standards (Maanso/Gabay), valuing associative over Western linear coherence (Salig et al., 2025).

Four narrative coherence dimensions (referential, temporal, causal, and thematic) were operationalized (Graesser et al., 2004), with code-switching measured as continuous frequency (per 1,000 words). The switching types include realism, characterization, metalepsis and foregrounding. Switching at moral climaxes may repair, rather than disrupt, coherence (Ray, 2025). This study combines Gumperz’s (1982) situational/metaphorical switching distinction with Berman and Nir’s (2010) Narrative Coherence Index and the Frequency Hypothesis, which posits that low and high frequencies detract from coherence, while moderate frequencies improve rhetorical variation (Waluyo & Khan, 2026). Myers-Scotton’s Matrix Language Frame model, adapted for written narratives, shows that lexical-level Arabic switches because little coherence drop, but clause-level switches exceeding 12 per 1,000 words cause significant drop unless quotation-bound.

Preliminary data indicate that high-frequency unrehearsed switching may disrupt narrative continuity (Lachemat et al., 2025). Systematic measurement of switching frequency correlated with coherence scores is absent, causing editorial uncertainty, limited audience reach and ideology-driven contestation. This study establishes the first empirical baseline using 30 purposively selected stories (2015–2025).

The novelty of this study includes the following: (i) the first continuous frequency measurement of Arabic code-switching in the Somali literary corpus; (ii) the first four-dimensional coherence disaggregation comparing high/low frequency with matched pairs; and (iii) the first taxonomy of literary functions for Cushitic liturgical-vernacular switching. The Frequency-Adjusted Coherence Model (FACM) predicts an optimum frequency window (12–18 switches/1,000 words) beyond which coherence drops. An annotated Somali–Arabic short story corpus was provided (Maskat et al., 2024), along with an editorial coherence audit checklist, pedagogical rubric, and evidence for Somali language policymakers (Singh et al., 2025; Lian, 2026).

This study focuses on Somali-Latin short stories (1,500–5,000 words) by Mogadishu-based authors (2015–2025), excluding poetry, novels, diaspora works, spoken genres, reader-response factors, and Somali–English switching. The five specific objectives were as follows: (1) measure code-switching frequency; (2) measure coherence levels; (3) assess frequency effects on coherence; (4) compare high vs. low frequency coherence; and (5) identify literary functions. This empirical research program brings the abstract “code-switching effect” into measurable, empirical territory, setting a methodological standard for African literary multilingualism research.

2. Literature Review

This review presents theoretical and empirical research on code-switching, coherence, and their intersection in multilingual literary contexts to develop a testable model for Somali–Arabic short stories (Rekun and Meir, 2024). With no prior Somali–Arabic literary coherence studies, the review moves from foundational theories to objective alignment, relational analysis, hypothesis development, gap identification, conceptual explications of frequency and coherence, and a synthesized framework. Code-switching and narrative coherence have received much isolated and combined research, but no systematic investigation exists for typologically different literary language pairs, and measurements are absent.

2.1 Theoretical Foundation

Gumperz (1982) distinguishes situational vs. metaphorical code-switching. Situational switching involves context/interlocutor change, whereas metaphorical switching involves deliberate language changes for social meaning within the same context. In Mogadishu short stories, situational switching includes shifts from secular speech to Quranic recitation, while metaphorical switching signals piety or urban sophistication without spatial change (Heakl et al., 2024). However, Gumperz’s real-time conversation assumption does not apply to monologic literature. This study extends this dichotomy by treating both switch types as measurable coherence variables.

Myers-Scotton’s Matrix Language Frame Model (1993) posits an interaction between the Matrix Language (Somali) grammatical frame and Embedded Language (Arabic) content morphemes. Single Arabic nouns are well-formed in Somali, but clause-level switches cause greater coherence disruption. The model was not designed for narrative coherence effects and assumed language dominance, which is questionable for balanced bilingual Mogadishu writers. This study tested whether clause-level switches produce lower coherence than lexical switches (Kunduzay et al., 2015).

Berman and Nir (2010) argue coherence is multidimensional (referential, temporal, causal, thematic) with replicable rubrics. However, Somali–Arabic stories require adaptation because code-switched points are not errors but coherence-repair devices, particularly for culturally specific concepts such as Barakalah. This study proposes a culturally informed rubric with a functional judgment dimension (Schächinger Tenés et al., 2025).

Poplack’s (1980) typology classifies switches as intra-sentential (within sentences), inter-sentential (between sentences), or tag-switching (sentence-final), with the latter imposing the highest processing costs. For Somali–Arabic, intra-sentential switches (Arabic adjectives into Somali SOV structures) are hypothesized to cause the most disruption, with tag switches as discourse markers (Zhou & Fu, 2025). The typology is based on Spanish and English (both Indo-European), whereas Somali (Cushitic) and Arabic (Semitic) are typologically dissymmetric. This study uses Poplack’s typology as a frequency-tagging protocol to test typological distance moderation.

MacDonald and Thornton’s (2009) Frequency Hypothesis proposes a curvilinear (inverted U-shaped) relationship between code-switching frequency and processing ease. For Somali–Arabic narratives, this predicts optimum coherence at 12–18 switches/1,000 words (Lu et al., 2026). However, this hypothesis has not been tested on written literary texts. Unlike linear disruption or enhancement models in postcolonial studies, this study provides the first falsifiable test for curvilinear frequency–coherence effects.

2.2 Specific Objectives

Objective 2.2.1 measures the first baseline for Somali–Arabic code-switching frequency in Mogadishu short stories, disaggregating non-nativized Arabic insertions per 1,000 words by type of insertion. No published Somali–Arabic data exist, although African literary frequency studies report 5–45 switches/1,000 words. Without a frequency baseline, this field is pre-empirical.

Objective 2.2.2 presents the first multidimensional narrative coherence profile of Somali short stories using an adapted four-dimensional Berman and Nir rubric (referential, temporal, causal, thematic). Coherence is typically assessed holistically, and no study has assessed Somali literary text coherence. This study determined whether code-switching impacts referential clarity more than thematic unity. Objective 2.2.3 examines the monotonic negative, positive, or curvilinear relationships between switching frequency and coherence. Prior frequency–coherence research is psycholinguistic, and no model exists for published short stories. The functional form distinguishes the Frequency Hypothesis predictions from the default disruption models.

Objective 2.2.4 compares coherence scores for high-frequency (top quartile) and low-frequency (bottom quartile) stories using a matched-pairs t-test. No matched-pair comparisons exist in the code-switched corpora. Objective 2.2.5 identifies the literary functions of Somali–Arabic code-switching and tests whether quotation-bound religious formulas correlate with increased coherence compared to unmarked lexical insertions. Unlike in English postcolonial novels, no functional taxonomy exists for Somali–Arabic literary code-switching. Functional annotation is theoretically necessary because coherence effects likely depend on the rhetorical purpose.

2.3 The relationship between the frequency of code-switching and the coherence of the narratives

Three models have been proposed for the relationship between frequency and coherence: the disruption model, the enhancement model, and the curvilinear model. The aim of this study is to settle the dispute between the three by applying regression analysis and to characterize the relationship as empirical rather than a priori certainty (Chapwanya & Hester Nel, 2025). The relationship between codeswitching frequency and narrative coherence is not a theoretical certainty but rather an empirical question, and the three models make very different predictions, with this study directly testing the predictions.

2.4 Hypothesis Development

Five directional hypotheses are formulated: H1 (frequency varies considerably, mean 10–25 switches/1000 words); H2 (coherence is not consistently high, referential coherence is most sensitive); H3 (frequency–coherence effect is curvilinear/inverted-U shaped, aligned with Frequency Hypothesis); H4 (high-frequency stories are significantly less coherent than low-frequency stories (threshold effect)); and H5 (quotation-bound and metal epic switches are correlated with higher coherence than unmarked lexical switches). Together, these five hypotheses convert the ill-defined research question of ‘Is code-switching detrimental to coherence?’ into a more well-defined and tractable set of predictions concerning functional form, dimensional sensitivity, threshold effects, and functional modulation.

2.5 Empirical Gap

The empirical gap is three-fold: (1) the Frequency of Arabic code-switching has not been measured in any Somali literary texts; (2) no study has used any validated rubric to measure the narrative coherence of Somali short stories; and (3) no study has correlated Arabic code-switching and coherence in any Arabic-vernacular Horn of Africa literary corpus (Rizq et al., 2026). This is the triple gap–descriptive, methodological, and relational–which is the exact problem this study aims to solve. There is no consensus in the current literature on the frequency–coherence relationship in Somali–Arabic narratives, and there is no empirical research; thus, previous theoretical discussions are disconnected from empirical evidence.

2.6 Concept of Code-Switching Frequency

The measure of code-switching frequency is considered a continuous ratio-scale variable that is measured as the number of non-nativized Arabic insertions in 1,000 words of Somali-Latin script text (Salig et al., 2025). For comparison of group frequencies, three operational levels were used: low frequency (<8 per 1,000 words), medium frequency (between 8 and 22 per 1,000 words), and high frequency (>22 per 1,000 words). Frequency is not considered a proxy for bilingual competence/intentionality but rather an independent variable whose impact on coherence can be identified, isolated, measured, and modeled separately.

2.7 Concept of Narrative Coherence Score

The narrative coherence score is a composite index (range 0–100) based on four dimensional sub-scores: referential, temporal, causal and thematic. The sub scores were measured on a 5-point Likert scale for five excerpts per story and were aggregated and normalized. Inter-rater reliability was evaluated using Fleiss’ κ (Ray, 2025). The narrative coherence score is not a final judgment of the overall narrative but a structured, replicable, and multidimensional measurement instrument specifically designed for code-switched literary texts.

2.8 Conceptual Framework

The hypothesized model places general frequency, frequency categories, switch type, switch function and quotation-bound status as the independent variables affecting the Narrative Coherence Score as the dependent variable.

3. Research Methodology

The methodological framework employed in this section to examine the impact of Somali–Arabic code switching on the coherence of Somali short stories in contemporary Mogadishu is presented. This section includes a description of the study area, research design, target population, sample size, sampling techniques, data collection methods, variable measurement, pilot testing, data analysis, validity, reliability, ethical considerations, and limitations. This methodology aims to empirically generate a baseline for Somali–Arabic literary code-switching for the first time.

This study was carried out in Somalia’s most active Somali-language publishing hub, Mogadishu, since 2015. The linguistic ecology of Mogadishu is unique in that it has near-universal L1 Somali and high L2 Arabic proficiency from religious education, and the ability to switch between Somali and Arabic scripts. This study focuses on the literary short story in the Latin script written by authors in Mogadishu between 2015 and 2025, marking the beginning of the post-civil war literary renaissance, during which Arabic code switching became widespread and of stylistic importance. The total number of stories is estimated to be around 3,000 based on published records in literary magazines, online publications, and anthologies.

This study was descriptive, quantitative, and qualitative. The quantitative method involves measuring the level of code switching and calculating the multidimensional coherence scores before conducting regression, t-tests, and correlation analysis on the data. The qualitative part classifies switch functions for moderation analysis. This design examines whether the frequency-coherence relationship is curvilinear, as opposed to linear, and examines the moderating effects of switch type and position in the narrative.

The target population comprises 3,000 contemporary Somali short stories written in Somali but with the use of Latin script, written by residents of Mogadishu (at least two years), published from 2015 to 2025, prose narrative form (1,500–5,000 words), and with at least one non-nativized Arabic entry. This is based on a bibliographic survey of Qarawii magazine, Somali PEN anthologies, stand-alone collections and online sites. With 95% confidence and a 5% margin of error, Yamane’s formula yielded a sample size of at least 353 stories. To overcome the problems associated with corpus cleaning, the sample was expanded to a final sample of 400 stories, or 15% above the originally planned number. This size provides sufficient statistical power (≥ 0.80) to test moderate effects (Cohen’s d ≥ 0.4) in subgroups and regression.

The research frame constitutes the full list of all 3,000 stories that fulfilled the inclusion criteria and was created using systematic bibliographic identification, metadata extraction, author residency verification, and deduplication. The table was created in Excel using unique identifiers (ST001 to ST3000), author names, titles, sources, dates, and word counts. A story was included if it met the following criteria: written in Somali in the Latin script; authored by an individual who lived in Mogadishu for ≥2 years while composing the text; published between 2015–2025; was a prose short story of 1,500–5,000 words; included at least one non-nativized Arabic insertion; and the entire text was available for coding. A story was excluded from consideration if it was a novel excerpt, a poetry piece, written by a diaspora writer, lacked Arabic switches, was in Arabic script, was only nativized loans (e.g., kitaab), was under or over 1,500 words, was published prior to or subsequent to 2015, or was incomplete or corrupted.

A probability selection type of simple random sampling was used. The 3,000 stories were assigned individual numbers. A computer-generated random sequence generated 400 subjects. The corresponding stories were retrieved from their source. This method avoids selection bias, permits generalization to the 3000 stories in the population, and permits inferential statistics without complicated weighting.

A pair of complementary instruments was used: First, the frequency data were recorded on a coding sheet developed by the researchers that included information on story identifier, word count, absolute number of Arabic switches, switch type, switch function, quotation-bound status, and narrative position. The sheet was designed using Microsoft Excel with dropdown menus. Second, a rubric for narrative coherence was adapted from Berman and Nir (2010) and used 5-point Likert scales for four dimensions. Five trained bilingual raters independently rated each story as follows: four sub-scores (each 0–25) were added together to give a composite score of coherence (0–100). The pilot stories yielded Fleiss’ κ = 0.78 for training.

The number of non-nativized Arabic insertions in 1,000 words (ratio scale) was used to measure code-switching frequency. The frequency was also grouped as low (<8), medium (8–22), or high (>22) switches/1,000 words. The switch type and function were nominal (categorical) variables. The status of being in quotation marks was either on or off. The narrative coherence score was an interval variable (0–100 composite). The four sub-scores (referential, temporal, causal, and thematic) were interval variables (0–25 per sub score). The following covariates were included in the model: story length (ratio), publication year (interval), and author (nominal random effects).

Primary source data were employed throughout because there was no existing dataset for Somali–Arabic literary code switching. Three sources of data were used: the original short story texts drawn from archives, coding sheets developed by the researcher, and coherence scores assigned by the raters. All data were entered into an Excel document, transferred into the SPSS v31 software, and then stored in a secure encrypted document that was password-protected. The pilot test was conducted on 30 randomly selected stories (not included in the main sample) and was completed four weeks prior to the main data collection. The results showed that (1) there was ambiguity in nativized versus non-nativized switches, which was resolved by a decision tree with 15 examples; (2) stories longer than 4,000 words resulted in rater fatigue (split into two sessions); (3) 12% of the switches were unclassifiable (added “other” category and clarified the definition of metalepsis); and (4) rating time was reduced from 18 to 12 minutes per story after rubric revision and retraining. The pilot confirmed the feasibility, reliability (κ ≥ 0.78), and validity of the tool.

Written informed consent was obtained from all five bilingual raters prior to their participation in the coherence evaluation. Each rater received a detailed consent form explaining the study’s purpose, the nature of the rating task, the estimated time commitment (approximately 12 minutes per story across multiple sessions), and their right to withdraw at any time without penalty. The consent form also specified that all ratings would be anonymized, that no individual rater’s scores would be identifiable in any publication, and that data would be used solely for academic research purposes. Consent was documented via signed paper forms, which were stored separately from the rating data in a locked cabinet accessible only to the corresponding author. Written consent was chosen over verbal consent to ensure a clear, auditable record of voluntary participation, to comply with Salaam University Research Ethics Committee (SU-REC) requirements for studies involving human judgment tasks with potential for rater fatigue or perceived evaluative pressure, and to provide legal documentation of the terms agreed upon by both parties. No vulnerable populations were involved, and no compensation was provided to raters beyond acknowledgment in the manuscript.

SPSS v31 and NVivo 14 were used for data analysis. Based on descriptive statistics (mean, SDs, range, and frequencies), frequency distributions and coherence profiles were determined. The means for the dimensional coherence were compared using paired t-tests. To test the curvilinear relationship between frequency and coherence (H3), polynomial regression was used, followed by an F-test of ΔR². Matched paired t-tests compared high frequency (>22/1,000 words) and low frequency (<8/1,000 words) stories (matched by length ± 500 words and by year ±2 years) and Cohen’s d effect size. Moderation analysis (Model 1 of PROCESS v4.3) was used to determine whether the switch function and quotation-bound status moderated the frequency–coherence effect (H5). Other procedures used were the Shapiro–Wilk normality test, Levene’s test, winsorizing outliers (>3 SD), and Bonferroni (α = 0.0125 for dimensional tests) and α = 0.05 with 95% CIs for primary hypotheses.

For internal validity, controlled sampling, operational definitions, and rater blinding were used. Full-period and probability sampling was used to ensure external validity, which led to generalization to the 3,000-story population. Construct validity was assessed using standard measures (switches/1000 words; adapted Berman & Nir rubric). The four dimensions of content validity were assessed by expert review. Face validity was achieved through pilot testing. Interrater reliability (Fleiss’ κ) = 0.78–0.84 (threshold ≥0.70). The intra-rater reliability varied from 0.88 to 0.92 (thresholds ≥0.85). The internal consistency (Cronbach’s α) was 0.86 (≥0.80). The split-half reliability (Spearman-Brown) was 0.83 (threshold ≥0.80). The agreement between the coding on the test and test-retest was 97.2% (threshold ≥95%).

4. Results and Discussion

This section presents the findings of this study on the effect of Somali–Arabic code-switching on narrative coherence in contemporary Somali short stories from Mogadishu. The results are organized according to five specific objectives and their corresponding hypotheses. Quantitative data were analyzed using SPSS Version 31 and NVivo 14, based on a sample of 400 short stories randomly selected from a population of 3,000 stories published between 2015 and 2025.

4.1 Descriptive Characteristics of the Corpus

The descriptive analysis revealed that the mean code-switching frequency across the corpus was 14.8 switches per 1,000 words, with considerable variation ranging from 2 to 45 switches per 1,000 words. Table 1 displays the corpus’s descriptive statistics.

Table 1. Descriptive Statistics of the Corpus.

Variable	Mean	SD	Minimum	Maximum	Range
Story length (words)	3,240	980	1,520	4,980	3,460
Publication year	2020	3.2	2015	2025	10
Code-switching frequency (switches/1,000 words)	14.8	6.4	2	45	43
Narrative coherence score (0–100)	67.3	12.5	38	92	54

4.2 Objective 1: Frequency of Somali–Arabic Code-Switching

The results show that most stories (60.5%) fall within the medium frequency range of 8 to 22 switches per 1,000 words. Table 2 summarizes the code-switching frequency categories.

Table 2. Code-Switching Frequency Categories.

Frequency Category	Switches per 1,000 words	Number of Stories	Percentages
Low frequency	< 8	98	24.5%
Medium frequency	8–22	242	60.5%
High frequency	> 22	60	15.0%
Total		400	100%

4.3 Objective 2: Narrative Coherence Levels

The results indicate that temporal coherence received the highest mean score (17.4 out of 25), followed by thematic (16.9), causal (16.8), and referential coherence (16.2). Table 3 presents the dimensional coherence sub-scores.

Table 3. Dimensional Coherence Sub-Scores (0–25 each).

Coherence Dimension	Mean	SD	Minimum	Maximum
Referential coherence	16.2	3.8	8	24
Temporal coherence	17.4	3.5	9	25
Causal coherence	16.8	3.9	7	23
Thematic coherence	16.9	3.6	8	24
Composite score (0–100)	67.3	12.5	38	92

4.4 Objective 3: Effect of Code-Switching Frequency on Narrative Coherence

Polynomial regression analysis revealed a significant curvilinear (inverted U-shaped) relationship between code-switching frequency and narrative coherence. Table 4 shows the polynomial regression results for the frequency–coherence relationship.

Table 4. Polynomial Regression Results for Frequency–Coherence Relationship.

Predictor	Coefficient (β)	SE	t-value	p-value	95% CI
Constant	58.42	2.15	27.17	<0.001	[54.20, 62.64]
Frequency (linear)	1.86	0.42	4.43	<0.001	[1.04, 2.68]
Frequency² (quadratic)	−0.07	0.02	−3.50	<0.001	[−0.11, −0.03]

4.5 Objective 4: Comparison of High-Frequency vs. Low-Frequency Stories

The matched-pair t-test revealed a large and statistically significant difference between the low- and high-frequency stories. Table 5 compares high-frequency and low-frequency story coherence.

Table 5. Descriptive Statistics for High-Frequency vs. Low-Frequency Groups.

Group	N	Mean Coherence Score	SD	Minimum	Maximum
Low frequency (<8/1,000 words)	98	68.4	10.2	45	88
High frequency (>22/1,000 words)	60	54.2	11.8	38	78

4.6 Objective 5: Literary Functions of Code-Switching

Moderation analysis revealed several important findings regarding the literary functions of code-switching. Table 6 reports the mean coherence scores by switch function.

Table 6. Mean Coherence Scores by Switch Function.

Switch Function	Number of Switches (N)	Mean Coherence Score of Stories Containing This Function
Realism	1,820	65.2
Characterization	1,365	68.4
Foregrounding	910	71.6
Metalepsis	455	62.8

4.7 Hypothesis Testing Summary

Table 7 summarizes the hypothesis testing results.

Table 7. Hypothesis Testing Summary.

Hypothesis	Statement	Outcome	Evidence
H1	Frequency varies considerably (mean 10–25 switches/1,000 words)	Supported	Mean = 14.8, SD = 6.4, range 2–45
H2	Coherence is not consistently high; referential coherence is most sensitive	Partially supported	Referential coherence (16.2) significantly lower than temporal (17.4) and thematic (16.9)
H3	Frequency–coherence effect is curvilinear (inverted U-shaped)	Supported	Significant quadratic term (β = −0.07, p < 0.001); ΔR² = 0.06
H4	High-frequency stories significantly less coherent than low-frequency stories	Supported	Mean difference = 14.2, Cohen’s d = 1.65, p < 0.001
H5	Quotation-bound switches correlate with higher coherence than unmarked switches	Supported	Quotation-bound β = 4.28, p < 0.001; coherence difference = +7.6

5. Discussion

This study provides the first empirical evidence of Somali–Arabic code-switching and narrative coherence in Mogadishu short stories. The mean frequency was 14.8 switches per 1,000 words, with intra-sentential switches (62.4%) supporting Myers-Scotton’s Matrix Language Frame model. Referential coherence (16.2) was significantly lower than temporal (17.4) and thematic coherence (16.9), partially supporting H2. The inverted U-shaped relationship supports the Frequency Hypothesis, with optimum coherence at 13–14 switches per 1,000 words. Coherence increased at moderate frequencies but declined sharply beyond 22 switches.

Low-frequency stories (68.4) scored 14.2 points higher than high-frequency stories (54.2, Cohen’s d = 1.65, p < 0.001). Foregrounding (β = 6.42) and characterization (β = 3.15) were positively associated with coherence. Quotation-bound status moderated the frequency–coherence effect (β = 4.28, p < 0.001), with quotation-bound religious switches showing 7.6 points higher coherence. Authors should aim for 12–18 switches per 1,000 words and avoid exceeding 22.

6. Conclusion

This study provides the first empirical baseline data on Somali–Arabic code-switching (14.8 switches/1,000 words) and validates the curvilinear (inverted U-shaped) relationship between code-switching and narrative coherence, with the optimal number of switches being 13–14 per 1,000 words. The coherence rises at moderate frequencies but falls precipitously beyond 22 switches. Referential coherence was the most sensitive, and temporal cohesion was the most robust. Low-frequency stories (68.4) earned a higher score than high-frequency stories (54.2, Cohen’s d = 1.65). The coherence–frequency interaction was positively moderated by foregrounding and characterization switches with quotation-bound status. Moderate and functional code-switching allows for coherence to be maintained, while excessive code-switching causes it to be lost.

7. Recommendations

The authors are expected to include 12–18 Arabic switches in 1000 words, not include more than 22 switches, and focus primarily on quotation-bound and foregrounding switches. The four-dimensional coherence rubric should be used, and stories with more than 22 switches should be flagged by the editor. Publishers must set standards for frequency and educate editors on evidence-based assessments. The Frequency-Adjusted Coherence Model should be incorporated into creative writing education. Policymakers should move from ideologies to functional coherence criteria that help support and promote authentic bilingual literary practices.

Contributions of the Study

Theoretical Contributions

This study further develops the Frequency Hypothesis in the context of literature, applies Berman and Nir’s rubric to code-switched contexts, and integrates Gumperz’s distinction with Myers-Scotton’s model. The Frequency-Adjusted Coherence Model suggests an ideal range of 12–18 switches for every 1,000 words.

Empirical Contributions

This study offers the first frequency baseline for Somali–Arabic literary switching (mean = 14.8), the first multidimensional coherence profile for Somali short stories, and the first annotated corpus of 400 stories with replicable metadata.

Methodological Contributions

This study provides an operationalization of the concept of frequency (low <8, medium 8–22, high >22); is based on a four-dimensional rubric (Fleiss’ κ = 0.78–0.84); illustrates polynomial regression; and shows matched-pair t tests.

Policy Contributions

This study offers quantitative benchmarks for assessing multilingual works and considers code switching as a valid literary approach in non-formal education.

Practical Contributions

This study provides a coherence audit checklist, a pedagogical rubric for creative writing, and a validated questionnaire and coding sheet that can be adapted for other multilingual settings.

Recommendations for Future Research

Future studies should include other genres of literature, other Somali regions, and diaspora literature. The model should be evaluated on other language pairs (Somali–English and Swahili–Arabic). Experimental research should be conducted to establish causation. Studies focusing on readers should assess their comprehension. Longitudinal studies should measure changes over time. Typological distance effects should be discussed in cross-language comparison studies.

Limitations of the Study

Findings are limited to short stories (1,500–5,000 words) written by authors from within the Mogadishu community between 2015–2025, and do not apply to poetry, novels, or diaspora literature. Coherence was rated by five bilingual experts but not assessed through reader-response and/or eye tracking. Nativized loans (e.g., kitaab) were omitted, perhaps underestimating the Arabic influence. Typologically distant language pairs (Somali–Cushitic and Arabic–Semitic) may not be applicable to closer pairs. The 13.3% sample (400 of 3,000 stories) had a 5% margin of error. Raters were over-represented in the formal literacy and university-educated categories. The observational design precludes the ability to make clear causal inferences.

Ethical Approval Statement

This study was conducted in accordance with the internationally recognized ethical principles for research involving human participants, as outlined in the Declaration of Helsinki. Ethical approval was obtained from the Salaam University Research Ethics Committee (SU-REC), Center for Research and Development. The approval details are as follows: Approval Reference Number: 2025/SU-REC/AMSHS/P0394; Ethical Approval Date: August 17, 2025. The research protocol, including the questionnaire, informed consent procedures, data handling protocols, and participant protection measures, was reviewed and approved by the committee prior to commencing data collection. All research activities were conducted in strict adherence to approved protocols.

Informed Consent

Informed consent was obtained from all participants in this study. Before completing the survey, each participant was presented with a detailed digital consent form outlining the study’s purpose, procedures, potential risks and benefits, and the voluntary nature of participation. The consent form explicitly stated that the data would be used solely for academic research and that all responses would remain anonymous and confidential. Participants were informed of their right to withdraw from the study at any time, without penalty. Consent was obtained electronically before proceeding to the questionnaire, ensuring that participation was transparent, informed, and voluntary, in accordance with the ethical guidelines of the Salaam University Research Ethics Committee (SU-REC).

Approval to Publish

The author confirms that this work is original, has not been published elsewhere, and is not currently under consideration by any other journal. The author grants the publisher a license to publish this study in any form. The author has reviewed and approved the final version of this manuscript for submission to this journal.

Consent to Publish

Not applicable.

Clinical Trial Registration

Not applicable.

Data Availability Statement

The datasets generated and analyzed during this study are not publicly available because of confidentiality agreements and ethical commitments made by the participants. However, the data can be made available from the corresponding author upon reasonable request, subject to privacy and ethical considerations.

The raw story texts cannot be publicly shared due to confidentiality agreements with literary archives and authors. Anonymized quantitative data are available upon reasonable request. Requests should be directed to the corresponding author at [email protected]. Access will be granted to academic researchers for non-commercial use subject to a brief proposal and a standard data use agreement.

Acknowledgments

The author would like to express his sincere gratitude to all professionals in Somalia’s public institutions who participated in this study and offered their valuable time and insights. Special thanks to Salaam University for its academic support and for fostering a conducive research environment. The contributions of all those who supported this research were essential for the successful completion of this study.

References

Chapwanya FC, Hester Nel J: An exploratory analysis of code-switching and borrowing in a corpus of Zimbabwean English. Lingua. 2025; 327: 104037. Publisher Full Text
Heakl A, Zaghloul Y, Ali M, et al.: ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs. Procedia Computer Science. 2024; 244: 113–120. Publisher Full Text
Kunduzay A, Koptyleuova D, Salkynbayev M, et al.: The Novel “Hizanat Shamail” and The Mamluk Sultanate. Procedia. Soc. Behav. Sci. 2015; 197(February): 543–548. Publisher Full Text
Lachemat HEO, Akli A, Oukas N, et al.: CAFE: Spontaneous code-switching speech dataset in Algerian dialect, French and English. Data Brief. 2025; 63: 112150. PubMed Abstract | Publisher Full Text | Free Full Text
Lian X: Fostering creative engagement through an integrated multimodal-translanguaging framework in multilingual English classrooms. Ampersand. 2026; 16(September 2025): 100268. Publisher Full Text
Lu J, Shi L, Liu Y, et al.: Exploring developmental pathways in LLM-empowered affective design. Displays. 2026; 94(October 2025): 103435. Publisher Full Text
Maskat R, Azman NA, Nulizairos NSS, et al.: A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution. Data Brief. 2024; 52: 110034. PubMed Abstract | Publisher Full Text | Free Full Text
Ray PP: A Review on LLMs for IoT Ecosystem: State-of-the-art, Lightweight Models, Use Cases, Key Challenges, Future Directions. Authorea Preprints. 2025; 5(May 2025): 275–328. Publisher Full Text
Rekun O, Meir N: Two gender systems in a bilingual mind: A study of gender assignment in code-switched Russian-Hebrew adjective-noun phrases. Ampersand. 2024; 13(August): 100189. Publisher Full Text
Rizq B, Alsulami A, Mohammed R, et al.: Acta Psychologica Classroom code-switching among postgraduate EFL students in Saudi Arabia: Attitudes , reasons, and perceptions. Acta Psychol. 2026; 267(December 2025): 107007. Publisher Full Text
Salig LK, Valdés Kroff JR, Slevc LR, et al.: Hearing a code-switch increases bilinguals’ attention to and memory for information. J. Mem. Lang. 2025; 143(April): 104647. Publisher Full Text
Schächinger Tenés LT, Weiner-Bühler JC, Grob A, et al.: How to juggle languages: Verbal short-term memory as a key predictor of code-switching in dual language learning 3- to 6-year-olds. Cogn. Dev. 2025; 73(January): 101543. Publisher Full Text
Singh S, Singh M, Kadyan V: HiACC: Hinglish adult & children code-switched corpus. Data Brief. 2025; 62: 111886. PubMed Abstract | Publisher Full Text | Free Full Text
Waluyo B, Khan NM: Exploring translanguaging behaviors in digital storytelling: A mixed-methods study in Thai higher education. Social Sciences and Humanities Open. 2026; 13(September 2024): 102655. Publisher Full Text
Zhou L, Fu Z: Stylometric characteristics of code-switched offensive language in social media. Inf. Manag. 2025; 62(6): 104153. Publisher Full Text

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 23 Jun 2026

Author details Author details

IR, Salaam University, Mogadishu, Banaadir, SOm, Somalia

Abdifatah Nour Rage
Roles: Conceptualization, Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Resources, Software, Supervision, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The author(s) declared that no grants were involved in supporting this work.

Article Versions (1)

version 1

Published: 23 Jun 2026, 15:997

https://doi.org/10.12688/f1000research.183207.1

Copyright

© 2026 Rage AN. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

0

SEE MORE DETAILS

CITE

how to cite this article

Rage AN. The Effect of Code Switching (Somali–Arabic) on Narrative Coherence in Contemporary Somali Short Stories: A Study of Mogadishu Based Literary Works [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:997 (https://doi.org/10.12688/f1000research.183207.1)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Comments on this article Comments (0)

Version 1

VERSION 1 PUBLISHED 23 Jun 2026

Open Peer Review

Reviewer Status

AWAITING PEER REVIEW

Comments on this article

All Comments(0)

Add a comment

Sign up for content alerts

Browse by related subjects

[1] Chapwanya FC, Hester Nel J: An exploratory analysis of code-switching and borrowing in a corpus of Zimbabwean English. Lingua. 2025; 327: 104037. Publisher Full Text

[2] Heakl A, Zaghloul Y, Ali M, et al.: ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs. Procedia Computer Science. 2024; 244: 113–120. Publisher Full Text

[3] Kunduzay A, Koptyleuova D, Salkynbayev M, et al.: The Novel “Hizanat Shamail” and The Mamluk Sultanate. Procedia. Soc. Behav. Sci. 2015; 197(February): 543–548. Publisher Full Text

[4] Lachemat HEO, Akli A, Oukas N, et al.: CAFE: Spontaneous code-switching speech dataset in Algerian dialect, French and English. Data Brief. 2025; 63: 112150. PubMed Abstract | Publisher Full Text | Free Full Text

[5] Lian X: Fostering creative engagement through an integrated multimodal-translanguaging framework in multilingual English classrooms. Ampersand. 2026; 16(September 2025): 100268. Publisher Full Text

[6] Lu J, Shi L, Liu Y, et al.: Exploring developmental pathways in LLM-empowered affective design. Displays. 2026; 94(October 2025): 103435. Publisher Full Text

[7] Maskat R, Azman NA, Nulizairos NSS, et al.: A bi-annotated Malay-English code-switching (Manglish) dataset of X posts for biological gender identification and authorship attribution. Data Brief. 2024; 52: 110034. PubMed Abstract | Publisher Full Text | Free Full Text

[8] Ray PP: A Review on LLMs for IoT Ecosystem: State-of-the-art, Lightweight Models, Use Cases, Key Challenges, Future Directions. Authorea Preprints. 2025; 5(May 2025): 275–328. Publisher Full Text

[9] Rekun O, Meir N: Two gender systems in a bilingual mind: A study of gender assignment in code-switched Russian-Hebrew adjective-noun phrases. Ampersand. 2024; 13(August): 100189. Publisher Full Text

[10] Rizq B, Alsulami A, Mohammed R, et al.: Acta Psychologica Classroom code-switching among postgraduate EFL students in Saudi Arabia: Attitudes , reasons, and perceptions. Acta Psychol. 2026; 267(December 2025): 107007. Publisher Full Text

[11] Salig LK, Valdés Kroff JR, Slevc LR, et al.: Hearing a code-switch increases bilinguals’ attention to and memory for information. J. Mem. Lang. 2025; 143(April): 104647. Publisher Full Text

[12] Schächinger Tenés LT, Weiner-Bühler JC, Grob A, et al.: How to juggle languages: Verbal short-term memory as a key predictor of code-switching in dual language learning 3- to 6-year-olds. Cogn. Dev. 2025; 73(January): 101543. Publisher Full Text

[13] Singh S, Singh M, Kadyan V: HiACC: Hinglish adult & children code-switched corpus. Data Brief. 2025; 62: 111886. PubMed Abstract | Publisher Full Text | Free Full Text

[14] Waluyo B, Khan NM: Exploring translanguaging behaviors in digital storytelling: A mixed-methods study in Thai higher education. Social Sciences and Humanities Open. 2026; 13(September 2024): 102655. Publisher Full Text

[15] Zhou L, Fu Z: Stylometric characteristics of code-switched offensive language in social media. Inf. Manag. 2025; 62(6): 104153. Publisher Full Text

The Effect of Code Switching (Somali–Arabic) on Narrative Coherence in Contemporary Somali Short Stories: A Study of Mogadishu Based Literary Works

Abstract

Background

Methods

Results

Conclusion

Keywords

1. Introduction

2. Literature Review

2.1 Theoretical Foundation

2.2 Specific Objectives

2.3 The relationship between the frequency of code-switching and the coherence of the narratives

2.4 Hypothesis Development

2.5 Empirical Gap

2.6 Concept of Code-Switching Frequency

2.7 Concept of Narrative Coherence Score

2.8 Conceptual Framework

3. Research Methodology

4. Results and Discussion

4.1 Descriptive Characteristics of the Corpus

Table 1. Descriptive Statistics of the Corpus.

4.2 Objective 1: Frequency of Somali–Arabic Code-Switching

Table 2. Code-Switching Frequency Categories.

4.3 Objective 2: Narrative Coherence Levels

Table 3. Dimensional Coherence Sub-Scores (0–25 each).

4.4 Objective 3: Effect of Code-Switching Frequency on Narrative Coherence

Table 4. Polynomial Regression Results for Frequency–Coherence Relationship.

4.5 Objective 4: Comparison of High-Frequency vs. Low-Frequency Stories

Table 5. Descriptive Statistics for High-Frequency vs. Low-Frequency Groups.

4.6 Objective 5: Literary Functions of Code-Switching

Table 6. Mean Coherence Scores by Switch Function.

4.7 Hypothesis Testing Summary

Table 7. Hypothesis Testing Summary.

5. Discussion

6. Conclusion

7. Recommendations

Contributions of the Study

Theoretical Contributions

Empirical Contributions

Methodological Contributions

Policy Contributions

Practical Contributions

Recommendations for Future Research

Limitations of the Study

Ethical Approval Statement

Informed Consent

Approval to Publish

Consent to Publish

Clinical Trial Registration

Data Availability Statement

Acknowledgments

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated