ALL Metrics
-
Views
Get PDF
Get XML
Cite
Export
Track
Systematic Review

The Effects of Artificial Intelligence-Based Interventions on Depression and Anxiety: A Systematic Review

[version 1; peer review: awaiting peer review]
Previously titled: The Effects of Artificial Intelligence-Based Intervention on Depression and Anxiety: A Systematic Review"
PUBLISHED 18 Jun 2026
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS AWAITING PEER REVIEW

This article is included in the Artificial Intelligence and Machine Learning gateway.

Abstract

Introduction

Depression and anxiety remain major global mental health challenges that continue to increase across populations. Conventional treatments are often limited by cost, accessibility, stigma, and the availability of professionals. Artificial intelligence (AI)-based interventions have emerged as a potential approach to address these gaps. However, the growing body of evidence across diverse contexts calls for further synthesis. This study aims to examine research characteristics, evaluate effects, and analyse the implementation issues of AI-based interventions for depression and anxiety.

Methods

This systematic review was conducted in accordance with guidelines. Fourteen randomised controlled trials (RCTs) were identified from major databases, including Scopus, Web of Science, PubMed, and EBSCO, within the period from 6 November 2015 to 6 November 2025. Study quality was assessed using the Cochrane Risk of Bias 2 tool, and findings were synthesised using a narrative approach.

Results

The findings indicate that AI-based interventions, such as chatbots, large language models, and integrated platforms, generally demonstrate effects in reducing symptoms of depression and anxiety across various populations. However, results remain heterogeneous, with some studies showing outcome-specific or within-group improvements only. Implementation issues were identified, including limited human support, recruitment bias, and short follow-up periods, which may reduce adherence, generalisability, and the assessment of long-term effects.

Conclusions

AI-based interventions may be potentially accessible and scalable mental health solutions, with outcomes comparable to conventional care in certain contexts. However, their effects are shaped by implementation-related challenges, including variability in engagement, technological limitations, and ethical considerations. Future research should prioritise more standardised methodologies, longer intervention durations with follow-up, and greater attention to implementation design and sustainability.

Systematic Review Registration

Registered in PROSPERO on 16 February 2026 (Registration number CRD420261308648). Available from: https://www.crd.york.ac.uk/PROSPERO/view/CRD420261308648.

Keywords

anxiety, artificial intelligence, depression, interventions, systematic review

1. Introduction

Depression and anxiety have become two of the most pervasive global mental health challenges, with prevalence rates continuing to rise across age groups and regions.1 These conditions not only diminish quality of life but are also associated with increased risks of chronic illness,2 impaired social functioning,3 and substantial economic burden on healthcare systems.4 Conventional treatment approaches, such as face-to-face therapy and psychiatric services, are often constrained by limited availability of mental-health professionals, high costs, stigma, and, not to mention, geographical barriers.1,5 These complexities underscore the urgent need for innovative strategies to expand the reach, accessibility, and effects of mental health interventions.6

Depression and anxiety are mental disorders that contribute to a significant portion of the global disease burden.7 The National Health Interview Survey shows that one in five adults experienced symptoms of depression (21.4%) and anxiety (18.2%) during the past two weeks.8 These disorders are caused by multiple factors, including biological, psychological, and social factors.9 These disorders often co-occur with physical problems, such as chronic physical pain, migraines, insomnia, low pain tolerance, extreme fatigue, and worsening physical and mental conditions.10 Conventional therapies such as Cognitive Behavioural Therapy (CBT) and medications such as antidepressants and anxiolytics are often used as treatment strategies.11 However, stigma, high costs, limited availability of mental health services, and long waiting times often lead individuals to seek self-help.12 To address this gap, AI offers 24/7 services, anonymity, and low costs. Through integration with CBT approaches, AI can help track mood, provide psychoeducation, and develop problem-solving skills through conversational interactions that mimic human interaction.13

Artificial Intelligence (AI)-based interventions have emerged as potential solutions to address the limitations of traditional mental-health services.14 These technologies can take various forms, including mobile applications, text-based chatbots, conversational agents, and even passive digital-behavior monitoring systems that help detect early signs of psychological stress.15 Studies have shown that AI-driven tools can deliver emotional support, psychoeducation, and cognitive-behavioral exercises in a consistent, scalable, and personalized manner.16 In cases of depression and anxiety, technology-based interventions are used to deliver more interactive and empathetic digital Cognitive Behavioural Therapy (CBT), such as AI chatbots (Therabot, ChatGPT, Psy-Bot, Woebot), Facebook Messenger, and mobile health applications (TEO).1721 Early evidence indicates that certain AI-based approaches can produce clinically meaningful improvements and, in some contexts, perform comparably to or even better than conventional interventions,6,14,22 providing strong justification for further scientific investigation.

However, although previous studies have shown that AI-based conversational agents have a significant impact on reducing symptoms of depression and emotional distress,6,16,18,23 further research is needed to synthesise the effects of AI in reducing depression and anxiety. A systematic review conducted by Joshi et al.1 highlights AI-based interventions for anxiety and depression involving individuals with psychological problems as the population. However, the article search was conducted only through 2024 and included articles not indexed in Scopus. A systematic review of AI Chatbots was also conducted by Nyakhar & Wang,13 which focused on improving students’ psychological well-being, including anxiety and depression. However, there has been no comprehensive synthesis evaluating the effects of AI-based interventions in simultaneously reducing depression and anxiety in various populations. This study included research articles published in reputable Scopus-indexed journals, indicating high scientific quality.

The rapid and widespread integration of AI into digital health systems worldwide has accelerated its development. As AI-based mental health tools become increasingly integrated into telehealth platforms, they assist with patient monitoring, technology-enabled healthcare, diagnostic support, and data analysis,24 thereby enhancing their clinical impact. Although previous reviews have highlighted the effects of AI for depression and anxiety,13,25 a new systematic review across diverse populations and contexts is needed to assess the effects of AI-based interventions.26 This systematic review aims to determine the effects of AI-based interventions in reducing depression and anxiety, as well as to examine the implementation issues associated with these interventions.

2. Methods

This systematic review study was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.27 Page et al.27 state that PRISMA reflects advances in methods for identifying, selecting, appraising, and synthesising studies. This study applied the PRISMA guidelines through four stages, namely (1) identifying research questions, (2) identifying literature sources, (3) conducting a literature search that answered the research questions, and (4) analysing the findings. The protocol was registered in the PROSPERO (International Prospective Register of Systematic Reviews) (CRD420261308648).

2.1 Identifying research questions (RQ)

This systematic review will answer research questions regarding the effects of AI-based interventions on anxiety and depression. The three research questions include:

RQ 1. What are the effects of AI-based interventions in reducing depression and anxiety?

RQ 2. What are the characteristics and patterns of research on AI-based interventions for anxiety and depression in the last 10 years?

RQ 3. What are the Implementation Issues of AI-Based Interventions for Depression and Anxiety?

2.2 Identifying literature sources

Eligibility criteria were developed based on the PICOS (Population, Intervention, Outcome, Comparator, and Study Design) framework.28 The population in this study is the general population, such as students, parents, patients, adults, and workers, considering that depression and anxiety are psychological conditions that are cross-demographic and not limited to specific clinical groups, thus enabling the evaluation of the effects of AI interventions in various real-world contexts for various users. The intervention criteria focus on AI, including AI chatbots, ChatGPT, AI-based platforms, and other AI-based interventions, to distinguish AI’s effects from that of conventional interventions. The comparator in this study was non-AI interventions or alternative AI-based designs used as a control condition. Outcome measures focus on psychological problems, namely anxiety and depression, as these two conditions are the most common psychological disorders and are most often targeted in AI-based interventions in the literature. We filtered the article based only on the study using a Randomised Controlled Trial (RCT) to maximise internal validity. Criteria excluded from the study were research that focused solely on reducing depression or anxiety; interventions that were not AI-based, such as digital or conventional interventions; cross-sectional studies; experimental studies other than RCTs; quasi-studies; case reports; non-empirical articles; or articles written in languages other than English.

2.3 Conducting a literature search that answers the research questions

The literature search was conducted across four main databases: Scopus, CINAHL (EBSCO), Web of Science (WoS), and MEDLINE (PubMed), all of which are indexed in Scopus. The search was conducted over the last 10 years (6 November 2015–6 November 2025) to ensure the studies remain relevant today. The literature search focused on articles relevant to the PICOS framework that discussed the population (general), intervention (AI-based), outcomes (depression and anxiety), and study design (experimental, RCT). The Boolean operators used were (“artificial intelligence” OR AI) AND (depression) AND (anxiety) AND (“randomised controlled trial”). In addition, the article search was limited to English-language articles. The inclusion and exclusion criteria are described in Table 1.

Table 1. Inclusion and exclusion criteria.

InclusionExclusion
Articles published between 6 November 2015 and 6 November 2025Articles published outside the range of 6 November 2015–6 November 2025
Articles written in EnglishArticles not written in English
Articles discussing anxiety and depression simultaneouslyArticles that do not discuss anxiety and depression simultaneously
Study design using Randomised Controlled Trial (RCT)Study design does not use a Randomised Controlled Trial (RCT)
Keywords using Boolean operators (“artificial intelligence” OR AI) AND (depression) AND (anxiety) AND (“randomised controlled trial”)Articles other than those appearing in the Boolean search
Full-paper articles are availableFull-text articles are not available

Study selection was conducted by five reviewers (NMAPP, IGAAIRPD, RSA, KPNT, and MWNTN) using Rayyan AI tools. Articles obtained from four databases were then imported into Rayyan and automatically deduplicated. Initial screening was conducted by four reviewers (NMAPP, KPNT, IGAAIRPD, and MWNTN), filtering abstracts for suitability to the PICOS framework. This was followed by full-text screening, yielding 14 final articles. When disagreements arose, the senior reviewer made the final decision on which articles were included (RSA).

2.4 Analysing the findings

The extraction stage was carried out using Notebook LM and manual extraction by the author. Data extraction was carried out by identifying 14 selected articles and extracting publication information (author name, year of publication, country, study design, sample size, Scopus quartile), population characteristics, sampling techniques, interventions (duration and type), and findings. Data synthesis in this study was conducted using a narrative approach to interpret and integrate the findings, due to the conceptual and methodological diversity of the included studies.29 The synthesis involved organizing study characteristics and results into comparative tables, identifying similar and recurring patterns, and examining relationships among interventions across the literature.29

Data synthesis was performed narratively by categorising the findings into three research questions: the characteristics and patterns of research on AI interventions for anxiety and depression over the last 10 years; the effects of AI-based interventions; and the implementation of AI-based interventions for anxiety and depression. The extraction was presented in tables and descriptive narratives. Reliability assessment was carried out by ensuring the suitability of articles according to the inclusion-exclusion criteria, conducting systematic selection using PRISMA guidelines, extracting data using a standard format, and providing direct citations for all articles. The 14 selected papers are described in Table 2.

Table 2. 14 Selected papers.

CitationAuthor name and yearPublisher Scopus quartile
32(Akdogan et al., 2025)ElsevierQ1
38(Chen et al., 2025)JMIR PublicationsQ1
17(Wang et al., 2025)JMIR PublicationsQ1
36(Xu & Ma, 2025)ElsevierQ1
18(Heinz et al., 2025)NEJM AIQ1
33(Sharp et al., 2025)JMIR PublicationsQ1
37(Gan et al., 2025)Wolters KluwerQ1
35(Zhao et al., 2024)WileyQ1
19(Karkosz et al., 2024)JMIR PublicationsQ2
6(Sadeh-Sharvit et al., 2023)JMIR PublicationsQ1
20(Suharwardy et al., 2023)ElsevierQ2
21(Danieli et al., 2022)JMIR PublicationsQ1
40(Klos et al., 2021)JMIR PublicationsQ2
34(Fulmer et al., 2018)JMIR PublicationsQ1

2.5 Risk of bias

The risk of bias in the selected studies was assessed using the Cochrane Risk of Bias 2 (RoB 2), which is designed to assess randomised controlled trials (RCTs). The assessment was conducted on five main domains, namely: (D1) bias from the randomisation process, (D2) bias due to deviations from the planned intervention, (D3) bias due to missing outcome data, (D4) bias in outcome measurement, and (D5) bias in the selection of reported results. For each domain, each study was then categorised as low risk, some concerns, or high risk.30

The classification was determined based on the signalling questions in RoB 2, including the information available in the research report. Then, the overall risk-of-bias assessment was conducted in accordance with the official RoB 2 guidelines, ensuring that the final decision remained consistent and accountable. To maintain consistency in the assessment, the assessment process was carried out systematically by recording the reasons behind each decision in each domain (e.g., whether the randomisation procedure was described, whether there was potential for intervention deviation, and so on). If unclear information was found in an article, it was noted as a consideration in determining the relevant domain category, in accordance with the principle of caution in RoB 2.31

3. Results

RQ 1. What are the effects of AI-based interventions in reducing depression and anxiety?

Artificial intelligence (AI) holds potential for reducing symptoms of depression and anxiety due to its accessibility and complementary role in conventional care. Various AI tools, such as Large Language Model (LLM)-based agents, chatbots, and mobile applications have shown effects in reducing symptoms of depression and anxiety.26 Although AI shows significant effects in addressing mental health issues, it is not a substitute for professionals or therapists; rather, it is a complement, while safety and long-term effects still need to be considered.26 The effects of AI from the 14 studies included in this systematic review will be outlined in Table 4.

The effects of artificial Intelligence-based interventions for depression and anxiety

Most studies suggest that AI-based interventions may help reduce symptoms of depression and anxiety, although the evidence remains heterogeneous.6,18,3236 Several controlled trials reported improvements in both outcomes; for example, ChatGPT-4.0 in digital counselling for cancer patients was associated with significant reductions in depression and anxiety compared with a control group.32 However, findings were not consistent across all studies. Some interventions were effects for only one outcome, such as Psy-Bot for depression but not anxiety,17 while ChatGPT in preoperative education reduced anxiety but not depression.37 In addition, several studies reported improvements only within groups, with no significant differences compared with control conditions.21,38 Overall, these findings indicate that AI-based interventions may be potential, but their effects appear to vary depending on population, intervention type, and study context.

Individual outcomes

Primary outcomes

The primary focus of this study was to measure the effects of AI interventions in reducing symptoms of mental disorders. Thus, the main outcomes were as follows: (1) reduction in depression symptoms measured using clinical scales such as the Patient Health Questionnaire-9 (PHQ-9), Centre for Epidemiologic Studies Depression Scale (CES-D), Hospital Anxiety and Depression Scale (HADS-depression), and Edinburgh Postnatal Depression Scale (EPDS), (2) reduction in anxiety symptoms measured using instruments such as the Generalised Anxiety Disorder-7 (GAD-7), HADS-Anxiety, State-Trait Anxiety Inventory (STAI), and perioperative anxiety, (3) clinically meaningful changes, namely by assessing whether AI-based testing tools can provide improvements in symptoms that are equal to or even greater than those achieved with conventional interventions.

Secondary outcomes

Secondary outcomes in this study extended beyond the use of AI in reducing symptoms of depression and anxiety, providing additional insights into improving life satisfaction and general well-being, as reported by 14.2% (2/14) of studies. A study conducted by Karkosz et al.19 revealed that the “Fido” application was not only effective in relieving anxiety symptoms but also significantly helped participants feel more satisfied with their daily lives. Other reported outcomes in this study included reduced loneliness, improved mood regulation, and enhanced social functioning, primarily among students following a short-term chatbot intervention.17,19,34,35

RQ 2. What are the characteristics and patterns of research on AI-based interventions for anxiety and depression over the past 10 years?

A systematic literature search was conducted27 using four sources, namely Web of Science, Scopus, PubMed, and EBSCO, yielding 355 articles. Deduplication was performed using Rayyan AI. 287 articles underwent title and abstract screening, of which 270 articles were excluded for not meeting the inclusion criteria, such as inappropriate study design (n = 59), irrelevant interventions (n = 124), not focusing on depression and anxiety (n = 39), retracted articles (n = 1), and review articles (n = 47). A total of 17 articles were read in full, but 3 articles with irrelevant study designs (n = 2) and high risk of bias (n = 1) did not meet the criteria. Thus, 14 studies were included in the final analysis, as illustrated in Figure 1. PRISMA Flow diagram of study selection.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure1.gif

Figure 1. PRISMA Flow diagram of study selection.

This figure presents the study selection process conducted in accordance with the PRISMA guidelines. It shows the number of records identified through abstract screening, full-text articles assessed, and the final paper included in this review. The flowchart provides the process of the identification, screening, and inclusion stages of the review process.

Study characteristics

This section presents the main characteristics of the studies included in the systematic review, providing an overview of the research context analysed. The presentation of these characteristics is an important component of systematic reviews, helping to understand variations in study design, population, and interventions that underlie the interpretation of results,27 as well as to explain the results of individual studies, as highlighted in previous systematic reviews.39 Table 3 summarises the research design, sample size, population characteristics, and sampling techniques, types of AI-based interventions, duration, and main outcomes reported.

Table 3. Study characteristics.

Citation Author, year of publicationStudy designSample sizePopulation characteristicsSampling techniquesIntervention typeDurationPrimary outcomes
32Akdogan et al, 2025)Two-Center RCTn = 150 (75 control, 75 intervention)Chemotherapy-naïve cancer patients. Median age: 64 years; 53.3% femaleRandomized 1:1 (ChatGPT vs control)Chat GPT 4.03 monthsReduction in anxiety (HADS-anxiety) and depression (HADS-depression) score
38(Chen et al., 2025)Pilot RCTn = 103Parents (general population)Block randomizationAI Chatbot5 monthsReduction of anxiety (GAD-7) and depression (PHQ-9) levels
17(Wang et al., 2025)RCTn = 100 (50 control, 50 intervention)University students. Mean age = 20.8; 62% femaleRandomized 1:1 (Intervention vs Waitlist)AI Chatbot named “Psy-Bot”7 daysDepression (CES-D) and loneliness (UCLA Loneliness scale) and anxiety (GAD-7)
36(Xu & Ma, 2025)Open-label RCTn = 84 (HSC vs LSC chatbot)College students; aged 18–28 years; 51,2% maleSPSS random number generatorNeil, an Artificial Intelligence (AI)-driven chatbot16 weeksReduction in depression (PHQ-9) and anxiety (GAD-7) scores, including WAI-SR and CSQ-8
18(Heinz et al., 2025)RCTn = 210 (intervention 106, waitlist control group 104)Mean age 33.86 years; 59,52% female; positive CHR-FED Computer-generated sequenceTherabot, a text-based multithreaded chat4 weeks, with follow up 8 weeksChanges in symptoms of MDD (PHQ-9), GAD (GAD-7), and CHR-FED (WCS)
33(Sharp et al., 2025)Two-arm RCTn = 60 (intervention 30, control 30)People on waitlists for eating disorder treatment. Age: ≥ 16 yearsThis multicenter 2-armed RCTThe ED ESSI chatbot4 months and three daysEating disorder pathology
35(Zhao et al., 2025)RCTn = 865 (intervention 269, control 388)Mean age 20.59 years; 61,8% femaleSimple randomizationDouyin companion bot28 daysDepression, anxiety, positive and negative moods
37(Gan et al., 2025)Single-blind, pilot RCTn = 55 (intervention 27, control 28)Patients with knee osteoarthritis. Age: 45–80 yearsSingle-blind, randomized controlled pilot studyChatGPT 4.03 monthsPerioperative anxiety and patient satisfaction
19(Karkosz et al., 2024)Two-arm, open-label RCTn = 81 (intervention 40, control 41)Participants with subclinical depression or anxietyTwo-arm, open-label RCTFido chatbot2 weeks intervention and 1 month follow upDepression (CESD-R, PHQ-9), anxiety (STAI), worry tendencies (PSWQ), satisfaction with life (SWLS), and loneliness (R-UCLA)
6(Sadeh-Sharvit et al., 2023)RCTn = 47 total adult consented (AI group n = 23; TAU group n = 24)Adults with depression or anxiety. Mean age = 30.64 years; 72% femaleTherapist-level randomizationAI Platform (Eleos Health)2 monthsFeasibility and acceptability of AI platform, changes in depression (PHQ-9) and anxiety (GAD-7) symptoms
20(Suharwardy et al., 2023)Single center RCTn = 192 (intervention 96, control 96)Postpartum women aged ≥18 years: mean age 34 yearsBlock randomizationWoebot (mental health chatbot)6 weeksDepression measured by PHQ-9 and EPDS
21(Danieli et al., 2022)RCTn = 60 (SMT-CBT 16, SMT-CBT PHA 16, PHA 14, test only 14)Active workers with stress and anxiety. Age ≥ 55 years: 78% femaleRCT random number generatorTraditional psychotherapy CBT, AI agent, and TEO8 weeksSymptoms related to stress, anxiety, and depression
40(Klos et al., 2021)Pilot RCTn = 181 (82 control, 99 intervention), completers is 34 control and 39 interventionCollege students. Age 18–33 years; 87,2% femaleSimple randomizationTess, an Artificial Intelligence (AI)-based chatbot8 weeksPreliminary data comparison of depression (PHQ-9) and anxiety (GAD-7) symptoms, focusing on viability and acceptability
34(Fulmer et al., 2018)RCTn = 74 (2 test n = 50, 1 control n = 24)College students. Mean age 22.9 years; 70% femaleComputer-based randomizationTess, an Artificial Intelligence (AI)-based chatbotgroup 1: 2 weeks, group 2: 4 weeksReduction of symptoms of depression (PHQ-9) and anxiety (GAD-7) and measured PANAS

Table 4. The effects of AI-based interventions in reducing depression and anxiety.

CitationAuthor, year InterventionComparatorPrimary outcomesOutcome interpretation
32Akdogan et al, 2025)Chat GPT 4.0Standard clinician-led education groupAnxiety (HADS-anxiety) and depression (HADS-depression)Effective for both outcomes
38(Chen et al., 2025)AI ChatbotNurse hotlineAnxiety (GAD-7) and depression (PHQ-9)Significant within-group
17(Wang et al., 2025)AI Chatbot “Psy-Bot”Waitlist controlDepression (CES-D) and loneliness (UCLA Loneliness scale) and anxiety (GAD-7)Effective for depression only
36(Xu & Ma, 2025)Neil, AI- chatbot (text + voice + animations)LSC group (text only)Depression (PHQ-9) and anxiety (GAD-7)Effective for both outcomes
18(Heinz et al., 2025)Therabot, a text-based multithreaded chatWaitlistMDD (PHQ-9), GAD (GAD-7), and CHR-FED (WCS)Effective for both outcomes
33(Sharp et al., 2025)The ED ESSI chatbotWeb-based informationEating disorder pathology (EDE-Q), Psychosocial impairment (CIA), depression, anxiety, stress (DASS-21)Effective for both outcomes
35(Zhao et al., 2025)Douyin companion botWaiting list groupDepression (PHQ-9), anxiety (GAD-7), positive and negative moods (PANAS)Effective for both outcomes
37(Gan et al., 2025)ChatGPT 4.0Traditional physician explanationAnxiety/Depression (HADS), Perioperative Apprehension Scale-7 (PAS-7), and Visual Analogue Scales for Anxiety (VAS-A, VAS-P)Effective for anxiety only
19(Karkosz et al., 2024)Fido chatbotSelf-help bookDepression (CESD-R, PHQ-9), anxiety (STAI), worry tendencies (PSWQ), satisfaction with life (SWLS), and loneliness (R-UCLA)Both groups improved; null between groups effect
6(Sadeh-Sharvit et al., 2023)AI Platform (Eleos Health)Treatment as usualDepression (PHQ-9) and anxiety (GAD-7) symptomsEffective for both outcomes
20(Suharwardy et al., 2023)Woebot (mental health chatbot)Usual postpartum careDepression measured by PHQ-9 and EPDSEffective for depression only
21(Danieli et al., 2022)AI agent and TEOTraditional therapyStress, anxiety, and depressionNull between-group; some within-group improvements
40(Klos et al., 2021)Tess, (AI)-based chatbotPsychoeducation bookDepression (PHQ-9) and anxiety (GAD-7)Null between-group; anxiety decreased within group
34(Fulmer et al., 2018)Tess, (AI)-based chatbotThe information-only Depression (PHQ-9), anxiety (GAD-7), and PANASEffective for both outcomes

Research design

The study designs of the 14 articles were predominantly randomised controlled trials (RCTs) (n = 14), encompassing variations such as two-centre RCTs,32 single-centre RCTs,20 and two-arm RCTs.19,33 Some studies were designed as quasi-RCT or pilot RCT designs,37,38,40 while other studies used RCTs.6,17,18,21,34,35

Geographic distribution of studies

Fourteen articles published between 2015 and 2025 consistently examined the effects of AI-based interventions in reducing depression and anxiety. The distribution of publications across years was as follows: 2018 (7%), 2021 (7%), 2022 (7%), 2023 (15%), 2024 (7%), and the majority in 2025 (57%). Four studies were conducted in the United States6,18,20,34 and one study was conducted in Argentina.40 Studies in Europe were conducted in Poland19 and Italy.21 Studies in Asia were conducted in Turkey,32 Hong Kong,38 and China.17,3537 Oceania was represented by one study in Australia.33 Figure 2(A) presents the distribution of publication years based on 14 selected journals from 2015 to 2025. Meanwhile, Figure 2(B) illustrates the geographical distribution of studies on AI-based interventions for depression and anxiety between 2015 and 2025.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure2.gif

Figure 2. (A) Distribution of studies by publication year.

This figure illustrates the distribution of the 14 studies included in the review according to their year of publication between 2015–2025. (B). Geographical distribution of studies. This figure shows the countries in which the included studies on AI-based interventions for depression and anxiety were conducted.

Sample size and demographics

The sample sizes across the 14 studies ranged from small to moderate. Moderate-sized samples included more than 500 participants (n = 865),35 while other studies involved fewer than 500 participants. Gender distribution varied across studies, with most studies stated that females were the dominant population. The age range spanned from adolescents and young adults (university students) to adults and the elderly. The populations included were heterogeneous, such as students, patients with specific medical conditions (e.g., cancer, knee osteoarthritis, postpartum mothers), individuals with specific psychological problems (eating disorders, depression and anxiety, and work-related stress), and general populations such as parents, adults, and workers.

AI intervention

These digital interventions take various forms and are designed to address the limitations of traditional mental health services. The identified digital interventions include: (1) chatbots and conversational agents, which are the most common forms, including text-based applications such as AI Chatbot, Tess, Woebot, Psy-Bot, Fido, Therabot, and ED ESSI1720,33,34,36,38,40; (2) large language models (LLM), which are technologies such as ChatGPT (version 4.0) which are used as digital counselling agents or companions to provide medical information and emotional support32,35,37; (3) integrated AI platforms, such as the Eleos Health system, which supports conventional therapy by monitoring patient progress and improving therapist efficiency6; and4 passive behaviour monitoring systems, which detect early signs of psychological stress through passive digital behaviour tracking.21 The visualisation of interventions from the selected articles is presented in Figure 3.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure3.gif

Figure 3. Type of AI-based interventions.

This figure illustrates the types of AI-based interventions identified in the 14 studies included on the review.

Duration

The duration of AI-based tool use in the review was categorised into three time frames: (1) short term (7 days to 4 weeks), in which interventions were designed to provide rapid emotional support or triage; for example, Psy-Bot was used for 7 days, Tess for 3–4 weeks, and LLM-based chatbots for 28 days. Interventions using Socratic questioning and Therabot were also conducted for a duration of 2–4 weeks. (2) Medium term (6 weeks to 3 months), which is typically used to assess more stable clinical effects; for example, Woebot was used for 6 weeks, Tess and the TEO platform for 8 weeks, and the Eleos Health platform for 2 months. The use of ChatGPT 4.0 in a medical context (e.g., cancer and orthopaedic patients) was generally used for 3 months. (3) Long-term (more than 4 months) which involves more complex or monitoring-based with longer durations; for example, the Neil chatbot (16 weeks), and the ED ESSI chatbot (over 4 months). The duration of interventions from the 14 selected articles is visualised in Figure 4.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure4.gif

Figure 4. AI Intervention duration.

This figure shows the duration of AI-based interventions reported in the 14 studies included in the review.

Risk of bias

Figure 5(B) summarises the risk-of-bias assessment for the 14 trials included in this review. Overall, 9 studies were classified in the “some concerns” category (64.3%), 3 studies were assessed as low risk (21.4%), and 2 studies (14.3%) were judged to have a high risk of bias. These findings indicate that although the available evidence is generally potential, several studies still present methodological limitations that should be interpreted with caution.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure5.gif

Figure 5. (A). Summary of risk of bias assessments for individual studies.

This figure summarizes the risk of bias assessments for each included study across the evaluated domains. (B). Distribution of risk of bias across studies. This figure illustrates the proportion of studies rated as low risk, some concerns, or high risk across each risk of bias domain and overall risk of bias.

Across domains, the most notable limitation was bias arising from deviations from intended interventions (D2), which was the only domain contributing to the high-risk ratings in this review. In contrast, bias due to missing outcome data (D3) was less problematic, with most studies classified as low risk in this domain. For the remaining domains—bias arising from the randomisation process (D1), bias in measurement of the outcome (D4), and bias in selection of the reported result (D5)—the most common judgement was “some concerns”, generally reflecting incomplete or unclear reporting of methodological procedures rather than clear evidence of serious bias. The detailed distribution of risk-of-bias judgements is presented in Figures 5(A) and 5(B).

RQ 3. What are the implementation issues of AI-based interventions for depression and anxiety?

Most studies employed passive control conditions (e.g., usual care or waitlist), which may limit the ability to control for placebo effects. Only a small number of studies used active control groups (e.g., psychoeducation, books, or nurse hotlines),19,38,40 which may reduce inferential strength. Moreover, only one study involved a therapist in a face-to-face setting when delivering AI-based CBT,6 while another study involved direct responses from a physician.37 Human support, however, appears to play an important role in influencing adherence and intervention effects.

Digital and social media–based recruitment methods tend to attract self-selected, technologically literate populations. As a result, there is a risk of selection bias, whereby the findings may not fully represent populations with lower digital literacy or those with limited access to devices due to economic constraints. In addition, study samples are often drawn from a single population segment. For example, postpartum studies may include only women without severe depression, while eating disorder studies may recruit only adolescents on waiting lists, thereby limiting generalisability. Furthermore, follow-up periods are relatively short, typically ranging from 2–8 weeks, making it difficult to assess long-term effects.

Heinz et al.18 highlighted another issue related to engagement, characterised by a decline in user participation over time (i.e., low retention). This pattern is often attributed to a “novelty effect”, which diminishes after the initial sessions. In addition, Sharp et al.33 indicated challenges in integrating chatbots into standard clinical care systems, particularly for patients on waiting lists. Other findings suggest that the transition from rule-based chatbots to those based on large language models (LLMs), such as ChatGPT, introduces new challenges related to personalisation and safety. Although generative AI enables more natural interactions, concerns regarding data privacy and the potential for medical “hallucinations” remain prominent in implementation within formal healthcare settings.32

4. Discussion

This review suggests that AI-based interventions have the potential to reduce symptoms of depression and anxiety in both general and clinical populations. Several studies reported short-term symptom improvement, indicating that AI may be considered a supportive tool in mental health services.6,18,34,38 However, these findings should be interpreted with caution due to the limited number of studies and the substantial heterogeneity observed. While some interventions demonstrated greater symptom reduction compared to standard care, others reported non-significant results. For example, text-based interventions did not consistently provide additional benefits compared to established self-help approaches.19

Design factors play an important role in determining intervention effects. Approaches that incorporate richer social cues, such as voice or visual elements, tend to produce better outcomes than purely text-based approaches.35,38 The sustainability of intervention effects also warrants attention. Several studies have reported a decline in effects during follow-up periods, particularly in the absence of human support.17 In addition, technical limitations, such as repetitive responses and failures in user intent recognition, may disrupt the therapeutic alliance and reduce user adherence.19

This review has several limitations that should be considered. The relatively small number of included studies, combined with substantial heterogeneity in intervention types, study designs, and outcome measures, limits the generalisability of the findings. In particular, the reviewed studies encompassed a wide range of AI approaches, including rule-based chatbots such as Tess, large language models (LLMs) such as ChatGPT-4.0 and Therabot, AI platforms that support clinical practice such as Eleos Health, and passive behavioural monitoring systems that detect or estimate psychological conditions through digital behavioural data. These differences suggest that each technology operates through distinct mechanisms, varies in its level of autonomy, and is applied to different clinical purposes. Accordingly, the core components contributing to intervention effects may not be consistent across studies. As a result, differences in effect outcomes between studies may not solely reflect whether an intervention is effective, but also the diversity of technologies being evaluated. Therefore, this review is more appropriately understood as an examination of diverse forms of AI-based mental health interventions, rather than an evaluation of a single uniform AI model.

Despite the effect of AI-based interventions in reducing symptoms of depression and anxiety, their real-world implementation remains constrained by several practical challenges. User engagement is a recurring concern, as many interventions demonstrate strong short-term outcomes but declining adherence over time, suggesting a novelty effect and limited sustained interaction. In addition, technological limitations, particularly in simpler chatbot systems, such as repetitive responses, limited contextual understanding, and failures in intent recognition, may weaken user trust and reduce the quality of the therapeutic experience. These issues highlight that effects observed under controlled conditions does not always translate directly into consistent real-world use. Another limitation of the statistical findings in most of the studies was the absence of reported confidence intervals, which limited the accuracy of interpretations regarding effect sizes. In addition, recruitment procedures lacked standardisation, allowing for the possibility of confounding variables that may have influenced the findings. Although all 14 studies were published in Scopus Q1–Q2 indexed journals, their findings should still be interpreted with caution due to methodological limitations, as the majority were rated as having “some concerns” regarding risk of bias. Additionally, two of the fourteen studies were classified as having a high risk of bias. Therefore, future research is recommended to employ more rigorous selection and randomisation procedures in order to strengthen the findings.

Beyond technical and behavioral factors, implementation is further shaped by clinical, ethical, and contextual constraints. The integration of AI into existing healthcare workflows remains limited, with unclear role definitions between AI systems and human practitioners, often positioning AI as a supplementary rather than a fully embedded tool. At the same time, concerns related to data privacy, clinical safety, and accountability persist, particularly in high-risk situations where AI may not adequately respond to severe psychological distress. Furthermore, the predominance of studies conducted in digitally literate populations raises questions about generalisability across broader and more diverse contexts. These findings suggest that successful implementation depends not only on technological capability but also on sustained engagement design, ethical safeguards, and alignment with clinical practice.

5. Strengths, limitations, and recommendations for future studies on AI-Based interventions for anxiety and depression

5.1 Strengths of the research in this paper review

This systematic review has several clear strengths. First, its compilation follows PRISMA guidelines, making the review more transparent, organised, and easier to trace. Second, this review deliberately focuses only on RCTs, yielding higher-quality evidence than when study designs are mixed. Third, the review focuses not only on mental health in general, but specifically examines AI interventions that target two outcomes simultaneously: depression and anxiety. Fourth, the scope of AI interventions is also quite broad, ranging from chatbots and large language models to integrated AI platforms, machine learning-based prediction systems, and passive behaviour monitoring. This prevents the interpretation of results from being too “narrow” to a single technology type. Finally, this review does not stop at summarising the results; it also includes a formal assessment of study quality using the Cochrane Risk of Bias 2 (RoB 2) tool. By assessing potential bias across five domains, the reported results are more “fair” to read, making it easier to identify which findings are strong and which require more careful interpretation. This summary of strengths is then clarified through the visualisation in Figure 6.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure6.gif

Figure 6. Research strengths.

This figure presents the key strengths identified across the 14 studies included in the review.

5.2 Study limitations

Although this review has several strengths, it also has several limitations that researchers need to acknowledge. First, the included studies show considerable heterogeneity in the types of AI interventions, exposure durations, outcome measurement instruments, and participant characteristics. Second, although all studies used an RCT design, implementation quality was not always consistent. The risk of bias assessment results show that most trials remain in the ‘some concerns’ category, and a small number are even at high risk. Third, the scope for generalising the findings also appears to be limited. The majority of trials were conducted in middle- and high-income countries, with participants who tended to be younger, mostly female, and well-versed in digital literacy. Fourth, most studies had relatively short to medium follow-up periods, so the sustainability of the effects has not been fully addressed. Finally, this review included only English-language publications, so there may be a language bias, and some relevant studies in other languages may not have been accessible. The study’s limitations will be illustrated in Figure 7.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure7.gif

Figure 7. Limitation of the research.

This figure summarizes the key limitations identified across the 14 studies reviewed.

5.3 Recommendations for future research

Several recommendations were made in the studies included in this review. First, future research should involve larger samples and more diverse populations. Second, the effects of conventional therapy should be compared with that of technology-based, face-to-face complementary therapies, such as integrating virtual reality, teletherapy, website-based therapy, and other AI interventions. Third, longer intervention durations are accompanied by follow-up sessions to assess the intervention’s long-term effects. Fourth, attention to participant safety and to effects testing procedures conducted in accordance with strict protocols. Fifth, providing interventions for higher-risk and more severe clinical disorders. Sixth, exploring the sustainable impact of the interventions provided. Finally, increasing human involvement with AI to enhance treatment impact, user satisfaction, and intervention usefulness. Recommendations from the 14 articles included in the study are explained in Figure 8.

e48e780d-64e4-403b-ba76-fd79844bf51d_figure8.gif

Figure 8. Future Research.

This figure explains the recommendations for future research based on the evidence identified in the 14 included studies.

6. Conclusion

AI-based interventions show potential effects for reducing symptoms of depression and anxiety; however the current evidence remains preliminary and heterogeneous. The reviewed studies varied substantially in intervention type, study design, population, and implementation context, and several raising concerns regarding risk of bias. Accordingly, the findings should not be interpreted as evidence that AI is broadly superior to standard care. Rather, AI appears to be a potentially useful supportive approach, with effects dependent on context, therapeutic design, and implementation quality. Future studies should employ more rigorous and standardised methods, include more diverse populations, and report long-term, safety, and implementation outcomes more clearly.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 18 Jun 2026
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Puspitarani NMAP, Devi IGAAIRP, Ngey MWNT et al. The Effects of Artificial Intelligence-Based Interventions on Depression and Anxiety: A Systematic Review [version 1; peer review: awaiting peer review]. F1000Research 2026, 15:964 (https://doi.org/10.12688/f1000research.181969.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status:
AWAITING PEER REVIEW
AWAITING PEER REVIEW
?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 18 Jun 2026
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.