Popularity and diversity: The negative relationship in baby names in the United Kingdom

Yuji Ogihara

doi:10.12688/f1000research.162476.2

Home Browse Popularity and diversity: The negative relationship in baby names...

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Brief Report

Revised

Popularity and diversity: The negative relationship in baby names in the United Kingdom

[version 2; peer review: 3 approved with reservations, 1 not approved]

Yuji Ogihara

PUBLISHED 14 Jul 2025

Author details Author details

Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, Tokyo, 150-8366, Japan

Yuji Ogihara
Roles: Conceptualization, Data Curation, Formal Analysis, Funding Acquisition, Investigation, Methodology, Project Administration, Validation, Visualization, Writing – Original Draft Preparation, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

Abstract

Background

Previous research has shown that popular names have become less popular over time. Simultaneously, accumulated evidence has indicated that names have become more diverse. However, the association between these two phenomena was unclear. This association should be revealed for a better understanding of names and naming practices. Therefore, this study investigated the relationship between the popularity and diversity of names.

Methods

I analyzed the data provided in a previous study in the U.K., which included complete records of all live births between 1996 and 2016 (N = 12,985,140).

Results

I found that the correlations between diversity and popularity indicators were highly negative, showing that they are conceptually strongly related. This means that when diversity is high, popularity is low.

Conclusions

Based on this study, we can predict one indicator from the other indicator. Because raw data on names are generally difficult to collect, this prediction is useful for understanding names and naming practices.

Keywords

popularity, diversity, name, commonality, distribution

Corresponding author: Yuji Ogihara

Competing interests: No competing interests were disclosed.

Grant information: This work was supported by Japan Society for the Promotion of Science (JSPS KAKENHI; Grant Number: 19K14368).
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2025 Ogihara Y. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Ogihara Y. Popularity and diversity: The negative relationship in baby names in the United Kingdom [version 2; peer review: 3 approved with reservations, 1 not approved]. F1000Research 2025, 14:424 (https://doi.org/10.12688/f1000research.162476.2) First published: 11 Apr 2025, 14:424 (https://doi.org/10.12688/f1000research.162476.1) Latest published: 24 Dec 2025, 14:424 (https://doi.org/10.12688/f1000research.162476.3)

Revised Amendments from Version 1

Following the reviewers’ suggestion, I have added an explanation of a limitation. Moreover, I have included an additional explanation of the two indicators.

See the author's detailed response to the review by Han-Wu-Shuang Bao
See the author's detailed response to the review by Stephen J. Bush

Introduction

Popularity of names has decreased

Previous research has indicated that popular names¹ have become less popular over time, suggesting that popularity of names has decreased (for a review, see Ogihara, 2025). For example, the rates of popular names decreased in the United States between 1880 and 2007 (Twenge et al., 2010). In this study, the rates of the top 1, 10, 25, and 50 most popular names were calculated each year, and their historical changes were analyzed (also see Twenge et al., 2016). Furthermore, Bush et al. (2018) demonstrated that the rates of popular names (the top 1 and 10 most popular names) decreased in the United Kingdom (England and Wales) between 1996 and 2016 (also see Bush, 2020). Similar trends were reported in Germany (Gerhards & Hackenbroch, 2000) and France (Mignot, 2022).

Not only in the West (Europe and North America) but also in the East (Asia), this trend has been observed. For instance, Ogihara et al. (2015) showed that the rates of popular names (the top 1, 10, 20, and 50 most popular names) decreased in Japan between 2004 and 2013 (also see Ogihara, 2022). This shift has been consistently reported (Ogihara, 2021; Ogihara & Ito, 2022; for a review, see Ogihara, 2025). In China (Bao et al., 2021) and Indonesia (Kuipers & Askuri, 2017), the popularity of names decreased as well.

Diversity of names has increased

At the same time, emerging evidence has shown that the diversity of names has increased over time, showing that names have become more diverse. For example, Bush et al. (2018) indicated that the ratio of unique (distinctive) names (the relative value of name variety) increased in the U.K. between 1838 and 2016. Moreover, this trend was observed in the U.S. between 1880 and 2017 (He, 2020; also see Mignot (2022) for a similar report in France).

Taken together, accumulated research has indicated that names have become less popular and more diverse. Bush et al. (2018) has shown that these two phenomena were simultaneously observed in the U.K. between 1996 and 2016, analyzing the same dataset.

The relationship between popularity and diversity is unclear

However, the relationship between these two phenomena is unclear. Even though these two phenomena were reported within the same study (Bush et al., 2018; Mignot, 2022), their relation was not directly investigated. Based on the meaning of the concepts (popularity and diversity), they are predicted to be negatively correlated. In other words, when the ratio of popular names is high, the diversity is expected to be relatively low. Similarly, when the ratio of popular names is low, the diversity is expected to be relatively high. Nevertheless, this prediction was not empirically tested. It is possible that even when the ratio of popular names is high, the diversity can also be high. For example, a population can be polarized, where some people give popular names, while others can give many varieties of unpopular names, leading to high popularity and diversity simultaneously.

This relationship between popularity and diversity should be uncovered. If the relationship is revealed, we can predict one indicator from the other. For instance, when the data and results for the top 10 most common names are available, we can infer its diversity from its popularity. In fact, this situation is frequently observed. Names are among the most private types of information. Thus, raw data on names is restricted from being openly shared, making it common for only the ranking of popular names (e.g., the top 10 most common names) to be disclosed (for a review, see Ogihara, 2025). Therefore, even when only one of the two indicators is available, we can estimate the distribution more precisely, which increases the understanding of the nature and phenomena of names and naming practices.

The current study

Therefore, in this study, I examined the relationship between the popularity and diversity of names. Specifically, I analyzed the data in the U.K. presented in previous research (Bush et al., 2018).

Based on the prior discussion above, it was predicted that there would be a negative correlation between the popularity and diversity indicators.

Method

Data

I analyzed the open data provided by Bush et al. (2018; Table S15 “Number of unique forenames, and forename diversity, in the Office for National Statistics dataset”). The data included variables on popular names and name diversity.

The original data is from the U.K. Office for National Statistics (2018), which included complete records of all live births in England and Wales for 21 years between 1996 and 2016. A total of 12,985,140 names were recorded, with an average of 618,340 names per year. It should be noted that names with a count of 2 or 1 were redacted to protect the confidentiality of individuals (Office for National Statistics, 2018).

Indicator

Popularity. As a popularity indicator, I used the variables “% of birth records registered with the most popular name” and “% of birth records accounted for by the top 10 names” (Bush et al., 2018, Table S15).² These indicators have been used in many prior studies (e.g., Bush, 2020; Kuipers & Askuri, 2017; Mignot, 2022; Ogihara et al., 2015; Ogihara, 2022; Twenge et al., 2010, 2016).

Diversity. As a diversity indicator, I used the variable “Forename diversity (i.e., ratio of the no. of unique forenames to the total no. of birth records per year)” (Bush et al., 2018, Table S15). Thus, this indicator represents the relative value of name variety. For example, when this value is 0.01, it means that there are 10 name types among 1,000 people. When this value is high, diversity is also high (there are more name types, meaning that the group is more diverse).³ This indicator has been used in previous research (e.g., He, 2020).

Results

Simple Pearson’s correlation coefficients among the year, popularity indicators, and diversity indicator are summarized in Table 1.

Table 1. Simple Pearson’s correlation coefficients.

	Year	Popularity (top 1)	Popularity (top 10)	Diversity
Year	-	−.955	−.973	.979
Popularity (top 1)	−.955	-	.969	−.960
Popularity (top 10)	−.973	.969	-	−.994
Diversity	.979	−.960	−.994	-

Relationship within popularity indicators

The ratio of the most popular name and the ratio of the top 10 most popular names were strongly correlated, r = .969. This result means that these two indicators consistently measure the same concept, increasing the validity of these two indicators as name popularity indices.

Relationship between popularity and diversity

The ratio of unique names and the ratio of the most popular name between 1996 and 2016 are indicated in Figure 1. As predicted, they were highly negatively correlated, r = −.960.

Figure 1. Diversity (ratio of unique names) and popularity (% of top 1 name) indicators of baby names in the U.K., 1996-2016.

Similarly, the ratio of unique names and the ratio of the top 10 most popular names are indicated in Figure 2. They were also highly negatively correlated, r = −.994.

Figure 2. Diversity (ratio of unique names) and popularity (% of top 10 names) indicators of baby names in the U.K., 1996-2016.

Discussion

Previous research has shown that names have become less popular over time (for a review, see Ogihara, 2025). At the same time, accumulated evidence has indicated that names have become more diverse over time. However, the association between these two phenomena was unclear. This association should be revealed for a better understanding of names and naming practices. Therefore, this study investigated the relationship between the popularity and diversity of names.

I analyzed the name data provided by the previous study in the U.K. (Bush et al., 2018). I found that the correlations between the diversity and popularity indicators were highly negative, showing that they are conceptually strongly related. Specifically, in years when the ratio of unique names was high, the ratios of popular names were low. This means that when diversity is high, popularity is low. This association was very strong in the current dataset.

Based on this study, we can predict one indicator from the other indicator. We can infer diversity from popularity or popularity from diversity. Because raw data on names are generally difficult to collect, this prediction is useful for understanding names and naming practices.

Limitation and future direction

This study analyzed the dataset yielded by the past study (Bush et al., 2018), which did not distinguish between boys’ and girls’ names. Although a different pattern is not predicted based on gender, it is desirable to investigate the relationship between the diversity and popularity of names for boys and girls separately in the future.

This study examined the relationship between diversity and popularity of names in the U.K. (England and Wales), which showed highly negative associations between them. Nevertheless, it is unclear whether this relationship is observed in other nations. Names are cultural products and are affected by many factors (e.g., Morling, 2016; Morling & Lamoreaux, 2008). Thus, it is necessary to investigate this relationship in other nations.

Author contributions

The author confirms being the sole contributor of this work and approved it for publication.

Ethics and consent

Ethical approval and consent were not required.

Data availability

I analyzed the open data provided by Bush et al. (2018; Table S15 “Number of unique forenames, and forename diversity, in the Office for National Statistics dataset”).

References

Bao HWS, Cai H, Jing Y, et al.: Novel evidence for the increasing prevalence of unique names in China: A reply to Ogihara. Front. Psychol. 2021; 12: 731244. PubMed Abstract | Publisher Full Text | Free Full Text
Bush SJ: Ambivalence, Avoidance, and Appeal: Alliterative Aspects of Anglo Anthroponyms. Names. 2020; 68: 141–155. Publisher Full Text
Bush SJ, Powell-Smith A, Freeman TC: Network analysis of the social and demographic influences on name choice within the UK (1838-2016). PLoS One. 2018; 13: e0205759. PubMed Abstract | Publisher Full Text | Free Full Text
Gerhards J, Hackenbroch R: Trends and causes of cultural modernization: An empirical study of first names. Int. Sociol. 2000; 15: 501–531. Publisher Full Text
He K: Long-term sociolinguistics trends and phonological patterns of American names. Proc. Ling. Soc. Amer. 2020; 5(1): 616–622. Publisher Full Text
Kuipers JC, Askuri.: Islamization and identity in Indonesia: The case of Arabic names in Java. Indonesia. 2017; 103: 25–49. Publisher Full Text
Mignot JF: First names given in France, 1800–2019: a window into the process of individualization. Popul. Econ. 2022; 6: 108–119. Publisher Full Text
Morling B: Cultural difference, inside and out. Soc. Personal. Psychol. Compass. 2016; 10(12): 693–706. Publisher Full Text
Morling B, Lamoreaux M: Measuring culture outside the head: A meta-analysis of individualism—collectivism in cultural products. Personal. Soc. Psychol. Rev. 2008; 12: 199–221. PubMed Abstract | Publisher Full Text
Office for National Statistics: Births, deaths and marriages.2018. Reference Source
Ogihara Y: Direct evidence of the increase in unique names in Japan: The rise of individualism. Curr. Res. Behav. Sci. 2021; 2: 100056. Publisher Full Text
Ogihara Y: Common names decreased in Japan: Further evidence of an increase in individualism. Exp. Res. 2022; 3: e5. Publisher Full Text
Ogihara Y: Uncommon names are increasing globally: A review of empirical evidence on naming trends.2025. Manuscript submitted for publication.
Ogihara Y, Fujita H, Tominaga H, et al.: Are common names becoming less common? The rise in uniqueness and individualism in Japan. Front. Psychol. 2015; 6: 1490. PubMed Abstract | Publisher Full Text | Free Full Text
Ogihara Y, Ito A: Unique names increased in Japan over 40 years: Baby names published in municipality newsletters show a rise in individualism, 1979-2018. Curr. Res. Ecol. Soc. Psychol. 2022; 3: 100046. Publisher Full Text
Twenge JM, Abebe EM, Campbell WK: Fitting in or standing Out: Trends in American parents’ choices for children’s names, 1880–2007. Soc. Psychol. Personal. Sci. 2010; 1: 19–25. Publisher Full Text
Twenge JM, Dawson L, Campbell WK: Still standing out: Children’s names in the United States during the Great Recession and correlations with economic indicators. J. Appl. Soc. Psychol. 2016; 46: 663–670. Publisher Full Text

Footnotes

1 In this article, I use the term “names” to refer to “first names” (given names, personal names, forenames), not “last names” (family names, surnames).

2 Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the top 1 and 10 most popular names (numerator) are not influenced by the redaction. As a result, this popularity indicator is higher than the actual popularity.

3 Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the number of unique forenames (numerator) is also lower than the total number of unique forenames. Thus, this diversity indicator is different from the actual diversity, but its direction (lower or higher) is difficult to predict.

Comments on this article Comments (0)

Version 3

VERSION 3 PUBLISHED 11 Apr 2025

Author details Author details

Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, Tokyo, 150-8366, Japan

Competing interests

No competing interests were disclosed.

Grant information

This work was supported by Japan Society for the Promotion of Science (JSPS KAKENHI; Grant Number: 19K14368).
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (3)

version 3

Revised

Published: 24 Dec 2025, 14:424

https://doi.org/10.12688/f1000research.162476.3

version 2

Revised

Published: 14 Jul 2025, 14:424

https://doi.org/10.12688/f1000research.162476.2

version 1

Published: 11 Apr 2025, 14:424

https://doi.org/10.12688/f1000research.162476.1

© 2025 Ogihara Y. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Ogihara Y. Popularity and diversity: The negative relationship in baby names in the United Kingdom [version 2; peer review: 3 approved with reservations, 1 not approved]. F1000Research 2025, 14:424 (https://doi.org/10.12688/f1000research.162476.2)

NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 2

VERSION 2

PUBLISHED 14 Jul 2025

Revised

Views

Reviewer Report 31 Dec 2025

Süleyman Kasap, Van Yüzüncü Yıl University, Bardakçı, Turkey

Not Approved

https://doi.org/10.5256/f1000research.184627.r438916

The manuscript investigates the correlation between two well-established demographic trends: the decreasing popularity of common names and the increasing diversity of names. Using a large, pre-existing dataset from the U.K., the author reports near-perfect negative correlations between measures of name popularity (top 1 and top 10 name shares) and a measure of name diversity (ratio of unique names). The central claim is that these indicators are so strongly related that one can be predicted from the other, offering a practical tool for researchers with limited data access.
While the topic is relevant to onomastics and cultural sociology, the manuscript in its current form makes a negligible contribution to the literature. The core finding—that measures of concentration and diversity in a finite distribution are inversely related—is a statistical tautology, not a novel empirical discovery. The study lacks methodological rigor, a clear theoretical framework, and a meaningful interpretation of results. The manuscript reads more as a replication or secondary data analysis note than as a standalone research article.
1. Lack of Novelty and Conceptual Contribution:
The introduction establishes that prior research has documented both decreasing popularity and increasing diversity over time. The stated research gap is that the "relationship between these two phenomena is unclear." However, this relationship is logically and mathematically inherent. If the share of births accounted for by the top names decreases, the remaining share must be distributed among a larger set of less common names, inherently increasing diversity metrics (like the unique name ratio used here). The analysis merely quantifies this inherent relationship within one dataset. The manuscript fails to articulate what new sociological, psychological, or cultural insight is gained from this quantification.
2. Methodological and Statistical Concerns: The chosen indicators are not independent. The "ratio of unique names" and the "percentage accounted for by the top N names" are two sides of the same coin when analyzing a single, bounded distribution (all births in a year). A high concentration in the top names necessarily limits the potential for a high unique name ratio, and vice-versa. The near-perfect correlations (r = -0.96, -0.99) are therefore expected and do not reveal a "conceptual" relationship but a mathematical one.
Lack of Statistical Control: The entire analysis is based on bivariate correlations across 21 time points. The strong negative correlations are almost certainly inflated by a strong shared time trend (both measures show clear linear trends over time, as evidenced by the high negative correlations with "Year" in Table 1). A partial correlation controlling for year, or a time-series analysis that accounts for autocorrelation, is essential to determine if a relationship exists beyond the simultaneous secular trends. The current analysis risks being spurious.

Insufficient Detail: The manuscript lacks a clear operational definition of the key variables in the results section. The exact calculation of the "ratio of unique names" is critical. Is it (number of distinct names) / (total births)? If so, this is extremely sensitive to the recording of very rare names and the data redaction policy (names with count ≤2 removed), which is a significant limitation not adequately discussed.

Weak Theoretical Framework and Interpretation: The introduction reviews literature on trends but does not build a theoretical argument for why testing this correlation is meaningful. The discussion does not interpret the results beyond restating them. The proposed "utility"—predicting one indicator from another—is overstated. Given the mathematical linkage, if a researcher has the share of the top 10 names, they can already make a strong bounded estimate of diversity without needing this specific correlation equation from U.K. data. The claim that this helps when "raw data...is restricted" is weak, as the prediction requires precisely the kind of summarized data (top N shares) that is already publicly available.

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

References

1. Kasap S: Impact of bilingualism and the difficulties of having minority-specific names in another dominant society: Turkish context for minority Kurdish society. Onoma. 2021; 56: 167-186 Publisher Full Text

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Linguistics , onomastics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to state that I do not consider it to be of an acceptable scientific standard, for reasons outlined above.

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 26 Nov 2025

Basil Osayin Daudu, Kogi State University, Anyigba, Kogi, Nigeria

Approved with Reservations

https://doi.org/10.5256/f1000research.184627.r431893

The paper is an interesting one. The author uses infographics (figures, tables, and illustrations) of some existing literature to argue for the correlations between the popularity and diversity of names in the UK, as one indicator predicts the other and vice versa.

However, some areas of concern need clarification:

1. Starting with the abstract, the problem, the objective results, and conclusions of the study are clearly stated, but the methodology lacks detailed information on the analysis of the available material used.

2. The uncertainty of the popularity and diversity of baby names, which the author also acknowledged. Adding my voice, “name” and “naming practice” may be interrelated, but they are distinct concepts of investigation. This differs from one culture to another. Taking cognizance of the study’s case study “United Kingdom” (England and Wales) as cited in the sixth line of the first paragraph of the introduction, Scotland and Northern Ireland are also considered as parts of the United Kingdom. So, these countries may have some shared similarities in name and naming practices, but they are not basically one and the same due to certain factors. This clearly shows how broad the scope of the study is, which largely contributes to the many uncertainties and predictions (not certainty) in the study. Therefore, narrowing down the scope to one of the countries in the United Kingdom would have produced better and more convincing results.

3. The literature is very rich and up-to-date. My concern here is that most of the literature is largely by Ogihara and Bush et al. If we were to go by 4% single citation authorship, this paper would have been outrightly rejected. The author, too, perceives this lapse and therefore acknowledges the lack of sufficient material for the study.

4. A more disturbing aspect of the study is the author’s adoption of Bush et al.’s study of records of all live births between 1996 and 2016. Another study by Bush et al., between 1838 and 2016. Now, if accepted the way it is, the problem of the study will be that of defining the scope. And to address this issue, the paper should preferably be titled “A Review… (Or similar title)” to fully communicate the author’s claims and evidence.

Decision:

I really commend the author for his use of statistical techniques and infographics in his work. The work is good in the sense that it is a representation of existing literature in which the author lends his voice to the discourse. But reframing the title to correctly capture the content of the work would resolve a whole lot of uncertainties. In this regard, the author is in a better position to do it.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: AI, African Studies, Health

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 02 Sep 2025

Stephen J. Bush, Xi'an Jiaotong University, Xi'an, Shaanxi, China

Approved with Reservations

https://doi.org/10.5256/f1000research.184627.r398444

CITE

Report a concern

Respond or Comment

Views

Reviewer Report 22 Aug 2025

Han-Wu-Shuang Bao, School of Psychology and Cognitive Science, East China Normal University, Shanghai, Shanghai, China

Approved with Reservations

https://doi.org/10.5256/f1000research.184627.r398443

Thanks for the author's responses to my comments. I feel satisfied with his thoughts on the 2nd and 3rd points, but I still suggest the author reconsidering the 1st point with some further analyses, perhaps presented in footnotes. I agree that it was not necessary to determine which variable precedes the other, but even if the conclusion is purely correlational, it might be spurious due to a thrid confounder. A simple correlation between A and B might be high and significant merely due to a third variable C if both A and B are correlated with C. In this situation, it would not be appropriate to conclude that A is associated with B, because their association (correlation or "relationship") might disappear when C is considered. This is why social sciences usually consider partial correlations and multiple regression analyses to partial out confounders, and why time series analyses usually account for a common time trend or each variable's autocorrelation. Here, in this article, some conclusions about the "relationship between popularity and diversity" (e.g., "they are conceptually strongly related") require such a more rigorous test. Otherwise, if one found a strongly positive correlation between life expectancy and name diversity over time (which is highly possible if they change linearly), it might also be concluded that "they are conceptually strongly related" and even "to predict one indicator from the other indicator", which are not proper because time is the confounder underlying the two variables. Therefore, I still recommend that the author consider at least one more rigorous test beyond a simple correlation to empirically address this point. A practical one is to simply control for year and report their partial correlation.

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: cultural change, names, naming practices, name uniqueness

CITE

Report a concern

Respond or Comment

Version 1

VERSION 1

PUBLISHED 11 Apr 2025

Views

Reviewer Report 29 May 2025

Stephen J. Bush, Xi'an Jiaotong University, Xi'an, Shaanxi, China

Approved with Reservations

https://doi.org/10.5256/f1000research.178684.r378193

This is a short report which reanalyses previously published UK name data to demonstrate an inverse relationship between name popularity and name diversity, such that one variable could be used predict the other. This could provide a useful platform for future name research as many name datasets are restricted to either the top X names per year or otherwise redact rare names (about which, more below); that is, it is only practically possible to derive the first variable from them (‘popularity’). In this respect, these results may facilitate the wider interpretation of smaller, and potentially overlooked, name datasets (e.g. ‘top 10’ lists).

Nevertheless, the result remains a single, simple, correlation analysis. Beyond the suggestions made by the first reviewer (with which I concur), the conclusion may be strengthened by some corroboratory analyses, for instance using an independent dataset sampling the same population, and/or using name datasets that do not redact rarer names (which would otherwise affect the calculations).

It is important to note that the raw data used for the analysis (Table S15 of Bush et al. 2018) ultimately derives from birth registration counts compiled by the UK Office for National Statistics (ONS); specifically, https://www.ons.gov.uk/file?uri=/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys/2016/adhocallbabynames1996to2016.xls, and strictly speaking covering only England and Wales rather than the entire UK. As with many name datasets, the long tail of the name distribution is redacted: in this case, names registered to < 3 people per year are excluded. As such, for any given year, the “diversity indicator” (the ratio of the no. of unique forenames to the total number of birth records per year) will be under-estimated. In addition, because the total no. of birth records will be lower than the total number of births, the “popularity indicator” (% of birth records registered with a top 1/top 10 name) will be over-estimated.

For instance, we can see from Table S15 that in 2016, the total number of birth records was 639,126 (this number being used to calculate both the popularity and diversity indicators). However, the total number of births in 2016 was 696,271, an approx. 9% difference (this value is also made available by the ONS but in a separate file; see the link to ‘download figure 1 data’ in https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/birthsummarytablesenglandandwales/2023#live-births).

It would bolster the overall conclusion – that popularity and diversity are negatively correlated – if it could be demonstrated that refinements to these values make negligible difference. A simple way of doing so would be to calculate them using birth registration datasets that do not redact rare names in the first place. Examples present themselves in the form of data from the Government of Alberta, Canada, specifically for the years 1980-2020 (see https://open.alberta.ca/opendata/frequency-and-ranking-of-baby-names-by-year-and-gender; specific file: https://open.alberta.ca/dataset/11245675-b047-49fc-8bd1-cc2ce8314a6d/resource/e8aac308-c754-484c-b446-0c57ed0e8d37/download/baby-names-frequency_1980_2020.xlsx) and, to draw the focus back to the UK, data from the National Records of Scotland, covering the years 1974-2023 (https://www.nrscotland.gov.uk/media/mbmjzs2q/full-list-1974-2023.csv).

In addition, two points foundational to the overall argument – that “names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names” are supported by reference to the author’s not-yet-published review (Ogihara 2025 ‘Uncommon names are increasing globally’). To provide further context, it would help if a citation was provided for at least a pre-print of this manuscript. I am particularly drawn to the point that names have become less popular over time as, in the UK, this most strongly manifests as the 20^th century progresses, i.e. during the time period covered by the ONS data. It would therefore be useful to know to what extent the relationship of popularity and diversity holds over a longer time period, and in that respect whether the conclusion can be generalised. That in mind, Bush et al. 2018 contains two different datasets sampling birth records from the English and Welsh population, of which the present work uses only one of them (from Table S15; the ONS dataset, which is much greater in depth but narrower in temporal scope). The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year. Nevertheless, a cursory examination of this data suggests that, as with the ONS data, there is a strongly negative correlation of popularity and diversity for the years 1996-2010 (this time period chosen because within it there are > 10,000 birth records per year). If we accept this as independent support of the conclusion drawn using the ONS data (and thereby the utility of this dataset), then it may be worth looking at how popularity and diversity correlate throughout history, too. This is of particular relevance as the conclusion drawn in the abstract (“we can predict one indicator from the other indicator”) implies a general relationship between the concepts that may not be universally true and that instead they may only “consistently measure the same concept” for a limited period of time.

I have expanded the Table S6 dataset substantially since the original publication; see Bush 2024 (https://doi.org/10.5195/names.2024.2543) and its raw data, specifically https://github.com/sjbush/uk_bmd/blob/main/dataset_B/summary_of_records_per_year.txt, which is essentially equivalent to Table S15 and contains the columns “most popular forename (% of records)” and “forename diversity (i.e., ratio of the no. of different forenames to the total no. of records per year)”. It should be clear from this data that before (approx.) the second half of the 20^th century, and compared to contemporary records, forename diversity was not only persistently lower but that popular names were given to a higher proportion of people and remained popular for longer, e.g. John and Mary were consistently the most popular male and female names, respectively, for a century (see https://demos.flourish.studio/namehistory/?names=John,Mary, which visualises data from Bush et al. 2018). So, although the present work (using late-20th century data) finds that one indicator predicts the other, it also appears the case that (earlier in the 20th century) popularity could vary without discernible change to diversity. Consequently, the two variables do not necessarily change simultaneously. This may be the subject of a more in-depth analysis but could be usefully commented upon here.

Finally, could some comment also be made on the gender of names, and the effect this may have on the conclusions? In contemporary UK name data, there is typically a larger pool of female than male names (discussed in Bush 2020, https://doi.org/10.1080/00277738.2020.1775471): that is, one would expect female names to have a higher “diversity indicator” and lower “popularity indicator” than male names. Accordingly, a direct correlation of popularity with diversity may not quite be comparing like with like: diversity being affected more strongly by female names, but popularity by male (such that in Table S15, the top 1 names, across the whole population, are all male).

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: UK naming trends and practices

CITE

Report a concern

Author Response 10 Sep 2025

Yuji Ogihara, Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, 150-8366, Japan

10 Sep 2025

Author Response

July 6th, 2025

Dear Dr. Stephen J. Bush,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according ... Continue reading July 6th, 2025

Dear Dr. Stephen J. Bush,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

This is a short report which reanalyses previously published UK name data to demonstrate an inverse relationship between name popularity and name diversity, such that one variable could be used predict the other. This could provide a useful platform for future name research as many name datasets are restricted to either the top X names per year or otherwise redact rare names (about which, more below); that is, it is only practically possible to derive the first variable from them (‘popularity’). In this respect, these results may facilitate the wider interpretation of smaller, and potentially overlooked, name datasets (e.g. ‘top 10’ lists).

Thank you for your summary of this article and your words of praise.

Nevertheless, the result remains a single, simple, correlation analysis. Beyond the suggestions made by the first reviewer (with which I concur), the conclusion may be strengthened by some corroboratory analyses, for instance using an independent dataset sampling the same population, and/or using name datasets that do not redact rarer names (which would otherwise affect the calculations).

I appreciate your constructive comment. As I replied to the first reviewer, I basically agree with your opinion. Yet, at least in this article, I would like to focus on the dataset used in a previous study (details are explained below).

It is important to note that the raw data used for the analysis (Table S15 of Bush et al. 2018) ultimately derives from birth registration counts compiled by the UK Office for National Statistics (ONS); specifically, https://www.ons.gov.uk/file?uri=/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys/2016/adhocallbabynames1996to2016.xls, and strictly speaking covering only England and Wales rather than the entire UK. As with many name datasets, the long tail of the name distribution is redacted: in this case, names registered to < 3 people per year are excluded. As such, for any given year, the “diversity indicator” (the ratio of the no. of unique forenames to the total number of birth records per year) will be under-estimated. In addition, because the total no. of birth records will be lower than the total number of births, the “popularity indicator” (% of birth records registered with a top 1/top 10 name) will be over-estimated.

For instance, we can see from Table S15 that in 2016, the total number of birth records was 639,126 (this number being used to calculate both the popularity and diversity indicators). However, the total number of births in 2016 was 696,271, an approx. 9% difference (this value is also made available by the ONS but in a separate file; see the link to ‘download figure 1 data’ in https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/birthsummarytablesenglandandwales/2023#live-births).

Thank you for your valuable note. I agree with you that this point is important.

The previous version of the article already stated that the data covered only England and Wales rather than the entire UK and that the long tail of the name distribution was redacted, as follows.

“The original data is from the U.K. Office for National Statistics (2018), which included complete records of all live births in England and Wales for 21 years between 1996 and 2016. A total of 12,985,140 names were recorded, with an average of 618,340 names per year. It should be noted that names with a count of 2 or 1 were redacted to protect the confidentiality of individuals (Office for National Statistics, 2018).” (Data section)

Thus, I have added the possible differences of the two indicators in the revised article, as follows.

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the top 1 and 10 most popular names (numerator) are not influenced by the redaction. As a result, this popularity indicator is higher than the actual popularity.” (Footnote 2)

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the number of unique forenames (numerator) is also lower than the total number of unique forenames. Thus, this diversity indicator is different from the actual diversity, but its direction (lower or higher) is difficult to predict.” (Footnote 3)

It would bolster the overall conclusion – that popularity and diversity are negatively correlated – if it could be demonstrated that refinements to these values make negligible difference. A simple way of doing so would be to calculate them using birth registration datasets that do not redact rare names in the first place. Examples present themselves in the form of data from the Government of Alberta, Canada, specifically for the years 1980-2020 (see https://open.alberta.ca/opendata/frequency-and-ranking-of-baby-names-by-year-and-gender; specific file: https://open.alberta.ca/dataset/11245675-b047-49fc-8bd1-cc2ce8314a6d/resource/e8aac308-c754-484c-b446-0c57ed0e8d37/download/baby-names-frequency_1980_2020.xlsx) and, to draw the focus back to the UK, data from the National Records of Scotland, covering the years 1974-2023 (https://www.nrscotland.gov.uk/media/mbmjzs2q/full-list-1974-2023.csv).

I appreciate your valuable comment. Datasets that do not redact rare names are especially precious because such names are frequently excluded.

Yet, this article focuses on names from England and Wales. Thus, I would like to examine these datasets in future research.

In addition, two points foundational to the overall argument – that “names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names” are supported by reference to the author’s not-yet-published review (Ogihara 2025 ‘Uncommon names are increasing globally’). To provide further context, it would help if a citation was provided for at least a pre-print of this manuscript. I am particularly drawn to the point that names have become less popular over time as, in the UK, this most strongly manifests as the 20th century progresses, i.e. during the time period covered by the ONS data. It would therefore be useful to know to what extent the relationship of popularity and diversity holds over a longer time period, and in that respect whether the conclusion can be generalised. That in mind, Bush et al. 2018 contains two different datasets sampling birth records from the English and Welsh population, of which the present work uses only one of them (from Table S15; the ONS dataset, which is much greater in depth but narrower in temporal scope). The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year. Nevertheless, a cursory examination of this data suggests that, as with the ONS data, there is a strongly negative correlation of popularity and diversity for the years 1996-2010 (this time period chosen because within it there are > 10,000 birth records per year). If we accept this as independent support of the conclusion drawn using the ONS data (and thereby the utility of this dataset), then it may be worth looking at how popularity and diversity correlate throughout history, too. This is of particular relevance as the conclusion drawn in the abstract (“we can predict one indicator from the other indicator”) implies a general relationship between the concepts that may not be universally true and that instead they may only “consistently measure the same concept” for a limited period of time.

I have expanded the Table S6 dataset substantially since the original publication; see Bush 2024 (https://doi.org/10.5195/names.2024.2543) and its raw data, specifically https://github.com/sjbush/uk_bmd/blob/main/dataset_B/summary_of_records_per_year.txt, which is essentially equivalent to Table S15 and contains the columns “most popular forename (% of records)” and “forename diversity (i.e., ratio of the no. of different forenames to the total no. of records per year)”. It should be clear from this data that before (approx.) the second half of the 20th century, and compared to contemporary records, forename diversity was not only persistently lower but that popular names were given to a higher proportion of people and remained popular for longer, e.g. John and Mary were consistently the most popular male and female names, respectively, for a century (see https://demos.flourish.studio/namehistory/?names=John,Mary, which visualises data from Bush et al. 2018). So, although the present work (using late-20th century data) finds that one indicator predicts the other, it also appears the case that (earlier in the 20th century) popularity could vary without discernible change to diversity. Consequently, the two variables do not necessarily change simultaneously. This may be the subject of a more in-depth analysis but could be usefully commented upon here.

Thank you for your valuable comment. Regarding the review paper, the two points (“names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names”) are supported not only by the review paper but also by other references I include (Bao et al., 2021; Bush, 2020; Bush et al., 2018; Gerhards & Hackenbroch, 2000; Kuipers & Askuri, 2017; Mignot, 2022; Ogihara, 2021, 2022; Ogihara & Ito, 2022; Ogihara et al., 2015; Twenge et al., 2010, 2016). Moreover, whether the review paper is made openly available would not directly affect the decision on this paper.

I also appreciate your second point. As you already acknowledged, it is unclear how the second dataset is representative (“The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year.”). In the original paper (Bush et al., 2018), the sample size for each year in the dataset is not sufficiently large and its representativeness for each year is not explained (“This approximates 130,000 to 230,000 records per year from 1838–1950, 25,000 to 100,000 records per year from 1951–2000, and 5000 to 15,000 records per year from 2001 to 2014”; p. 3). In contrast, the sample sizes are sufficiently large and representativeness of the newer dataset that I analyzed is apparently high. Thus, I analyzed the newer dataset and did not analyze the older dataset.

Finally, could some comment also be made on the gender of names, and the effect this may have on the conclusions? In contemporary UK name data, there is typically a larger pool of female than male names (discussed in Bush 2020, https://doi.org/10.1080/00277738.2020.1775471): that is, one would expect female names to have a higher “diversity indicator” and lower “popularity indicator” than male names. Accordingly, a direct correlation of popularity with diversity may not quite be comparing like with like: diversity being affected more strongly by female names, but popularity by male (such that in Table S15, the top 1 names, across the whole population, are all male).

I appreciate your comment. The original study (Bush et al., 2018) did not analyze name trends by gender, so this study did not either. However, I agree with your point. Thus, I have added this as a limitation, as follows.

“This study analyzed the dataset yielded by the past study (Bush et al., 2018), which did not distinguish between boys’ and girls’ names. Although a different pattern is not predicted based on gender, it is desirable to investigate the relationship between the diversity and popularity of names for boys and girls separately in the future.” (Limitation and future direction section)

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
Department of Psychology, College of Education, Psychology and Human Studies
Address: 4-4-25 Shibuya, Shibuya-ku, Tokyo, 150-8366, Japan
E-mail: yogihara@ephs.aoyama.ac.jp
Web: https://sites.google.com/site/yujiogiharaweb/english
July 6th, 2025

Dear Dr. Stephen J. Bush,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

This is a short report which reanalyses previously published UK name data to demonstrate an inverse relationship between name popularity and name diversity, such that one variable could be used predict the other. This could provide a useful platform for future name research as many name datasets are restricted to either the top X names per year or otherwise redact rare names (about which, more below); that is, it is only practically possible to derive the first variable from them (‘popularity’). In this respect, these results may facilitate the wider interpretation of smaller, and potentially overlooked, name datasets (e.g. ‘top 10’ lists).

Thank you for your summary of this article and your words of praise.

Nevertheless, the result remains a single, simple, correlation analysis. Beyond the suggestions made by the first reviewer (with which I concur), the conclusion may be strengthened by some corroboratory analyses, for instance using an independent dataset sampling the same population, and/or using name datasets that do not redact rarer names (which would otherwise affect the calculations).

I appreciate your constructive comment. As I replied to the first reviewer, I basically agree with your opinion. Yet, at least in this article, I would like to focus on the dataset used in a previous study (details are explained below).

It is important to note that the raw data used for the analysis (Table S15 of Bush et al. 2018) ultimately derives from birth registration counts compiled by the UK Office for National Statistics (ONS); specifically, https://www.ons.gov.uk/file?uri=/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys/2016/adhocallbabynames1996to2016.xls, and strictly speaking covering only England and Wales rather than the entire UK. As with many name datasets, the long tail of the name distribution is redacted: in this case, names registered to < 3 people per year are excluded. As such, for any given year, the “diversity indicator” (the ratio of the no. of unique forenames to the total number of birth records per year) will be under-estimated. In addition, because the total no. of birth records will be lower than the total number of births, the “popularity indicator” (% of birth records registered with a top 1/top 10 name) will be over-estimated.

For instance, we can see from Table S15 that in 2016, the total number of birth records was 639,126 (this number being used to calculate both the popularity and diversity indicators). However, the total number of births in 2016 was 696,271, an approx. 9% difference (this value is also made available by the ONS but in a separate file; see the link to ‘download figure 1 data’ in https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/birthsummarytablesenglandandwales/2023#live-births).

Thank you for your valuable note. I agree with you that this point is important.

The previous version of the article already stated that the data covered only England and Wales rather than the entire UK and that the long tail of the name distribution was redacted, as follows.

“The original data is from the U.K. Office for National Statistics (2018), which included complete records of all live births in England and Wales for 21 years between 1996 and 2016. A total of 12,985,140 names were recorded, with an average of 618,340 names per year. It should be noted that names with a count of 2 or 1 were redacted to protect the confidentiality of individuals (Office for National Statistics, 2018).” (Data section)

Thus, I have added the possible differences of the two indicators in the revised article, as follows.

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the top 1 and 10 most popular names (numerator) are not influenced by the redaction. As a result, this popularity indicator is higher than the actual popularity.” (Footnote 2)

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the number of unique forenames (numerator) is also lower than the total number of unique forenames. Thus, this diversity indicator is different from the actual diversity, but its direction (lower or higher) is difficult to predict.” (Footnote 3)

It would bolster the overall conclusion – that popularity and diversity are negatively correlated – if it could be demonstrated that refinements to these values make negligible difference. A simple way of doing so would be to calculate them using birth registration datasets that do not redact rare names in the first place. Examples present themselves in the form of data from the Government of Alberta, Canada, specifically for the years 1980-2020 (see https://open.alberta.ca/opendata/frequency-and-ranking-of-baby-names-by-year-and-gender; specific file: https://open.alberta.ca/dataset/11245675-b047-49fc-8bd1-cc2ce8314a6d/resource/e8aac308-c754-484c-b446-0c57ed0e8d37/download/baby-names-frequency_1980_2020.xlsx) and, to draw the focus back to the UK, data from the National Records of Scotland, covering the years 1974-2023 (https://www.nrscotland.gov.uk/media/mbmjzs2q/full-list-1974-2023.csv).

I appreciate your valuable comment. Datasets that do not redact rare names are especially precious because such names are frequently excluded.

Yet, this article focuses on names from England and Wales. Thus, I would like to examine these datasets in future research.

In addition, two points foundational to the overall argument – that “names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names” are supported by reference to the author’s not-yet-published review (Ogihara 2025 ‘Uncommon names are increasing globally’). To provide further context, it would help if a citation was provided for at least a pre-print of this manuscript. I am particularly drawn to the point that names have become less popular over time as, in the UK, this most strongly manifests as the 20th century progresses, i.e. during the time period covered by the ONS data. It would therefore be useful to know to what extent the relationship of popularity and diversity holds over a longer time period, and in that respect whether the conclusion can be generalised. That in mind, Bush et al. 2018 contains two different datasets sampling birth records from the English and Welsh population, of which the present work uses only one of them (from Table S15; the ONS dataset, which is much greater in depth but narrower in temporal scope). The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year. Nevertheless, a cursory examination of this data suggests that, as with the ONS data, there is a strongly negative correlation of popularity and diversity for the years 1996-2010 (this time period chosen because within it there are > 10,000 birth records per year). If we accept this as independent support of the conclusion drawn using the ONS data (and thereby the utility of this dataset), then it may be worth looking at how popularity and diversity correlate throughout history, too. This is of particular relevance as the conclusion drawn in the abstract (“we can predict one indicator from the other indicator”) implies a general relationship between the concepts that may not be universally true and that instead they may only “consistently measure the same concept” for a limited period of time.

I have expanded the Table S6 dataset substantially since the original publication; see Bush 2024 (https://doi.org/10.5195/names.2024.2543) and its raw data, specifically https://github.com/sjbush/uk_bmd/blob/main/dataset_B/summary_of_records_per_year.txt, which is essentially equivalent to Table S15 and contains the columns “most popular forename (% of records)” and “forename diversity (i.e., ratio of the no. of different forenames to the total no. of records per year)”. It should be clear from this data that before (approx.) the second half of the 20th century, and compared to contemporary records, forename diversity was not only persistently lower but that popular names were given to a higher proportion of people and remained popular for longer, e.g. John and Mary were consistently the most popular male and female names, respectively, for a century (see https://demos.flourish.studio/namehistory/?names=John,Mary, which visualises data from Bush et al. 2018). So, although the present work (using late-20th century data) finds that one indicator predicts the other, it also appears the case that (earlier in the 20th century) popularity could vary without discernible change to diversity. Consequently, the two variables do not necessarily change simultaneously. This may be the subject of a more in-depth analysis but could be usefully commented upon here.

Thank you for your valuable comment. Regarding the review paper, the two points (“names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names”) are supported not only by the review paper but also by other references I include (Bao et al., 2021; Bush, 2020; Bush et al., 2018; Gerhards & Hackenbroch, 2000; Kuipers & Askuri, 2017; Mignot, 2022; Ogihara, 2021, 2022; Ogihara & Ito, 2022; Ogihara et al., 2015; Twenge et al., 2010, 2016). Moreover, whether the review paper is made openly available would not directly affect the decision on this paper.

I also appreciate your second point. As you already acknowledged, it is unclear how the second dataset is representative (“The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year.”). In the original paper (Bush et al., 2018), the sample size for each year in the dataset is not sufficiently large and its representativeness for each year is not explained (“This approximates 130,000 to 230,000 records per year from 1838–1950, 25,000 to 100,000 records per year from 1951–2000, and 5000 to 15,000 records per year from 2001 to 2014”; p. 3). In contrast, the sample sizes are sufficiently large and representativeness of the newer dataset that I analyzed is apparently high. Thus, I analyzed the newer dataset and did not analyze the older dataset.

Finally, could some comment also be made on the gender of names, and the effect this may have on the conclusions? In contemporary UK name data, there is typically a larger pool of female than male names (discussed in Bush 2020, https://doi.org/10.1080/00277738.2020.1775471): that is, one would expect female names to have a higher “diversity indicator” and lower “popularity indicator” than male names. Accordingly, a direct correlation of popularity with diversity may not quite be comparing like with like: diversity being affected more strongly by female names, but popularity by male (such that in Table S15, the top 1 names, across the whole population, are all male).

I appreciate your comment. The original study (Bush et al., 2018) did not analyze name trends by gender, so this study did not either. However, I agree with your point. Thus, I have added this as a limitation, as follows.

“This study analyzed the dataset yielded by the past study (Bush et al., 2018), which did not distinguish between boys’ and girls’ names. Although a different pattern is not predicted based on gender, it is desirable to investigate the relationship between the diversity and popularity of names for boys and girls separately in the future.” (Limitation and future direction section)

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
Department of Psychology, College of Education, Psychology and Human Studies
Address: 4-4-25 Shibuya, Shibuya-ku, Tokyo, 150-8366, Japan
E-mail: yogihara@ephs.aoyama.ac.jp
Web: https://sites.google.com/site/yujiogiharaweb/english
Competing Interests: The author declares no competing interest. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 10 Sep 2025

Yuji Ogihara, Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, 150-8366, Japan

10 Sep 2025

Author Response

July 6th, 2025

Dear Dr. Stephen J. Bush,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according ... Continue reading July 6th, 2025

Dear Dr. Stephen J. Bush,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

This is a short report which reanalyses previously published UK name data to demonstrate an inverse relationship between name popularity and name diversity, such that one variable could be used predict the other. This could provide a useful platform for future name research as many name datasets are restricted to either the top X names per year or otherwise redact rare names (about which, more below); that is, it is only practically possible to derive the first variable from them (‘popularity’). In this respect, these results may facilitate the wider interpretation of smaller, and potentially overlooked, name datasets (e.g. ‘top 10’ lists).

Thank you for your summary of this article and your words of praise.

Nevertheless, the result remains a single, simple, correlation analysis. Beyond the suggestions made by the first reviewer (with which I concur), the conclusion may be strengthened by some corroboratory analyses, for instance using an independent dataset sampling the same population, and/or using name datasets that do not redact rarer names (which would otherwise affect the calculations).

I appreciate your constructive comment. As I replied to the first reviewer, I basically agree with your opinion. Yet, at least in this article, I would like to focus on the dataset used in a previous study (details are explained below).

It is important to note that the raw data used for the analysis (Table S15 of Bush et al. 2018) ultimately derives from birth registration counts compiled by the UK Office for National Statistics (ONS); specifically, https://www.ons.gov.uk/file?uri=/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys/2016/adhocallbabynames1996to2016.xls, and strictly speaking covering only England and Wales rather than the entire UK. As with many name datasets, the long tail of the name distribution is redacted: in this case, names registered to < 3 people per year are excluded. As such, for any given year, the “diversity indicator” (the ratio of the no. of unique forenames to the total number of birth records per year) will be under-estimated. In addition, because the total no. of birth records will be lower than the total number of births, the “popularity indicator” (% of birth records registered with a top 1/top 10 name) will be over-estimated.

For instance, we can see from Table S15 that in 2016, the total number of birth records was 639,126 (this number being used to calculate both the popularity and diversity indicators). However, the total number of births in 2016 was 696,271, an approx. 9% difference (this value is also made available by the ONS but in a separate file; see the link to ‘download figure 1 data’ in https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/birthsummarytablesenglandandwales/2023#live-births).

Thank you for your valuable note. I agree with you that this point is important.

The previous version of the article already stated that the data covered only England and Wales rather than the entire UK and that the long tail of the name distribution was redacted, as follows.

“The original data is from the U.K. Office for National Statistics (2018), which included complete records of all live births in England and Wales for 21 years between 1996 and 2016. A total of 12,985,140 names were recorded, with an average of 618,340 names per year. It should be noted that names with a count of 2 or 1 were redacted to protect the confidentiality of individuals (Office for National Statistics, 2018).” (Data section)

Thus, I have added the possible differences of the two indicators in the revised article, as follows.

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the top 1 and 10 most popular names (numerator) are not influenced by the redaction. As a result, this popularity indicator is higher than the actual popularity.” (Footnote 2)

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the number of unique forenames (numerator) is also lower than the total number of unique forenames. Thus, this diversity indicator is different from the actual diversity, but its direction (lower or higher) is difficult to predict.” (Footnote 3)

It would bolster the overall conclusion – that popularity and diversity are negatively correlated – if it could be demonstrated that refinements to these values make negligible difference. A simple way of doing so would be to calculate them using birth registration datasets that do not redact rare names in the first place. Examples present themselves in the form of data from the Government of Alberta, Canada, specifically for the years 1980-2020 (see https://open.alberta.ca/opendata/frequency-and-ranking-of-baby-names-by-year-and-gender; specific file: https://open.alberta.ca/dataset/11245675-b047-49fc-8bd1-cc2ce8314a6d/resource/e8aac308-c754-484c-b446-0c57ed0e8d37/download/baby-names-frequency_1980_2020.xlsx) and, to draw the focus back to the UK, data from the National Records of Scotland, covering the years 1974-2023 (https://www.nrscotland.gov.uk/media/mbmjzs2q/full-list-1974-2023.csv).

I appreciate your valuable comment. Datasets that do not redact rare names are especially precious because such names are frequently excluded.

Yet, this article focuses on names from England and Wales. Thus, I would like to examine these datasets in future research.

In addition, two points foundational to the overall argument – that “names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names” are supported by reference to the author’s not-yet-published review (Ogihara 2025 ‘Uncommon names are increasing globally’). To provide further context, it would help if a citation was provided for at least a pre-print of this manuscript. I am particularly drawn to the point that names have become less popular over time as, in the UK, this most strongly manifests as the 20th century progresses, i.e. during the time period covered by the ONS data. It would therefore be useful to know to what extent the relationship of popularity and diversity holds over a longer time period, and in that respect whether the conclusion can be generalised. That in mind, Bush et al. 2018 contains two different datasets sampling birth records from the English and Welsh population, of which the present work uses only one of them (from Table S15; the ONS dataset, which is much greater in depth but narrower in temporal scope). The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year. Nevertheless, a cursory examination of this data suggests that, as with the ONS data, there is a strongly negative correlation of popularity and diversity for the years 1996-2010 (this time period chosen because within it there are > 10,000 birth records per year). If we accept this as independent support of the conclusion drawn using the ONS data (and thereby the utility of this dataset), then it may be worth looking at how popularity and diversity correlate throughout history, too. This is of particular relevance as the conclusion drawn in the abstract (“we can predict one indicator from the other indicator”) implies a general relationship between the concepts that may not be universally true and that instead they may only “consistently measure the same concept” for a limited period of time.

I have expanded the Table S6 dataset substantially since the original publication; see Bush 2024 (https://doi.org/10.5195/names.2024.2543) and its raw data, specifically https://github.com/sjbush/uk_bmd/blob/main/dataset_B/summary_of_records_per_year.txt, which is essentially equivalent to Table S15 and contains the columns “most popular forename (% of records)” and “forename diversity (i.e., ratio of the no. of different forenames to the total no. of records per year)”. It should be clear from this data that before (approx.) the second half of the 20th century, and compared to contemporary records, forename diversity was not only persistently lower but that popular names were given to a higher proportion of people and remained popular for longer, e.g. John and Mary were consistently the most popular male and female names, respectively, for a century (see https://demos.flourish.studio/namehistory/?names=John,Mary, which visualises data from Bush et al. 2018). So, although the present work (using late-20th century data) finds that one indicator predicts the other, it also appears the case that (earlier in the 20th century) popularity could vary without discernible change to diversity. Consequently, the two variables do not necessarily change simultaneously. This may be the subject of a more in-depth analysis but could be usefully commented upon here.

Thank you for your valuable comment. Regarding the review paper, the two points (“names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names”) are supported not only by the review paper but also by other references I include (Bao et al., 2021; Bush, 2020; Bush et al., 2018; Gerhards & Hackenbroch, 2000; Kuipers & Askuri, 2017; Mignot, 2022; Ogihara, 2021, 2022; Ogihara & Ito, 2022; Ogihara et al., 2015; Twenge et al., 2010, 2016). Moreover, whether the review paper is made openly available would not directly affect the decision on this paper.

I also appreciate your second point. As you already acknowledged, it is unclear how the second dataset is representative (“The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year.”). In the original paper (Bush et al., 2018), the sample size for each year in the dataset is not sufficiently large and its representativeness for each year is not explained (“This approximates 130,000 to 230,000 records per year from 1838–1950, 25,000 to 100,000 records per year from 1951–2000, and 5000 to 15,000 records per year from 2001 to 2014”; p. 3). In contrast, the sample sizes are sufficiently large and representativeness of the newer dataset that I analyzed is apparently high. Thus, I analyzed the newer dataset and did not analyze the older dataset.

Finally, could some comment also be made on the gender of names, and the effect this may have on the conclusions? In contemporary UK name data, there is typically a larger pool of female than male names (discussed in Bush 2020, https://doi.org/10.1080/00277738.2020.1775471): that is, one would expect female names to have a higher “diversity indicator” and lower “popularity indicator” than male names. Accordingly, a direct correlation of popularity with diversity may not quite be comparing like with like: diversity being affected more strongly by female names, but popularity by male (such that in Table S15, the top 1 names, across the whole population, are all male).

I appreciate your comment. The original study (Bush et al., 2018) did not analyze name trends by gender, so this study did not either. However, I agree with your point. Thus, I have added this as a limitation, as follows.

“This study analyzed the dataset yielded by the past study (Bush et al., 2018), which did not distinguish between boys’ and girls’ names. Although a different pattern is not predicted based on gender, it is desirable to investigate the relationship between the diversity and popularity of names for boys and girls separately in the future.” (Limitation and future direction section)

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
Department of Psychology, College of Education, Psychology and Human Studies
Address: 4-4-25 Shibuya, Shibuya-ku, Tokyo, 150-8366, Japan
E-mail: yogihara@ephs.aoyama.ac.jp
Web: https://sites.google.com/site/yujiogiharaweb/english
July 6th, 2025

Dear Dr. Stephen J. Bush,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

This is a short report which reanalyses previously published UK name data to demonstrate an inverse relationship between name popularity and name diversity, such that one variable could be used predict the other. This could provide a useful platform for future name research as many name datasets are restricted to either the top X names per year or otherwise redact rare names (about which, more below); that is, it is only practically possible to derive the first variable from them (‘popularity’). In this respect, these results may facilitate the wider interpretation of smaller, and potentially overlooked, name datasets (e.g. ‘top 10’ lists).

Thank you for your summary of this article and your words of praise.

Nevertheless, the result remains a single, simple, correlation analysis. Beyond the suggestions made by the first reviewer (with which I concur), the conclusion may be strengthened by some corroboratory analyses, for instance using an independent dataset sampling the same population, and/or using name datasets that do not redact rarer names (which would otherwise affect the calculations).

I appreciate your constructive comment. As I replied to the first reviewer, I basically agree with your opinion. Yet, at least in this article, I would like to focus on the dataset used in a previous study (details are explained below).

It is important to note that the raw data used for the analysis (Table S15 of Bush et al. 2018) ultimately derives from birth registration counts compiled by the UK Office for National Statistics (ONS); specifically, https://www.ons.gov.uk/file?uri=/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys/2016/adhocallbabynames1996to2016.xls, and strictly speaking covering only England and Wales rather than the entire UK. As with many name datasets, the long tail of the name distribution is redacted: in this case, names registered to < 3 people per year are excluded. As such, for any given year, the “diversity indicator” (the ratio of the no. of unique forenames to the total number of birth records per year) will be under-estimated. In addition, because the total no. of birth records will be lower than the total number of births, the “popularity indicator” (% of birth records registered with a top 1/top 10 name) will be over-estimated.

For instance, we can see from Table S15 that in 2016, the total number of birth records was 639,126 (this number being used to calculate both the popularity and diversity indicators). However, the total number of births in 2016 was 696,271, an approx. 9% difference (this value is also made available by the ONS but in a separate file; see the link to ‘download figure 1 data’ in https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/birthsummarytablesenglandandwales/2023#live-births).

Thank you for your valuable note. I agree with you that this point is important.

The previous version of the article already stated that the data covered only England and Wales rather than the entire UK and that the long tail of the name distribution was redacted, as follows.

“The original data is from the U.K. Office for National Statistics (2018), which included complete records of all live births in England and Wales for 21 years between 1996 and 2016. A total of 12,985,140 names were recorded, with an average of 618,340 names per year. It should be noted that names with a count of 2 or 1 were redacted to protect the confidentiality of individuals (Office for National Statistics, 2018).” (Data section)

Thus, I have added the possible differences of the two indicators in the revised article, as follows.

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the top 1 and 10 most popular names (numerator) are not influenced by the redaction. As a result, this popularity indicator is higher than the actual popularity.” (Footnote 2)

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the number of unique forenames (numerator) is also lower than the total number of unique forenames. Thus, this diversity indicator is different from the actual diversity, but its direction (lower or higher) is difficult to predict.” (Footnote 3)

It would bolster the overall conclusion – that popularity and diversity are negatively correlated – if it could be demonstrated that refinements to these values make negligible difference. A simple way of doing so would be to calculate them using birth registration datasets that do not redact rare names in the first place. Examples present themselves in the form of data from the Government of Alberta, Canada, specifically for the years 1980-2020 (see https://open.alberta.ca/opendata/frequency-and-ranking-of-baby-names-by-year-and-gender; specific file: https://open.alberta.ca/dataset/11245675-b047-49fc-8bd1-cc2ce8314a6d/resource/e8aac308-c754-484c-b446-0c57ed0e8d37/download/baby-names-frequency_1980_2020.xlsx) and, to draw the focus back to the UK, data from the National Records of Scotland, covering the years 1974-2023 (https://www.nrscotland.gov.uk/media/mbmjzs2q/full-list-1974-2023.csv).

I appreciate your valuable comment. Datasets that do not redact rare names are especially precious because such names are frequently excluded.

Yet, this article focuses on names from England and Wales. Thus, I would like to examine these datasets in future research.

In addition, two points foundational to the overall argument – that “names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names” are supported by reference to the author’s not-yet-published review (Ogihara 2025 ‘Uncommon names are increasing globally’). To provide further context, it would help if a citation was provided for at least a pre-print of this manuscript. I am particularly drawn to the point that names have become less popular over time as, in the UK, this most strongly manifests as the 20th century progresses, i.e. during the time period covered by the ONS data. It would therefore be useful to know to what extent the relationship of popularity and diversity holds over a longer time period, and in that respect whether the conclusion can be generalised. That in mind, Bush et al. 2018 contains two different datasets sampling birth records from the English and Welsh population, of which the present work uses only one of them (from Table S15; the ONS dataset, which is much greater in depth but narrower in temporal scope). The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year. Nevertheless, a cursory examination of this data suggests that, as with the ONS data, there is a strongly negative correlation of popularity and diversity for the years 1996-2010 (this time period chosen because within it there are > 10,000 birth records per year). If we accept this as independent support of the conclusion drawn using the ONS data (and thereby the utility of this dataset), then it may be worth looking at how popularity and diversity correlate throughout history, too. This is of particular relevance as the conclusion drawn in the abstract (“we can predict one indicator from the other indicator”) implies a general relationship between the concepts that may not be universally true and that instead they may only “consistently measure the same concept” for a limited period of time.

I have expanded the Table S6 dataset substantially since the original publication; see Bush 2024 (https://doi.org/10.5195/names.2024.2543) and its raw data, specifically https://github.com/sjbush/uk_bmd/blob/main/dataset_B/summary_of_records_per_year.txt, which is essentially equivalent to Table S15 and contains the columns “most popular forename (% of records)” and “forename diversity (i.e., ratio of the no. of different forenames to the total no. of records per year)”. It should be clear from this data that before (approx.) the second half of the 20th century, and compared to contemporary records, forename diversity was not only persistently lower but that popular names were given to a higher proportion of people and remained popular for longer, e.g. John and Mary were consistently the most popular male and female names, respectively, for a century (see https://demos.flourish.studio/namehistory/?names=John,Mary, which visualises data from Bush et al. 2018). So, although the present work (using late-20th century data) finds that one indicator predicts the other, it also appears the case that (earlier in the 20th century) popularity could vary without discernible change to diversity. Consequently, the two variables do not necessarily change simultaneously. This may be the subject of a more in-depth analysis but could be usefully commented upon here.

Thank you for your valuable comment. Regarding the review paper, the two points (“names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names”) are supported not only by the review paper but also by other references I include (Bao et al., 2021; Bush, 2020; Bush et al., 2018; Gerhards & Hackenbroch, 2000; Kuipers & Askuri, 2017; Mignot, 2022; Ogihara, 2021, 2022; Ogihara & Ito, 2022; Ogihara et al., 2015; Twenge et al., 2010, 2016). Moreover, whether the review paper is made openly available would not directly affect the decision on this paper.

I also appreciate your second point. As you already acknowledged, it is unclear how the second dataset is representative (“The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year.”). In the original paper (Bush et al., 2018), the sample size for each year in the dataset is not sufficiently large and its representativeness for each year is not explained (“This approximates 130,000 to 230,000 records per year from 1838–1950, 25,000 to 100,000 records per year from 1951–2000, and 5000 to 15,000 records per year from 2001 to 2014”; p. 3). In contrast, the sample sizes are sufficiently large and representativeness of the newer dataset that I analyzed is apparently high. Thus, I analyzed the newer dataset and did not analyze the older dataset.

Finally, could some comment also be made on the gender of names, and the effect this may have on the conclusions? In contemporary UK name data, there is typically a larger pool of female than male names (discussed in Bush 2020, https://doi.org/10.1080/00277738.2020.1775471): that is, one would expect female names to have a higher “diversity indicator” and lower “popularity indicator” than male names. Accordingly, a direct correlation of popularity with diversity may not quite be comparing like with like: diversity being affected more strongly by female names, but popularity by male (such that in Table S15, the top 1 names, across the whole population, are all male).

I appreciate your comment. The original study (Bush et al., 2018) did not analyze name trends by gender, so this study did not either. However, I agree with your point. Thus, I have added this as a limitation, as follows.

“This study analyzed the dataset yielded by the past study (Bush et al., 2018), which did not distinguish between boys’ and girls’ names. Although a different pattern is not predicted based on gender, it is desirable to investigate the relationship between the diversity and popularity of names for boys and girls separately in the future.” (Limitation and future direction section)

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
Department of Psychology, College of Education, Psychology and Human Studies
Address: 4-4-25 Shibuya, Shibuya-ku, Tokyo, 150-8366, Japan
E-mail: yogihara@ephs.aoyama.ac.jp
Web: https://sites.google.com/site/yujiogiharaweb/english
Competing Interests: The author declares no competing interest. Close
Report a concern

Views

Reviewer Report 06 May 2025

Han-Wu-Shuang Bao, School of Psychology and Cognitive Science, East China Normal University, Shanghai, Shanghai, China

Approved with Reservations

https://doi.org/10.5256/f1000research.178684.r378190

In this Brief Report, the author performed a simple correlation analysis with Bush et al.’s (2018) published dataset on UK first names, showing that when the most popular names were more often used, there were also more diverse types of unique names. I think the author has addressed one of the important issues in name research and cultural change research, which lacked an empirical test before. While I believe the main argument that name popularity and name diversity can predict each other is worth publishing, I have several suggestions that may improve the strength of this argument.

First, the analysis presented in this report was simple correlation over time, which might be spurious due to covariance among year, popularity, and diversity. To more rigorously test the key relationship, I suggest either controlling for year or conducting a cross-correlation analysis and a Granger causality test. The latter two methods are usually preferred for more rigorous time series analysis in cultural change research. In doing so, the author could examine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. It becomes more important here because such high correlations found in this report might simply due to the fact that any linear trends over time would be highly correlated (positively or negatively) with each other.

Second, to partial out the confound by a common linear trend, I also suggest performing a simulation analysis. Specifically, you can generate or resample (e.g., bootstrap) large random samples of “names”, with each sample consisting of a sufficient number of “names” simulating a sample of names in a “year”. Then, for each sample, you can calculate the indices of popularity and diversity. Finally, across these simulated samples, you can test the correlation between popularity and diversity. In this way, you can test this relationship without the time confound. This may also answer if there is a purely statistical relationship between popularity and diversity, providing a better understanding of the main research question here.

Besides these suggestions, I have one minor comment on including more past evidence for name diversity. Bao et al. (2021) indeed tested changes in six name indices in China, where “standard deviation of name length” and “proportion of three-character given names” can both be regarded as indices of name diversity in Chinese naming practices. So I suggest adding them as another piece of evidence into the section “Diversity of names has increased”.

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: cultural change, names, naming practices, name uniqueness

CITE

Report a concern

Author Response 21 Aug 2025

Yuji Ogihara, Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, 150-8366, Japan

21 Aug 2025

Author Response

July 6th, 2025

Dear Dr. Han-Wu-Shuang Bao,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to ... Continue reading July 6th, 2025

Dear Dr. Han-Wu-Shuang Bao,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

In this Brief Report, the author performed a simple correlation analysis with Bush et al.’s (2018) published dataset on UK first names, showing that when the most popular names were more often used, there were also more diverse types of unique names. I think the author has addressed one of the important issues in name research and cultural change research, which lacked an empirical test before. While I believe the main argument that name popularity and name diversity can predict each other is worth publishing, I have several suggestions that may improve the strength of this argument.

Thank you for your words of praise. Following your comments below, I have modified the manuscript.

First, the analysis presented in this report was simple correlation over time, which might be spurious due to covariance among year, popularity, and diversity. To more rigorously test the key relationship, I suggest either controlling for year or conducting a cross-correlation analysis and a Granger causality test. The latter two methods are usually preferred for more rigorous time series analysis in cultural change research. In doing so, the author could examine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. It becomes more important here because such high correlations found in this report might simply due to the fact that any linear trends over time would be highly correlated (positively or negatively) with each other.

Thank you for your valuable comment. I agree with you that it is important to consider this point carefully.

I do not claim that there is a causality between the two indicators. As I write in the article, it is important to predict one indicator from the other indicator in name research. It is not necessary to determine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. Thus, I think simple correlations are sufficient, at least in this context. Nevertheless, I also think that the analyses you suggested work well in cultural change research.

However, if you do not find my response to be sufficient, I will reconsider this point. I would appreciate it very much.

Second, to partial out the confound by a common linear trend, I also suggest performing a simulation analysis. Specifically, you can generate or resample (e.g., bootstrap) large random samples of “names”, with each sample consisting of a sufficient number of “names” simulating a sample of names in a “year”. Then, for each sample, you can calculate the indices of popularity and diversity. Finally, across these simulated samples, you can test the correlation between popularity and diversity. In this way, you can test this relationship without the time confound. This may also answer if there is a purely statistical relationship between popularity and diversity, providing a better understanding of the main research question here.

I appreciate your constructive comment. That is a fascinating idea. I agree that such a simulation analysis can contribute to a better understanding of names and naming practices.

In contrast, I think that the approach is beyond the scope of this article which analyzes the actual data from the United Kingdom.

Moreover, this simulation could tend to lack ecological validity. The results should strongly depend on parameters of a hypothetical and imaginary dataset.

Furthermore, this simulation lacks a time perspective. Our focus is on how names (and name indicators) change over time.

Therefore, I would like to consider this new idea in a different study. Anyway, thank you very much for your important input.

Besides these suggestions, I have one minor comment on including more past evidence for name diversity. Bao et al. (2021) indeed tested changes in six name indices in China, where “standard deviation of name length” and “proportion of three-character given names” can both be regarded as indices of name diversity in Chinese naming practices. So I suggest adding them as another piece of evidence into the section “Diversity of names has increased”.

Thank you for your comment. I agree that the indicators you mention are related to name diversity. Yet, these indicators do not directly show name diversity because they focus on “name length.” Even when the standard deviation of name length increases, name diversity can decrease. For example, even when names consisting of one or three characters increase, if the same names are given, diversity can decrease. Therefore, I maintain this part.

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
July 6th, 2025

Dear Dr. Han-Wu-Shuang Bao,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

In this Brief Report, the author performed a simple correlation analysis with Bush et al.’s (2018) published dataset on UK first names, showing that when the most popular names were more often used, there were also more diverse types of unique names. I think the author has addressed one of the important issues in name research and cultural change research, which lacked an empirical test before. While I believe the main argument that name popularity and name diversity can predict each other is worth publishing, I have several suggestions that may improve the strength of this argument.

Thank you for your words of praise. Following your comments below, I have modified the manuscript.

First, the analysis presented in this report was simple correlation over time, which might be spurious due to covariance among year, popularity, and diversity. To more rigorously test the key relationship, I suggest either controlling for year or conducting a cross-correlation analysis and a Granger causality test. The latter two methods are usually preferred for more rigorous time series analysis in cultural change research. In doing so, the author could examine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. It becomes more important here because such high correlations found in this report might simply due to the fact that any linear trends over time would be highly correlated (positively or negatively) with each other.

Thank you for your valuable comment. I agree with you that it is important to consider this point carefully.

I do not claim that there is a causality between the two indicators. As I write in the article, it is important to predict one indicator from the other indicator in name research. It is not necessary to determine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. Thus, I think simple correlations are sufficient, at least in this context. Nevertheless, I also think that the analyses you suggested work well in cultural change research.

However, if you do not find my response to be sufficient, I will reconsider this point. I would appreciate it very much.

Second, to partial out the confound by a common linear trend, I also suggest performing a simulation analysis. Specifically, you can generate or resample (e.g., bootstrap) large random samples of “names”, with each sample consisting of a sufficient number of “names” simulating a sample of names in a “year”. Then, for each sample, you can calculate the indices of popularity and diversity. Finally, across these simulated samples, you can test the correlation between popularity and diversity. In this way, you can test this relationship without the time confound. This may also answer if there is a purely statistical relationship between popularity and diversity, providing a better understanding of the main research question here.

I appreciate your constructive comment. That is a fascinating idea. I agree that such a simulation analysis can contribute to a better understanding of names and naming practices.

In contrast, I think that the approach is beyond the scope of this article which analyzes the actual data from the United Kingdom.

Moreover, this simulation could tend to lack ecological validity. The results should strongly depend on parameters of a hypothetical and imaginary dataset.

Furthermore, this simulation lacks a time perspective. Our focus is on how names (and name indicators) change over time.

Therefore, I would like to consider this new idea in a different study. Anyway, thank you very much for your important input.

Besides these suggestions, I have one minor comment on including more past evidence for name diversity. Bao et al. (2021) indeed tested changes in six name indices in China, where “standard deviation of name length” and “proportion of three-character given names” can both be regarded as indices of name diversity in Chinese naming practices. So I suggest adding them as another piece of evidence into the section “Diversity of names has increased”.

Thank you for your comment. I agree that the indicators you mention are related to name diversity. Yet, these indicators do not directly show name diversity because they focus on “name length.” Even when the standard deviation of name length increases, name diversity can decrease. For example, even when names consisting of one or three characters increase, if the same names are given, diversity can decrease. Therefore, I maintain this part.

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
Competing Interests: The author declares no competing interest. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 21 Aug 2025

Yuji Ogihara, Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, 150-8366, Japan

21 Aug 2025

Author Response

July 6th, 2025

Dear Dr. Han-Wu-Shuang Bao,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to ... Continue reading July 6th, 2025

Dear Dr. Han-Wu-Shuang Bao,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

In this Brief Report, the author performed a simple correlation analysis with Bush et al.’s (2018) published dataset on UK first names, showing that when the most popular names were more often used, there were also more diverse types of unique names. I think the author has addressed one of the important issues in name research and cultural change research, which lacked an empirical test before. While I believe the main argument that name popularity and name diversity can predict each other is worth publishing, I have several suggestions that may improve the strength of this argument.

Thank you for your words of praise. Following your comments below, I have modified the manuscript.

First, the analysis presented in this report was simple correlation over time, which might be spurious due to covariance among year, popularity, and diversity. To more rigorously test the key relationship, I suggest either controlling for year or conducting a cross-correlation analysis and a Granger causality test. The latter two methods are usually preferred for more rigorous time series analysis in cultural change research. In doing so, the author could examine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. It becomes more important here because such high correlations found in this report might simply due to the fact that any linear trends over time would be highly correlated (positively or negatively) with each other.

Thank you for your valuable comment. I agree with you that it is important to consider this point carefully.

I do not claim that there is a causality between the two indicators. As I write in the article, it is important to predict one indicator from the other indicator in name research. It is not necessary to determine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. Thus, I think simple correlations are sufficient, at least in this context. Nevertheless, I also think that the analyses you suggested work well in cultural change research.

However, if you do not find my response to be sufficient, I will reconsider this point. I would appreciate it very much.

Second, to partial out the confound by a common linear trend, I also suggest performing a simulation analysis. Specifically, you can generate or resample (e.g., bootstrap) large random samples of “names”, with each sample consisting of a sufficient number of “names” simulating a sample of names in a “year”. Then, for each sample, you can calculate the indices of popularity and diversity. Finally, across these simulated samples, you can test the correlation between popularity and diversity. In this way, you can test this relationship without the time confound. This may also answer if there is a purely statistical relationship between popularity and diversity, providing a better understanding of the main research question here.

I appreciate your constructive comment. That is a fascinating idea. I agree that such a simulation analysis can contribute to a better understanding of names and naming practices.

In contrast, I think that the approach is beyond the scope of this article which analyzes the actual data from the United Kingdom.

Moreover, this simulation could tend to lack ecological validity. The results should strongly depend on parameters of a hypothetical and imaginary dataset.

Furthermore, this simulation lacks a time perspective. Our focus is on how names (and name indicators) change over time.

Therefore, I would like to consider this new idea in a different study. Anyway, thank you very much for your important input.

Besides these suggestions, I have one minor comment on including more past evidence for name diversity. Bao et al. (2021) indeed tested changes in six name indices in China, where “standard deviation of name length” and “proportion of three-character given names” can both be regarded as indices of name diversity in Chinese naming practices. So I suggest adding them as another piece of evidence into the section “Diversity of names has increased”.

Thank you for your comment. I agree that the indicators you mention are related to name diversity. Yet, these indicators do not directly show name diversity because they focus on “name length.” Even when the standard deviation of name length increases, name diversity can decrease. For example, even when names consisting of one or three characters increase, if the same names are given, diversity can decrease. Therefore, I maintain this part.

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
July 6th, 2025

Dear Dr. Han-Wu-Shuang Bao,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

In this Brief Report, the author performed a simple correlation analysis with Bush et al.’s (2018) published dataset on UK first names, showing that when the most popular names were more often used, there were also more diverse types of unique names. I think the author has addressed one of the important issues in name research and cultural change research, which lacked an empirical test before. While I believe the main argument that name popularity and name diversity can predict each other is worth publishing, I have several suggestions that may improve the strength of this argument.

Thank you for your words of praise. Following your comments below, I have modified the manuscript.

First, the analysis presented in this report was simple correlation over time, which might be spurious due to covariance among year, popularity, and diversity. To more rigorously test the key relationship, I suggest either controlling for year or conducting a cross-correlation analysis and a Granger causality test. The latter two methods are usually preferred for more rigorous time series analysis in cultural change research. In doing so, the author could examine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. It becomes more important here because such high correlations found in this report might simply due to the fact that any linear trends over time would be highly correlated (positively or negatively) with each other.

Thank you for your valuable comment. I agree with you that it is important to consider this point carefully.

I do not claim that there is a causality between the two indicators. As I write in the article, it is important to predict one indicator from the other indicator in name research. It is not necessary to determine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. Thus, I think simple correlations are sufficient, at least in this context. Nevertheless, I also think that the analyses you suggested work well in cultural change research.

However, if you do not find my response to be sufficient, I will reconsider this point. I would appreciate it very much.

Second, to partial out the confound by a common linear trend, I also suggest performing a simulation analysis. Specifically, you can generate or resample (e.g., bootstrap) large random samples of “names”, with each sample consisting of a sufficient number of “names” simulating a sample of names in a “year”. Then, for each sample, you can calculate the indices of popularity and diversity. Finally, across these simulated samples, you can test the correlation between popularity and diversity. In this way, you can test this relationship without the time confound. This may also answer if there is a purely statistical relationship between popularity and diversity, providing a better understanding of the main research question here.

I appreciate your constructive comment. That is a fascinating idea. I agree that such a simulation analysis can contribute to a better understanding of names and naming practices.

In contrast, I think that the approach is beyond the scope of this article which analyzes the actual data from the United Kingdom.

Moreover, this simulation could tend to lack ecological validity. The results should strongly depend on parameters of a hypothetical and imaginary dataset.

Furthermore, this simulation lacks a time perspective. Our focus is on how names (and name indicators) change over time.

Therefore, I would like to consider this new idea in a different study. Anyway, thank you very much for your important input.

Besides these suggestions, I have one minor comment on including more past evidence for name diversity. Bao et al. (2021) indeed tested changes in six name indices in China, where “standard deviation of name length” and “proportion of three-character given names” can both be regarded as indices of name diversity in Chinese naming practices. So I suggest adding them as another piece of evidence into the section “Diversity of names has increased”.

Thank you for your comment. I agree that the indicators you mention are related to name diversity. Yet, these indicators do not directly show name diversity because they focus on “name length.” Even when the standard deviation of name length increases, name diversity can decrease. For example, even when names consisting of one or three characters increase, if the same names are given, diversity can decrease. Therefore, I maintain this part.

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
Competing Interests: The author declares no competing interest. Close
Report a concern

Comments on this article Comments (0)

Version 3

VERSION 3 PUBLISHED 11 Apr 2025

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2	3	4
Version 3 (revision) 24 Dec 25	read		read
Version 2 (revision) 14 Jul 25	read	read	read	read
Version 1 11 Apr 25	read	read

Han-Wu-Shuang Bao, East China Normal University, Shanghai, China
Stephen J. Bush, Xi'an Jiaotong University, Xi'an, China
Basil Osayin Daudu, Kogi State University, Anyigba, Nigeria
Süleyman Kasap, Van Yüzüncü Yıl University, Bardakçı, Turkey

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

3 Views

07 Jan 2026 | for Version 3

Han-Wu-Shuang Bao, School of Psychology and Cognitive Science, East China Normal University, Shanghai, Shanghai, China

3 Views Cite this report Responses(0)

Approved With Reservations

I would like to thank the author for addressing my concern by reporting partial correlations in the new section, "Controlling for a possible confounding factor (year)". Surprisingly, however, none of the results reported in this article include statistical significance information, such as p values. While the simple correlations are close to 1 and are undoubtedly significant, the significance of the partial correlations (given the sample size of 21 years) is unclear. My calculations based on r and n suggest that r = −.887 and r = .577 are significant at p < .05, but r = −.419 has a p value of .06. I would therefore recommend that the author reports the exact p values in order to reach a more scientifically sound conclusion.

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

cultural change, names, naming practices, name uniqueness

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

5 Views

02 Jan 2026 | for Version 3

Basil Osayin Daudu, Kogi State University, Anyigba, Kogi, Nigeria

5 Views Cite this report Responses(0)

Approved

I have no further comments to make.

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Philosophy, AI, African Studies, Health

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

7 Views

31 Dec 2025 | for Version 2

Süleyman Kasap, Van Yüzüncü Yıl University, Bardakçı, Turkey

7 Views Cite this report Responses(0)

Not Approved

Is the work clearly and accurately presented and does it cite the current literature?

No
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

No
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Partly
Are the conclusions drawn adequately supported by the results?

Partly

References

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Linguistics , onomastics

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

30 Views

26 Nov 2025 | for Version 2

Basil Osayin Daudu, Kogi State University, Anyigba, Kogi, Nigeria

30 Views Cite this report Responses(0)

Approved With Reservations

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Partly
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Yes

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

AI, African Studies, Health

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

13 Views

02 Sep 2025 | for Version 2

Stephen J. Bush, Xi'an Jiaotong University, Xi'an, Shaanxi, China

13 Views Cite this report Responses(0)

Approved With Reservations

While I remain broadly in favour of this work and its utility in facilitating further research, as far as I can tell, the revised version of this article differs only minimally from the original. Specifically, the inclusion of two clarifying footnotes and a ‘limitations’ statement regarding the original data not distinguishing between boys’ and girls’ names.
While I appreciate these additions, many of the points raised in the original reviews with regard to alternative or supporting analyses have not been addressed in the text – although could conceivably have strengthened the work further.

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

UK naming trends and practices

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

28 Views

22 Aug 2025 | for Version 2

Han-Wu-Shuang Bao, School of Psychology and Cognitive Science, East China Normal University, Shanghai, Shanghai, China

28 Views Cite this report Responses(0)

Approved With Reservations

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

cultural change, names, naming practices, name uniqueness

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

24 Views

29 May 2025 | for Version 1

Stephen J. Bush, Xi'an Jiaotong University, Xi'an, Shaanxi, China

24 Views Cite this report Responses(1)

Approved With Reservations

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

Partly
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

Yes
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

UK naming trends and practices

Respond to this report

Responses (1)

Author Response

10 Sep 2025

Yuji Ogihara, Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, 150-8366, Japan

July 6th, 2025

Dear Dr. Stephen J. Bush,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

This is a short report which reanalyses previously published UK name data to demonstrate an inverse relationship between name popularity and name diversity, such that one variable could be used predict the other. This could provide a useful platform for future name research as many name datasets are restricted to either the top X names per year or otherwise redact rare names (about which, more below); that is, it is only practically possible to derive the first variable from them (‘popularity’). In this respect, these results may facilitate the wider interpretation of smaller, and potentially overlooked, name datasets (e.g. ‘top 10’ lists).

Thank you for your summary of this article and your words of praise.

Nevertheless, the result remains a single, simple, correlation analysis. Beyond the suggestions made by the first reviewer (with which I concur), the conclusion may be strengthened by some corroboratory analyses, for instance using an independent dataset sampling the same population, and/or using name datasets that do not redact rarer names (which would otherwise affect the calculations).

I appreciate your constructive comment. As I replied to the first reviewer, I basically agree with your opinion. Yet, at least in this article, I would like to focus on the dataset used in a previous study (details are explained below).

It is important to note that the raw data used for the analysis (Table S15 of Bush et al. 2018) ultimately derives from birth registration counts compiled by the UK Office for National Statistics (ONS); specifically, https://www.ons.gov.uk/file?uri=/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys/2016/adhocallbabynames1996to2016.xls, and strictly speaking covering only England and Wales rather than the entire UK. As with many name datasets, the long tail of the name distribution is redacted: in this case, names registered to < 3 people per year are excluded. As such, for any given year, the “diversity indicator” (the ratio of the no. of unique forenames to the total number of birth records per year) will be under-estimated. In addition, because the total no. of birth records will be lower than the total number of births, the “popularity indicator” (% of birth records registered with a top 1/top 10 name) will be over-estimated.

For instance, we can see from Table S15 that in 2016, the total number of birth records was 639,126 (this number being used to calculate both the popularity and diversity indicators). However, the total number of births in 2016 was 696,271, an approx. 9% difference (this value is also made available by the ONS but in a separate file; see the link to ‘download figure 1 data’ in https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/bulletins/birthsummarytablesenglandandwales/2023#live-births).

Thank you for your valuable note. I agree with you that this point is important.

The previous version of the article already stated that the data covered only England and Wales rather than the entire UK and that the long tail of the name distribution was redacted, as follows.

“The original data is from the U.K. Office for National Statistics (2018), which included complete records of all live births in England and Wales for 21 years between 1996 and 2016. A total of 12,985,140 names were recorded, with an average of 618,340 names per year. It should be noted that names with a count of 2 or 1 were redacted to protect the confidentiality of individuals (Office for National Statistics, 2018).” (Data section)

Thus, I have added the possible differences of the two indicators in the revised article, as follows.

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the top 1 and 10 most popular names (numerator) are not influenced by the redaction. As a result, this popularity indicator is higher than the actual popularity.” (Footnote 2)

“Due to the redaction of names with a count of 2 or 1, the total number of birth records (denominator) is lower than the total number of births. In contrast, the number of unique forenames (numerator) is also lower than the total number of unique forenames. Thus, this diversity indicator is different from the actual diversity, but its direction (lower or higher) is difficult to predict.” (Footnote 3)

It would bolster the overall conclusion – that popularity and diversity are negatively correlated – if it could be demonstrated that refinements to these values make negligible difference. A simple way of doing so would be to calculate them using birth registration datasets that do not redact rare names in the first place. Examples present themselves in the form of data from the Government of Alberta, Canada, specifically for the years 1980-2020 (see https://open.alberta.ca/opendata/frequency-and-ranking-of-baby-names-by-year-and-gender; specific file: https://open.alberta.ca/dataset/11245675-b047-49fc-8bd1-cc2ce8314a6d/resource/e8aac308-c754-484c-b446-0c57ed0e8d37/download/baby-names-frequency_1980_2020.xlsx) and, to draw the focus back to the UK, data from the National Records of Scotland, covering the years 1974-2023 (https://www.nrscotland.gov.uk/media/mbmjzs2q/full-list-1974-2023.csv).

I appreciate your valuable comment. Datasets that do not redact rare names are especially precious because such names are frequently excluded.

Yet, this article focuses on names from England and Wales. Thus, I would like to examine these datasets in future research.

In addition, two points foundational to the overall argument – that “names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names” are supported by reference to the author’s not-yet-published review (Ogihara 2025 ‘Uncommon names are increasing globally’). To provide further context, it would help if a citation was provided for at least a pre-print of this manuscript. I am particularly drawn to the point that names have become less popular over time as, in the UK, this most strongly manifests as the 20th century progresses, i.e. during the time period covered by the ONS data. It would therefore be useful to know to what extent the relationship of popularity and diversity holds over a longer time period, and in that respect whether the conclusion can be generalised. That in mind, Bush et al. 2018 contains two different datasets sampling birth records from the English and Welsh population, of which the present work uses only one of them (from Table S15; the ONS dataset, which is much greater in depth but narrower in temporal scope). The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year. Nevertheless, a cursory examination of this data suggests that, as with the ONS data, there is a strongly negative correlation of popularity and diversity for the years 1996-2010 (this time period chosen because within it there are > 10,000 birth records per year). If we accept this as independent support of the conclusion drawn using the ONS data (and thereby the utility of this dataset), then it may be worth looking at how popularity and diversity correlate throughout history, too. This is of particular relevance as the conclusion drawn in the abstract (“we can predict one indicator from the other indicator”) implies a general relationship between the concepts that may not be universally true and that instead they may only “consistently measure the same concept” for a limited period of time.

I have expanded the Table S6 dataset substantially since the original publication; see Bush 2024 (https://doi.org/10.5195/names.2024.2543) and its raw data, specifically https://github.com/sjbush/uk_bmd/blob/main/dataset_B/summary_of_records_per_year.txt, which is essentially equivalent to Table S15 and contains the columns “most popular forename (% of records)” and “forename diversity (i.e., ratio of the no. of different forenames to the total no. of records per year)”. It should be clear from this data that before (approx.) the second half of the 20th century, and compared to contemporary records, forename diversity was not only persistently lower but that popular names were given to a higher proportion of people and remained popular for longer, e.g. John and Mary were consistently the most popular male and female names, respectively, for a century (see https://demos.flourish.studio/namehistory/?names=John,Mary, which visualises data from Bush et al. 2018). So, although the present work (using late-20th century data) finds that one indicator predicts the other, it also appears the case that (earlier in the 20th century) popularity could vary without discernible change to diversity. Consequently, the two variables do not necessarily change simultaneously. This may be the subject of a more in-depth analysis but could be usefully commented upon here.

Thank you for your valuable comment. Regarding the review paper, the two points (“names have become less popular over time” and that “raw data on names is restricted from being openly shared, making it common for only the ranking of popular names”) are supported not only by the review paper but also by other references I include (Bao et al., 2021; Bush, 2020; Bush et al., 2018; Gerhards & Hackenbroch, 2000; Kuipers & Askuri, 2017; Mignot, 2022; Ogihara, 2021, 2022; Ogihara & Ito, 2022; Ogihara et al., 2015; Twenge et al., 2010, 2016). Moreover, whether the review paper is made openly available would not directly affect the decision on this paper.

I also appreciate your second point. As you already acknowledged, it is unclear how the second dataset is representative (“The second dataset – which has the same popularity and diversity indicators – is available in Table S6, and may be viewed as a representative population sample from 1838-2014, albeit of varying coverage depths per year.”). In the original paper (Bush et al., 2018), the sample size for each year in the dataset is not sufficiently large and its representativeness for each year is not explained (“This approximates 130,000 to 230,000 records per year from 1838–1950, 25,000 to 100,000 records per year from 1951–2000, and 5000 to 15,000 records per year from 2001 to 2014”; p. 3). In contrast, the sample sizes are sufficiently large and representativeness of the newer dataset that I analyzed is apparently high. Thus, I analyzed the newer dataset and did not analyze the older dataset.

Finally, could some comment also be made on the gender of names, and the effect this may have on the conclusions? In contemporary UK name data, there is typically a larger pool of female than male names (discussed in Bush 2020, https://doi.org/10.1080/00277738.2020.1775471): that is, one would expect female names to have a higher “diversity indicator” and lower “popularity indicator” than male names. Accordingly, a direct correlation of popularity with diversity may not quite be comparing like with like: diversity being affected more strongly by female names, but popularity by male (such that in Table S15, the top 1 names, across the whole population, are all male).

I appreciate your comment. The original study (Bush et al., 2018) did not analyze name trends by gender, so this study did not either. However, I agree with your point. Thus, I have added this as a limitation, as follows.

“This study analyzed the dataset yielded by the past study (Bush et al., 2018), which did not distinguish between boys’ and girls’ names. Although a different pattern is not predicted based on gender, it is desirable to investigate the relationship between the diversity and popularity of names for boys and girls separately in the future.” (Limitation and future direction section)

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University
Department of Psychology, College of Education, Psychology and Human Studies
Address: 4-4-25 Shibuya, Shibuya-ku, Tokyo, 150-8366, Japan
E-mail: yogihara@ephs.aoyama.ac.jp
Web: https://sites.google.com/site/yujiogiharaweb/english

View more View less

Competing Interests

The author declares no competing interest.

Back to all reports

Reviewer Report

35 Views

06 May 2025 | for Version 1

Han-Wu-Shuang Bao, School of Psychology and Cognitive Science, East China Normal University, Shanghai, Shanghai, China

35 Views Cite this report Responses(1)

Approved With Reservations

Is the work clearly and accurately presented and does it cite the current literature?

Yes
Is the study design appropriate and is the work technically sound?

No
Are sufficient details of methods and analysis provided to allow replication by others?

Yes
If applicable, is the statistical analysis and its interpretation appropriate?

No
Are all the source data underlying the results available to ensure full reproducibility?

Yes
Are the conclusions drawn adequately supported by the results?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

cultural change, names, naming practices, name uniqueness

Respond to this report

Responses (1)

Author Response

21 Aug 2025

Yuji Ogihara, Department of Psychology, College of Education, Psychology and Human Studies, Aoyama Gakuin University, Shibuya, 150-8366, Japan

July 6th, 2025

Dear Dr. Han-Wu-Shuang Bao,

Thank you very much for reviewing my manuscript and providing valuable comments.

I have modified the manuscript extensively according to the reviewers’ comments.

I offer my responses to each comment below.

I have copied and pasted all of your comments without making changes.

In this Brief Report, the author performed a simple correlation analysis with Bush et al.’s (2018) published dataset on UK first names, showing that when the most popular names were more often used, there were also more diverse types of unique names. I think the author has addressed one of the important issues in name research and cultural change research, which lacked an empirical test before. While I believe the main argument that name popularity and name diversity can predict each other is worth publishing, I have several suggestions that may improve the strength of this argument.

Thank you for your words of praise. Following your comments below, I have modified the manuscript.

First, the analysis presented in this report was simple correlation over time, which might be spurious due to covariance among year, popularity, and diversity. To more rigorously test the key relationship, I suggest either controlling for year or conducting a cross-correlation analysis and a Granger causality test. The latter two methods are usually preferred for more rigorous time series analysis in cultural change research. In doing so, the author could examine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. It becomes more important here because such high correlations found in this report might simply due to the fact that any linear trends over time would be highly correlated (positively or negatively) with each other.

Thank you for your valuable comment. I agree with you that it is important to consider this point carefully.

I do not claim that there is a causality between the two indicators. As I write in the article, it is important to predict one indicator from the other indicator in name research. It is not necessary to determine whether popularity precedes diversity, diversity precedes popularity, or they shift just concurrently. Thus, I think simple correlations are sufficient, at least in this context. Nevertheless, I also think that the analyses you suggested work well in cultural change research.

However, if you do not find my response to be sufficient, I will reconsider this point. I would appreciate it very much.

Second, to partial out the confound by a common linear trend, I also suggest performing a simulation analysis. Specifically, you can generate or resample (e.g., bootstrap) large random samples of “names”, with each sample consisting of a sufficient number of “names” simulating a sample of names in a “year”. Then, for each sample, you can calculate the indices of popularity and diversity. Finally, across these simulated samples, you can test the correlation between popularity and diversity. In this way, you can test this relationship without the time confound. This may also answer if there is a purely statistical relationship between popularity and diversity, providing a better understanding of the main research question here.

I appreciate your constructive comment. That is a fascinating idea. I agree that such a simulation analysis can contribute to a better understanding of names and naming practices.

In contrast, I think that the approach is beyond the scope of this article which analyzes the actual data from the United Kingdom.

Moreover, this simulation could tend to lack ecological validity. The results should strongly depend on parameters of a hypothetical and imaginary dataset.

Furthermore, this simulation lacks a time perspective. Our focus is on how names (and name indicators) change over time.

Therefore, I would like to consider this new idea in a different study. Anyway, thank you very much for your important input.

Besides these suggestions, I have one minor comment on including more past evidence for name diversity. Bao et al. (2021) indeed tested changes in six name indices in China, where “standard deviation of name length” and “proportion of three-character given names” can both be regarded as indices of name diversity in Chinese naming practices. So I suggest adding them as another piece of evidence into the section “Diversity of names has increased”.

Thank you for your comment. I agree that the indicators you mention are related to name diversity. Yet, these indicators do not directly show name diversity because they focus on “name length.” Even when the standard deviation of name length increases, name diversity can decrease. For example, even when names consisting of one or three characters increase, if the same names are given, diversity can decrease. Therefore, I maintain this part.

Thank you for your further consideration of this manuscript.

I look forward to hearing from you at your earliest convenience.

Sincerely,

Yuji Ogihara, Ph.D.
Aoyama Gakuin University

View more View less

Competing Interests

The author declares no competing interest.

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] Bao HWS, Cai H, Jing Y, et al.: Novel evidence for the increasing prevalence of unique names in China: A reply to Ogihara. Front. Psychol. 2021; 12: 731244. PubMed Abstract | Publisher Full Text | Free Full Text

[2] Bush SJ: Ambivalence, Avoidance, and Appeal: Alliterative Aspects of Anglo Anthroponyms. Names. 2020; 68: 141–155. Publisher Full Text

[3] Bush SJ, Powell-Smith A, Freeman TC: Network analysis of the social and demographic influences on name choice within the UK (1838-2016). PLoS One. 2018; 13: e0205759. PubMed Abstract | Publisher Full Text | Free Full Text

[4] Gerhards J, Hackenbroch R: Trends and causes of cultural modernization: An empirical study of first names. Int. Sociol. 2000; 15: 501–531. Publisher Full Text

[5] He K: Long-term sociolinguistics trends and phonological patterns of American names. Proc. Ling. Soc. Amer. 2020; 5(1): 616–622. Publisher Full Text

[6] Kuipers JC, Askuri.: Islamization and identity in Indonesia: The case of Arabic names in Java. Indonesia. 2017; 103: 25–49. Publisher Full Text

[7] Mignot JF: First names given in France, 1800–2019: a window into the process of individualization. Popul. Econ. 2022; 6: 108–119. Publisher Full Text

[8] Morling B: Cultural difference, inside and out. Soc. Personal. Psychol. Compass. 2016; 10(12): 693–706. Publisher Full Text

[9] Morling B, Lamoreaux M: Measuring culture outside the head: A meta-analysis of individualism—collectivism in cultural products. Personal. Soc. Psychol. Rev. 2008; 12: 199–221. PubMed Abstract | Publisher Full Text

[10] Office for National Statistics: Births, deaths and marriages.2018. Reference Source

[11] Ogihara Y: Direct evidence of the increase in unique names in Japan: The rise of individualism. Curr. Res. Behav. Sci. 2021; 2: 100056. Publisher Full Text

[12] Ogihara Y: Common names decreased in Japan: Further evidence of an increase in individualism. Exp. Res. 2022; 3: e5. Publisher Full Text

[13] Ogihara Y: Uncommon names are increasing globally: A review of empirical evidence on naming trends.2025. Manuscript submitted for publication.

[14] Ogihara Y, Fujita H, Tominaga H, et al.: Are common names becoming less common? The rise in uniqueness and individualism in Japan. Front. Psychol. 2015; 6: 1490. PubMed Abstract | Publisher Full Text | Free Full Text

[15] Ogihara Y, Ito A: Unique names increased in Japan over 40 years: Baby names published in municipality newsletters show a rise in individualism, 1979-2018. Curr. Res. Ecol. Soc. Psychol. 2022; 3: 100046. Publisher Full Text

[16] Twenge JM, Abebe EM, Campbell WK: Fitting in or standing Out: Trends in American parents’ choices for children’s names, 1880–2007. Soc. Psychol. Personal. Sci. 2010; 1: 19–25. Publisher Full Text

[17] Twenge JM, Dawson L, Campbell WK: Still standing out: Children’s names in the United States during the Great Recession and correlations with economic indicators. J. Appl. Soc. Psychol. 2016; 46: 663–670. Publisher Full Text

Popularity and diversity: The negative relationship in baby names in the United Kingdom

Abstract

Background

Methods

Results

Conclusions

Keywords

Revised Amendments from Version 1

Introduction

Popularity of names has decreased

Diversity of names has increased

The relationship between popularity and diversity is unclear

The current study

Method

Data

Indicator

Results

Table 1. Simple Pearson’s correlation coefficients.

Relationship within popularity indicators

Relationship between popularity and diversity

Figure 1. Diversity (ratio of unique names) and popularity (% of top 1 name) indicators of baby names in the U.K., 1996-2016.

Figure 2. Diversity (ratio of unique names) and popularity (% of top 10 names) indicators of baby names in the U.K., 1996-2016.

Discussion

Limitation and future direction

Author contributions

Ethics and consent

Data availability

References

Footnotes

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated