Objective

F1000Research

2046-1402

F1000 Research Limited

London, UK

10.12688/f1000research.160046.1

Research Article

Articles

Research data volume and quality derived from a specialist disease registry versus routine electronic health records

[version 1; peer review: awaiting peer review]

Hamilton

Roseanna

Data Curation Formal Analysis Methodology Project Administration Writing – Original Draft Preparation Writing – Review & Editing 1 Varakliotis

Socrates

Data Curation Formal Analysis Methodology Writing – Original Draft Preparation Writing – Review & Editing https://orcid.org/0000-0002-5265-8205 1 2 Cancemi

Dario

Data Curation Investigation Project Administration Writing – Original Draft Preparation Writing – Review & Editing 1 2 Spiridou

Anastasia

Data Curation Formal Analysis Methodology Project Administration Writing – Original Draft Preparation Writing – Review & Editing https://orcid.org/0000-0001-6576-0244 1 Shah

Mohsin

Data Curation Methodology Project Administration Writing – Original Draft Preparation Writing – Review & Editing 1 Key

Daniel

Data Curation Investigation Methodology Project Administration Writing – Original Draft Preparation Writing – Review & Editing 1 Wedderburn

Lucy R

Conceptualization Data Curation Investigation Project Administration Supervision Writing – Original Draft Preparation Writing – Review & Editing 1 2 Sebire

Neil James

Conceptualization Formal Analysis Methodology Resources Supervision Writing – Original Draft Preparation Writing – Review & Editing https://orcid.org/0000-0001-5348-9063 a 1 2 1NIHR Great Ormond Street Hospital Biomedical Research Centre, London, England, UK 2University College London Institute of Child Health, London, England, UK

a n.sebire@ucl.ac.uk

No competing interests were disclosed.

27 1 2025

2025

132

8 1 2025

2025

This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Objective

This study aims to compare data availability and analytic results for patients using matched data items from a dedicated disease registry versus data extracted directly from an electronic patient record (EPR) system and a trusted research environment (TRE).

Methods

Data from patients enrolled in the National JDM Cohort and Biomarker Study (JDCBS) was compared with routine EPR data from the same patients attending a specialist children’s hospital between 2019-2021. Data from both sources were extracted, de-identified, and analysed within a trusted research environment adhering to NHS security standards. Descriptive statistics, visualizations, and statistical comparisons were performed.

Results

Of the 688 registry patients in total, 270 attended one specialist hospital with EPR data available. The EPR system yielded 328,527 data points on these patients compared to 40,673 from the registry, including 2-10 fold more data items across data categories. Diagnoses were more numerous in the EPR data, while registry data captured more comprehensive medication records. Laboratory test results were 10 times more frequent in EPR data, including a broader range of test types. Despite higher data volume in EPR, the clinical significance of the additional data points remains uncertain.

Conclusion

Routine EPR data can effectively replicate much disease registry data with a larger volume of data points, potentially offering additional analytical possibilities. However, specific targeted registry data collection remains valuable for certain data elements. A hybrid approach, utilizing both routine EPR data and focused registry collection, could optimise healthcare research by reducing costs and avoiding duplication.

EPR data registry data data quality healthcare research trusted research environment.

Myositis UK Charity

Medical Research Council

MR/N003322/1

Wellcome Trust UK

085860

Henry Smith Charity and Great Ormond Street Children's Charity

V1268

Cure JM

GOSH042019

Remission Charity

Myositis Association

NIHR Great Ormond Street Hospital Biomedical Research Centre

Action Medical Research UK

SP4252

Tiny Hearts Society

Versus Arthritis

14518

20164

21593

Cathal Hayes Research Foundation

Funding for the UK JDM Cohort and Biomarker study (JDCBS) has been by grants from Cathal Hayes research Foundation; the Wellcome Trust UK [085860]; Action Medical Research UK [SP4252]; the Myositis UK Charity, Arthritis Research UK now Versus Arthritis [14518, 20164, 21593]; the Henry Smith Charity and Great Ormond Street Children's Charity [V1268]; Tiny Hearts Society, The Myositis Association, Remission Charity, Cure JM (GOSH042019), the Medical Research Council [MR/N003322/1], and infrastructure through the National Institute for Health Research (NIHR) via the NIHR-Biomedical Research Centre at GOSH and GOSHCC. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care.

The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Introduction

Traditionally, research studies use specifically collected data since it has been reported that historically there may be quality issues with using routine EPR data and manual validation may be required at organisational level to make such data meaningful. ¹ Several dimensions of study data quality are generally described such as completeness, accuracy, concordance, plausibility, all of which have been variably applied to routine EPR data for research ² and corresponding EPR data quality assessment frameworks have been proposed. ^{3,
4}

However, previous studies have reported the feasibility of using routine EPR data to determine quality of care in various settings, such congenital heart disease, with reasonable data availability, although reporting reduced reliability of billing codes for identification of certain specific conditions, especially those that may be rare. ⁵ In addition, the use of routine EPR data to generate and populate disease-specific registries has been described (in this context, registries are regarded as list of patients and associated data items for individuals who either share a common diagnosis, procedure or treatment). ⁶

Despite assumptions regarding EPR systems, published evidence suggests that data quality in studies using routine EPR data to be acceptable. In one study directly comparing quality of cancer registry data versus the same data derived directly (but manually extracted) from EPR systems, there was 95% concordance for most features including important elements such as primary site, laterality and histologic type. ⁷ Another study directly compared manual and electronic data collection from a critical care EPR system and reported that the EPR derived data from over 241,000 patients undergoing more than 400,000 surgical procedures was good quality; for example, only around 1% had missing race/ethnicity data, all cases had an associated procedure code and 84% had outpatient medication recorded. ⁸ In another study, data were extracted from specific fields from a sample of around 200 patients admitted to adult intensive care units, either via manual study specific collection or extracted directly from the EPR system; concordance was high with full agreement for 11/30 variables (35%) and median Kappa score for categorical variables of 0.99 (IQR 0.92-1.00). Interestingly, in this studies where discordancy was present, manual transcription errors were the most common source of discrepancies. ⁹ Whilst routine extracted data therefore shows good scores for dimensions such as consistency, completeness, and uniqueness, there may be apparent ‘missing’ routine data, which is mainly related to the different levels of granularity required for secondary purposes compared to clinical coding. ¹⁰ In addition, EPR data quality is often variable specific. For example, in a study of hypertension surveillance, blood pressure measurements and medications were well-recorded but other elements such as smoking or alcohol status were often missing or incomplete, ¹¹ hence the need for assessment of EPR data quality for specific purposes. ¹²

Registry data is generally regarded as good quality. In a review of paediatric cardiac surgery registries using >50,000 data elements in around 500 subjects reported that only 3% of data elements were missing, with 98% accuracy of recording. ¹³ However, it should be noted that even registries may have data quality issues. One retrospective chart review study of around 400 medical records from 14 hospitals compared to matched registry data reported only 80-90% accuracy for surgery type, chemotherapy and radiotherapy for a range of disorders, with accuracy related to the experience of those extracting the data. ¹⁴

There is therefore recent interest in more widespread use of using routine EPR data for clinical trials and surveillance, partly since this approach would be cheaper and quicker than conducting dedicated trials and studies, but also since real-world effects can be estimated from such data, which may be important. A metanalysis of 84 studies using routine data and 463 traditional trials, reported that routine data studies demonstrate around 20% less favourable treatment effects compared to formal trials for the same conditions across a range of outcomes. ¹⁵ The aim of this study is therefore to directly compare data availability and analytic results for the same patients using matched data items extracted from a dedicated existing registry versus data extracted directly from an EPR system and trusted research environment.

Methods

The National JDM Cohort and Biomarker Study (JDCBS) is a voluntary cohort study: at the time of analysis JDCBS included data from 688 patients over a 20 year period, for which patients and families consent to the storage and use of their data and biosamples for secondary medical research purposes. ¹⁶ For the purposes of simplicity for the current analysis the JDCBS will simply be referred to as the ‘registry’ to distinguish this dataset from the EPR derived dataset from the same patients. Patients in the JDCBS attending GOSH were identified and data extracted into a secure data environment (GOSH DRE). Routine EPR data from the same patients were also directly extracted from the EPR system (Epic), linked and all data deidentified and stored in the secure data environment for analysis. The GOSH DRE is a trusted research environment (TRE) meeting NHS security and ICT standards including ISO27001 and ISO27010 and the architecture and routine deidentified extracted EPR data an HRA REC approved research database.

Data was extracted and provisioned by the GOSH data steward team and only the non-identifiable linked data made available through a secure workspace for subsequent analysis by the research team using R. Descriptive statistics and visualisations were carried out on both datasets and statistical comparison performed by comparison of proportions and Mann-Whitney U tests as appropriate for discrete and continuous variables.

Results

Of a total of 688 patients registered overall in JCDBS there were 286 patients who had been managed at GOSH of whom EPR data from the study period 2019-2021 inclusive was available for 270. Total data points available for these patients for the categories of laboratory test results, medications, diagnoses and visits were 40,673 in the registry and 328,527 from EPR, 8-fold more data items over the same period. The data volumes varied by data type but 2-10 fold more data items were available for the same categories using routine extracted EPR data ( Table 1).

Table 1. Number of data items per category from GOSH patients included in the registry based on registry data and routinely extracted EPR data for the same patient group over the same time period.

Characteristic	JDCBS	DRE
Laboratory tests	27464	284150
Medications	9532	34868
Diagnoses	268	2772
Visits	3409	6777

For some categories, such as Diagnoses, there were significantly more diagnoses recorded from the routine EPR data versus registry data, but this is likely a result of only targeted registry data collection in addition to the recording of many non-specific ‘diagnoses’ within EPR coded data ( Table 2/ Figure 1).

Table 2. Median and mean number of ‘diagnoses’ recorded per patient in EPR DRE versus JDCBS registry for the same patient group over the same time period.

Patients	DRE (N=270)	JDCBS (N=286)
Total diagnoses (N)
Minimum	0	0
Median (IQR)	6 (3-14)	1 (1-1)
Mean (SD)	10.27±12.01	0.94±0.24
Maximum	68	1

Figure 1. Chart of number of distinct diagnoses recorded from routine EPR and registry data.

The number of distinct diagnoses recorded from routine EPR data is more than 50x greater than number of diagnoses recorded in the registry. However, examination of the most common diagnoses provided in each demonstrate that registry diagnoses only include those high-level diagnoses directly related to the primary medical condition, whereas EPR data additionally includes comorbidities and other conditions.

In contrast, the median number of medications recorded per patient is less in the routine DRE data compared to registry data (median 16 versus median 26 respectively), likely explained by the fact that GOSH EPR data only includes medications prescribed by the hospital whereas registry data may have included all medications used regardless of whether prescribed in other hospitals or primary care as well as GOSH ( Table 3).

Table 3. Median and mean number of medications per patient in EPR DRE data and JDCBS registry data for the same patient group over the same time period.

	DRE (N=270)	JCDBS (N=286)
Total medications
Minimum	0	0
Median (IQR)	16.00 (0.00, 99.50)	26.00 (8.00, 49.00)
Mean (SD)	129.14 (±477.70)	33.33 (±30.96)
Maximum	7200	178

The category with the greatest fold difference in data items was however, laboratory testing, with 10-fold more laboratory test results available per patient in the EPR derived dataset compare to the registry data, likely a consequence of recording of only selected laboratory tests within the registry ( Table 4).

Table 4. Number of laboratory test results available per patient from the EPR DRE dataset and the JDCBS registry data for the same patient group over the same time period., demonstrating many more fold laboratory results in the routine EPR data.

	DRE (N=270)	JDCBS (N=286)
Total laboratory tests
Minimum	0	0
Median (IQR)	830 (432-1364)	74 (21-152)
Mean (SD)	1052.4±875.0	96.0±89.4
Maximum	5911	523

Further examination of the JDCBS and EPR DRE data laboratory test result types demonstrates a broadly similar pattern of testing with a marked predominance of repeated standard tests, specifically tests such as full blood count ( Figure 2). However, the overall number of distinct laboratory tests recorded in the registry was 34 compared to >1300 laboratory test types overall in routine data, likely a consequence of registry data collection of only specific predefined tests ( Figure 3).

Figure 2. Bat chart of most common laboratory tests.

The 20 most common laboratory test types available in the JCDBS registry (Top) and the RPR DRE data (Bottom) are provided, demonstrating broadly similar patterns of relative test frequencies despite around 10-fold more test results available through the EPR data.

Figure 3. Chart of number and types of unique laboratory test types in the JDCBS and EPR DRE datasets.

For test types present in both datasets, significantly more values were available from the routine EPR DRE data, resulting in small differences in overall result distributions of uncertain clinical significance ( Figure 4).

Figure 4. Examples of small differences in distributions of laboratory test values between JCDBS registry data and routine EPR DRE data.

Box whisker plots illustrating median, IQR and ranges for serum albumin (top) and serum LDH (bottom) from both datasets showing small differences in distribution of values.

However, the presence of magnitudes greater data items in the EPR DRE data allows potential additional analysis types to be carried out. For example, there is a relationship between the total number of laboratory tests performed and total number of EPR diagnoses recorded per patient ( Figure 5).

Figure 5. Relationship between total number of laboratory tests performed and total number of EPR diagnoses.

Using EPR DRE data per patient.

Finally, since registry data only includes selected attendances, the average number of outpatient visits recorded is more than twice as many per patient from routine EPR data as from the registry ( Table 5).

Table 5. Average number (median, mean) of hospital attendances per patient during the same time period from EPR DRE and JDCBS registry.

	DRE (N=270)	JDCBS (N=286)
Total outpatient visits
Minimum	0	1
Median (IQR)	22.00 (13.00. 30.75)	10.00 (4.00, 18.00)
Mean (SD)	25.10±22.7	11.92±9.17
Maximum	254	48

Discussion

The findings of this study have demonstrated that, firstly, it is possible to use extracted routine electronic health record data to generate a dataset that recapitulated many aspects of data found in a dedicated registry. Second, there are orders of magnitude more data points available from use of routine EPR data, including data elements which may be of interest or use but were not initially considered or appreciated when setting up the registry, especially for elements such as laboratory test results. Third, use of all data points, such as from all laboratory tests performed may demonstrate small but significant differences in test result distributions indicating that registry data may not represent unselected routine clinical data, although, in general, distributions were similar and any differences of uncertain significance. Fourth, additional analyses may be possible using more extensive routine EPR data due to ease of linkage regarding time points and data point interrelationships.

However, despite the additional volume of data available from routine EPR extractions, it remains uncertain whether this provides significant additional clinical or research insight, since the most common data items are repeat testing of common standard tests and it is likely that only a minority of test results are contributory to diagnosis and management. Finally, it should be recognised that only specific pre-defined data elements are collected in registries, often with well-described data dictionaries, whereas routine EPR derived data includes all items but is dependent on clinical data entry and coding; this is most apparent in the ‘diagnoses’ section, which in registry data is confined to the main underlying JDM related diagnosis but in the routine EPR dataset additionally includes a wide range of associated or incidental diagnoses and non-specific symptoms.

The findings do, however, indicate that significant effort and cost may potentially be avoided by more widespread use of routine extracted EPR data to support, augment or replace dedicated disease-specific registries, since comparative analysis suggests that findings from both dataset types are broadly similar. However, there are differences in several aspects, such as hospital visits, medications and laboratory tests indicating that both approaches may be optimal for particular circumstances. Therefore, optimal healthcare research should begin to question the routine setting up of registries to duplicate data held in EPR systems and that significant resource savings could be achieved by using routine EPR data wherever possible, but enhanced by highly targeted registry collection for specific data elements, thus a customised hybrid approach to achieve maximum benefit.

It should be emphasised that the findings presented in the present study are based on routinely extracted EPR data from a single centre, which already has an established digital research environment and underlying processes and architecture for large scale extraction, deidentification and harmonisation of electronic patient record data elements. The disease registry, in contrast, collects data from many different centres, each of whom may have different electronic patient record systems, and markedly different levels of digital and data maturity. Therefore, scaling the approach of extracting and collating or mapping similar data from multiple different organisations’ clinical systems adds significant complexity with aspects such as data harmonisation, ontology mapping and unification of formatting, all of which are essentially avoided by manual entry into a research data capture tool associated with a registry. The disadvantage of this approach is that such registries requires both initial setup and ongoing management resources with additional potential transcription and data entry errors, as well as intrinsic limitations to the extent of data collection since there is a human resource burden directly proportional to the number of participants and number of data elements. It is hoped that future developments towards unifying healthcare data specifications for interoperability, such as HL7 FHIRv4 may significantly reduce the complexity of multicentre data harmonisation for such use cases, but at present few clinical systems support such tools or APIs beyond basic functionality.

In this rare disease example (juvenile onset dermatomyositis, annual incidence 2-3 per million children per year), ¹⁷ collection from many centres has a clear benefit to research to power studies adequately and enable cross centre comparison of outcomes and practice. In addition, the agreement of an internationally agreed data set for research and clinical use in this condition has facilitated comparisons of registries across countries. ¹⁸ In the future it would therefore be feasible to standardise the elements recorded in the EPR specific to this condition and then use routinely extracted, large datasets for research. This might provide significant savings of time and duplicated effort to the research community, enable a wider range of data elements to be incorporated into high dimensional modelling or disease outcome and so lead to significant benefit for patients.

Ethics and consent

The study was approved by the appropriate REC (for JDM data: REC 01/3/022 20/03/2023 and specific analysis was approved by the JDCBS Study Steering Committee with all patients having provided written consent for the use of their data in research; use of EPR data for research through the GOSH SDE is approved under REC REC reference: 21/LO/0646), 13/10/2021).

Data availability statement Underlying data

Individual patient level data is not available since REC approvals do not support data sharing beyond the platform without additional approval. Summary data is available on request through the corresponding author.

Acknowledgements

The Juvenile Dermatomyositis Cohort Biomarker Study & Repository (JDCBS) would like to thank all of the patients and their families who contributed to the JDCBS research study. We thank all local research coordinators and principal investigators who have made this research possible. Clinical, research and administrative contributors to JDCBS members were as follows:

Dr Kate Armon, and Ms Louise Coke, Ms Julie Cook and Ms Amy Nichols (Norfolk and Norwich University Hospitals);Dr Liza McCann, Mr Ian Roberts, Dr Eileen Baildam, Ms Louise Hanna, Ms Olivia Lloyd, Susan Wadeson, Ms Michelle Andrews, Ms Olivia Lloyd and Mrs Jane Roach (The Royal Liverpool Children’s Hospital, Alder Hey, Liverpool); Dr Phil Riley, Ms Ann McGovern, and Ms Verna Cuthbert (Royal Manchester Children’s Hospital, Manchester); Dr Clive Ryder, Ms Janis Scott, Ms Beverley Thomas, Professor Taunton Southwood, Dr Eslam Al-Abadi and Ms Ruth Howman (Birmingham Children’s Hospital, Birmingham); Dr Sue Wyatt, Mrs Gillian Jackson, Dr Mark Wood, Dr Tania Amin, Dr Vanessa VanRooyen, Ms Deborah Burton, Ms Louise Turner, Ms Heather Rostron, and Ms Sarah Hanson (Leeds General Infirmary, Leeds); Dr Joyce Davidson, Dr Janet Gardner-Medwin, Dr Neil Martin, Ms Sue Ferguson, Ms Liz Waxman and Mr Michael Browne, Ms Roisin Boyle, Ms Emily Blyth and Ms Susanne Cathcart (The Royal Hospital for Sick Children, Yorkhill, Glasgow); Dr Mark Friswell, Professor Helen Foster, Ms Alison Swift, Dr Sharmila Jandial, Ms Vicky Stevenson, Ms Debbie Wade, Dr Ethan Sen, Dr Eve Smith, Ms Lisa Qiao, Mr Stuart Watson and Ms Claire Duong, Dr Stephen Crulley, Mr Andrew Davies, Miss Caroline Miller, Ms Lynne Bell, Dr Flora McErlane, Dr Sunil Sampath, Dr Josh Bennet, Mrs Sharon King and Mr Christopher Long (Great North Children’s Hospital, Newcastle); Dr Helen Venning, Dr Rangaraj Satyapal, Mrs Elizabeth Stretton, Ms Mary Jordan, Dr Ellen Mosley, Ms Anna Frost, Ms Lindsay Crate, Dr Kishore Warrier, Ms Stefanie Stafford, Mrs Brogan Wrest, Ms Chia-Ping Chou, and Mr Paul Pryce (Queens Medical Centre, Nottingham); Professor Lucy Wedderburn, Dr Clarissa Pilkington, Dr Nathan Hasson, Dr Muthana Al-Obadi, Dr Giulia Varnier, Dr Sandrine Lacassagne, Ms Sue Maillard, Mrs Lauren Stone, Ms Elizabeth Halkon, Ms Virginia Brown, Ms Audrey Juggins, Dr Sally Smith, Ms Sian Lunt, Ms Elli Enayat, Ms Hemlata Varsani, Ms Laura Kassoumeri, Miss Laura Beard, Ms Katie Arnold, Mrs Yvonne Glackin, Ms Stephanie Simou, Dr Beverley Almeida, Dr Kiran Nistala, Dr Raquel Marques, Dr Claire Deakin, Dr Parichat Khaosut, Ms Stefanie Dowle, Dr Charalampia Papadopoulou, Dr Shireena Yasin, Dr Christina Boros, Dr Meredyth Wilkinson, Dr Chris Piper, Ms Cerise Johnson-Moore, Ms Lucy Marshall, Ms Kathryn O’Brien, Ms Emily Robinson, Mr Dominic Igbelina, Dr Polly Livermore, Dr Socrates Varakliotis, Ms Rosie Hamilton, Ms Lucy Nguyen, Mr Dario Cancemi, Dr Ovgu Kul Cinar, Dr Elena Moraitis and Dr Hannah Peckham (Great Ormond Street Hospital, London); Dr Kevin Murray (Princess Margaret Hospital, Perth, Western Australia); Dr Coziana Ciurtin, Dr John Ioannou, Mrs Caitlin Clifford, Ms Linda Suffield and Ms Laura Hennelly (University College London Hospital, London); Ms Helen Lee, Ms Sam Leach, Ms Helen Smith, Dr Anne-Marie McMahon, Ms Heather Chisem, Ms Jeanette Hall and Ms Amy Huffenberger (Sheffield’s Children’s Hospital, Sheffield); Dr Nick Wilkinson, Ms Emma Inness, Ms Eunice Kendall, Mr David Mayers, Ms Ruth Etherton, Ms Danielle Miller and Dr Kathryn Bailey (Oxford University Hospitals, Oxford); Dr Jacqui Clinch, Ms Natalie Fineman, Ms Helen Pluess-Hall, Ms Suzanne Sketchley, Ms Melanie Marsh, Ms Anna Fry, Ms Maisy Dawkins-Lloyd and Ms Mashal Asif (Bristol Royal Hospital for Children, Bristol); Dr Joyce Davidson, Margaret Connon and Ms Lindsay Vallance (Royal Aberdeen Children’s Hospital); Dr Kirsty Haslam, Ms Charlene Bass-Woodcock, Ms Trudy Booth, and Ms Louise Akeroyd (Bradford Teaching Hospitals); Dr Alice Leahy, Amy Collier, Rebecca Cutts, Emma Macleod, Dr Hans De Graaf, Dr Brian Davidson, Sarah Hartfree, Ms Elizabeth Fofana and Ms Lorena Caruana (University Hospital Southampton) and all the Children, Young people and their families who have contributed to this research.

References 1

Benin

Fenick

Herrin

: How good are the data? Feasible approach to validation of metrics of quality derived from an outpatient electronic health record. Am. J. Med. Qual. 2011;26:441–451. 21926280

10.1177/1062860611403136

Weiskopf

Weng

: Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J. Am. Med. Inform. Assoc. 2013;20:144–151. 22733976

10.1136/amiajnl-2011-000681

PMC3555312

Reimer

Milinovich

Madigan

: Data quality assessment framework to assess electronic medical record data for use in research. Int. J. Med. Inform. 2016;90:40–47. 27103196

10.1016/j.ijmedinf.2016.03.006

PMC4845906

Kahn

Callahan

Barnard

: A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data. EGEMS (Wash DC). 2016;4:1244. 27713905

10.13063/2327-9214.1244

Broberg

Sklenar

Burchill

: Feasibility of Using Electronic Medical Record Data for Tracking Quality Indicators in Adults with Congenital Heart Disease. Congenit. Heart Dis. 2015;10:E268–E277. 26239748

10.1111/chd.12289

Kannan

Fish

Mutz

: Rapid Development of Specialty Population Registries and Quality Measures from Electronic Health Record Data*. An Agile Framework. Methods Inf. Med. 2017;56:e74–e83. 28930362

10.3414/ME16-02-0031

Schouten

Jager

Brandt

van den : Quality of cancer registry data: a comparison of data provided by clinicians with those of registration personnel. Br. J. Cancer. 1993;68:974–977. 8217612

10.1038/bjc.1993.464

PMC1968711

Corey

Helmkamp

Simons

: Assessing Quality of Surgical Real-World Data from an Automated Electronic Health Record Pipeline. J. Am. Coll. Surg. 2020;230:295–305e12. 31945461

10.1016/j.jamcollsurg.2019.12.005

Brundin-Mather

Soo

Zuege

: Secondary EMR data for quality improvement and research: A comparison of manual and electronic data collection from an integrated critical care electronic medical record system. J. Crit. Care. 2018;47:295–301. 30099330

10.1016/j.jcrc.2018.07.021

Aerts

Kalra

Sáez

: Quality of Hospital Electronic Health Record (EHR) Data Based on the International Consortium for Health Outcomes Measurement (ICHOM) in Heart Failure: Pilot Data Quality Assessment Study. JMIR Med. Inform. 2021;9:e27842. 34346902

10.2196/27842

PMC8374665

Garies

McBrien

Quan

: A data quality assessment to inform hypertension surveillance using primary care electronic medical record data from Alberta, Canada. BMC Public Health. 2021;21:264. 33530975

10.1186/s12889-021-10295-w

PMC7852125

Ozonze

Scott

Hopgood

: Automating Electronic Health Record Data Quality Assessment. J. Med. Syst. 2023;47:23. 36781551

10.1007/s10916-022-01892-2

PMC9925537

Nathan

Jacobs

Gaynor

: Completeness and Accuracy of Local Clinical Registry Data for Children Undergoing Heart Surgery. Ann. Thorac. Surg. 2017;103:629–636. 27726857

10.1016/j.athoracsur.2016.06.111

PMC5253303

Cheng

C-Y

Chiang

C-J

Hsieh

C-H

: Is quality of registry treatment data related to registrar experience and workload? A study of Taiwan cancer registry data. J. Formos. Med. Assoc. 2018;117:1093–1100. 29329964

10.1016/j.jfma.2017.12.012

Mc Cord

Ewald

Agarwal

: Treatment effects in randomised trials using routinely collected data for outcome assessment versus traditional trials: meta-research study. BMJ. 2021;372:n450. 10.1136/bmj.n450

Martin

Krol

Smith

: A national registry for juvenile dermatomyositis and other paediatric idiopathic inflammatory myopathies: 10 years’ experience; the Juvenile Dermatomyositis National (UK and Ireland) Cohort Biomarker Study and Repository for Idiopathic Inflammatory Myopathies. Rheumatology (Oxford). 2011;50:137–145. 20823094

10.1093/rheumatology/keq261

Papadopoulou

Chew

Wilkinson

MGL

: Juvenile idiopathic inflammatory myositis: an update on pathophysiology and clinical care. Nat. Rev. Rheumatol. 2023;19:343–362. 37188756

10.1038/s41584-023-00967-9

PMC10184643

McCann

Pilkington

Huber

: Development of a consensus core dataset in juvenile dermatomyositis for clinical use to inform research. Ann. Rheum. Dis. 2018;77:241–250. 29084729

10.1136/annrheumdis-2017-212141

PMC5816738