Keywords
Informational masking, Energetic masking, Elderly, Train, Speech perception in noise
Informational masking, Energetic masking, Elderly, Train, Speech perception in noise
Understanding speech in noisy environments is a major challenge of the auditory system, which occurs mostly due to aging. It is clear that poorer speech recognition in the elderly population can occur due to various factors, including peripheral hearing impairment or decline in cognitive capabilities and processing defects at supra threshold levels. It is difficult to determine the exact role of each of these factors in developing speech problems in the elderly1. Most elderly individuals complain about the difficulty in understanding speech in noisy environments, despite having normal hearing thresholds2. In addition, the type of background noise heavily influences the extent of the damage imposed to speech intelligibility1. Generally, hearing problems worsen in noisy environments when the target speech is covered by a competing signal. In this situation, in addition to energetic masking, there is another type of masking known as informational masking3.
The spectro-temporal overlap between target and competing speech, which leads to poor target identification, is called energetic masking4. Energetic masking is caused by physical interactions between target and competing speech5,6 at the low level of the peripheral auditory system7. Recent research has suggested that when competing signals occur randomly or when there is a high similarity between target and competing signals (for example, when both signals are speech), another type of masking occurs both. This type of masking, which occurs in response to uncertainty of the competing signal or similarity between target and competing signals, is called non-energetic or informational masking8,9. It leads to failure in selecting auditory objects and therefore impairs auditory scene analysis. Generally, in contrast with the energetic masking, which occurs due to the limitations caused by frequency selectivity at peripheral levels, informational masking reflects the processing capacity limitations at central auditory levels10.
Various studies have indicated that with an increase in age, side-effects of competing noises will increase1,3,11,12. Different studies have revealed that elderly populations, who do not have peripheral auditory impairment, suffer from diminished ability of using acoustic and phonetic signs to separate speech from background noise, compared to young people; therefore, more informational masking occurs in this population3,13. Since problems with understanding speech will reduce social interactions of the elderly population, it is very important to develop effective auditory rehabilitation programs to prevent their isolation and to improve their quality of life1.
Auditory spatial processing plays an important role in speech recognition in complex noisy environments14, since it enables the listener to differentiate the target signal from competing signals via auditory scene analysis and forming auditory streams15. Based on the results of different studies, it has become clear that the most important sign of informational masking release is spatial separation of target and competing signals16. In addition, it has been shown that auditory spatial processing ability is lower in the elderly with normal hearing than in young people. The reduction of localization accuracy and taking advantage of auditory spatial processing, consequently decreasing binaural processing, are not totally related to impaired hearing thresholds17,18. Hence, it leads to poorer speech recognition in elderly people with normal hearing in noisy environments14. On the other hand, it has been shown that the elderly population need a higher signal to noise ratio for speech recognition in the presence of noise, compared to young people. These changes are possibly due to the reduction in the ability of using acoustic and phonetic signs to separate target signals from background noise11. Therefore, in elderly people, without considering the hearing impairment, the ability to use spatial and non-spatial signs for informational masking release diminishes due to the reduction of cognitive processing abilities11,12, temporal processing defects3, defects in the connection between hemispheres, and diminished ability to separate simultaneous sounds11.
Current neuroscientific studies have suggested that the central auditory system has a strong neuroplasticity capability for auditory spatial processing19,20 and since the effect of short-term and long-term auditory rehabilitation programs has been demonstrated in elderly people2, it seems that by providing auditory spatial training, we can aid the elderly population to perform informational masking release, preventing them from missing conversations in noisy environments.
The present study had two parts. The first part of this research was developing a test for measuring and evaluating the informational masking. The second part of this research was a clinical trial of an auditory spatial training program in elderly people with normal hearing, which could diminish informational masking. The main hypothesis for the second part of the study is that presenting an auditory spatial training for elderly people would be effective in the improvement of speech recognition in noisy environments by stimulating the centers related to binaural processing.
This is version 1 of this protocol.
This research will be conducted in the Audiology Clinic of Rehabilitation Faculty of Iran University of Medical Sciences.
This study consists of two main parts. Part 1: develop and determine an informational masking measurement test and explain its validity characteristics in a test development study, conducted cross-sectionally. The study population will be a group of elderly (60 to 75 years old) and a group of young (20 to 40 years old) people. The young people will be recruited from rehabilitation students of Iran University of Medical Sciences, while elderly people are those referred to the audiology clinics of Iran University of Medical Sciences.
Part 2 (simple randomized clinical trial): the effect of training on informational masking release. This part of study is a simple randomized clinical trial design and patients will be randomly assigned into two groups of control (not receiving auditory spatial training) and intervention (receiving auditory spatial training). The random allocation will be performed based on balanced randomization [1:1] where the allocation will be applied by random number table (those assigned an odd number, control group; those assigned an even number, intervention group). This allocation sequence will be generated by one of the audiology clinic staff of the IUMS who will not have any role in the study. An elderly population, 60–75-years-old, who are referred to the audiology clinics of Iran University of Medical Sciences will be selected. The two groups will be matched for age and gender. Those in the control group will not receive any rehabilitation programs during the study.
Inclusion criteria (for all participants in the study): auditory thresholds ≤25dB within the 250–2000Hz frequency range and <40dB at 4000Hz frequency, ensuring lack of salient cognitive problems using Mini Mental State Examination (MMSE)21; having diploma or higher degree; right-handedness (using Edinburgh handedness inventory); speaking Farsi and being monolingual; complaint about speech in noise perception difficulties (just for those in part 2 of the study); and normal condition of middle ear function.
Exclusion criteria (for all participants in the study): unwillingness for participation in each step of research and not meeting inclusion criteria.
Part 1: Developing an informational masking measurement test and determining its validity. When studying the informational masking, use of the coordinate response measure (CRM) has been frequently been introduced as one of the most popular speech materials for evaluating informational masking22. In these sentences, the same rigid structure with "Ready [call sign] go to [color] [number] now" format is used. In these sentences, eight call signs, four colors, and eight numbers from 1 to 9 can be used. These sentences will be expressed by speakers of different genders22–24. In the present study, 256 sentences will be created for each speaker (8*4*8). Sentences will be expressed by four speakers (two women and 2 men), providing a total of 1024 sentences. Although CRM stimuli have been initially designed to measure the speech perception in the presence of competing signals, these speech materials provide no contextual information; i.e. predicting the given color or number in the phrases is not possible. This is an important factor in measuring informational masking23.
Since there is no Persian version of these sentences, this research will prepare the sentences and determine their content and face validity and reliability. Then, after selecting the nouns, colors, and numbers used in the sentence, conforming to the main English version, the prepared sentences will be given to experts in this field (audiologists, speech therapists, and linguistics) to determine the content validity. These experts are the academic members of rehabilitation faculties of IUMS, Tehran University of Medical Sciences (TUMS) and Shahid Beheshti University of Medical Sciences (SBUMS). These experts will be emailed a questionnaire to score the validity items (see Table S1, Extended data).
After selecting the best pattern matching Persian language, based on the model presented for recording the sentences, all sentences will be recorded in a studio with the four speakers. In order to record the sentences, all criteria of the English version including the sampling rate of 44.1 kHz and giving 3s to speakers to produce each sentence will be followed. Then all the sentences will be scaled and all the words in CRM will be set such that they occur simultaneously, called coordinate sentences23 where each of the sentences will be filtered using a band pass filter of 80 to 8000 Hz filter. Again, in order to determine the face validity, the recorded sentences will be given to the experts mentioned above to determine their suitability. They must fill the questionnaire, which will be emailed to them.
To determine the reliability of CRM speech materials, the mean scores of CRM recognition in the silent will be evaluated in a group of young and old people with normal hearing who do not have speech perception difficulties in noisy environments. There will be one preliminary test and then a re-test. This evaluation will be implemented at the comfort hearing level of the participant. The score will be calculated based on the correct recognition percentage of the sentences. A sentence will receive a correct score when the color-number combination recognition is recognized correctly23,24. In this study, the mean correct score for color, number, and noun will be studied separately in order to calculate the error percentage for each of them22. By preparing these sentences, they can be used in the part 2 of the study.
The best way to measure informational masking value is determining the score for speech recognition in the presence of meaningful and meaningless competing noise. To this end, the recognition score for Persian version of CRM speech corpus will be measured under two conditions:
A. In the presence of meaningful competing noise: The competing signal is selected from the Persian version of CRM corpus, where the call sign, color and number used in the competing sentence is different from those of the target sentence and it will be expressed by a different speaker. The individual will be trained to pay attention to certain target call signs and ignore other signals22–24. As one of the important effective factors for informational masking is the great similarity between target and competing signals (like when both of them are speech)8,9, using CRM sentences as both the target signal and competing signal, the high semantic and syntactic similarity would develop between target and competing signals22,23.
B. In the presence of meaningless competing signal: the previous signal is manipulated such that its spectrum content remains fixed but meaningless - indeed, energetic masking remains but informational masking is reduced. "Time-reversed speech" will be used for this purpose. This is one of the most effective methods in behavioral and neurophysiological research performed for the effects of speech signals on each other. In this method, by fixing the long-term acoustic spectra of two signals and manipulating one of them such that it divides into non-overlapping time segments, and with reverse time presentation of each segment connecting them to each other, we will have a signal which is equal with the first signal in terms of the spectrum but is not understandable25. In the case of using 20–40 millisecond time windows, this method does not have significant effectiveness in non-understanding the speech signal; therefore, longer time windows should be used25. MATLAB R2018 software will be used in constructing this signal.
The signal to noise ratio in sentence recognition test in steps A and B was ±10, ±5, and 0. The target signal is always presented from a loudspeaker in the 0-azimuth degree and two competing signals from the loudspeakers, which are at ±90 degree and 0 azimuth degree (once with spatial separation and once in the direction of target signal), where once the competing signal has the same gender as the target signal and another time has a different gender. As a result, 20 conditions will be evaluated at each step (5 signal to noise ratios and two spatial angles with two different genders).
Finally, the informational masking score in all 20 conditions will be calculated as follows:
Speech recognition score in the meaningful competing noise condition-speech recognition score at non-understandable noise condition=informational masking score (percentage).
In this step of the research, construct validity will be used to determine the validity of the test. For this purpose, Speech, Spatial, and Qualities of Hearing Scale (SSQ) questionnaire score of each individual will be compared against the informational masking score26. Figure S1 (Extended data) represents the participant’s timeline of the first part of the study.
Part 2: the effect of auditory spatial training on the informational masking release. This part of research will be conducted in three steps: before auditory spatial training, during training, and after training.
1. Assessments prior to auditory spatial training (preliminary interview)
- Obtain patient history to confirm the inclusion and exclusion criteria of the participants
- Initial clinical examination, including otoscopy and tympanometry
- Perform pure tone audiometry test
- Perform MMSE questionnaire to ensure lack of salient cognition problem in the participants21
- Determine speech perception difficulties in the presence of noise: this was evaluated with a question: Do you have difficulty in understanding speech in noisy situations? There were three response options: yes, no or sometimes. Those who responded yes were entered into the study.
- Measure informational masking score using the test constructed in Part 1 (primary outcome)
- Determine the SSQ score26. The SSQ self-assessment questionnaire will be filled out by the researcher during the preliminary interview. As improving informational masking can improve the quality of life of people, this questionnaire will be used to measure the quality of life of the participants quantitatively (secondary outcome).
2. Providing auditory spatial training (intervention group only)
Auditory spatial training is designed based on five signs that are important in informational masking release: angular differentiation between target and competing signals16,27–29; signal to noise ratio27; similarity and difference between the target and competing signals12; similar or different gender for target and competing signals12,30; and meaningfulness of the competing signal12. Training sessions will be divided into three general steps by considering the competing signals. In the first step, meaningless competing signals will be used like white noise. In the second step, in order to make the training process somewhat difficult, meaning-carrying signals like speech babble consisting of four speakers will be used. Finally, sentence materials with male and female genders will be used. The reason for using the gender factor is that consideration of gender similarity or difference between target and competing signals is one of the signs that adults use for informational masking release12,30.
In all steps, the target signal will be presented from the loudspeaker at 0-azimuth degree and competing signals from different azimuth angles such as ±90 and 031,32. Therefore, the difficulty of training will grow in each step by reducing the azimuth angle of the competing signal33.
Sentence signals are selected from Persian version of QuickSIN34. Every step of training is implemented as follows:
The intensity of the competing signal is fixed at 60 dBSPL, and at the beginning the intensity of target signals is 70 dBSPL. Three first sentences are used for familiarization. If an individual needs more practice, more familiarization sentences are provided.
An individual is requested to identify the keywords heard in the target sentences. In the case of true and false identification, required feedback is provided. If the individual identifies more than 50% of the keywords, the sentence is considered true. In this signal to noise ratio, 5 sentences are provided where if the individual identifies more than 50% of the presented sentences, the signal to noise ratio decreases in 5dB steps, after which 5 sentences will be provided again for the individual. If the individual does not have the capability to correctly identify more than 50% of the presented sentences in each signal to noise ratio, the training begins where this signal to noise ratio will be considered as the initial level. The training will continue for 20 minutes and the intensity will change in an adaptive procedure, such that in the case of correct identification of a sentence, the intensity will be decreased by 1.5dB while it will be increased by 2.5 dB if the individual scored less than 50% of words correctly. At each intensity where an individual can correctly identify the sentences, the next sentence will be presented and the above process will repeat.
The optimal condition for perceptual auditory learning includes active listening to high repetition of signals during the consecutive educational sessions, which is conducted within a short time interval. Since long-term training is not a very suitable option in the clinic2, trainings are repeated twice a week over 8 sessions35.
3. Assessments after auditory spatial training (interview immediately after and one month after training)
The informational masking test (as per Part 1) will be done immediately after training and one month after using the Persian list of the coordinate response measure (CRM) corpus, which will be compared with the pre-training results (preliminary interview). This score will be the primary outcome. The reason for repeating experiments one month after the intervention is determining the reliability of the results obtained by intervention for informational masking release.
The informational masking release value will be calculated based on the difference between sentence recognition score (in all 20 conditions of signal to noise, different spatial angles, and two genders) in both noise situations (meaningful and non-understandable). The changes in informational masking in the assessments will be calculated before and after the intervention across all 20 conditions (see Table S2, Extended data).
As the ultimate purpose of this research is improving the quality of life of elderly people, the score of SSQ immediately and one month after intervention will be obtained and the results of both intervention and control groups will be compared separately. This score will be the secondary outcome of the intervention. Figure S2 (Extended data) represents participants timeline of the second part of the study.
Part 1. The study of Terwee et al. was considered as the basic study to determine the sample size of the first part of our study. They suggested that at least 50 patients in each group must be included to evaluate the construct validity36. In total, 50 young people aged between 20 and 40 years and 50 elderly people aged between 60 and 75 years, with normal hearing who do not suffer from speech understanding in noisy environments, will be recruited.
Part 2. The following formula is used to determine the sample size:
S1: standard deviation of the studied variable in the first group (case, exposed, or intervened)
S2: standard deviation of the studied variable in the second group (control, unexposed, or compared)
µ1: mean of the studied variable in the first group
µ2: mean of the studied variable in the second group
α=0.05
β=80%
In this formula, the studied variable is the extent of informational masking changes before and after the intervention. There is no previous study, which was used the same training as this study proposes; therefore, we considered the study of Delphi et al. which was on a group of elderly individuals37. The sample size of this study was 16 patients in each group. We will use the same sample size.
Central tendency and dispersion indices (mean and standard deviation) will be used in descriptive analysis of data. In data analysis and for determining the reliability of the Persian version of the coordinate response measure (CRM) corpus, paired t-test and Pearson correlation will be used in the case of normality of data; otherwise, Spearman test will be employed, and one-way ANOVA will be utilized for determining intra-class correlation coefficient.
In part 2 of the research, repeated measures ANOVA values will be used for inter-group comparisons and two-way ANOVA will be employed for comparison among groups. SPSS software (V20.0, IBM Corporation, New York, USA) will be used for statistical data analysis and the significance level for all tests will be 0.05.
The Medical Ethics Committee of Iran University of medical sciences approved the study protocol (IR.IUMS.REC.1397.303) and the ethical principles of the ethics committee will be observed in this research. Researchers will send any amendments to the protocol in the future to the ethics committee.
One of the researchers of this study will obtain written informed consent from patients willing to participate in the trial (see Extended data). The purpose of the research and its steps will be explained for all participants before the study start. Confidentiality of data and results of tests will be ensured to participants. Participants will be made aware that they can refrain from cooperation in the study when they want. Conducting tests has no side-effects for the studied individuals and all tests and training sessions are without cost to the participants.
To promote participant retention and complete follow-up, in every training session the examiner will provide feedback to all participants and will inform them about the training progress. The researcher will ask them about the impact of training on the participant’s daily communication conversations.
All data will be entered into forms which are prepared for data collection (see Table S2; Extended data) and the participant files will be stored at study site and will be maintained in a secure place and manner. Participant files will be maintained in storage for a period of 2 years after completion of the study. Only Principal Investigators will be given access to the study data.
The study outcomes will be published through peer-reviewed journals. The data resulting from this study will be released to the audiologists and participants and the general medical community. The results of this trial will be communicated to the external funding body through a formal report. There is no limit in the publication of the trial results.
Since there is a progressive increase in elderly populations around the world, the independence of this age group has gained much attention, and Iran is no exception38. One of the most important points in independent life during aging is the capability for effective verbal communication. Unfortunately, this capability declines in elderly people, especially tracking speech in environments where several speakers talk with each other. Most elderly individuals complain that despite good hearing, they cannot understand speech in noisy environments1. Indeed, elderly people cannot use auditory spatial signs for informational masking release due to the reduction of their auditory processing and cognitive abilities11,12. Since informational masking has an important role in competing signal environments and rehabilitation programs have not considered this an important aspect of masking, designing training that can help elderly people in releasing this masking is novel. Therefore, if the main research hypothesis, i.e. auditory spatial training can improve informational masking release in the elderly people, is confirmed, by providing the therapeutic solution in this age group, a series of auditory spatial trainings based on informational masking release will be provided for audiologists. In addition, the Persian version of the coordinate response measure (CRM) corpus and its reliability and validity will be calculated to be used in research on speech recognition in noisy environments.
Open Science Framework: The effects of auditory spatial training on informational masking release in elderly listeners: a study protocol for a randomized clinical trial. https://doi.org/10.17605/OSF.IO/SDEJP39.
This project contains the following extended data:
Table S1: The questionnaire of content and face validity of Persian version of coordinate response measure (CRM) corpus
Table S2: The data collection sheet for the second part of the study
Figure S1: Participant timeline for part 1 of the study (Developing an informational masking measurement test and determining its validity)
Figure S2: Participant timeline for part 2 of the study (the effect of auditory spatial training on the informational masking release)
Informed consent form for the participants of the first part of the study
Informed consent form for the participants of the second part of the study
Data are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
Open Science Framework: SPIRIT checklist for ‘The effects of auditory spatial training on informational masking release in elderly listeners: a study protocol for a randomized clinical trial’: https://doi.org/10.17605/OSF.IO/SDEJP38.
This study was part of a Ph.D. Dissertation approved by Iran University of Medical Sciences (IUMS), Tehran, Iran and is financially supported by IUMS (Contract No: 97-4-6-13657).
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
| Views | Downloads | |
|---|---|---|
| F1000Research | - | - | 
| PubMed Central Data from PMC are received and updated monthly. | - | - | 
Is the rationale for, and objectives of, the study clearly described?
Yes
Is the study design appropriate for the research question?
Partly
Are sufficient details of the methods provided to allow replication by others?
Partly
Are the datasets clearly presented in a useable and accessible format?
Not applicable
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Assessment and management of auditory processing deficit in patients with neurological disorders. Diagnostic audiology
Is the rationale for, and objectives of, the study clearly described?
Yes
Is the study design appropriate for the research question?
Partly
Are sufficient details of the methods provided to allow replication by others?
Yes
Are the datasets clearly presented in a useable and accessible format?
Not applicable
References
1. Shawn Green C, Bavelier D, Kramer A, Vinogradov S, et al.: Improving Methodological Standards in Behavioral Interventions for Cognitive Enhancement. Journal of Cognitive Enhancement. 2019; 3 (1): 2-29 Publisher Full TextCompeting Interests: Drs. Gallun and Seitz are funded by the National Institutes of Health to develop evaluation and training systems for better diagnose and rehabilitate auditory processing dysfunction
Reviewer Expertise: Auditory processing, informational masking, perceptual learning and training
Alongside their report, reviewers assign a status to the article:
| Invited Reviewers | ||
|---|---|---|
| 1 | 2 | |
| Version 2 (revision) 04 Jul 19 | read | read | 
| Version 1 09 Apr 19 | read | read | 
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)