Module 3

Recording Technique

How the stimulus is built, how the recording is made, and how a statistical test — not a clinician’s eye — decides whether a response is present.

Building the stimulus

An ASSR stimulus is defined by two independent frequencies. The carrier frequencyis the audiometric frequency being tested — it determines which region of the cochlea is activated. The modulation frequencyis the rate at which that carrier is varied — it relates to the neural generators of the recorded response, as Module 2 described. The two can be controlled independently [9].

Carriers of 500, 1000, 2000, and 4000 Hz are the standard set, mirroring pure-tone audiometry [9]. For threshold work in sleeping patients these are modulated at high rates; when several carriers are presented together, modulation rates are typically chosen in the region of 82–106 Hz, with each carrier given its own distinct rate so the responses can be separated in the frequency domain[16].

Amplitude, frequency, and mixed modulation

Three forms of modulation are used. Amplitude modulation (AM)varies the carrier’s level over time and is the most common choice because of its good frequency specificity; the extent of modulation is given as a percentage, and high modulation depths of roughly 90–100% are typical [9]. Frequency modulation (FM)varies the carrier’s frequency instead — an FM depth of 20% means the frequency swings by ±10% around the carrier. Mixed modulation (MM) combines AM and FM. Carriers modulated with FM alone produce a response roughly half the amplitude of AM alone, while combining the two raises amplitude further, which is why MM is often preferred[13].

There is no free lunch. Modulating a pure tone spreads energy on either side of the carrier — spectral splatter — so a larger, more easily detected response is bought at some cost to frequency specificity, the trade-off introduced in Module 2. Exponential amplitude modulation, which uses steeper envelope slopes and longer silent intervals between peaks, has been proposed as one way to raise response amplitude through better neural synchrony[13].

Single- and multiple-frequency paradigms

Carriers can be tested one at a time, or several can be modulated at slightly different rates and presented together — even to both ears at once. The multiple-frequency approach is what makes ASSR efficient: a single recording can yield up to eight thresholds (four carriers × two ears). Because the carriers overlap in the cochlea, simultaneously presented tones interact slightly, and a four-carrier stimulus is about 5 dB more intense than the carriers would be individually [16]. Newer systems use band-limited chirp stimuli, designed to compensate for cochlear travelling-wave delay, which can roughly halve recording time[16].

Electrodes and recording conditions

The electrode montage closely follows that used for the ABR. Active electrodes sit at or near the vertex and on the ipsilateral earlobe/mastoid, with a ground on the low forehead; a system recording from both ears at once uses a two-channel preamplifier, and a single-channel system recording a binaural presentation may use a common reference at the nape of the neck [16]. As in any evoked-potential recording, low interelectrode impedances are checked before testing begins [9].

Insert earphones are the usual transducer; they allow high presentation levels — an advantage for severe-to-profound losses — but there are two cautions. Very loud stimulation can itself damage hearing, and at high levels a vestibular response can be evoked that, because ASSR has no time-domain waveform to inspect, may be hard to distinguish from a true auditory response [16]. The recording also depends on a quiet patient: ongoing EEG and muscle artefact are the noise against which the response must be detected, so epochs whose amplitude exceeds a rejection criterion are discarded and re-recorded [16].

Statistical detection

This is the feature that sets ASSR apart from the ABR. Because the response is periodic, its presence can be tested objectively in the frequency domain rather than judged by eye. After a Fourier transform of the averaged EEG, a statistic asks whether the energy at the modulation frequency stands out from the surrounding background noise[9].

The most widely used statistic is the F-test. The frequency bin at the modulation frequency is compared with the noise in a set of neighbouring bins — for example, 60 bins above and 60 below — producing a p-value tested against a chosen significance level, commonly p < 0.05 [14]. An equivalent family of methods, the phase-coherence and magnitude-squared coherence statistics, instead ask whether the phase of the response is consistent across recording segments. Head-to-head comparisons have found these approaches detect responses with broadly similar efficiency[14]. Newer algorithms extend the test across the first several harmonics of the modulation frequency rather than the first harmonic alone [9].

Reading the criterion correctly

A statistical criterion is not a guarantee. With an F-test at p < 0.05, noise will be misread as a response in about 5% of recordings by chance — a false-positive rate that is the direct consequence of the chosen significance level [14]. This is why a single significant result is treated with caution and why stopping rules require significance to be sustained.

Stopping criteria

Detection statistics become more powerful as more data are averaged, so a recording needs a principled rule for when to stop. A response can be declared present once it reaches significance and that significance holds across a number of consecutive sweeps; it can be declared absent only once the noise floor has fallen low enough that a real response would have been seen had one been there; and a maximum recording time caps the test regardless[15]. Well-designed a priori stopping rules let a full multiple-frequency threshold search in an adult be completed in about an hour [15].

The Normal Response module takes the next step: reading the lowest intensity with a present response as the ASSR threshold, and converting that to an estimated behavioural audiogram.