g., a click) has frequency components that align at their peaks (phase 0). For sounds dominated by one of these feature types, adjacent modulation
bands thus have consistent relative phase in places where their amplitudes are high. We captured this relationship with a complex-valued correlation measure (Portilla and Simoncelli, 2000). We first define analytic extensions of the modulation bands: αk,n(t)≡b˜k,n(t)+iH(b˜k,n(t)), where H denotes the Hilbert transform and i=−1. The analytic signal comprises the responses of the filter and its quadrature twin, and selleck compound is thus readily instantiated biologically. The correlation has the standard form, except it is computed between analytic modulation bands tuned to modulation frequencies an octave apart, with the frequency of the lower band doubled. Frequency doubling is achieved by squaring the complex-valued analytic signal: dk,n(t)=ak,n2(t)‖ak,n(t)‖,yielding C2k,mn=∑tw(t)dk,m∗(t)ak,n(t)σk,mσk,n,k ∈ [1…32], m ∈ [1…6], and (n −
PF-06463922 m) = 1, where ∗ and ‖⋅‖ denote the complex conjugate and modulus, respectively. Because the bands result from octave-spaced filters, the frequency doubling of the lower-frequency band causes them to oscillate at the same rate, producing a fixed phase difference between adjacent bands in regions of large amplitude. We use a factor of 2 rather than something smaller because the operation of exponentiating a complex number is uniquely defined only for integer powers. See Figure S6 for further explanation.
C2k,mn is complex valued, and the real and imaginary parts must be independently measured and imposed. Example sounds with onsets, offsets, and impulses are shown in Figure 3D along with their C2 correlations. In total, there are 128 cochlear marginal statistics, 189 cochlear cross-correlations, 640 modulation band variances, 366 C1 correlations, and 192 C2 correlations, for a total of 1515 statistics. Synthesis was driven by a set of statistics measured for a sound signal of interest using the auditory model described above. The synthetic signal was initialized with a sample of Gaussian white noise, and was modified with an iterative process until it shared the measured also statistics. Each cycle of the iterative process, as illustrated in Figure 4A, consisted of the following steps: (1) The synthetic sound signal is decomposed into cochlear subbands. We performed conjugate gradient descent using Carl Rasmussen’s “minimize” MATLAB function (available online). The objective function was the total squared error between the synthetic signal’s statistics and those of the original signal. The subband envelopes were modified one-by-one, beginning with the subband with largest power, and working outwards from that. Correlations between pairs of subband envelopes were imposed when the second subband envelope contributing to the correlation was being adjusted.