fbpx
Wikipedia

Modified discrete cosine transform

The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset, where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block. This overlapping, in addition to the energy-compaction qualities of the DCT, makes the MDCT especially attractive for signal compression applications, since it helps to avoid artifacts stemming from the block boundaries. As a result of these advantages, the MDCT is the most widely used lossy compression technique in audio data compression. It is employed in most modern audio coding standards, including MP3, Dolby Digital (AC-3), Vorbis (Ogg), Windows Media Audio (WMA), ATRAC, Cook, Advanced Audio Coding (AAC),[1] High-Definition Coding (HDC),[2] LDAC, Dolby AC-4,[3] and MPEG-H 3D Audio,[4] as well as speech coding standards such as AAC-LD (LD-MDCT),[5] G.722.1,[6] G.729.1,[7] CELT,[8] and Opus.[9][10]

The discrete cosine transform (DCT) was first proposed by Nasir Ahmed in 1972,[11] and demonstrated by Ahmed with T. Natarajan and K. R. Rao in 1974.[12] The MDCT was later proposed by John P. Princen, A.W. Johnson and Alan B. Bradley at the University of Surrey in 1987,[13] following earlier work by Princen and Bradley (1986)[14] to develop the MDCT's underlying principle of time-domain aliasing cancellation (TDAC), described below. (There also exists an analogous transform, the MDST, based on the discrete sine transform, as well as other, rarely used, forms of the MDCT based on different types of DCT or DCT/DST combinations.)

In MP3, the MDCT is not applied to the audio signal directly, but rather to the output of a 32-band polyphase quadrature filter (PQF) bank. The output of this MDCT is postprocessed by an alias reduction formula to reduce the typical aliasing of the PQF filter bank. Such a combination of a filter bank with an MDCT is called a hybrid filter bank or a subband MDCT. AAC, on the other hand, normally uses a pure MDCT; only the (rarely used) MPEG-4 AAC-SSR variant (by Sony) uses a four-band PQF bank followed by an MDCT. Similar to MP3, ATRAC uses stacked quadrature mirror filters (QMF) followed by an MDCT.

Definition

As a lapped transform, the MDCT is a bit unusual compared to other Fourier-related transforms in that it has half as many outputs as inputs (instead of the same number). In particular, it is a linear function   (where R denotes the set of real numbers). The 2N real numbers x0, ..., x2N-1 are transformed into the N real numbers X0, ..., XN-1 according to the formula:

 

(The normalization coefficient in front of this transform, here unity, is an arbitrary convention and differs between treatments. Only the product of the normalizations of the MDCT and the IMDCT, below, is constrained.)

Inverse transform

The inverse MDCT is known as the IMDCT. Because there are different numbers of inputs and outputs, at first glance it might seem that the MDCT should not be invertible. However, perfect invertibility is achieved by adding the overlapped IMDCTs of subsequent overlapping blocks, causing the errors to cancel and the original data to be retrieved; this technique is known as time-domain aliasing cancellation (TDAC).

The IMDCT transforms N real numbers X0, ..., XN-1 into 2N real numbers y0, ..., y2N-1 according to the formula:

 

(Like for the DCT-IV, an orthogonal transform, the inverse has the same form as the forward transform.)

In the case of a windowed MDCT with the usual window normalization (see below), the normalization coefficient in front of the IMDCT should be multiplied by 2 (i.e., becoming 2/N).

Computation

Although the direct application of the MDCT formula would require O(N2) operations, it is possible to compute the same thing with only O(N log N) complexity by recursively factorizing the computation, as in the fast Fourier transform (FFT). One can also compute MDCTs via other transforms, typically a DFT (FFT) or a DCT, combined with O(N) pre- and post-processing steps. Also, as described below, any algorithm for the DCT-IV immediately provides a method to compute the MDCT and IMDCT of even size.

Window functions

 
MDCT window functions:
blue: Cosine, red: Sine-Cosine, green: modified Kaiser-Bessel

In typical signal-compression applications, the transform properties are further improved by using a window function wn (n = 0, ..., 2N−1) that is multiplied with xn and yn in the MDCT and IMDCT formulas, above, in order to avoid discontinuities at the n = 0 and 2N boundaries by making the function go smoothly to zero at those points. (That is, we window the data before the MDCT and after the IMDCT.) In principle, x and y could have different window functions, and the window function could also change from one block to the next (especially for the case where data blocks of different sizes are combined), but for simplicity we consider the common case of identical window functions for equal-sized blocks.

The transform remains invertible (that is, TDAC works), for a symmetric window wn = w2N−1−n, as long as w satisfies the Princen-Bradley condition:

 .

Various window functions are used. A window that produces a form known as a modulated lapped transform (MLT)[15][16] is given by

 

and is used for MP3 and MPEG-2 AAC, and

 

for Vorbis. AC-3 uses a Kaiser-Bessel derived (KBD) window, and MPEG-4 AAC can also use a KBD window.

Note that windows applied to the MDCT are different from windows used for some other types of signal analysis, since they must fulfill the Princen-Bradley condition. One of the reasons for this difference is that MDCT windows are applied twice, for both the MDCT (analysis) and the IMDCT (synthesis).

Relationship to DCT-IV and Origin of TDAC

As can be seen by inspection of the definitions, for even N the MDCT is essentially equivalent to a DCT-IV, where the input is shifted by N/2 and two N-blocks of data are transformed at once. By examining this equivalence more carefully, important properties like TDAC can be easily derived.

In order to define the precise relationship to the DCT-IV, one must realize that the DCT-IV corresponds to alternating even/odd boundary conditions: even at its left boundary (around n = −1/2), odd at its right boundary (around n = N − 1/2), and so on (instead of periodic boundaries as for a DFT). This follows from the identities   and  . Thus, if its inputs are an array x of length N, we can imagine extending this array to (x, −xR, −x, xR, ...) and so on, where xR denotes x in reverse order.

Consider an MDCT with 2N inputs and N outputs, where we divide the inputs into four blocks (a, b, c, d) each of size N/2. If we shift these to the right by N/2 (from the +N/2 term in the MDCT definition), then (b, c, d) extend past the end of the N DCT-IV inputs, so we must "fold" them back according to the boundary conditions described above.

Thus, the MDCT of 2N inputs (a, b, c, d) is exactly equivalent to a DCT-IV of the N inputs: (−cRd, abR), where R denotes reversal as above.

(In this way, any algorithm to compute the DCT-IV can be trivially applied to the MDCT.)

Similarly, the IMDCT formula above is precisely 1/2 of the DCT-IV (which is its own inverse), where the output is extended (via the boundary conditions) to a length 2N and shifted back to the left by N/2. The inverse DCT-IV would simply give back the inputs (−cRd, abR) from above. When this is extended via the boundary conditions and shifted, one obtains:

IMDCT (MDCT (a, b, c, d)) = (abR, baR, c+dR, d+cR) / 2.

Half of the IMDCT outputs are thus redundant, as baR = −(abR)R, and likewise for the last two terms. If we group the input into bigger blocks A,B of size N, where A = (a, b) and B = (c, d), we can write this result in a simpler way:

IMDCT (MDCT (A, B)) = (AAR, B+BR) / 2

One can now understand how TDAC works. Suppose that one computes the MDCT of the subsequent, 50% overlapped, 2N block (B, C). The IMDCT will then yield, analogous to the above: (BBR, C+CR) / 2. When this is added with the previous IMDCT result in the overlapping half, the reversed terms cancel and one obtains simply B, recovering the original data.

Origin of TDAC

The origin of the term "time-domain aliasing cancellation" is now clear. The use of input data that extend beyond the boundaries of the logical DCT-IV causes the data to be aliased in the same way that frequencies beyond the Nyquist frequency are aliased to lower frequencies, except that this aliasing occurs in the time domain instead of the frequency domain: we cannot distinguish the contributions of a and of bR to the MDCT of (a, b, c, d), or equivalently, to the result of

IMDCT (MDCT (a, b, c, d)) = (abR, baR, c+dR, d+cR) / 2.

The combinations cdR and so on, have precisely the right signs for the combinations to cancel when they are added.

For odd N (which are rarely used in practice), N/2 is not an integer so the MDCT is not simply a shift permutation of a DCT-IV. In this case, the additional shift by half a sample means that the MDCT/IMDCT becomes equivalent to the DCT-III/II, and the analysis is analogous to the above.

Smoothness and discontinuities

We have seen above that the MDCT of 2N inputs (a, b, c, d) is equivalent to a DCT-IV of the N inputs (−cRd, abR). The DCT-IV is designed for the case where the function at the right boundary is odd, and therefore the values near the right boundary are close to 0. If the input signal is smooth, this is the case: the rightmost components of a and bR are consecutive in the input sequence (a, b, c, d), and therefore their difference is small. Let us look at the middle of the interval: if we rewrite the above expression as (−cRd, abR) = (−d, a)−(b,c)R, the second term, (b,c)R, gives a smooth transition in the middle. However, in the first term, (−d, a), there is a potential discontinuity where the right end of −d meets the left end of a. This is the reason for using a window function that reduces the components near the boundaries of the input sequence (a, b, c, d) towards 0.

TDAC for the windowed MDCT

Above, the TDAC property was proved for the ordinary MDCT, showing that adding IMDCTs of subsequent blocks in their overlapping half recovers the original data. The derivation of this inverse property for the windowed MDCT is only slightly more complicated.

Consider to overlapping consecutive sets of 2N inputs (A,B) and (B,C), for blocks A,B,C of size N. Recall from above that when   and   are MDCTed, IMDCTed, and added in their overlapping half, we obtain  , the original data.

Now we suppose that we multiply both the MDCT inputs and the IMDCT outputs by a window function of length 2N. As above, we assume a symmetric window function, which is therefore of the form   where W is a length-N vector and R denotes reversal as before. Then the Princen-Bradley condition can be written as  , with the squares and additions performed elementwise.

Therefore, instead of MDCTing  , we now MDCT   (with all multiplications performed elementwise). When this is IMDCTed and multiplied again (elementwise) by the window function, the last-N half becomes:

 .

(Note that we no longer have the multiplication by 1/2, because the IMDCT normalization differs by a factor of 2 in the windowed case.)

Similarly, the windowed MDCT and IMDCT of   yields, in its first-N half:

 .

When we add these two halves together, we obtain:

 

recovering the original data.

See also

References

  1. ^ Luo, Fa-Long (2008). Mobile Multimedia Broadcasting Standards: Technology and Practice. Springer Science & Business Media. p. 590. ISBN 9780387782638.
  2. ^ Jones, Graham A.; Layer, David H.; Osenkowsky, Thomas G. (2013). National Association of Broadcasters Engineering Handbook: NAB Engineering Handbook. Taylor & Francis. pp. 558–9. ISBN 978-1-136-03410-7.
  3. ^ "Dolby AC-4: Audio Delivery for Next-Generation Entertainment Services" (PDF). Dolby Laboratories. June 2015. Retrieved 11 November 2019.
  4. ^ Bleidt, R. L.; Sen, D.; Niedermeier, A.; Czelhan, B.; Füg, S.; et al. (2017). "Development of the MPEG-H TV Audio System for ATSC 3.0" (PDF). IEEE Transactions on Broadcasting. 63 (1): 202–236. doi:10.1109/TBC.2017.2661258. S2CID 30821673.
  5. ^ Schnell, Markus; Schmidt, Markus; Jander, Manuel; Albert, Tobias; Geiger, Ralf; Ruoppila, Vesa; Ekstrand, Per; Bernhard, Grill (October 2008). MPEG-4 Enhanced Low Delay AAC - A New Standard for High Quality Communication (PDF). 125th AES Convention. Fraunhofer IIS. Audio Engineering Society. Retrieved 20 October 2019.
  6. ^ Lutzky, Manfred; Schuller, Gerald; Gayer, Marc; Krämer, Ulrich; Wabnik, Stefan (May 2004). A guideline to audio codec delay (PDF). 116th AES Convention. Fraunhofer IIS. Audio Engineering Society. Retrieved 24 October 2019.
  7. ^ Nagireddi, Sivannarayana (2008). VoIP Voice and Fax Signal Processing. John Wiley & Sons. p. 69. ISBN 9780470377864.
  8. ^ Presentation of the CELT codec by Timothy B. Terriberry (65 minutes of video, see also presentation slides in PDF)
  9. ^ "Opus Codec". Opus (Home page). Xiph.org Foundation. Retrieved July 31, 2012.
  10. ^ Bright, Peter (2012-09-12). "Newly standardized Opus audio codec fills every role from online chat to music". Ars Technica. Retrieved 2014-05-28.
  11. ^ Ahmed, Nasir (January 1991). "How I Came Up With the Discrete Cosine Transform". Digital Signal Processing. 1 (1): 4–5. doi:10.1016/1051-2004(91)90086-Z.
  12. ^ Ahmed, Nasir; Natarajan, T.; Rao, K. R. (January 1974), "Discrete Cosine Transform", IEEE Transactions on Computers, C-23 (1): 90–93, doi:10.1109/T-C.1974.223784, S2CID 149806273
  13. ^ Princen, John P.; Johnson, A.W.; Bradley, Alan B. (1987). "Subband/Transform coding using filter bank designs based on time domain aliasing cancellation". ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing. 12: 2161–2164. doi:10.1109/ICASSP.1987.1169405. S2CID 58446992.
  14. ^ John P. Princen, Alan B. Bradley: Analysis/synthesis filter bank design based on time domain aliasing cancellation, IEEE Trans. Acoust. Speech Signal Processing, ASSP-34 (5), 1153–1161, 1986. Described a precursor to the MDCT using a combination of discrete cosine and sine transforms.
  15. ^ H. S. Malvar, "Lapped Transforms for Efficient Transform/Subband Coding", IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 38, no. 6, pp. 969–978 (Equation 22), June 1990.
  16. ^ H. S. Malvar, "Modulated QMF Filter Banks with Perfect Reconstruction", Electronics Letters, vol. 26, no. 13, pp. 906–907 (Equation 13), June 1990.

Bibliography

  • Henrique S. Malvar, Signal Processing with Lapped Transforms (Artech House: Norwood MA, 1992).
  • A. W. Johnson and A. B. Bradley, "Adaptive transform coding incorporating time domain aliasing cancellation," Speech Comm. 6, 299-308 (1987).
  • For algorithms, see examples:
    • Chi-Min Liu and Wen-Chieh Lee, "A unified fast algorithm for cosine modulated filterbanks in current audio standards[permanent dead link]", J. Audio Engineering 47 (12), 1061-1075 (1999).
    • V. Britanak and K. R. Rao, "A new fast algorithm for the unified forward and inverse MDCT/MDST computation," Signal Processing 82, 433-459 (2002)
    • Vladimir Nikolajevic and Gerhard Fettweis, "Computation of forward and inverse MDCT using Clenshaw's recurrence formula," IEEE Trans. Sig. Proc. 51 (5), 1439-1444 (2003)
    • Che-Hong Chen, Bin-Da Liu, and Jar-Ferr Yang, "Recursive architectures for realizing modified discrete cosine transform and its inverse," IEEE Trans. Circuits Syst. II: Analog Dig. Sig. Proc. 50 (1), 38-45 (2003)
    • J.S. Wu, H.Z. Shu, L. Senhadji, and L.M. Luo, "Mixed-radix algorithm for the computation of forward and inverse MDCTs," IEEE Trans. Circuits Syst. I: Reg. Papers 56 (4), 784-794 (2009)
    • V. Britanak, "A survey of efficient MDCT implementations in MP3 audio coding standard: retrospective and state-of-the-art," Signal. Process. 91 (4), 624-672(2011)

modified, discrete, cosine, transform, mdct, redirects, here, form, medical, imaging, multidetector, computed, tomography, modified, discrete, cosine, transform, mdct, transform, based, type, discrete, cosine, transform, with, additional, property, being, lapp. MDCT redirects here For the form of medical imaging see Multidetector computed tomography The modified discrete cosine transform MDCT is a transform based on the type IV discrete cosine transform DCT IV with the additional property of being lapped it is designed to be performed on consecutive blocks of a larger dataset where subsequent blocks are overlapped so that the last half of one block coincides with the first half of the next block This overlapping in addition to the energy compaction qualities of the DCT makes the MDCT especially attractive for signal compression applications since it helps to avoid artifacts stemming from the block boundaries As a result of these advantages the MDCT is the most widely used lossy compression technique in audio data compression It is employed in most modern audio coding standards including MP3 Dolby Digital AC 3 Vorbis Ogg Windows Media Audio WMA ATRAC Cook Advanced Audio Coding AAC 1 High Definition Coding HDC 2 LDAC Dolby AC 4 3 and MPEG H 3D Audio 4 as well as speech coding standards such as AAC LD LD MDCT 5 G 722 1 6 G 729 1 7 CELT 8 and Opus 9 10 The discrete cosine transform DCT was first proposed by Nasir Ahmed in 1972 11 and demonstrated by Ahmed with T Natarajan and K R Rao in 1974 12 The MDCT was later proposed by John P Princen A W Johnson and Alan B Bradley at the University of Surrey in 1987 13 following earlier work by Princen and Bradley 1986 14 to develop the MDCT s underlying principle of time domain aliasing cancellation TDAC described below There also exists an analogous transform the MDST based on the discrete sine transform as well as other rarely used forms of the MDCT based on different types of DCT or DCT DST combinations In MP3 the MDCT is not applied to the audio signal directly but rather to the output of a 32 band polyphase quadrature filter PQF bank The output of this MDCT is postprocessed by an alias reduction formula to reduce the typical aliasing of the PQF filter bank Such a combination of a filter bank with an MDCT is called a hybrid filter bank or a subband MDCT AAC on the other hand normally uses a pure MDCT only the rarely used MPEG 4 AAC SSR variant by Sony uses a four band PQF bank followed by an MDCT Similar to MP3 ATRAC uses stacked quadrature mirror filters QMF followed by an MDCT Contents 1 Definition 1 1 Inverse transform 1 2 Computation 2 Window functions 3 Relationship to DCT IV and Origin of TDAC 3 1 Origin of TDAC 3 2 Smoothness and discontinuities 3 3 TDAC for the windowed MDCT 4 See also 5 References 6 BibliographyDefinition EditAs a lapped transform the MDCT is a bit unusual compared to other Fourier related transforms in that it has half as many outputs as inputs instead of the same number In particular it is a linear function F R 2 N R N displaystyle F colon mathbf R 2N to mathbf R N where R denotes the set of real numbers The 2N real numbers x0 x2N 1 are transformed into the N real numbers X0 XN 1 according to the formula X k n 0 2 N 1 x n cos p N n 1 2 N 2 k 1 2 displaystyle X k sum n 0 2N 1 x n cos left frac pi N left n frac 1 2 frac N 2 right left k frac 1 2 right right The normalization coefficient in front of this transform here unity is an arbitrary convention and differs between treatments Only the product of the normalizations of the MDCT and the IMDCT below is constrained Inverse transform Edit The inverse MDCT is known as the IMDCT Because there are different numbers of inputs and outputs at first glance it might seem that the MDCT should not be invertible However perfect invertibility is achieved by adding the overlapped IMDCTs of subsequent overlapping blocks causing the errors to cancel and the original data to be retrieved this technique is known as time domain aliasing cancellation TDAC The IMDCT transforms N real numbers X0 XN 1 into 2N real numbers y0 y2N 1 according to the formula y n 1 N k 0 N 1 X k cos p N n 1 2 N 2 k 1 2 displaystyle y n frac 1 N sum k 0 N 1 X k cos left frac pi N left n frac 1 2 frac N 2 right left k frac 1 2 right right Like for the DCT IV an orthogonal transform the inverse has the same form as the forward transform In the case of a windowed MDCT with the usual window normalization see below the normalization coefficient in front of the IMDCT should be multiplied by 2 i e becoming 2 N Computation Edit Although the direct application of the MDCT formula would require O N2 operations it is possible to compute the same thing with only O N log N complexity by recursively factorizing the computation as in the fast Fourier transform FFT One can also compute MDCTs via other transforms typically a DFT FFT or a DCT combined with O N pre and post processing steps Also as described below any algorithm for the DCT IV immediately provides a method to compute the MDCT and IMDCT of even size Window functions Edit MDCT window functions blue Cosine red Sine Cosine green modified Kaiser Bessel In typical signal compression applications the transform properties are further improved by using a window function wn n 0 2N 1 that is multiplied with xn and yn in the MDCT and IMDCT formulas above in order to avoid discontinuities at the n 0 and 2N boundaries by making the function go smoothly to zero at those points That is we window the data before the MDCT and after the IMDCT In principle x and y could have different window functions and the window function could also change from one block to the next especially for the case where data blocks of different sizes are combined but for simplicity we consider the common case of identical window functions for equal sized blocks The transform remains invertible that is TDAC works for a symmetric window wn w2N 1 n as long as w satisfies the Princen Bradley condition w n 2 w n N 2 1 displaystyle w n 2 w n N 2 1 Various window functions are used A window that produces a form known as a modulated lapped transform MLT 15 16 is given by w n sin p 2 N n 1 2 displaystyle w n sin left frac pi 2N left n frac 1 2 right right and is used for MP3 and MPEG 2 AAC and w n sin p 2 sin 2 p 2 N n 1 2 displaystyle w n sin left frac pi 2 sin 2 left frac pi 2N left n frac 1 2 right right right for Vorbis AC 3 uses a Kaiser Bessel derived KBD window and MPEG 4 AAC can also use a KBD window Note that windows applied to the MDCT are different from windows used for some other types of signal analysis since they must fulfill the Princen Bradley condition One of the reasons for this difference is that MDCT windows are applied twice for both the MDCT analysis and the IMDCT synthesis Relationship to DCT IV and Origin of TDAC EditAs can be seen by inspection of the definitions for even N the MDCT is essentially equivalent to a DCT IV where the input is shifted by N 2 and two N blocks of data are transformed at once By examining this equivalence more carefully important properties like TDAC can be easily derived In order to define the precise relationship to the DCT IV one must realize that the DCT IV corresponds to alternating even odd boundary conditions even at its left boundary around n 1 2 odd at its right boundary around n N 1 2 and so on instead of periodic boundaries as for a DFT This follows from the identities cos p N n 1 1 2 k 1 2 cos p N n 1 2 k 1 2 displaystyle cos left frac pi N left n 1 frac 1 2 right left k frac 1 2 right right cos left frac pi N left n frac 1 2 right left k frac 1 2 right right and cos p N 2 N n 1 1 2 k 1 2 cos p N n 1 2 k 1 2 displaystyle cos left frac pi N left 2N n 1 frac 1 2 right left k frac 1 2 right right cos left frac pi N left n frac 1 2 right left k frac 1 2 right right Thus if its inputs are an array x of length N we can imagine extending this array to x xR x xR and so on where xR denotes x in reverse order Consider an MDCT with 2N inputs and N outputs where we divide the inputs into four blocks a b c d each of size N 2 If we shift these to the right by N 2 from the N 2 term in the MDCT definition then b c d extend past the end of the N DCT IV inputs so we must fold them back according to the boundary conditions described above Thus the MDCT of 2N inputs a b c d is exactly equivalent to a DCT IV of the N inputs cR d a bR where R denotes reversal as above In this way any algorithm to compute the DCT IV can be trivially applied to the MDCT Similarly the IMDCT formula above is precisely 1 2 of the DCT IV which is its own inverse where the output is extended via the boundary conditions to a length 2N and shifted back to the left by N 2 The inverse DCT IV would simply give back the inputs cR d a bR from above When this is extended via the boundary conditions and shifted one obtains IMDCT MDCT a b c d a bR b aR c dR d cR 2 Half of the IMDCT outputs are thus redundant as b aR a bR R and likewise for the last two terms If we group the input into bigger blocks A B of size N where A a b and B c d we can write this result in a simpler way IMDCT MDCT A B A AR B BR 2One can now understand how TDAC works Suppose that one computes the MDCT of the subsequent 50 overlapped 2N block B C The IMDCT will then yield analogous to the above B BR C CR 2 When this is added with the previous IMDCT result in the overlapping half the reversed terms cancel and one obtains simply B recovering the original data Origin of TDAC Edit The origin of the term time domain aliasing cancellation is now clear The use of input data that extend beyond the boundaries of the logical DCT IV causes the data to be aliased in the same way that frequencies beyond the Nyquist frequency are aliased to lower frequencies except that this aliasing occurs in the time domain instead of the frequency domain we cannot distinguish the contributions of a and of bR to the MDCT of a b c d or equivalently to the result of IMDCT MDCT a b c d a bR b aR c dR d cR 2 The combinations c dR and so on have precisely the right signs for the combinations to cancel when they are added For odd N which are rarely used in practice N 2 is not an integer so the MDCT is not simply a shift permutation of a DCT IV In this case the additional shift by half a sample means that the MDCT IMDCT becomes equivalent to the DCT III II and the analysis is analogous to the above Smoothness and discontinuities Edit We have seen above that the MDCT of 2N inputs a b c d is equivalent to a DCT IV of the N inputs cR d a bR The DCT IV is designed for the case where the function at the right boundary is odd and therefore the values near the right boundary are close to 0 If the input signal is smooth this is the case the rightmost components of a and bR are consecutive in the input sequence a b c d and therefore their difference is small Let us look at the middle of the interval if we rewrite the above expression as cR d a bR d a b c R the second term b c R gives a smooth transition in the middle However in the first term d a there is a potential discontinuity where the right end of d meets the left end of a This is the reason for using a window function that reduces the components near the boundaries of the input sequence a b c d towards 0 TDAC for the windowed MDCT Edit Above the TDAC property was proved for the ordinary MDCT showing that adding IMDCTs of subsequent blocks in their overlapping half recovers the original data The derivation of this inverse property for the windowed MDCT is only slightly more complicated Consider to overlapping consecutive sets of 2N inputs A B and B C for blocks A B C of size N Recall from above that when A B displaystyle A B and B C displaystyle B C are MDCTed IMDCTed and added in their overlapping half we obtain B B R 2 B B R 2 B displaystyle B B R 2 B B R 2 B the original data Now we suppose that we multiply both the MDCT inputs and the IMDCT outputs by a window function of length 2N As above we assume a symmetric window function which is therefore of the form W W R displaystyle W W R where W is a length N vector and R denotes reversal as before Then the Princen Bradley condition can be written as W 2 W R 2 1 1 displaystyle W 2 W R 2 1 1 ldots with the squares and additions performed elementwise Therefore instead of MDCTing A B displaystyle A B we now MDCT W A W R B displaystyle WA W R B with all multiplications performed elementwise When this is IMDCTed and multiplied again elementwise by the window function the last N half becomes W R W R B W R B R W R W R B W B R W R 2 B W W R B R displaystyle W R cdot W R B W R B R W R cdot W R B WB R W R 2 B WW R B R Note that we no longer have the multiplication by 1 2 because the IMDCT normalization differs by a factor of 2 in the windowed case Similarly the windowed MDCT and IMDCT of B C displaystyle B C yields in its first N half W W B W R B R W 2 B W W R B R displaystyle W cdot WB W R B R W 2 B WW R B R When we add these two halves together we obtain W R 2 B W W R B R W 2 B W W R B R W R 2 W 2 B B displaystyle W R 2 B WW R B R W 2 B WW R B R left W R 2 W 2 right B B recovering the original data See also EditDiscrete cosine transform Other overlapping windowed Fourier transforms include Modulated complex lapped transform Short time Fourier transform Welch s method Audio coding format Audio compression data References Edit Luo Fa Long 2008 Mobile Multimedia Broadcasting Standards Technology and Practice Springer Science amp Business Media p 590 ISBN 9780387782638 Jones Graham A Layer David H Osenkowsky Thomas G 2013 National Association of Broadcasters Engineering Handbook NAB Engineering Handbook Taylor amp Francis pp 558 9 ISBN 978 1 136 03410 7 Dolby AC 4 Audio Delivery for Next Generation Entertainment Services PDF Dolby Laboratories June 2015 Retrieved 11 November 2019 Bleidt R L Sen D Niedermeier A Czelhan B Fug S et al 2017 Development of the MPEG H TV Audio System for ATSC 3 0 PDF IEEE Transactions on Broadcasting 63 1 202 236 doi 10 1109 TBC 2017 2661258 S2CID 30821673 Schnell Markus Schmidt Markus Jander Manuel Albert Tobias Geiger Ralf Ruoppila Vesa Ekstrand Per Bernhard Grill October 2008 MPEG 4 Enhanced Low Delay AAC A New Standard for High Quality Communication PDF 125th AES Convention Fraunhofer IIS Audio Engineering Society Retrieved 20 October 2019 Lutzky Manfred Schuller Gerald Gayer Marc Kramer Ulrich Wabnik Stefan May 2004 A guideline to audio codec delay PDF 116th AES Convention Fraunhofer IIS Audio Engineering Society Retrieved 24 October 2019 Nagireddi Sivannarayana 2008 VoIP Voice and Fax Signal Processing John Wiley amp Sons p 69 ISBN 9780470377864 Presentation of the CELT codec by Timothy B Terriberry 65 minutes of video see also presentation slides in PDF Opus Codec Opus Home page Xiph org Foundation Retrieved July 31 2012 Bright Peter 2012 09 12 Newly standardized Opus audio codec fills every role from online chat to music Ars Technica Retrieved 2014 05 28 Ahmed Nasir January 1991 How I Came Up With the Discrete Cosine Transform Digital Signal Processing 1 1 4 5 doi 10 1016 1051 2004 91 90086 Z Ahmed Nasir Natarajan T Rao K R January 1974 Discrete Cosine Transform IEEE Transactions on Computers C 23 1 90 93 doi 10 1109 T C 1974 223784 S2CID 149806273 Princen John P Johnson A W Bradley Alan B 1987 Subband Transform coding using filter bank designs based on time domain aliasing cancellation ICASSP 87 IEEE International Conference on Acoustics Speech and Signal Processing 12 2161 2164 doi 10 1109 ICASSP 1987 1169405 S2CID 58446992 John P Princen Alan B Bradley Analysis synthesis filter bank design based on time domain aliasing cancellation IEEE Trans Acoust Speech Signal Processing ASSP 34 5 1153 1161 1986 Described a precursor to the MDCT using a combination of discrete cosine and sine transforms H S Malvar Lapped Transforms for Efficient Transform Subband Coding IEEE Trans on Acoustics Speech and Signal Processing vol 38 no 6 pp 969 978 Equation 22 June 1990 H S Malvar Modulated QMF Filter Banks with Perfect Reconstruction Electronics Letters vol 26 no 13 pp 906 907 Equation 13 June 1990 Bibliography EditHenrique S Malvar Signal Processing with Lapped Transforms Artech House Norwood MA 1992 A W Johnson and A B Bradley Adaptive transform coding incorporating time domain aliasing cancellation Speech Comm 6 299 308 1987 For algorithms see examples Chi Min Liu and Wen Chieh Lee A unified fast algorithm for cosine modulated filterbanks in current audio standards permanent dead link J Audio Engineering 47 12 1061 1075 1999 V Britanak and K R Rao A new fast algorithm for the unified forward and inverse MDCT MDST computation Signal Processing 82 433 459 2002 Vladimir Nikolajevic and Gerhard Fettweis Computation of forward and inverse MDCT using Clenshaw s recurrence formula IEEE Trans Sig Proc 51 5 1439 1444 2003 Che Hong Chen Bin Da Liu and Jar Ferr Yang Recursive architectures for realizing modified discrete cosine transform and its inverse IEEE Trans Circuits Syst II Analog Dig Sig Proc 50 1 38 45 2003 J S Wu H Z Shu L Senhadji and L M Luo Mixed radix algorithm for the computation of forward and inverse MDCTs IEEE Trans Circuits Syst I Reg Papers 56 4 784 794 2009 V Britanak A survey of efficient MDCT implementations in MP3 audio coding standard retrospective and state of the art Signal Process 91 4 624 672 2011 Retrieved from https en wikipedia org w index php title Modified discrete cosine transform amp oldid 1143109481, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.