fbpx
Wikipedia

Reed–Solomon error correction

Reed–Solomon codes are a group of error-correcting codes that were introduced by Irving S. Reed and Gustave Solomon in 1960.[1] They have many applications, including consumer technologies such as MiniDiscs, CDs, DVDs, Blu-ray discs, QR codes, Data Matrix, data transmission technologies such as DSL and WiMAX, broadcast systems such as satellite communications, DVB and ATSC, and storage systems such as RAID 6.

Reed–Solomon codes
Named afterIrving S. Reed and Gustave Solomon
Classification
HierarchyLinear block code
Polynomial code
Reed–Solomon code
Block lengthn
Message lengthk
Distancenk + 1
Alphabet sizeq = pmn  (p prime)
Often n = q − 1.
Notation[n, k, nk + 1]q-code
Algorithms
Berlekamp–Massey
Euclidean
et al.
Properties
Maximum-distance separable code

Reed–Solomon codes operate on a block of data treated as a set of finite-field elements called symbols. Reed–Solomon codes are able to detect and correct multiple symbol errors. By adding t = n − k check symbols to the data, a Reed–Solomon code can detect (but not correct) any combination of up to t erroneous symbols, or locate and correct up to t/2⌋ erroneous symbols at unknown locations. As an erasure code, it can correct up to t erasures at locations that are known and provided to the algorithm, or it can detect and correct combinations of errors and erasures. Reed–Solomon codes are also suitable as multiple-burst bit-error correcting codes, since a sequence of b + 1 consecutive bit errors can affect at most two symbols of size b. The choice of t is up to the designer of the code and may be selected within wide limits.

There are two basic types of Reed–Solomon codes – original view and BCH view – with BCH view being the most common, as BCH view decoders are faster and require less working storage than original view decoders.

History edit

Reed–Solomon codes were developed in 1960 by Irving S. Reed and Gustave Solomon, who were then staff members of MIT Lincoln Laboratory. Their seminal article was titled "Polynomial Codes over Certain Finite Fields". (Reed & Solomon 1960). The original encoding scheme described in the Reed & Solomon article used a variable polynomial based on the message to be encoded where only a fixed set of values (evaluation points) to be encoded are known to encoder and decoder. The original theoretical decoder generated potential polynomials based on subsets of k (unencoded message length) out of n (encoded message length) values of a received message, choosing the most popular polynomial as the correct one, which was impractical for all but the simplest of cases. This was initially resolved by changing the original scheme to a BCH code like scheme based on a fixed polynomial known to both encoder and decoder, but later, practical decoders based on the original scheme were developed, although slower than the BCH schemes. The result of this is that there are two main types of Reed Solomon codes, ones that use the original encoding scheme, and ones that use the BCH encoding scheme.

Also in 1960, a practical fixed polynomial decoder for BCH codes developed by Daniel Gorenstein and Neal Zierler was described in an MIT Lincoln Laboratory report by Zierler in January 1960 and later in a paper in June 1961.[2] The Gorenstein–Zierler decoder and the related work on BCH codes are described in a book Error Correcting Codes by W. Wesley Peterson (1961).[3] By 1963 (or possibly earlier), J. J. Stone (and others) recognized that Reed Solomon codes could use the BCH scheme of using a fixed generator polynomial, making such codes a special class of BCH codes,[4] but Reed Solomon codes based on the original encoding scheme, are not a class of BCH codes, and depending on the set of evaluation points, they are not even cyclic codes.

In 1969, an improved BCH scheme decoder was developed by Elwyn Berlekamp and James Massey, and has since been known as the Berlekamp–Massey decoding algorithm.

In 1975, another improved BCH scheme decoder was developed by Yasuo Sugiyama, based on the extended Euclidean algorithm.[5]

 

In 1977, Reed–Solomon codes were implemented in the Voyager program in the form of concatenated error correction codes. The first commercial application in mass-produced consumer products appeared in 1982 with the compact disc, where two interleaved Reed–Solomon codes are used. Today, Reed–Solomon codes are widely implemented in digital storage devices and digital communication standards, though they are being slowly replaced by Bose–Chaudhuri–Hocquenghem (BCH) codes. For example, Reed–Solomon codes are used in the Digital Video Broadcasting (DVB) standard DVB-S, in conjunction with a convolutional inner code, but BCH codes are used with LDPC in its successor, DVB-S2.

In 1986, an original scheme decoder known as the Berlekamp–Welch algorithm was developed.

In 1996, variations of original scheme decoders called list decoders or soft decoders were developed by Madhu Sudan and others, and work continues on these types of decoders – see Guruswami–Sudan list decoding algorithm.

In 2002, another original scheme decoder was developed by Shuhong Gao, based on the extended Euclidean algorithm.[6]

Applications edit

Data storage edit

Reed–Solomon coding is very widely used in mass storage systems to correct the burst errors associated with media defects.

Reed–Solomon coding is a key component of the compact disc. It was the first use of strong error correction coding in a mass-produced consumer product, and DAT and DVD use similar schemes. In the CD, two layers of Reed–Solomon coding separated by a 28-way convolutional interleaver yields a scheme called Cross-Interleaved Reed–Solomon Coding (CIRC). The first element of a CIRC decoder is a relatively weak inner (32,28) Reed–Solomon code, shortened from a (255,251) code with 8-bit symbols. This code can correct up to 2 byte errors per 32-byte block. More importantly, it flags as erasures any uncorrectable blocks, i.e., blocks with more than 2 byte errors. The decoded 28-byte blocks, with erasure indications, are then spread by the deinterleaver to different blocks of the (28,24) outer code. Thanks to the deinterleaving, an erased 28-byte block from the inner code becomes a single erased byte in each of 28 outer code blocks. The outer code easily corrects this, since it can handle up to 4 such erasures per block.

The result is a CIRC that can completely correct error bursts up to 4000 bits, or about 2.5 mm on the disc surface. This code is so strong that most CD playback errors are almost certainly caused by tracking errors that cause the laser to jump track, not by uncorrectable error bursts.[7]

DVDs use a similar scheme, but with much larger blocks, a (208,192) inner code, and a (182,172) outer code.

Reed–Solomon error correction is also used in parchive files which are commonly posted accompanying multimedia files on USENET. The distributed online storage service Wuala (discontinued in 2015) also used Reed–Solomon when breaking up files.

Bar code edit

Almost all two-dimensional bar codes such as PDF-417, MaxiCode, Datamatrix, QR Code, and Aztec Code use Reed–Solomon error correction to allow correct reading even if a portion of the bar code is damaged. When the bar code scanner cannot recognize a bar code symbol, it will treat it as an erasure.

Reed–Solomon coding is less common in one-dimensional bar codes, but is used by the PostBar symbology.

Data transmission edit

Specialized forms of Reed–Solomon codes, specifically Cauchy-RS and Vandermonde-RS, can be used to overcome the unreliable nature of data transmission over erasure channels. The encoding process assumes a code of RS(NK) which results in N codewords of length N symbols each storing K symbols of data, being generated, that are then sent over an erasure channel.

Any combination of K codewords received at the other end is enough to reconstruct all of the N codewords. The code rate is generally set to 1/2 unless the channel's erasure likelihood can be adequately modelled and is seen to be less. In conclusion, N is usually 2K, meaning that at least half of all the codewords sent must be received in order to reconstruct all of the codewords sent.

Reed–Solomon codes are also used in xDSL systems and CCSDS's Space Communications Protocol Specifications as a form of forward error correction.

Space transmission edit

 
Deep-space concatenated coding system.[8] Notation: RS(255, 223) + CC ("constraint length" = 7, code rate = 1/2).

One significant application of Reed–Solomon coding was to encode the digital pictures sent back by the Voyager program.

Voyager introduced Reed–Solomon coding concatenated with convolutional codes, a practice that has since become very widespread in deep space and satellite (e.g., direct digital broadcasting) communications.

Viterbi decoders tend to produce errors in short bursts. Correcting these burst errors is a job best done by short or simplified Reed–Solomon codes.

Modern versions of concatenated Reed–Solomon/Viterbi-decoded convolutional coding were and are used on the Mars Pathfinder, Galileo, Mars Exploration Rover and Cassini missions, where they perform within about 1–1.5 dB of the ultimate limit, the Shannon capacity.

These concatenated codes are now being replaced by more powerful turbo codes:

Channel coding schemes used by NASA missions[9]
Years Code Mission(s)
1958–present Uncoded Explorer, Mariner, many others
1968–1978 convolutional codes (CC) (25, 1/2) Pioneer, Venus
1969–1975 Reed-Muller code (32, 6) Mariner, Viking
1977–present Binary Golay code Voyager
1977–present RS(255, 223) + CC(7, 1/2) Voyager, Galileo, many others
1989–2003 RS(255, 223) + CC(7, 1/3) Voyager
1989–2003 RS(255, 223) + CC(14, 1/4) Galileo
1996–present RS + CC (15, 1/6) Cassini, Mars Pathfinder, others
2004–present Turbo codes[nb 1] Messenger, Stereo, MRO, others
est. 2009 LDPC codes Constellation, MSL

Constructions (encoding) edit

The Reed–Solomon code is actually a family of codes, where every code is characterised by three parameters: an alphabet size  , a block length  , and a message length  , with  . The set of alphabet symbols is interpreted as the finite field   of order  , and thus,   must be a prime power. In the most useful parameterizations of the Reed–Solomon code, the block length is usually some constant multiple of the message length, that is, the rate   is some constant, and furthermore, the block length is equal to or one less than the alphabet size, that is,   or  .[citation needed]

Reed & Solomon's original view: The codeword as a sequence of values edit

There are different encoding procedures for the Reed–Solomon code, and thus, there are different ways to describe the set of all codewords. In the original view of Reed & Solomon (1960), every codeword of the Reed–Solomon code is a sequence of function values of a polynomial of degree less than  . In order to obtain a codeword of the Reed–Solomon code, the message symbols (each within the q-sized alphabet) are treated as the coefficients of a polynomial   of degree less than k, over the finite field   with   elements. In turn, the polynomial p is evaluated at nq distinct points   of the field F, and the sequence of values is the corresponding codeword. Common choices for a set of evaluation points include {0, 1, 2, ..., n − 1}, {0, 1, α, α2, ..., αn−2}, or for n < q, {1, α, α2, ..., αn−1}, ... , where α is a primitive element of F.

Formally, the set   of codewords of the Reed–Solomon code is defined as follows:

 
Since any two distinct polynomials of degree less than   agree in at most   points, this means that any two codewords of the Reed–Solomon code disagree in at least   positions. Furthermore, there are two polynomials that do agree in   points but are not equal, and thus, the distance of the Reed–Solomon code is exactly  . Then the relative distance is  , where   is the rate. This trade-off between the relative distance and the rate is asymptotically optimal since, by the Singleton bound, every code satisfies  . Being a code that achieves this optimal trade-off, the Reed–Solomon code belongs to the class of maximum distance separable codes.

While the number of different polynomials of degree less than k and the number of different messages are both equal to  , and thus every message can be uniquely mapped to such a polynomial, there are different ways of doing this encoding. The original construction of Reed & Solomon (1960) interprets the message x as the coefficients of the polynomial p, whereas subsequent constructions interpret the message as the values of the polynomial at the first k points   and obtain the polynomial p by interpolating these values with a polynomial of degree less than k. The latter encoding procedure, while being slightly less efficient, has the advantage that it gives rise to a systematic code, that is, the original message is always contained as a subsequence of the codeword.

Simple encoding procedure: The message as a sequence of coefficients edit

In the original construction of Reed & Solomon (1960), the message   is mapped to the polynomial   with

 
The codeword of   is obtained by evaluating   at   different points   of the field  . Thus the classical encoding function   for the Reed–Solomon code is defined as follows:
 
This function   is a linear mapping, that is, it satisfies   for the following  -matrix   with elements from  :
 

This matrix is a Vandermonde matrix over  . In other words, the Reed–Solomon code is a linear code, and in the classical encoding procedure, its generator matrix is  .

Systematic encoding procedure: The message as an initial sequence of values edit

There is an alternative encoding procedure that produces a systematic Reed–Solomon code. Here, we use a different polynomial  . In this variant, the polynomial   is defined as the unique polynomial of degree less than   such that

 
To compute this polynomial   from  , one can use Lagrange interpolation. Once it has been found, it is evaluated at the other points  .
 

This variant is systematic since the first   entries,  , are exactly   by the definition of  .

Discrete Fourier transform and its inverse edit

A discrete Fourier transform is essentially the same as the encoding procedure; it uses the generator polynomial   to map a set of evaluation points into the message values as shown above:

 

The inverse Fourier transform could be used to convert an error free set of n < q message values back into the encoding polynomial of k coefficients, with the constraint that in order for this to work, the set of evaluation points used to encode the message must be a set of increasing powers of α:

 
 

However, Lagrange interpolation performs the same conversion without the constraint on the set of evaluation points or the requirement of an error free set of message values and is used for systematic encoding, and in one of the steps of the Gao decoder.

The BCH view: The codeword as a sequence of coefficients edit

In this view, the message is interpreted as the coefficients of a polynomial  . The sender computes a related polynomial   of degree   where   and sends the polynomial  . The polynomial   is constructed by multiplying the message polynomial  , which has degree  , with a generator polynomial   of degree   that is known to both the sender and the receiver. The generator polynomial   is defined as the polynomial whose roots are sequential powers of the Galois field primitive  

 

For a "narrow sense code",  .

 

Systematic encoding procedure edit

The encoding procedure for the BCH view of Reed–Solomon codes can be modified to yield a systematic encoding procedure, in which each codeword contains the message as a prefix, and simply appends error correcting symbols as a suffix. Here, instead of sending  , the encoder constructs the transmitted polynomial   such that the coefficients of the   largest monomials are equal to the corresponding coefficients of  , and the lower-order coefficients of   are chosen exactly in such a way that   becomes divisible by  . Then the coefficients of   are a subsequence of the coefficients of  . To get a code that is overall systematic, we construct the message polynomial   by interpreting the message as the sequence of its coefficients.

Formally, the construction is done by multiplying   by   to make room for the   check symbols, dividing that product by   to find the remainder, and then compensating for that remainder by subtracting it. The   check symbols are created by computing the remainder  :

 

The remainder has degree at most  , whereas the coefficients of   in the polynomial   are zero. Therefore, the following definition of the codeword   has the property that the first   coefficients are identical to the coefficients of  :

 

As a result, the codewords   are indeed elements of  , that is, they are divisible by the generator polynomial  :[10]

 

Properties edit

The Reed–Solomon code is a [n, k, nk + 1] code; in other words, it is a linear block code of length n (over F) with dimension k and minimum Hamming distance   The Reed–Solomon code is optimal in the sense that the minimum distance has the maximum value possible for a linear code of size (nk); this is known as the Singleton bound. Such a code is also called a maximum distance separable (MDS) code.

The error-correcting ability of a Reed–Solomon code is determined by its minimum distance, or equivalently, by  , the measure of redundancy in the block. If the locations of the error symbols are not known in advance, then a Reed–Solomon code can correct up to   erroneous symbols, i.e., it can correct half as many errors as there are redundant symbols added to the block. Sometimes error locations are known in advance (e.g., "side information" in demodulator signal-to-noise ratios)—these are called erasures. A Reed–Solomon code (like any MDS code) is able to correct twice as many erasures as errors, and any combination of errors and erasures can be corrected as long as the relation 2E + Snk is satisfied, where   is the number of errors and   is the number of erasures in the block.

 
Theoretical BER performance of the Reed-Solomon code (N=255, K=233, QPSK, AWGN). Step-like characteristic.

The theoretical error bound can be described via the following formula for the AWGN channel for FSK:[11]

 
and for other modulation schemes:
 
where  ,  ,  ,   is the symbol error rate in uncoded AWGN case and   is the modulation order.

For practical uses of Reed–Solomon codes, it is common to use a finite field   with   elements. In this case, each symbol can be represented as an  -bit value. The sender sends the data points as encoded blocks, and the number of symbols in the encoded block is  . Thus a Reed–Solomon code operating on 8-bit symbols has   symbols per block. (This is a very popular value because of the prevalence of byte-oriented computer systems.) The number  , with  , of data symbols in the block is a design parameter. A commonly used code encodes   eight-bit data symbols plus 32 eight-bit parity symbols in an  -symbol block; this is denoted as a   code, and is capable of correcting up to 16 symbol errors per block.

The Reed–Solomon code properties discussed above make them especially well-suited to applications where errors occur in bursts. This is because it does not matter to the code how many bits in a symbol are in error — if multiple bits in a symbol are corrupted it only counts as a single error. Conversely, if a data stream is not characterized by error bursts or drop-outs but by random single bit errors, a Reed–Solomon code is usually a poor choice compared to a binary code.

The Reed–Solomon code, like the convolutional code, is a transparent code. This means that if the channel symbols have been inverted somewhere along the line, the decoders will still operate. The result will be the inversion of the original data. However, the Reed–Solomon code loses its transparency when the code is shortened. The "missing" bits in a shortened code need to be filled by either zeros or ones, depending on whether the data is complemented or not. (To put it another way, if the symbols are inverted, then the zero-fill needs to be inverted to a one-fill.) For this reason it is mandatory that the sense of the data (i.e., true or complemented) be resolved before Reed–Solomon decoding.

Whether the Reed–Solomon code is cyclic or not depends on subtle details of the construction. In the original view of Reed and Solomon, where the codewords are the values of a polynomial, one can choose the sequence of evaluation points in such a way as to make the code cyclic. In particular, if   is a primitive root of the field  , then by definition all non-zero elements of   take the form   for  , where  . Each polynomial   over   gives rise to a codeword  . Since the function   is also a polynomial of the same degree, this function gives rise to a codeword  ; since   holds, this codeword is the cyclic left-shift of the original codeword derived from  . So choosing a sequence of primitive root powers as the evaluation points makes the original view Reed–Solomon code cyclic. Reed–Solomon codes in the BCH view are always cyclic because BCH codes are cyclic.

Remarks edit

Designers are not required to use the "natural" sizes of Reed–Solomon code blocks. A technique known as "shortening" can produce a smaller code of any desired size from a larger code. For example, the widely used (255,223) code can be converted to a (160,128) code by padding the unused portion of the source block with 95 binary zeroes and not transmitting them. At the decoder, the same portion of the block is loaded locally with binary zeroes. The Delsarte–Goethals–Seidel[12] theorem illustrates an example of an application of shortened Reed–Solomon codes. In parallel to shortening, a technique known as puncturing allows omitting some of the encoded parity symbols.

BCH view decoders edit

The decoders described in this section use the BCH view of a codeword as a sequence of coefficients. They use a fixed generator polynomial known to both encoder and decoder.

Peterson–Gorenstein–Zierler decoder edit

Daniel Gorenstein and Neal Zierler developed a decoder that was described in a MIT Lincoln Laboratory report by Zierler in January 1960 and later in a paper in June 1961.[13] The Gorenstein–Zierler decoder and the related work on BCH codes are described in a book Error Correcting Codes by W. Wesley Peterson (1961).[14]

Formulation edit

The transmitted message,  , is viewed as the coefficients of a polynomial s(x):

 

As a result of the Reed-Solomon encoding procedure, s(x) is divisible by the generator polynomial g(x):

 
where α is a primitive element.

Since s(x) is a multiple of the generator g(x), it follows that it "inherits" all its roots.

 
Therefore,
 

The transmitted polynomial is corrupted in transit by an error polynomial e(x) to produce the received polynomial r(x).

 
 

Coefficient ei will be zero if there is no error at that power of x and nonzero if there is an error. If there are ν errors at distinct powers ik of x, then

 

The goal of the decoder is to find the number of errors (ν), the positions of the errors (ik), and the error values at those positions (eik). From those, e(x) can be calculated and subtracted from r(x) to get the originally sent message s(x).

Syndrome decoding edit

The decoder starts by evaluating the polynomial as received at points  . We call the results of that evaluation the "syndromes", Sj. They are defined as:

 
Note that   because   has roots at  , as shown in the previous section.

The advantage of looking at the syndromes is that the message polynomial drops out. In other words, the syndromes only relate to the error, and are unaffected by the actual contents of the message being transmitted. If the syndromes are all zero, the algorithm stops here and reports that the message was not corrupted in transit.

Error locators and error values edit

For convenience, define the error locators Xk and error values Yk as:

 

Then the syndromes can be written in terms of these error locators and error values as

 

This definition of the syndrome values is equivalent to the previous since  .

The syndromes give a system of nk ≥ 2ν equations in 2ν unknowns, but that system of equations is nonlinear in the Xk and does not have an obvious solution. However, if the Xk were known (see below), then the syndrome equations provide a linear system of equations that can easily be solved for the Yk error values.

 

Consequently, the problem is finding the Xk, because then the leftmost matrix would be known, and both sides of the equation could be multiplied by its inverse, yielding Yk

In the variant of this algorithm where the locations of the errors are already known (when it is being used as an erasure code), this is the end. The error locations (Xk) are already known by some other method (for example, in an FM transmission, the sections where the bitstream was unclear or overcome with interference are probabilistically determinable from frequency analysis). In this scenario, up to   errors can be corrected.

The rest of the algorithm serves to locate the errors, and will require syndrome values up to  , instead of just the   used thus far. This is why twice as many error correcting symbols need to be added as can be corrected without knowing their locations.

Error locator polynomial edit

There is a linear recurrence relation that gives rise to a system of linear equations. Solving those equations identifies those error locations Xk.

Define the error locator polynomial Λ(x) as

 

The zeros of Λ(x) are the reciprocals  . This follows from the above product notation construction since if   then one of the multiplied terms will be zero  , making the whole polynomial evaluate to zero.

 

Let   be any integer such that  . Multiply both sides by   and it will still be zero.

 

Sum for k = 1 to ν and it will still be zero.

 

Collect each term into its own sum.

 

Extract the constant values of   that are unaffected by the summation.

 

These summations are now equivalent to the syndrome values, which we know and can substitute in! This therefore reduces to

 

Subtracting   from both sides yields

 

Recall that j was chosen to be any integer between 1 and v inclusive, and this equivalence is true for any and all such values. Therefore, we have v linear equations, not just one. This system of linear equations can therefore be solved for the coefficients Λi of the error location polynomial:

 
The above assumes the decoder knows the number of errors ν, but that number has not been determined yet. The PGZ decoder does not determine ν directly but rather searches for it by trying successive values. The decoder first assumes the largest value for a trial ν and sets up the linear system for that value. If the equations can be solved (i.e., the matrix determinant is nonzero), then that trial value is the number of errors. If the linear system cannot be solved, then the trial ν is reduced by one and the next smaller system is examined. (Gill n.d., p. 35)

Find the roots of the error locator polynomial edit

Use the coefficients Λi found in the last step to build the error location polynomial. The roots of the error location polynomial can be found by exhaustive search. The error locators Xk are the reciprocals of those roots. The order of coefficients of the error location polynomial can be reversed, in which case the roots of that reversed polynomial are the error locators   (not their reciprocals  ). Chien search is an efficient implementation of this step.

Calculate the error values edit

Once the error locators Xk are known, the error values can be determined. This can be done by direct solution for Yk in the error equations matrix given above, or using the Forney algorithm.

Calculate the error locations edit

Calculate ik by taking the log base   of Xk. This is generally done using a precomputed lookup table.

Fix the errors edit

Finally, e(x) is generated from ik and eik and then is subtracted from r(x) to get the originally sent message s(x), with errors corrected.

Example edit

Consider the Reed–Solomon code defined in GF(929) with α = 3 and t = 4 (this is used in PDF417 barcodes) for a RS(7,3) code. The generator polynomial is

 
If the message polynomial is p(x) = 3 x2 + 2 x + 1, then a systematic codeword is encoded as follows.
 
 
Errors in transmission might cause this to be received instead.
 
The syndromes are calculated by evaluating r at powers of α.
 
 
 

Using Gaussian elimination:

 
Λ(x) = 329 x2 + 821 x + 001, with roots x1 = 757 = 3−3 and x2 = 562 = 3−4

The coefficients can be reversed to produce roots with positive exponents, but typically this isn't used:

R(x) = 001 x2 + 821 x + 329, with roots 27 = 33 and 81 = 34

with the log of the roots corresponding to the error locations (right to left, location 0 is the last term in the codeword).

To calculate the error values, apply the Forney algorithm.

Ω(x) = S(x) Λ(x) mod x4 = 546 x + 732
Λ'(x) = 658 x + 821
e1 = −Ω(x1)/Λ'(x1) = 074
e2 = −Ω(x2)/Λ'(x2) = 122

Subtracting   from the received polynomial r(x) reproduces the original codeword s.

Berlekamp–Massey decoder edit

The Berlekamp–Massey algorithm is an alternate iterative procedure for finding the error locator polynomial. During each iteration, it calculates a discrepancy based on a current instance of Λ(x) with an assumed number of errors e:

 
and then adjusts Λ(x) and e so that a recalculated Δ would be zero. The article Berlekamp–Massey algorithm has a detailed description of the procedure. In the following example, C(x) is used to represent Λ(x).

Example edit

Using the same data as the Peterson Gorenstein Zierler example above:

n Sn+1 d C B b m
0 732 732 197 x + 1 1 732 1
1 637 846 173 x + 1 1 732 2
2 762 412 634 x2 + 173 x + 1 173 x + 1 412 1
3 925 576 329 x2 + 821 x + 1 173 x + 1 412 2

The final value of C is the error locator polynomial, Λ(x).

Euclidean decoder edit

Another iterative method for calculating both the error locator polynomial and the error value polynomial is based on Sugiyama's adaptation of the extended Euclidean algorithm .

Define S(x), Λ(x), and Ω(x) for t syndromes and e errors:

 

The key equation is:

 

For t = 6 and e = 3:

 

The middle terms are zero due to the relationship between Λ and syndromes.

The extended Euclidean algorithm can find a series of polynomials of the form

Ai(x) S(x) + Bi(x) xt = Ri(x)

where the degree of R decreases as i increases. Once the degree of Ri(x) < t/2, then

Ai(x) = Λ(x)
Bi(x) = −Q(x)
Ri(x) = Ω(x).

B(x) and Q(x) don't need to be saved, so the algorithm becomes:

R−1 := xt R0  := S(x) A−1 := 0 A0  := 1 i := 0 while degree of Rit/2 i := i + 1 Q := Ri-2 / Ri-1 Ri := Ri-2 - Q Ri-1 Ai := Ai-2 - Q Ai-1 

to set low order term of Λ(x) to 1, divide Λ(x) and Ω(x) by Ai(0):

Λ(x) = Ai / Ai(0)
Ω(x) = Ri / Ai(0)

Ai(0) is the constant (low order) term of Ai.

Example edit

Using the same data as the Peterson–Gorenstein–Zierler example above:

i Ri Ai
−1 001 x4 + 000 x3 + 000 x2 + 000 x + 000 000
0 925 x3 + 762 x2 + 637 x + 732 001
1 683 x2 + 676 x + 024 697 x + 396
2 673 x + 596 608 x2 + 704 x + 544
Λ(x) = A2 / 544 = 329 x2 + 821 x + 001
Ω(x) = R2 / 544 = 546 x + 732

Decoder using discrete Fourier transform edit

A discrete Fourier transform can be used for decoding.[15] To avoid conflict with syndrome names, let c(x) = s(x) the encoded codeword. r(x) and e(x) are the same as above. Define C(x), E(x), and R(x) as the discrete Fourier transforms of c(x), e(x), and r(x). Since r(x) = c(x) + e(x), and since a discrete Fourier transform is a linear operator, R(x) = C(x) + E(x).

Transform r(x) to R(x) using discrete Fourier transform. Since the calculation for a discrete Fourier transform is the same as the calculation for syndromes, t coefficients of R(x) and E(x) are the same as the syndromes:

 

Use   through   as syndromes (they're the same) and generate the error locator polynomial using the methods from any of the above decoders.

Let v = number of errors. Generate E(x) using the known coefficients   to  , the error locator polynomial, and these formulas

 

Then calculate C(x) = R(x) − E(x) and take the inverse transform (polynomial interpolation) of C(x) to produce c(x).

Decoding beyond the error-correction bound edit

The Singleton bound states that the minimum distance d of a linear block code of size (n,k) is upper-bounded by nk + 1. The distance d was usually understood to limit the error-correction capability to ⌊(d−1) / 2⌋. The Reed–Solomon code achieves this bound with equality, and can thus correct up to ⌊(nk) / 2⌋ errors. However, this error-correction bound is not exact.

In 1999, Madhu Sudan and Venkatesan Guruswami at MIT published "Improved Decoding of Reed–Solomon and Algebraic-Geometry Codes" introducing an algorithm that allowed for the correction of errors beyond half the minimum distance of the code.[16] It applies to Reed–Solomon codes and more generally to algebraic geometric codes. This algorithm produces a list of codewords (it is a list-decoding algorithm) and is based on interpolation and factorization of polynomials over   and its extensions.

In 2023, building on three exciting works,[17][18][19] coding theorists showed that Reed-Solomon codes defined over random evaluation points can actually achieve list decoding capacity (up to nk errors) over linear size alphabets with high probability. However, this result is combinatorial rather than algorithmic.

Soft-decoding edit

The algebraic decoding methods described above are hard-decision methods, which means that for every symbol a hard decision is made about its value. For example, a decoder could associate with each symbol an additional value corresponding to the channel demodulator's confidence in the correctness of the symbol. The advent of LDPC and turbo codes, which employ iterated soft-decision belief propagation decoding methods to achieve error-correction performance close to the theoretical limit, has spurred interest in applying soft-decision decoding to conventional algebraic codes. In 2003, Ralf Koetter and Alexander Vardy presented a polynomial-time soft-decision algebraic list-decoding algorithm for Reed–Solomon codes, which was based upon the work by Sudan and Guruswami.[20] In 2016, Steven J. Franke and Joseph H. Taylor published a novel soft-decision decoder.[21]

MATLAB example edit

Encoder edit

Here we present a simple MATLAB implementation for an encoder.

function encoded = rsEncoder(msg, m, prim_poly, n, k)  % RSENCODER Encode message with the Reed-Solomon algorithm  % m is the number of bits per symbol  % prim_poly: Primitive polynomial p(x). Ie for DM is 301  % k is the size of the message  % n is the total size (k+redundant)  % Example: msg = uint8('Test')  % enc_msg = rsEncoder(msg, 8, 301, 12, numel(msg));   % Get the alpha  alpha = gf(2, m, prim_poly);   % Get the Reed-Solomon generating polynomial g(x)  g_x = genpoly(k, n, alpha);   % Multiply the information by X^(n-k), or just pad with zeros at the end to  % get space to add the redundant information  msg_padded = gf([msg zeros(1, n - k)], m, prim_poly);   % Get the remainder of the division of the extended message by the  % Reed-Solomon generating polynomial g(x)  [~, remainder] = deconv(msg_padded, g_x);   % Now return the message with the redundant information  encoded = msg_padded - remainder;  end  % Find the Reed-Solomon generating polynomial g(x), by the way this is the % same as the rsgenpoly function on matlab function g = genpoly(k, n, alpha)  g = 1;  % A multiplication on the galois field is just a convolution  for k = mod(1 : n - k, n)  g = conv(g, [1 alpha .^ (k)]);  end end 

Decoder edit

Now the decoding part:

function [decoded, error_pos, error_mag, g, S] = rsDecoder(encoded, m, prim_poly, n, k)  % RSDECODER Decode a Reed-Solomon encoded message  % Example:  % [dec, ~, ~, ~, ~] = rsDecoder(enc_msg, 8, 301, 12, numel(msg))  max_errors = floor((n - k) / 2);  orig_vals = encoded.x;  % Initialize the error vector  errors = zeros(1, n);  g = [];  S = [];   % Get the alpha  alpha = gf(2, m, prim_poly);   % Find the syndromes (Check if dividing the message by the generator  % polynomial the result is zero)  Synd = polyval(encoded, alpha .^ (1:n - k));  Syndromes = trim(Synd);   % If all syndromes are zeros (perfectly divisible) there are no errors  if isempty(Syndromes.x)  decoded = orig_vals(1:k);  error_pos = [];  error_mag = [];  g = [];  S = Synd;  return;  end   % Prepare for the euclidean algorithm (Used to find the error locating  % polynomials)  r0 = [1, zeros(1, 2 * max_errors)]; r0 = gf(r0, m, prim_poly); r0 = trim(r0);  size_r0 = length(r0);  r1 = Syndromes;  f0 = gf([zeros(1, size_r0 - 1) 1], m, prim_poly);  f1 = gf(zeros(1, size_r0), m, prim_poly);  g0 = f1; g1 = f0;   % Do the euclidean algorithm on the polynomials r0(x) and Syndromes(x) in  % order to find the error locating polynomial  while true  % Do a long division  [quotient, remainder] = deconv(r0, r1);  % Add some zeros  quotient = pad(quotient, length(g1));   % Find quotient*g1 and pad  c = conv(quotient, g1);  c = trim(c);  c = pad(c, length(g0));   % Update g as g0-quotient*g1  g = g0 - c;   % Check if the degree of remainder(x) is less than max_errors  if all(remainder(1:end - max_errors) == 0)  break;  end   % Update r0, r1, g0, g1 and remove leading zeros  r0 = trim(r1); r1 = trim(remainder);  g0 = g1; g1 = g;  end   % Remove leading zeros  g = trim(g);   % Find the zeros of the error polynomial on this galois field  evalPoly = polyval(g, alpha .^ (n - 1 : - 1 : 0));  error_pos = gf(find(evalPoly == 0), m);   % If no error position is found we return the received work, because  % basically is nothing that we could do and we return the received message  if isempty(error_pos)  decoded = orig_vals(1:k);  error_mag = [];  return;  end   % Prepare a linear system to solve the error polynomial and find the error  % magnitudes  size_error = length(error_pos);  Syndrome_Vals = Syndromes.x;  b(:, 1) = Syndrome_Vals(1:size_error);  for idx = 1 : size_error  e = alpha .^ (idx * (n - error_pos.x));  err = e.x;  er(idx, :) = err;  end   % Solve the linear system  error_mag = (gf(er, m, prim_poly) \ gf(b, m, prim_poly))';  % Put the error magnitude on the error vector  errors(error_pos.x) = error_mag.x;  % Bring this vector to the galois field  errors_gf = gf(errors, m, prim_poly);   % Now to fix the errors just add with the encoded code  decoded_gf = encoded(1:k) + errors_gf(1:k);  decoded = decoded_gf.x;  end  % Remove leading zeros from Galois array function gt = trim(g)  gx = g.x;  gt = gf(gx(find(gx, 1) : end), g.m, g.prim_poly); end  % Add leading zeros function xpad = pad(x, k)  len = length(x);  if len < k  xpad = [zeros(1, k - len) x];  end end 

Reed Solomon original view decoders edit

The decoders described in this section use the Reed Solomon original view of a codeword as a sequence of polynomial values where the polynomial is based on the message to be encoded. The same set of fixed values are used by the encoder and decoder, and the decoder recovers the encoding polynomial (and optionally an error locating polynomial) from the received message.

Theoretical decoder edit

Reed & Solomon (1960) described a theoretical decoder that corrected errors by finding the most popular message polynomial. The decoder only knows the set of values   to   and which encoding method was used to generate the codeword's sequence of values. The original message, the polynomial, and any errors are unknown. A decoding procedure could use a method like Lagrange interpolation on various subsets of n codeword values taken k at a time to repeatedly produce potential polynomials, until a sufficient number of matching polynomials are produced to reasonably eliminate any errors in the received codeword. Once a polynomial is determined, then any errors in the codeword can be corrected, by recalculating the corresponding codeword values. Unfortunately, in all but the simplest of cases, there are too many subsets, so the algorithm is impractical. The number of subsets is the binomial coefficient,  , and the number of subsets is infeasible for even modest codes. For a   code that can correct 3 errors, the naïve theoretical decoder would examine 359 billion subsets.

Berlekamp Welch decoder edit

In 1986, a decoder known as the Berlekamp–Welch algorithm was developed as a decoder that is able to recover the original message polynomial as well as an error "locator" polynomial that produces zeroes for the input values that correspond to errors, with time complexity  , where   is the number of values in a message. The recovered polynomial is then used to recover (recalculate as needed) the original message.

Example edit

Using RS(7,3), GF(929), and the set of evaluation points ai = i − 1

a = {0, 1, 2, 3, 4, 5, 6}

If the message polynomial is

p(x) = 003 x2 + 002 x + 001

The codeword is

c = {001, 006, 017, 034, 057, 086, 121}

Errors in transmission might cause this to be received instead.

b = c + e = {001, 006, 123, 456, 057, 086, 121}

The key equations are:

 

Assume maximum number of errors: e = 2. The key equations become:

 
 

Using Gaussian elimination:

 
Q(x) = 003 x4 + 916 x3 + 009 x2 + 007 x + 006
E(x) = 001 x2 + 924 x + 006
Q(x) / E(x) = P(x) = 003 x2 + 002 x + 001

Recalculate P(x) where E(x) = 0 : {2, 3} to correct b resulting in the corrected codeword:

c = {001, 006, 017, 034, 057, 086, 121}

Gao decoder edit

In 2002, an improved decoder was developed by Shuhong Gao, based on the extended Euclid algorithm.[6]

Example edit

Using the same data as the Berlekamp Welch example above:

  •  
  •   Lagrange interpolation of   for i = 1 to n
  •  
  •  
i Ri Ai
−1 001 x7 + 908 x6 + 175 x5 + 194 x4 + 695 x3 + 094 x2 + 720 x + 000 000
0 055 x6 + 440 x5 + 497 x4 + 904 x3 + 424 x2 + 472 x + 001 001
1 702 x5 + 845 x4 + 691 x3 + 461 x2 + 327 x + 237 152 x + 237
2 266 x4 + 086 x3 + 798 x2 + 311 x + 532 708 x2 + 176 x + 532
Q(x) = R2 = 266 x4 + 086 x3 + 798 x2 + 311 x + 532
E(x) = A2 = 708 x2 + 176 x + 532

divide Q(x) and E(x) by most significant coefficient of E(x) = 708. (Optional)

Q(x) = 003 x4 + 916 x3 + 009 x2 + 007 x + 006
E(x) = 001 x2 + 924 x + 006
Q(x) / E(x) = P(x) = 003 x2 + 002 x + 001

Recalculate P(x) where E(x) = 0 : {2, 3} to correct b resulting in the corrected codeword:

c = {001, 006, 017, 034, 057, 086, 121}

See also edit

reed, solomon, error, correction, reed, solomon, codes, group, error, correcting, codes, that, were, introduced, irving, reed, gustave, solomon, 1960, they, have, many, applications, including, consumer, technologies, such, minidiscs, dvds, discs, codes, data,. Reed Solomon codes are a group of error correcting codes that were introduced by Irving S Reed and Gustave Solomon in 1960 1 They have many applications including consumer technologies such as MiniDiscs CDs DVDs Blu ray discs QR codes Data Matrix data transmission technologies such as DSL and WiMAX broadcast systems such as satellite communications DVB and ATSC and storage systems such as RAID 6 Reed Solomon codesNamed afterIrving S Reed and Gustave SolomonClassificationHierarchyLinear block codePolynomial codeReed Solomon codeBlock lengthnMessage lengthkDistancen k 1Alphabet sizeq pm n p prime Often n q 1 Notation n k n k 1 q codeAlgorithmsBerlekamp MasseyEuclideanet al PropertiesMaximum distance separable codevte Reed Solomon codes operate on a block of data treated as a set of finite field elements called symbols Reed Solomon codes are able to detect and correct multiple symbol errors By adding t n k check symbols to the data a Reed Solomon code can detect but not correct any combination of up to t erroneous symbols or locate and correct up to t 2 erroneous symbols at unknown locations As an erasure code it can correct up to t erasures at locations that are known and provided to the algorithm or it can detect and correct combinations of errors and erasures Reed Solomon codes are also suitable as multiple burst bit error correcting codes since a sequence of b 1 consecutive bit errors can affect at most two symbols of size b The choice of t is up to the designer of the code and may be selected within wide limits There are two basic types of Reed Solomon codes original view and BCH view with BCH view being the most common as BCH view decoders are faster and require less working storage than original view decoders Contents 1 History 2 Applications 2 1 Data storage 2 2 Bar code 2 3 Data transmission 2 4 Space transmission 3 Constructions encoding 3 1 Reed amp Solomon s original view The codeword as a sequence of values 3 1 1 Simple encoding procedure The message as a sequence of coefficients 3 1 2 Systematic encoding procedure The message as an initial sequence of values 3 1 3 Discrete Fourier transform and its inverse 3 2 The BCH view The codeword as a sequence of coefficients 3 2 1 Systematic encoding procedure 4 Properties 4 1 Remarks 5 BCH view decoders 5 1 Peterson Gorenstein Zierler decoder 5 1 1 Formulation 5 1 2 Syndrome decoding 5 1 3 Error locators and error values 5 1 4 Error locator polynomial 5 1 5 Find the roots of the error locator polynomial 5 1 6 Calculate the error values 5 1 7 Calculate the error locations 5 1 8 Fix the errors 5 1 9 Example 5 2 Berlekamp Massey decoder 5 2 1 Example 5 3 Euclidean decoder 5 3 1 Example 5 4 Decoder using discrete Fourier transform 5 5 Decoding beyond the error correction bound 5 6 Soft decoding 5 7 MATLAB example 5 7 1 Encoder 5 7 2 Decoder 6 Reed Solomon original view decoders 6 1 Theoretical decoder 6 2 Berlekamp Welch decoder 6 2 1 Example 6 3 Gao decoder 6 3 1 Example 7 See also 8 Notes 9 References 10 Further reading 11 External links 11 1 Information and tutorials 11 2 ImplementationsHistory editReed Solomon codes were developed in 1960 by Irving S Reed and Gustave Solomon who were then staff members of MIT Lincoln Laboratory Their seminal article was titled Polynomial Codes over Certain Finite Fields Reed amp Solomon 1960 The original encoding scheme described in the Reed amp Solomon article used a variable polynomial based on the message to be encoded where only a fixed set of values evaluation points to be encoded are known to encoder and decoder The original theoretical decoder generated potential polynomials based on subsets of k unencoded message length out of n encoded message length values of a received message choosing the most popular polynomial as the correct one which was impractical for all but the simplest of cases This was initially resolved by changing the original scheme to a BCH code like scheme based on a fixed polynomial known to both encoder and decoder but later practical decoders based on the original scheme were developed although slower than the BCH schemes The result of this is that there are two main types of Reed Solomon codes ones that use the original encoding scheme and ones that use the BCH encoding scheme Also in 1960 a practical fixed polynomial decoder for BCH codes developed by Daniel Gorenstein and Neal Zierler was described in an MIT Lincoln Laboratory report by Zierler in January 1960 and later in a paper in June 1961 2 The Gorenstein Zierler decoder and the related work on BCH codes are described in a book Error Correcting Codes by W Wesley Peterson 1961 3 By 1963 or possibly earlier J J Stone and others recognized that Reed Solomon codes could use the BCH scheme of using a fixed generator polynomial making such codes a special class of BCH codes 4 but Reed Solomon codes based on the original encoding scheme are not a class of BCH codes and depending on the set of evaluation points they are not even cyclic codes In 1969 an improved BCH scheme decoder was developed by Elwyn Berlekamp and James Massey and has since been known as the Berlekamp Massey decoding algorithm In 1975 another improved BCH scheme decoder was developed by Yasuo Sugiyama based on the extended Euclidean algorithm 5 nbsp In 1977 Reed Solomon codes were implemented in the Voyager program in the form of concatenated error correction codes The first commercial application in mass produced consumer products appeared in 1982 with the compact disc where two interleaved Reed Solomon codes are used Today Reed Solomon codes are widely implemented in digital storage devices and digital communication standards though they are being slowly replaced by Bose Chaudhuri Hocquenghem BCH codes For example Reed Solomon codes are used in the Digital Video Broadcasting DVB standard DVB S in conjunction with a convolutional inner code but BCH codes are used with LDPC in its successor DVB S2 In 1986 an original scheme decoder known as the Berlekamp Welch algorithm was developed In 1996 variations of original scheme decoders called list decoders or soft decoders were developed by Madhu Sudan and others and work continues on these types of decoders see Guruswami Sudan list decoding algorithm In 2002 another original scheme decoder was developed by Shuhong Gao based on the extended Euclidean algorithm 6 Applications editData storage edit Reed Solomon coding is very widely used in mass storage systems to correct the burst errors associated with media defects Reed Solomon coding is a key component of the compact disc It was the first use of strong error correction coding in a mass produced consumer product and DAT and DVD use similar schemes In the CD two layers of Reed Solomon coding separated by a 28 way convolutional interleaver yields a scheme called Cross Interleaved Reed Solomon Coding CIRC The first element of a CIRC decoder is a relatively weak inner 32 28 Reed Solomon code shortened from a 255 251 code with 8 bit symbols This code can correct up to 2 byte errors per 32 byte block More importantly it flags as erasures any uncorrectable blocks i e blocks with more than 2 byte errors The decoded 28 byte blocks with erasure indications are then spread by the deinterleaver to different blocks of the 28 24 outer code Thanks to the deinterleaving an erased 28 byte block from the inner code becomes a single erased byte in each of 28 outer code blocks The outer code easily corrects this since it can handle up to 4 such erasures per block The result is a CIRC that can completely correct error bursts up to 4000 bits or about 2 5 mm on the disc surface This code is so strong that most CD playback errors are almost certainly caused by tracking errors that cause the laser to jump track not by uncorrectable error bursts 7 DVDs use a similar scheme but with much larger blocks a 208 192 inner code and a 182 172 outer code Reed Solomon error correction is also used in parchive files which are commonly posted accompanying multimedia files on USENET The distributed online storage service Wuala discontinued in 2015 also used Reed Solomon when breaking up files Bar code edit Almost all two dimensional bar codes such as PDF 417 MaxiCode Datamatrix QR Code and Aztec Code use Reed Solomon error correction to allow correct reading even if a portion of the bar code is damaged When the bar code scanner cannot recognize a bar code symbol it will treat it as an erasure Reed Solomon coding is less common in one dimensional bar codes but is used by the PostBar symbology Data transmission edit Specialized forms of Reed Solomon codes specifically Cauchy RS and Vandermonde RS can be used to overcome the unreliable nature of data transmission over erasure channels The encoding process assumes a code of RS N K which results in N codewords of length N symbols each storing K symbols of data being generated that are then sent over an erasure channel Any combination of K codewords received at the other end is enough to reconstruct all of the N codewords The code rate is generally set to 1 2 unless the channel s erasure likelihood can be adequately modelled and is seen to be less In conclusion N is usually 2K meaning that at least half of all the codewords sent must be received in order to reconstruct all of the codewords sent Reed Solomon codes are also used in xDSL systems and CCSDS s Space Communications Protocol Specifications as a form of forward error correction Space transmission edit nbsp Deep space concatenated coding system 8 Notation RS 255 223 CC constraint length 7 code rate 1 2 One significant application of Reed Solomon coding was to encode the digital pictures sent back by the Voyager program Voyager introduced Reed Solomon coding concatenated with convolutional codes a practice that has since become very widespread in deep space and satellite e g direct digital broadcasting communications Viterbi decoders tend to produce errors in short bursts Correcting these burst errors is a job best done by short or simplified Reed Solomon codes Modern versions of concatenated Reed Solomon Viterbi decoded convolutional coding were and are used on the Mars Pathfinder Galileo Mars Exploration Rover and Cassini missions where they perform within about 1 1 5 dB of the ultimate limit the Shannon capacity These concatenated codes are now being replaced by more powerful turbo codes Channel coding schemes used by NASA missions 9 Years Code Mission s 1958 present Uncoded Explorer Mariner many others 1968 1978 convolutional codes CC 25 1 2 Pioneer Venus 1969 1975 Reed Muller code 32 6 Mariner Viking 1977 present Binary Golay code Voyager 1977 present RS 255 223 CC 7 1 2 Voyager Galileo many others 1989 2003 RS 255 223 CC 7 1 3 Voyager 1989 2003 RS 255 223 CC 14 1 4 Galileo 1996 present RS CC 15 1 6 Cassini Mars Pathfinder others 2004 present Turbo codes nb 1 Messenger Stereo MRO others est 2009 LDPC codes Constellation MSLConstructions encoding editThe Reed Solomon code is actually a family of codes where every code is characterised by three parameters an alphabet size q displaystyle q nbsp a block length n displaystyle n nbsp and a message length k displaystyle k nbsp with k lt n q displaystyle k lt n leq q nbsp The set of alphabet symbols is interpreted as the finite field F displaystyle F nbsp of order q displaystyle q nbsp and thus q displaystyle q nbsp must be a prime power In the most useful parameterizations of the Reed Solomon code the block length is usually some constant multiple of the message length that is the rate R k n displaystyle R frac k n nbsp is some constant and furthermore the block length is equal to or one less than the alphabet size that is n q displaystyle n q nbsp or n q 1 displaystyle n q 1 nbsp citation needed Reed amp Solomon s original view The codeword as a sequence of values edit There are different encoding procedures for the Reed Solomon code and thus there are different ways to describe the set of all codewords In the original view of Reed amp Solomon 1960 every codeword of the Reed Solomon code is a sequence of function values of a polynomial of degree less than k displaystyle k nbsp In order to obtain a codeword of the Reed Solomon code the message symbols each within the q sized alphabet are treated as the coefficients of a polynomial p displaystyle p nbsp of degree less than k over the finite field F displaystyle F nbsp with q displaystyle q nbsp elements In turn the polynomial p is evaluated at n q distinct points a 1 a n displaystyle a 1 dots a n nbsp of the field F and the sequence of values is the corresponding codeword Common choices for a set of evaluation points include 0 1 2 n 1 0 1 a a2 an 2 or for n lt q 1 a a2 an 1 where a is a primitive element of F Formally the set C displaystyle mathbf C nbsp of codewords of the Reed Solomon code is defined as follows C p a 1 p a 2 p a n p is a polynomial over F of degree lt k displaystyle mathbf C Bigl bigl p a 1 p a 2 dots p a n bigr Big p text is a polynomial over F text of degree lt k Bigr nbsp Since any two distinct polynomials of degree less than k displaystyle k nbsp agree in at most k 1 displaystyle k 1 nbsp points this means that any two codewords of the Reed Solomon code disagree in at least n k 1 n k 1 displaystyle n k 1 n k 1 nbsp positions Furthermore there are two polynomials that do agree in k 1 displaystyle k 1 nbsp points but are not equal and thus the distance of the Reed Solomon code is exactly d n k 1 displaystyle d n k 1 nbsp Then the relative distance is d d n 1 k n 1 n 1 R 1 n 1 R displaystyle delta d n 1 k n 1 n 1 R 1 n sim 1 R nbsp where R k n displaystyle R k n nbsp is the rate This trade off between the relative distance and the rate is asymptotically optimal since by the Singleton bound every code satisfies d R 1 1 n displaystyle delta R leq 1 1 n nbsp Being a code that achieves this optimal trade off the Reed Solomon code belongs to the class of maximum distance separable codes While the number of different polynomials of degree less than k and the number of different messages are both equal to q k displaystyle q k nbsp and thus every message can be uniquely mapped to such a polynomial there are different ways of doing this encoding The original construction of Reed amp Solomon 1960 interprets the message x as the coefficients of the polynomial p whereas subsequent constructions interpret the message as the values of the polynomial at the first k points a 1 a k displaystyle a 1 dots a k nbsp and obtain the polynomial p by interpolating these values with a polynomial of degree less than k The latter encoding procedure while being slightly less efficient has the advantage that it gives rise to a systematic code that is the original message is always contained as a subsequence of the codeword Simple encoding procedure The message as a sequence of coefficients edit In the original construction of Reed amp Solomon 1960 the message m m 0 m k 1 F k displaystyle m m 0 dots m k 1 in F k nbsp is mapped to the polynomial p m displaystyle p m nbsp withp m a i 0 k 1 m i a i displaystyle p m a sum i 0 k 1 m i a i nbsp The codeword of m displaystyle m nbsp is obtained by evaluating p m displaystyle p m nbsp at n displaystyle n nbsp different points a 0 a n 1 displaystyle a 0 dots a n 1 nbsp of the field F displaystyle F nbsp Thus the classical encoding function C F k F n displaystyle C F k to F n nbsp for the Reed Solomon code is defined as follows C m p m a 0 p m a 1 p m a n 1 displaystyle C m begin bmatrix p m a 0 p m a 1 cdots p m a n 1 end bmatrix nbsp This function C displaystyle C nbsp is a linear mapping that is it satisfies C m A m displaystyle C m Am nbsp for the following n k displaystyle n times k nbsp matrix A displaystyle A nbsp with elements from F displaystyle F nbsp C m A m 1 a 0 a 0 2 a 0 k 1 1 a 1 a 1 2 a 1 k 1 1 a n 1 a n 1 2 a n 1 k 1 m 0 m 1 m k 1 displaystyle C m Am begin bmatrix 1 amp a 0 amp a 0 2 amp dots amp a 0 k 1 1 amp a 1 amp a 1 2 amp dots amp a 1 k 1 vdots amp vdots amp vdots amp ddots amp vdots 1 amp a n 1 amp a n 1 2 amp dots amp a n 1 k 1 end bmatrix begin bmatrix m 0 m 1 vdots m k 1 end bmatrix nbsp This matrix is a Vandermonde matrix over F displaystyle F nbsp In other words the Reed Solomon code is a linear code and in the classical encoding procedure its generator matrix is A displaystyle A nbsp Systematic encoding procedure The message as an initial sequence of values edit There is an alternative encoding procedure that produces a systematic Reed Solomon code Here we use a different polynomial p m displaystyle p m nbsp In this variant the polynomial p m displaystyle p m nbsp is defined as the unique polynomial of degree less than k displaystyle k nbsp such thatp m a i m i for all i 0 k 1 displaystyle p m a i m i text for all i in 0 dots k 1 nbsp To compute this polynomial p m displaystyle p m nbsp from m displaystyle m nbsp one can use Lagrange interpolation Once it has been found it is evaluated at the other points a k a n 1 displaystyle a k dots a n 1 nbsp C m p m a 0 p m a 1 p m a n 1 displaystyle C m begin bmatrix p m a 0 p m a 1 cdots p m a n 1 end bmatrix nbsp This variant is systematic since the first k displaystyle k nbsp entries p m a 0 p m a k 1 displaystyle p m a 0 dots p m a k 1 nbsp are exactly m 0 m k 1 displaystyle m 0 dots m k 1 nbsp by the definition of p m displaystyle p m nbsp Discrete Fourier transform and its inverse edit A discrete Fourier transform is essentially the same as the encoding procedure it uses the generator polynomial p m displaystyle p m nbsp to map a set of evaluation points into the message values as shown above C m p m a 0 p m a 1 p m a n 1 displaystyle C m begin bmatrix p m a 0 p m a 1 cdots p m a n 1 end bmatrix nbsp The inverse Fourier transform could be used to convert an error free set of n lt q message values back into the encoding polynomial of k coefficients with the constraint that in order for this to work the set of evaluation points used to encode the message must be a set of increasing powers of a a i a i displaystyle a i alpha i nbsp a 0 a n 1 1 a a 2 a n 1 displaystyle a 0 dots a n 1 1 alpha alpha 2 dots alpha n 1 nbsp However Lagrange interpolation performs the same conversion without the constraint on the set of evaluation points or the requirement of an error free set of message values and is used for systematic encoding and in one of the steps of the Gao decoder The BCH view The codeword as a sequence of coefficients edit In this view the message is interpreted as the coefficients of a polynomial p x displaystyle p x nbsp The sender computes a related polynomial s x displaystyle s x nbsp of degree n 1 displaystyle n 1 nbsp where n q 1 displaystyle n leq q 1 nbsp and sends the polynomial s x displaystyle s x nbsp The polynomial s x displaystyle s x nbsp is constructed by multiplying the message polynomial p x displaystyle p x nbsp which has degree k 1 displaystyle k 1 nbsp with a generator polynomial g x displaystyle g x nbsp of degree n k displaystyle n k nbsp that is known to both the sender and the receiver The generator polynomial g x displaystyle g x nbsp is defined as the polynomial whose roots are sequential powers of the Galois field primitive a displaystyle alpha nbsp g x x a i x a i 1 x a i n k 1 g 0 g 1 x g n k 1 x n k 1 x n k displaystyle g x left x alpha i right left x alpha i 1 right cdots left x alpha i n k 1 right g 0 g 1 x cdots g n k 1 x n k 1 x n k nbsp For a narrow sense code i 1 displaystyle i 1 nbsp C s 1 s 2 s n s a i 1 n s i a i is a polynomial that has at least the roots a 1 a 2 a n k displaystyle mathbf C left left s 1 s 2 dots s n right Big s a sum i 1 n s i a i text is a polynomial that has at least the roots alpha 1 alpha 2 dots alpha n k right nbsp Systematic encoding procedure edit The encoding procedure for the BCH view of Reed Solomon codes can be modified to yield a systematic encoding procedure in which each codeword contains the message as a prefix and simply appends error correcting symbols as a suffix Here instead of sending s x p x g x displaystyle s x p x g x nbsp the encoder constructs the transmitted polynomial s x displaystyle s x nbsp such that the coefficients of the k displaystyle k nbsp largest monomials are equal to the corresponding coefficients of p x displaystyle p x nbsp and the lower order coefficients of s x displaystyle s x nbsp are chosen exactly in such a way that s x displaystyle s x nbsp becomes divisible by g x displaystyle g x nbsp Then the coefficients of p x displaystyle p x nbsp are a subsequence of the coefficients of s x displaystyle s x nbsp To get a code that is overall systematic we construct the message polynomial p x displaystyle p x nbsp by interpreting the message as the sequence of its coefficients Formally the construction is done by multiplying p x displaystyle p x nbsp by x t displaystyle x t nbsp to make room for the t n k displaystyle t n k nbsp check symbols dividing that product by g x displaystyle g x nbsp to find the remainder and then compensating for that remainder by subtracting it The t displaystyle t nbsp check symbols are created by computing the remainder s r x displaystyle s r x nbsp s r x p x x t mod g x displaystyle s r x p x cdot x t bmod g x nbsp The remainder has degree at most t 1 displaystyle t 1 nbsp whereas the coefficients of x t 1 x t 2 x 1 x 0 displaystyle x t 1 x t 2 dots x 1 x 0 nbsp in the polynomial p x x t displaystyle p x cdot x t nbsp are zero Therefore the following definition of the codeword s x displaystyle s x nbsp has the property that the first k displaystyle k nbsp coefficients are identical to the coefficients of p x displaystyle p x nbsp s x p x x t s r x displaystyle s x p x cdot x t s r x nbsp As a result the codewords s x displaystyle s x nbsp are indeed elements of C displaystyle mathbf C nbsp that is they are divisible by the generator polynomial g x displaystyle g x nbsp 10 s x p x x t s r x s r x s r x 0 mod g x displaystyle s x equiv p x cdot x t s r x equiv s r x s r x equiv 0 mod g x nbsp Properties editThe Reed Solomon code is a n k n k 1 code in other words it is a linear block code of length n over F with dimension k and minimum Hamming distance d min n k 1 textstyle d min n k 1 nbsp The Reed Solomon code is optimal in the sense that the minimum distance has the maximum value possible for a linear code of size n k this is known as the Singleton bound Such a code is also called a maximum distance separable MDS code The error correcting ability of a Reed Solomon code is determined by its minimum distance or equivalently by n k displaystyle n k nbsp the measure of redundancy in the block If the locations of the error symbols are not known in advance then a Reed Solomon code can correct up to n k 2 displaystyle n k 2 nbsp erroneous symbols i e it can correct half as many errors as there are redundant symbols added to the block Sometimes error locations are known in advance e g side information in demodulator signal to noise ratios these are called erasures A Reed Solomon code like any MDS code is able to correct twice as many erasures as errors and any combination of errors and erasures can be corrected as long as the relation 2E S n k is satisfied where E displaystyle E nbsp is the number of errors and S displaystyle S nbsp is the number of erasures in the block nbsp Theoretical BER performance of the Reed Solomon code N 255 K 233 QPSK AWGN Step like characteristic The theoretical error bound can be described via the following formula for the AWGN channel for FSK 11 P b 2 m 1 2 m 1 1 n ℓ t 1 n ℓ n ℓ P s ℓ 1 P s n ℓ displaystyle P b approx frac 2 m 1 2 m 1 frac 1 n sum ell t 1 n ell n choose ell P s ell 1 P s n ell nbsp and for other modulation schemes P b 1 m 1 n ℓ t 1 n ℓ n ℓ P s ℓ 1 P s n ℓ displaystyle P b approx frac 1 m frac 1 n sum ell t 1 n ell n choose ell P s ell 1 P s n ell nbsp where t 1 2 d min 1 textstyle t frac 1 2 d min 1 nbsp P s 1 1 s h displaystyle P s 1 1 s h nbsp h m log 2 M displaystyle h frac m log 2 M nbsp s displaystyle s nbsp is the symbol error rate in uncoded AWGN case and M displaystyle M nbsp is the modulation order For practical uses of Reed Solomon codes it is common to use a finite field F displaystyle F nbsp with 2 m displaystyle 2 m nbsp elements In this case each symbol can be represented as an m displaystyle m nbsp bit value The sender sends the data points as encoded blocks and the number of symbols in the encoded block is n 2 m 1 displaystyle n 2 m 1 nbsp Thus a Reed Solomon code operating on 8 bit symbols has n 2 8 1 255 displaystyle n 2 8 1 255 nbsp symbols per block This is a very popular value because of the prevalence of byte oriented computer systems The number k displaystyle k nbsp with k lt n displaystyle k lt n nbsp of data symbols in the block is a design parameter A commonly used code encodes k 223 displaystyle k 223 nbsp eight bit data symbols plus 32 eight bit parity symbols in an n 255 displaystyle n 255 nbsp symbol block this is denoted as a n k 255 223 displaystyle n k 255 223 nbsp code and is capable of correcting up to 16 symbol errors per block The Reed Solomon code properties discussed above make them especially well suited to applications where errors occur in bursts This is because it does not matter to the code how many bits in a symbol are in error if multiple bits in a symbol are corrupted it only counts as a single error Conversely if a data stream is not characterized by error bursts or drop outs but by random single bit errors a Reed Solomon code is usually a poor choice compared to a binary code The Reed Solomon code like the convolutional code is a transparent code This means that if the channel symbols have been inverted somewhere along the line the decoders will still operate The result will be the inversion of the original data However the Reed Solomon code loses its transparency when the code is shortened The missing bits in a shortened code need to be filled by either zeros or ones depending on whether the data is complemented or not To put it another way if the symbols are inverted then the zero fill needs to be inverted to a one fill For this reason it is mandatory that the sense of the data i e true or complemented be resolved before Reed Solomon decoding Whether the Reed Solomon code is cyclic or not depends on subtle details of the construction In the original view of Reed and Solomon where the codewords are the values of a polynomial one can choose the sequence of evaluation points in such a way as to make the code cyclic In particular if a displaystyle alpha nbsp is a primitive root of the field F displaystyle F nbsp then by definition all non zero elements of F displaystyle F nbsp take the form a i displaystyle alpha i nbsp for i 1 q 1 displaystyle i in 1 dots q 1 nbsp where q F displaystyle q F nbsp Each polynomial p displaystyle p nbsp over F displaystyle F nbsp gives rise to a codeword p a 1 p a q 1 displaystyle p alpha 1 dots p alpha q 1 nbsp Since the function a p a a displaystyle a mapsto p alpha a nbsp is also a polynomial of the same degree this function gives rise to a codeword p a 2 p a q displaystyle p alpha 2 dots p alpha q nbsp since a q a 1 displaystyle alpha q alpha 1 nbsp holds this codeword is the cyclic left shift of the original codeword derived from p displaystyle p nbsp So choosing a sequence of primitive root powers as the evaluation points makes the original view Reed Solomon code cyclic Reed Solomon codes in the BCH view are always cyclic because BCH codes are cyclic Remarks edit Designers are not required to use the natural sizes of Reed Solomon code blocks A technique known as shortening can produce a smaller code of any desired size from a larger code For example the widely used 255 223 code can be converted to a 160 128 code by padding the unused portion of the source block with 95 binary zeroes and not transmitting them At the decoder the same portion of the block is loaded locally with binary zeroes The Delsarte Goethals Seidel 12 theorem illustrates an example of an application of shortened Reed Solomon codes In parallel to shortening a technique known as puncturing allows omitting some of the encoded parity symbols BCH view decoders editThe decoders described in this section use the BCH view of a codeword as a sequence of coefficients They use a fixed generator polynomial known to both encoder and decoder Peterson Gorenstein Zierler decoder edit Main article Peterson Gorenstein Zierler algorithm Daniel Gorenstein and Neal Zierler developed a decoder that was described in a MIT Lincoln Laboratory report by Zierler in January 1960 and later in a paper in June 1961 13 The Gorenstein Zierler decoder and the related work on BCH codes are described in a book Error Correcting Codes by W Wesley Peterson 1961 14 Formulation edit The transmitted message c 0 c i c n 1 displaystyle c 0 ldots c i ldots c n 1 nbsp is viewed as the coefficients of a polynomial s x s x i 0 n 1 c i x i displaystyle s x sum i 0 n 1 c i x i nbsp As a result of the Reed Solomon encoding procedure s x is divisible by the generator polynomial g x g x j 1 n k x a j displaystyle g x prod j 1 n k x alpha j nbsp where a is a primitive element Since s x is a multiple of the generator g x it follows that it inherits all its roots s x mod x a j g x mod x a j 0 displaystyle s x bmod x alpha j g x bmod x alpha j 0 nbsp Therefore s a j 0 j 1 2 n k displaystyle s alpha j 0 j 1 2 ldots n k nbsp The transmitted polynomial is corrupted in transit by an error polynomial e x to produce the received polynomial r x r x s x e x displaystyle r x s x e x nbsp e x i 0 n 1 e i x i displaystyle e x sum i 0 n 1 e i x i nbsp Coefficient ei will be zero if there is no error at that power of x and nonzero if there is an error If there are n errors at distinct powers ik of x thene x k 1 n e i k x i k displaystyle e x sum k 1 nu e i k x i k nbsp The goal of the decoder is to find the number of errors n the positions of the errors ik and the error values at those positions eik From those e x can be calculated and subtracted from r x to get the originally sent message s x Syndrome decoding edit The decoder starts by evaluating the polynomial as received at points a 1 a n k displaystyle alpha 1 dots alpha n k nbsp We call the results of that evaluation the syndromes Sj They are defined as S j r a j s a j e a j 0 e a j e a j k 1 n e i k a j i k j 1 2 n k displaystyle begin aligned S j amp r alpha j s alpha j e alpha j 0 e alpha j amp e alpha j amp sum k 1 nu e i k left alpha j right i k quad j 1 2 ldots n k end aligned nbsp Note that s a j 0 displaystyle s alpha j 0 nbsp because s x displaystyle s x nbsp has roots at a j displaystyle alpha j nbsp as shown in the previous section The advantage of looking at the syndromes is that the message polynomial drops out In other words the syndromes only relate to the error and are unaffected by the actual contents of the message being transmitted If the syndromes are all zero the algorithm stops here and reports that the message was not corrupted in transit Error locators and error values edit For convenience define the error locators Xk and error values Yk as X k a i k Y k e i k displaystyle X k alpha i k Y k e i k nbsp Then the syndromes can be written in terms of these error locators and error values asS j k 1 n Y k X k j displaystyle S j sum k 1 nu Y k X k j nbsp This definition of the syndrome values is equivalent to the previous since a j i k a j i k a i k j X k j displaystyle alpha j i k alpha j i k alpha i k j X k j nbsp The syndromes give a system of n k 2n equations in 2n unknowns but that system of equations is nonlinear in the Xk and does not have an obvious solution However if the Xk were known see below then the syndrome equations provide a linear system of equations that can easily be solved for the Yk error values X 1 1 X 2 1 X n 1 X 1 2 X 2 2 X n 2 X 1 n k X 2 n k X n n k Y 1 Y 2 Y n S 1 S 2 S n k displaystyle begin bmatrix X 1 1 amp X 2 1 amp cdots amp X nu 1 X 1 2 amp X 2 2 amp cdots amp X nu 2 vdots amp vdots amp ddots amp vdots X 1 n k amp X 2 n k amp cdots amp X nu n k end bmatrix begin bmatrix Y 1 Y 2 vdots Y nu end bmatrix begin bmatrix S 1 S 2 vdots S n k end bmatrix nbsp Consequently the problem is finding the Xk because then the leftmost matrix would be known and both sides of the equation could be multiplied by its inverse yielding YkIn the variant of this algorithm where the locations of the errors are already known when it is being used as an erasure code this is the end The error locations Xk are already known by some other method for example in an FM transmission the sections where the bitstream was unclear or overcome with interference are probabilistically determinable from frequency analysis In this scenario up to n k displaystyle n k nbsp errors can be corrected The rest of the algorithm serves to locate the errors and will require syndrome values up to 2 v displaystyle 2v nbsp instead of just the v displaystyle v nbsp used thus far This is why twice as many error correcting symbols need to be added as can be corrected without knowing their locations Error locator polynomial edit There is a linear recurrence relation that gives rise to a system of linear equations Solving those equations identifies those error locations Xk Define the error locator polynomial L x asL x k 1 n 1 x X k 1 L 1 x 1 L 2 x 2 L n x n displaystyle Lambda x prod k 1 nu 1 xX k 1 Lambda 1 x 1 Lambda 2 x 2 cdots Lambda nu x nu nbsp The zeros of L x are the reciprocals X k 1 displaystyle X k 1 nbsp This follows from the above product notation construction since if x X k 1 displaystyle x X k 1 nbsp then one of the multiplied terms will be zero 1 X k 1 X k 1 1 0 displaystyle 1 X k 1 cdot X k 1 1 0 nbsp making the whole polynomial evaluate to zero L X k 1 0 displaystyle Lambda X k 1 0 nbsp Let j displaystyle j nbsp be any integer such that 1 j n displaystyle 1 leq j leq nu nbsp Multiply both sides by Y k X k j n displaystyle Y k X k j nu nbsp and it will still be zero Y k X k j n L X k 1 0 Y k X k j n 1 L 1 X k 1 L 2 X k 2 L n X k n 0 Y k X k j n L 1 Y k X k j n X k 1 L 2 Y k X k j n X k 2 L n Y k X k j n X k n 0 Y k X k j n L 1 Y k X k j n 1 L 2 Y k X k j n 2 L n Y k X k j 0 displaystyle begin aligned amp Y k X k j nu Lambda X k 1 0 1ex amp Y k X k j nu left 1 Lambda 1 X k 1 Lambda 2 X k 2 cdots Lambda nu X k nu right 0 1ex amp Y k X k j nu Lambda 1 Y k X k j nu X k 1 Lambda 2 Y k X k j nu X k 2 cdots Lambda nu Y k X k j nu X k nu 0 1ex amp Y k X k j nu Lambda 1 Y k X k j nu 1 Lambda 2 Y k X k j nu 2 cdots Lambda nu Y k X k j 0 end aligned nbsp Sum for k 1 to n and it will still be zero k 1 n Y k X k j n L 1 Y k X k j n 1 L 2 Y k X k j n 2 L n Y k X k j 0 displaystyle sum k 1 nu left Y k X k j nu Lambda 1 Y k X k j nu 1 Lambda 2 Y k X k j nu 2 cdots Lambda nu Y k X k j right 0 nbsp Collect each term into its own sum k 1 n Y k X k j n k 1 n L 1 Y k X k j n 1 k 1 n L 2 Y k X k j n 2 k 1 n L n Y k X k j 0 displaystyle left sum k 1 nu Y k X k j nu right left sum k 1 nu Lambda 1 Y k X k j nu 1 right left sum k 1 nu Lambda 2 Y k X k j nu 2 right cdots left sum k 1 nu Lambda nu Y k X k j right 0 nbsp Extract the constant values of L displaystyle Lambda nbsp that are unaffected by the summation k 1 n Y k X k j n L 1 k 1 n Y k X k j n 1 L 2 k 1 n Y k X k j n 2 L n k 1 n Y k X k j 0 displaystyle left sum k 1 nu Y k X k j nu right Lambda 1 left sum k 1 nu Y k X k j nu 1 right Lambda 2 left sum k 1 nu Y k X k j nu 2 right cdots Lambda nu left sum k 1 nu Y k X k j right 0 nbsp These summations are now equivalent to the syndrome values which we know and can substitute in This therefore reduces toS j n L 1 S j n 1 L n 1 S j 1 L n S j 0 displaystyle S j nu Lambda 1 S j nu 1 cdots Lambda nu 1 S j 1 Lambda nu S j 0 nbsp Subtracting S j n displaystyle S j nu nbsp from both sides yieldsS j L n S j 1 L n 1 S j n 1 L 1 S j n displaystyle S j Lambda nu S j 1 Lambda nu 1 cdots S j nu 1 Lambda 1 S j nu nbsp Recall that j was chosen to be any integer between 1 and v inclusive and this equivalence is true for any and all such values Therefore we have v linear equations not just one This system of linear equations can therefore be solved for the coefficients Li of the error location polynomial S 1 S 2 S n S 2 S 3 S n 1 S n S n 1 S 2 n 1 L n L n 1 L 1 S n 1 S n 2 S n n displaystyle begin bmatrix S 1 amp S 2 amp cdots amp S nu S 2 amp S 3 amp cdots amp S nu 1 vdots amp vdots amp ddots amp vdots S nu amp S nu 1 amp cdots amp S 2 nu 1 end bmatrix begin bmatrix Lambda nu Lambda nu 1 vdots Lambda 1 end bmatrix begin bmatrix S nu 1 S nu 2 vdots S nu nu end bmatrix nbsp The above assumes the decoder knows the number of errors n but that number has not been determined yet The PGZ decoder does not determine n directly but rather searches for it by trying successive values The decoder first assumes the largest value for a trial n and sets up the linear system for that value If the equations can be solved i e the matrix determinant is nonzero then that trial value is the number of errors If the linear system cannot be solved then the trial n is reduced by one and the next smaller system is examined Gill n d p 35 Find the roots of the error locator polynomial edit Use the coefficients Li found in the last step to build the error location polynomial The roots of the error location polynomial can be found by exhaustive search The error locators Xk are the reciprocals of those roots The order of coefficients of the error location polynomial can be reversed in which case the roots of that reversed polynomial are the error locators X k displaystyle X k nbsp not their reciprocals X k 1 displaystyle X k 1 nbsp Chien search is an efficient implementation of this step Calculate the error values edit Once the error locators Xk are known the error values can be determined This can be done by direct solution for Yk in the error equations matrix given above or using the Forney algorithm Calculate the error locations edit Calculate ik by taking the log base a displaystyle alpha nbsp of Xk This is generally done using a precomputed lookup table Fix the errors edit Finally e x is generated from ik and eik and then is subtracted from r x to get the originally sent message s x with errors corrected Example edit Consider the Reed Solomon code defined in GF 929 with a 3 and t 4 this is used in PDF417 barcodes for a RS 7 3 code The generator polynomial isg x x 3 x 3 2 x 3 3 x 3 4 x 4 809 x 3 723 x 2 568 x 522 displaystyle g x x 3 x 3 2 x 3 3 x 3 4 x 4 809x 3 723x 2 568x 522 nbsp If the message polynomial is p x 3 x2 2 x 1 then a systematic codeword is encoded as follows s r x p x x t mod g x 547 x 3 738 x 2 442 x 455 displaystyle s r x p x x t bmod g x 547x 3 738x 2 442x 455 nbsp s x p x x t s r x 3 x 6 2 x 5 1 x 4 382 x 3 191 x 2 487 x 474 displaystyle s x p x x t s r x 3x 6 2x 5 1x 4 382x 3 191x 2 487x 474 nbsp Errors in transmission might cause this to be received instead r x s x e x 3 x 6 2 x 5 123 x 4 456 x 3 191 x 2 487 x 474 displaystyle r x s x e x 3x 6 2x 5 123x 4 456x 3 191x 2 487x 474 nbsp The syndromes are calculated by evaluating r at powers of a S 1 r 3 1 3 3 6 2 3 5 123 3 4 456 3 3 191 3 2 487 3 474 732 displaystyle S 1 r 3 1 3 cdot 3 6 2 cdot 3 5 123 cdot 3 4 456 cdot 3 3 191 cdot 3 2 487 cdot 3 474 732 nbsp S 2 r 3 2 637 S 3 r 3 3 762 S 4 r 3 4 925 displaystyle S 2 r 3 2 637 S 3 r 3 3 762 S 4 r 3 4 925 nbsp 732 637 637 762 L 2 L 1 762 925 167 004 displaystyle begin bmatrix 732 amp 637 637 amp 762 end bmatrix begin bmatrix Lambda 2 Lambda 1 end bmatrix begin bmatrix 762 925 end bmatrix begin bmatrix 167 004 end bmatrix nbsp Using Gaussian elimination 001 000 000 001 L 2 L 1 329 821 displaystyle begin bmatrix 001 amp 000 000 amp 001 end bmatrix begin bmatrix Lambda 2 Lambda 1 end bmatrix begin bmatrix 329 821 end bmatrix nbsp L x 329 x2 821 x 001 with roots x1 757 3 3 and x2 562 3 4 The coefficients can be reversed to produce roots with positive exponents but typically this isn t used R x 001 x2 821 x 329 with roots 27 33 and 81 34 with the log of the roots corresponding to the error locations right to left location 0 is the last term in the codeword To calculate the error values apply the Forney algorithm W x S x L x mod x4 546 x 732 L x 658 x 821 e1 W x1 L x1 074 e2 W x2 L x2 122 Subtracting e 1 x 3 e 2 x 4 74 x 3 122 x 4 displaystyle e 1 x 3 e 2 x 4 74x 3 122x 4 nbsp from the received polynomial r x reproduces the original codeword s Berlekamp Massey decoder edit The Berlekamp Massey algorithm is an alternate iterative procedure for finding the error locator polynomial During each iteration it calculates a discrepancy based on a current instance of L x with an assumed number of errors e D S i L 1 S i 1 L e S i e displaystyle Delta S i Lambda 1 S i 1 cdots Lambda e S i e nbsp and then adjusts L x and e so that a recalculated D would be zero The article Berlekamp Massey algorithm has a detailed description of the procedure In the following example C x is used to represent L x Example edit Using the same data as the Peterson Gorenstein Zierler example above n Sn 1 d C B b m 0 732 732 197 x 1 1 732 1 1 637 846 173 x 1 1 732 2 2 762 412 634 x2 173 x 1 173 x 1 412 1 3 925 576 329 x2 821 x 1 173 x 1 412 2 The final value of C is the error locator polynomial L x Euclidean decoder edit Another iterative method for calculating both the error locator polynomial and the error value polynomial is based on Sugiyama s adaptation of the extended Euclidean algorithm Define S x L x and W x for t syndromes and e errors S x S t x t 1 S t 1 x t 2 S 2 x S 1 L x L e x e L e 1 x e 1 L 1 x 1 W x W e x e W e 1 x e 1 W 1 x W 0 displaystyle begin aligned S x amp S t x t 1 S t 1 x t 2 cdots S 2 x S 1 1ex Lambda x amp Lambda e x e Lambda e 1 x e 1 cdots Lambda 1 x 1 1ex Omega x amp Omega e x e Omega e 1 x e 1 cdots Omega 1 x Omega 0 end aligned nbsp The key equation is L x S x Q x x t W x displaystyle Lambda x S x Q x x t Omega x nbsp For t 6 and e 3 L 3 S 6 x 8 L 2 S 6 L 3 S 5 x 7 L 1 S 6 L 2 S 5 L 3 S 4 x 6 S 6 L 1 S 5 L 2 S 4 L 3 S 3 x 5 S 5 L 1 S 4 L 2 S 3 L 3 S 2 x 4 S 4 L 1 S 3 L 2 S 2 L 3 S 1 x 3 S 3 L 1 S 2 L 2 S 1 x 2 S 2 L 1 S 1 x S 1 Q 2 x 8 Q 1 x 7 Q 0 x 6 0 0 0 W 2 x 2 W 1 x W 0 displaystyle begin bmatrix Lambda 3 S 6 amp x 8 Lambda 2 S 6 Lambda 3 S 5 amp x 7 Lambda 1 S 6 Lambda 2 S 5 Lambda 3 S 4 amp x 6 S 6 Lambda 1 S 5 Lambda 2 S 4 Lambda 3 S 3 amp x 5 S 5 Lambda 1 S 4 Lambda 2 S 3 Lambda 3 S 2 amp x 4 S 4 Lambda 1 S 3 Lambda 2 S 2 Lambda 3 S 1 amp x 3 S 3 Lambda 1 S 2 Lambda 2 S 1 amp x 2 S 2 Lambda 1 S 1 amp x S 1 end bmatrix begin bmatrix Q 2 x 8 Q 1 x 7 Q 0 x 6 0 0 0 Omega 2 x 2 Omega 1 x Omega 0 end bmatrix nbsp The middle terms are zero due to the relationship between L and syndromes The extended Euclidean algorithm can find a series of polynomials of the form Ai x S x Bi x xt Ri x where the degree of R decreases as i increases Once the degree of Ri x lt t 2 then Ai x L x Bi x Q x Ri x W x B x and Q x don t need to be saved so the algorithm becomes R 1 xt R0 S x A 1 0 A0 1 i 0 while degree of Ri t 2 i i 1 Q Ri 2 Ri 1 Ri Ri 2 Q Ri 1 Ai Ai 2 Q Ai 1 to set low order term of L x to 1 divide L x and W x by Ai 0 L x Ai Ai 0 W x Ri Ai 0 Ai 0 is the constant low order term of Ai Example edit Using the same data as the Peterson Gorenstein Zierler example above i Ri Ai 1 001 x4 000 x3 000 x2 000 x 000 000 0 925 x3 762 x2 637 x 732 001 1 683 x2 676 x 024 697 x 396 2 673 x 596 608 x2 704 x 544 L x A2 544 329 x2 821 x 001 W x R2 544 546 x 732 Decoder using discrete Fourier transform edit A discrete Fourier transform can be used for decoding 15 To avoid conflict with syndrome names let c x s x the encoded codeword r x and e x are the same as above Define C x E x and R x as the discrete Fourier transforms of c x e x and r x Since r x c x e x and since a discrete Fourier transform is a linear operator R x C x E x Transform r x to R x using discrete Fourier transform Since the calculation for a discrete Fourier transform is the same as the calculation for syndromes t coefficients of R x and E x are the same as the syndromes R j E j S j r a j for 1 j t displaystyle R j E j S j r alpha j qquad text for 1 leq j leq t nbsp Use R 1 displaystyle R 1 nbsp through R t displaystyle R t nbsp as syndromes they re the same and generate the error locator polynomial using the methods from any of the above decoders Let v number of errors Generate E x using the known coefficients E 1 displaystyle E 1 nbsp to E t displaystyle E t nbsp the error locator polynomial and these formulasE 0 1 L v E v L 1 E v 1 L v 1 E 1 E j L 1 E j 1 L 2 E j 2 L v E j v for t lt j lt n displaystyle begin aligned E 0 amp frac 1 Lambda v E v Lambda 1 E v 1 cdots Lambda v 1 E 1 E j amp Lambda 1 E j 1 Lambda 2 E j 2 cdots Lambda v E j v amp text for t lt j lt n end aligned nbsp Then calculate C x R x E x and take the inverse transform polynomial interpolation of C x to produce c x Decoding beyond the error correction bound edit The Singleton bound states that the minimum distance d of a linear block code of size n k is upper bounded by n k 1 The distance d was usually understood to limit the error correction capability to d 1 2 The Reed Solomon code achieves this bound with equality and can thus correct up to n k 2 errors However this error correction bound is not exact In 1999 Madhu Sudan and Venkatesan Guruswami at MIT published Improved Decoding of Reed Solomon and Algebraic Geometry Codes introducing an algorithm that allowed for the correction of errors beyond half the minimum distance of the code 16 It applies to Reed Solomon codes and more generally to algebraic geometric codes This algorithm produces a list of codewords it is a list decoding algorithm and is based on interpolation and factorization of polynomials over G F 2 m displaystyle GF 2 m nbsp and its extensions In 2023 building on three exciting works 17 18 19 coding theorists showed that Reed Solomon codes defined over random evaluation points can actually achieve list decoding capacity up to n k errors over linear size alphabets with high probability However this result is combinatorial rather than algorithmic Soft decoding edit The algebraic decoding methods described above are hard decision methods which means that for every symbol a hard decision is made about its value For example a decoder could associate with each symbol an additional value corresponding to the channel demodulator s confidence in the correctness of the symbol The advent of LDPC and turbo codes which employ iterated soft decision belief propagation decoding methods to achieve error correction performance close to the theoretical limit has spurred interest in applying soft decision decoding to conventional algebraic codes In 2003 Ralf Koetter and Alexander Vardy presented a polynomial time soft decision algebraic list decoding algorithm for Reed Solomon codes which was based upon the work by Sudan and Guruswami 20 In 2016 Steven J Franke and Joseph H Taylor published a novel soft decision decoder 21 MATLAB example edit Encoder edit Here we present a simple MATLAB implementation for an encoder function encoded rsEncoder msg m prim poly n k RSENCODER Encode message with the Reed Solomon algorithm m is the number of bits per symbol prim poly Primitive polynomial p x Ie for DM is 301 k is the size of the message n is the total size k redundant Example msg uint8 Test enc msg rsEncoder msg 8 301 12 numel msg Get the alpha alpha gf 2 m prim poly Get the Reed Solomon generating polynomial g x g x genpoly k n alpha Multiply the information by X n k or just pad with zeros at the end to get space to add the redundant information msg padded gf msg zeros 1 n k m prim poly Get the remainder of the division of the extended message by the Reed Solomon generating polynomial g x remainder deconv msg padded g x Now return the message with the redundant information encoded msg padded remainder end Find the Reed Solomon generating polynomial g x by the way this is the same as the rsgenpoly function on matlab function g genpoly k n alpha g 1 A multiplication on the galois field is just a convolution for k mod 1 n k n g conv g 1 alpha k end end Decoder edit Now the decoding part function decoded error pos error mag g S rsDecoder encoded m prim poly n k RSDECODER Decode a Reed Solomon encoded message Example dec rsDecoder enc msg 8 301 12 numel msg max errors floor n k 2 orig vals encoded x Initialize the error vector errors zeros 1 n g S Get the alpha alpha gf 2 m prim poly Find the syndromes Check if dividing the message by the generator polynomial the result is zero Synd polyval encoded alpha 1 n k Syndromes trim Synd If all syndromes are zeros perfectly divisible there are no errors if isempty Syndromes x decoded orig vals 1 k error pos error mag g S Synd return end Prepare for the euclidean algorithm Used to find the error locating polynomials r0 1 zeros 1 2 max errors r0 gf r0 m prim poly r0 trim r0 size r0 length r0 r1 Syndromes f0 gf zeros 1 size r0 1 1 m prim poly f1 gf zeros 1 size r0 m prim poly g0 f1 g1 f0 Do the euclidean algorithm on the polynomials r0 x and Syndromes x in order to find the error locating polynomial while true Do a long division quotient remainder deconv r0 r1 Add some zeros quotient pad quotient length g1 Find quotient g1 and pad c conv quotient g1 c trim c c pad c length g0 Update g as g0 quotient g1 g g0 c Check if the degree of remainder x is less than max errors if all remainder 1 end max errors 0 break end Update r0 r1 g0 g1 and remove leading zeros r0 trim r1 r1 trim remainder g0 g1 g1 g end Remove leading zeros g trim g Find the zeros of the error polynomial on this galois field evalPoly polyval g alpha n 1 1 0 error pos gf find evalPoly 0 m If no error position is found we return the received work because basically is nothing that we could do and we return the received message if isempty error pos decoded orig vals 1 k error mag return end Prepare a linear system to solve the error polynomial and find the error magnitudes size error length error pos Syndrome Vals Syndromes x b 1 Syndrome Vals 1 size error for idx 1 size error e alpha idx n error pos x err e x er idx err end Solve the linear system error mag gf er m prim poly gf b m prim poly Put the error magnitude on the error vector errors error pos x error mag x Bring this vector to the galois field errors gf gf errors m prim poly Now to fix the errors just add with the encoded code decoded gf encoded 1 k errors gf 1 k decoded decoded gf x end Remove leading zeros from Galois array function gt trim g gx g x gt gf gx find gx 1 end g m g prim poly end Add leading zeros function xpad pad x k len length x if len lt k xpad zeros 1 k len x end endReed Solomon original view decoders editThe decoders described in this section use the Reed Solomon original view of a codeword as a sequence of polynomial values where the polynomial is based on the message to be encoded The same set of fixed values are used by the encoder and decoder and the decoder recovers the encoding polynomial and optionally an error locating polynomial from the received message Theoretical decoder edit Reed amp Solomon 1960 described a theoretical decoder that corrected errors by finding the most popular message polynomial The decoder only knows the set of values a 1 displaystyle a 1 nbsp to a n displaystyle a n nbsp and which encoding method was used to generate the codeword s sequence of values The original message the polynomial and any errors are unknown A decoding procedure could use a method like Lagrange interpolation on various subsets of n codeword values taken k at a time to repeatedly produce potential polynomials until a sufficient number of matching polynomials are produced to reasonably eliminate any errors in the received codeword Once a polynomial is determined then any errors in the codeword can be corrected by recalculating the corresponding codeword values Unfortunately in all but the simplest of cases there are too many subsets so the algorithm is impractical The number of subsets is the binomial coefficient n k n n k k textstyle binom n k n over n k k nbsp and the number of subsets is infeasible for even modest codes For a 255 249 displaystyle 255 249 nbsp code that can correct 3 errors the naive theoretical decoder would examine 359 billion subsets Berlekamp Welch decoder edit In 1986 a decoder known as the Berlekamp Welch algorithm was developed as a decoder that is able to recover the original message polynomial as well as an error locator polynomial that produces zeroes for the input values that correspond to errors with time complexity O n 3 displaystyle O n 3 nbsp where n displaystyle n nbsp is the number of values in a message The recovered polynomial is then used to recover recalculate as needed the original message Example edit Using RS 7 3 GF 929 and the set of evaluation points ai i 1 a 0 1 2 3 4 5 6 If the message polynomial is p x 003 x2 002 x 001 The codeword is c 001 006 017 034 057 086 121 Errors in transmission might cause this to be received instead b c e 001 006 123 456 057 086 121 The key equations are b i E a i Q a i 0 displaystyle b i E a i Q a i 0 nbsp Assume maximum number of errors e 2 The key equations become b i e 0 e 1 a i q 0 q 1 a i q 2 a i 2 q 3 a i 3 q 4 a i 4 b i a i 2 displaystyle b i e 0 e 1 a i q 0 q 1 a i q 2 a i 2 q 3 a i 3 q 4 a i 4 b i a i 2 nbsp 001 000 928 000 000 000 000 006 006 928 928 928 928 928 123 246 928 927 925 921 913 456 439 928 926 920 902 848 057 228 928 925 913 865 673 086 430 928 924 904 804 304 121 726 928 923 893 713 562 e 0 e 1 q 0 q 1 q 2 q 3 q 4 000 923 437 541 017 637 289 displaystyle begin bmatrix 001 amp 000 amp 928 amp 000 amp 000 amp 000 amp 000 006 amp 006 amp 928 amp 928 amp 928 amp 928 amp 928 123 amp 246 amp 928 amp 927 amp 925 amp 921 amp 913 456 amp 439 amp 928 amp 926 amp 920 amp 902 amp 848 057 amp 228 amp 928 amp 925 amp 913 amp 865 amp 673 086 amp 430 amp 928 amp 924 amp 904 amp 804 amp 304 121 amp 726 amp 928 amp 923 amp 893 amp 713 amp 562 end bmatrix begin bmatrix e 0 e 1 q 0 q 1 q 2 q 3 q 4 end bmatrix begin bmatrix 000 923 437 541 017 637 289 end bmatrix nbsp Using Gaussian elimination 001 000 000 000 000 000 000 000 001 000 000 000 000 000 000 000 001 000 000 000 000 000 000 000 001 000 000 000 000 000 000 000 001 000 000 000 000 000 000 000 001 000 000 000 000 000 000 000 001 e 0 e 1 q 0 q 1 q 2 q 3 q 4 006 924 006 007 009 916 003 displaystyle begin bmatrix 001 amp 000 amp 000 amp 000 amp 000 amp 000 amp 000 000 amp 001 amp 000 amp 000 amp 000 amp 000 amp 000 000 amp 000 amp 001 amp 000 amp 000 amp 000 amp 000 000 amp 000 amp 000 amp 001 amp 000 amp 000 amp 000 000 amp 000 amp 000 amp 000 amp 001 amp 000 amp 000 000 amp 000 amp 000 amp 000 amp 000 amp 001 amp 000 000 amp 000 amp 000 amp 000 amp 000 amp 000 amp 001 end bmatrix begin bmatrix e 0 e 1 q 0 q 1 q 2 q 3 q 4 end bmatrix begin bmatrix 006 924 006 007 009 916 003 end bmatrix nbsp Q x 003 x4 916 x3 009 x2 007 x 006 E x 001 x2 924 x 006 Q x E x P x 003 x2 002 x 001 Recalculate P x where E x 0 2 3 to correct b resulting in the corrected codeword c 001 006 017 034 057 086 121 Gao decoder edit In 2002 an improved decoder was developed by Shuhong Gao based on the extended Euclid algorithm 6 Example edit Using the same data as the Berlekamp Welch example above R 1 i 1 n x a i displaystyle R 1 prod i 1 n x a i nbsp R 0 displaystyle R 0 nbsp Lagrange interpolation of a i b a i displaystyle a i b a i nbsp for i 1 to n A 1 0 displaystyle A 1 0 nbsp A 0 1 displaystyle A 0 1 nbsp i Ri Ai 1 001 x7 908 x6 175 x5 194 x4 695 x3 094 x2 720 x 000 000 0 055 x6 440 x5 497 x4 904 x3 424 x2 472 x 001 001 1 702 x5 845 x4 691 x3 461 x2 327 x 237 152 x 237 2 266 x4 086 x3 798 x2 311 x 532 708 x2 176 x 532 Q x R2 266 x4 086 x3 798 x2 311 x 532 E x A2 708 x2 176 x 532 divide Q x and E x by most significant coefficient of E x 708 Optional Q x 003 x4 916 x3 009 x2 007 x 006 E x 001 x2 924 x 006 Q x E x P x 003 x2 002 x 001 Recalculate P x where E x 0 2 3 to correct b resulting in the corrected codeword c 001 006 017 034 057 086 121 See also editBCH code Berlekamp Massey algorithm Berlekamp Welch algorithm Chien sea, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.