fbpx
Wikipedia

Maximum subarray problem

In computer science, the maximum sum subarray problem, also known as the maximum segment sum problem, is the task of finding a contiguous subarray with the largest sum, within a given one-dimensional array A[1...n] of numbers. It can be solved in time and space.

Visualization of how sub-arrays change based on start and end positions of a sample. Each possible contiguous sub-array is represented by a point on a colored line. That point's y-coordinate represents the sum of the sample. Its x-coordinate represents the end of the sample, and the leftmost point on that colored line represents the start of the sample. In this case, the array from which samples are taken is [2, 3, -1, -20, 5, 10].

Formally, the task is to find indices and with , such that the sum

is as large as possible. (Some formulations of the problem also allow the empty subarray to be considered; by convention, the sum of all values of the empty subarray is zero.) Each number in the input array A could be positive, negative, or zero.[1]

For example, for the array of values [−2, 1, −3, 4, −1, 2, 1, −5, 4], the contiguous subarray with the largest sum is [4, −1, 2, 1], with sum 6.

Some properties of this problem are:

  1. If the array contains all non-negative numbers, then the problem is trivial; a maximum subarray is the entire array.
  2. If the array contains all non-positive numbers, then a solution is any subarray of size 1 containing the maximal value of the array (or the empty subarray, if it is permitted).
  3. Several different sub-arrays may have the same maximum sum.

Although this problem can be solved using several different algorithmic techniques, including brute force,[2] divide and conquer,[3] dynamic programming,[4] and reduction to shortest paths, a simple single-pass algorithm known as Kadane's algorithm solves it efficiently.

History edit

The maximum subarray problem was proposed by Ulf Grenander in 1977 as a simplified model for maximum likelihood estimation of patterns in digitized images.[5]

Grenander was looking to find a rectangular subarray with maximum sum, in a two-dimensional array of real numbers. A brute-force algorithm for the two-dimensional problem runs in O(n6) time; because this was prohibitively slow, Grenander proposed the one-dimensional problem to gain insight into its structure. Grenander derived an algorithm that solves the one-dimensional problem in O(n2) time,[note 1] improving the brute force running time of O(n3). When Michael Shamos heard about the problem, he overnight devised an O(n log n) divide-and-conquer algorithm for it. Soon after, Shamos described the one-dimensional problem and its history at a Carnegie Mellon University seminar attended by Jay Kadane, who designed within a minute an O(n)-time algorithm,[5][6][7] which is as fast as possible.[note 2] In 1982, David Gries obtained the same O(n)-time algorithm by applying Dijkstra's "standard strategy";[8] in 1989, Richard Bird derived it by purely algebraic manipulation of the brute-force algorithm using the Bird–Meertens formalism.[9]

Grenander's two-dimensional generalization can be solved in O(n3) time either by using Kadane's algorithm as a subroutine, or through a divide-and-conquer approach. Slightly faster algorithms based on distance matrix multiplication have been proposed by Tamaki & Tokuyama (1998) and by Takaoka (2002). There is some evidence that no significantly faster algorithm exists; an algorithm that solves the two-dimensional maximum subarray problem in O(n3−ε) time, for any ε>0, would imply a similarly fast algorithm for the all-pairs shortest paths problem.[10]

Applications edit

Maximum subarray problems arise in many fields, such as genomic sequence analysis and computer vision.

Genomic sequence analysis employs maximum subarray algorithms to identify important biological segments of protein sequences.[citation needed] These problems include conserved segments, GC-rich regions, tandem repeats, low-complexity filter, DNA binding domains, and regions of high charge.[citation needed]

In computer vision, maximum-subarray algorithms are used on bitmap images to detect the brightest area in an image.

Kadane's algorithm edit

No empty subarrays admitted edit

Kadane's algorithm scans the given array   from left to right. In the  th step, it computes the subarray with the largest sum ending at  ; this sum is maintained in variable current_sum.[note 3] Moreover, it computes the subarray with the largest sum anywhere in  , maintained in variable best_sum,[note 4] and easily obtained as the maximum of all values of current_sum seen so far, cf. line 7 of the algorithm.

As a loop invariant, in the  th step, the old value of current_sum holds the maximum over all   of the sum  . Therefore, current_sum [note 5] is the maximum over all   of the sum  . To extend the latter maximum to cover also the case  , it is sufficient to consider also the singleton subarray  . This is done in line 6 by assigning  current_sum  as the new value of current_sum, which after that holds the maximum over all   of the sum  .

Thus, the problem can be solved with the following code,[11] expressed in Python.

def max_subarray(numbers):  """Find the largest sum of any contiguous subarray."""  best_sum = - infinity  current_sum = 0  for x in numbers:  current_sum = max(x, current_sum + x)  best_sum = max(best_sum, current_sum)  return best_sum 

If the input contains no positive element, the returned value is that of the largest element (i.e., the value closest to 0), or negative infinity if the input was empty. For correctness, an exception should be raised when the input array is empty, since an empty array has no maximum nonempty subarray. If the array is nonempty, its first element could be used in place of negative infinity, if needed to avoid mixing numeric and non-numeric values.

The algorithm can be adapted to the case which allows empty subarrays or to keep track of the starting and ending indices of the maximum subarray.

This algorithm calculates the maximum subarray ending at each position from the maximum subarray ending at the previous position, so it can be viewed as a trivial case of dynamic programming.

Empty subarrays admitted edit

Kadane's original algorithm solves the problem variant when empty subarrays are admitted.[4][7] This variant will return 0 if the input contains no positive elements (including when the input is empty). It is obtained by two changes in code: in line 3, best_sum should be initialized to 0 to account for the empty subarray  

 best_sum = 0; 

and line 6 in the for loop current_sum should be updated as max(0, current_sum + x).[note 6]

 current_sum = max(0, current_sum + x) 

As a loop invariant, in the  th step, the old value of current_sum holds the maximum over all   of the sum  .[note 7] Therefore, current_sum  is the maximum over all   of the sum  . To extend the latter maximum to cover also the case  , it is sufficient to consider also the empty subarray  . This is done in line 6 by assigning  current_sum  as the new value of current_sum, which after that holds the maximum over all   of the sum  . Machine-verified C / Frama-C code of both variants can be found here.

Computing the best subarray's position edit

The algorithm can be modified to keep track of the starting and ending indices of the maximum subarray as well.

Because of the way this algorithm uses optimal substructures (the maximum subarray ending at each position is calculated in a simple way from a related but smaller and overlapping subproblem: the maximum subarray ending at the previous position) this algorithm can be viewed as a simple/trivial example of dynamic programming.

Complexity edit

The runtime complexity of Kadane's algorithm is   and its space complexity is  .[4][7]

Generalizations edit

Similar problems may be posed for higher-dimensional arrays, but their solutions are more complicated; see, e.g., Takaoka (2002). Brodal & Jørgensen (2007) showed how to find the k largest subarray sums in a one-dimensional array, in the optimal time bound  .

The Maximum sum k-disjoint subarrays can also be computed in the optimal time bound   .[12]

See also edit

Notes edit

  1. ^ By using a precomputed table of cumulative sums   to compute the subarray sum   in constant time
  2. ^ since every algorithm must at least scan the array once which already takes O(n) time
  3. ^ named MaxEndingHere in Bentley (1989), and c in Gries (1982)
  4. ^ named MaxSoFar in Bentley (1989), and s in Gries (1982)
  5. ^ In the Python code below,   is expressed as x, with the index   left implicit.
  6. ^ While Bentley (1989) does not mention this difference, using x instead of 0 in the above version without empty subarrays achieves maintaining its loop invariant current_sum  at the beginning of the  th step.
  7. ^ This sum is   when  , corresponding to the empty subarray  .

References edit

  1. ^ Bentley 1989, p. 69.
  2. ^ Bentley 1989, p. 70.
  3. ^ Bentley 1989, p. 73.
  4. ^ a b c Bentley 1989, p. 74.
  5. ^ a b Bentley 1984, p. 868-869.
  6. ^ Bentley 1989, p. 76-77.
  7. ^ a b c Gries 1982, p. 211.
  8. ^ Gries 1982, p. 209-211.
  9. ^ Bird 1989, Sect.8, p.126.
  10. ^ Backurs, Dikkala & Tzamos 2016.
  11. ^ Bentley 1989, p. 78,171. Bentley, like Gries, first introduces the variant admitting empty subarrays, see below, and describes only the changes.
  12. ^ Bengtsson & Chen 2007.
  • Backurs, Arturs; Dikkala, Nishanth; Tzamos, Christos (2016), "Tight Hardness Results for Maximum Weight Rectangles", Proc. 43rd International Colloquium on Automata, Languages, and Programming: 81:1–81:13, doi:10.4230/LIPIcs.ICALP.2016.81, S2CID 12720136
  • Bae, Sung Eun (2007), (PDF) (Ph.D. thesis), University of Canterbury, S2CID 2681670, archived from the original (PDF) on 2017-10-26.
  • Bengtsson, Fredrik; Chen, Jingsen (2007), Computing maximum-scoring segments optimally (PDF) (Research report), Luleå University of Technology
  • Bentley, Jon (1984), "Programming Pearls: Algorithm Design Techniques", Communications of the ACM, 27 (9): 865–873, doi:10.1145/358234.381162, S2CID 207565329
  • Bentley, Jon (May 1989), Programming Pearls (2nd? ed.), Reading, MA: Addison Wesley, ISBN 0-201-10331-1
  • Bird, Richard S. (1989), "Algebraic Identities for Program Calculation" (PDF), The Computer Journal, 32 (2): 122–126, doi:10.1093/comjnl/32.2.122[dead link]
  • Brodal, Gerth Stølting; Jørgensen, Allan Grønlund (2007), "A linear time algorithm for the k maximal sums problem", Mathematical Foundations of Computer Science 2007, Lecture Notes in Computer Science, vol. 4708, Springer-Verlag, pp. 442–453, doi:10.1007/978-3-540-74456-6_40, ISBN 978-3-540-74455-9.
  • Gries, David (1982), "A Note on the Standard Strategy for Developing Loop Invariants and Loops" (PDF), Science of Computer Programming, 2 (3): 207–241, doi:10.1016/0167-6423(83)90015-1, hdl:1813/6370
  • Takaoka, Tadao (2002), "Efficient algorithms for the maximum subarray problem by distance matrix multiplication", Electronic Notes in Theoretical Computer Science, 61: 191–200, doi:10.1016/S1571-0661(04)00313-5.
  • Tamaki, Hisao; Tokuyama, Takeshi (1998), "Algorithms for the Maximum Subarray Problem Based on Matrix Multiplication", Proceedings of the 9th Symposium on Discrete Algorithms (SODA): 446–452, retrieved November 17, 2018

External links edit

  • TAN, Lirong. (PDF). Archived from the original (PDF) on 2015-10-10. Retrieved 2017-10-26.
  • Mu, Shin-Cheng (2010). "The Maximum Segment Sum Problem: Its Origin, and a Derivation".
  • "Notes on Maximum Subarray Problem". 2012.
  • www.algorithmist.com
  • alexeigor.wikidot.com
  • greatest subsequential sum problem on Rosetta Code
  • geeksforgeeks page on Kadane's Algorithm

maximum, subarray, problem, computer, science, maximum, subarray, problem, also, known, maximum, segment, problem, task, finding, contiguous, subarray, with, largest, within, given, dimensional, array, numbers, solved, displaystyle, time, displaystyle, space, . In computer science the maximum sum subarray problem also known as the maximum segment sum problem is the task of finding a contiguous subarray with the largest sum within a given one dimensional array A 1 n of numbers It can be solved in O n displaystyle O n time and O 1 displaystyle O 1 space Visualization of how sub arrays change based on start and end positions of a sample Each possible contiguous sub array is represented by a point on a colored line That point s y coordinate represents the sum of the sample Its x coordinate represents the end of the sample and the leftmost point on that colored line represents the start of the sample In this case the array from which samples are taken is 2 3 1 20 5 10 Formally the task is to find indices i displaystyle i and j displaystyle j with 1 i j n displaystyle 1 leq i leq j leq n such that the sum x i j A x displaystyle sum x i j A x is as large as possible Some formulations of the problem also allow the empty subarray to be considered by convention the sum of all values of the empty subarray is zero Each number in the input array A could be positive negative or zero 1 For example for the array of values 2 1 3 4 1 2 1 5 4 the contiguous subarray with the largest sum is 4 1 2 1 with sum 6 Some properties of this problem are If the array contains all non negative numbers then the problem is trivial a maximum subarray is the entire array If the array contains all non positive numbers then a solution is any subarray of size 1 containing the maximal value of the array or the empty subarray if it is permitted Several different sub arrays may have the same maximum sum Although this problem can be solved using several different algorithmic techniques including brute force 2 divide and conquer 3 dynamic programming 4 and reduction to shortest paths a simple single pass algorithm known as Kadane s algorithm solves it efficiently Contents 1 History 2 Applications 3 Kadane s algorithm 3 1 No empty subarrays admitted 3 2 Empty subarrays admitted 3 3 Computing the best subarray s position 3 4 Complexity 4 Generalizations 5 See also 6 Notes 7 References 8 External linksHistory editThe maximum subarray problem was proposed by Ulf Grenander in 1977 as a simplified model for maximum likelihood estimation of patterns in digitized images 5 Grenander was looking to find a rectangular subarray with maximum sum in a two dimensional array of real numbers A brute force algorithm for the two dimensional problem runs in O n6 time because this was prohibitively slow Grenander proposed the one dimensional problem to gain insight into its structure Grenander derived an algorithm that solves the one dimensional problem in O n2 time note 1 improving the brute force running time of O n3 When Michael Shamos heard about the problem he overnight devised an O n log n divide and conquer algorithm for it Soon after Shamos described the one dimensional problem and its history at a Carnegie Mellon University seminar attended by Jay Kadane who designed within a minute an O n time algorithm 5 6 7 which is as fast as possible note 2 In 1982 David Gries obtained the same O n time algorithm by applying Dijkstra s standard strategy 8 in 1989 Richard Bird derived it by purely algebraic manipulation of the brute force algorithm using the Bird Meertens formalism 9 Grenander s two dimensional generalization can be solved in O n3 time either by using Kadane s algorithm as a subroutine or through a divide and conquer approach Slightly faster algorithms based on distance matrix multiplication have been proposed by Tamaki amp Tokuyama 1998 and by Takaoka 2002 There is some evidence that no significantly faster algorithm exists an algorithm that solves the two dimensional maximum subarray problem in O n3 e time for any e gt 0 would imply a similarly fast algorithm for the all pairs shortest paths problem 10 Applications editThis section needs attention from an expert in Computational biology The specific problem is fix inline tags WikiProject Computational biology may be able to help recruit an expert September 2019 Maximum subarray problems arise in many fields such as genomic sequence analysis and computer vision Genomic sequence analysis employs maximum subarray algorithms to identify important biological segments of protein sequences citation needed These problems include conserved segments GC rich regions tandem repeats low complexity filter DNA binding domains and regions of high charge citation needed In computer vision maximum subarray algorithms are used on bitmap images to detect the brightest area in an image Kadane s algorithm editNo empty subarrays admitted edit Kadane s algorithm scans the given array A 1 n displaystyle A 1 ldots n nbsp from left to right In the j displaystyle j nbsp th step it computes the subarray with the largest sum ending at j displaystyle j nbsp this sum is maintained in variable current sum note 3 Moreover it computes the subarray with the largest sum anywhere in A 1 j displaystyle A 1 ldots j nbsp maintained in variable best sum note 4 and easily obtained as the maximum of all values of current sum seen so far cf line 7 of the algorithm As a loop invariant in the j displaystyle j nbsp th step the old value of current sum holds the maximum over all i 1 j 1 displaystyle i in 1 ldots j 1 nbsp of the sum A i A j 1 displaystyle A i cdots A j 1 nbsp Therefore current sum A j displaystyle A j nbsp note 5 is the maximum over all i 1 j 1 displaystyle i in 1 ldots j 1 nbsp of the sum A i A j displaystyle A i cdots A j nbsp To extend the latter maximum to cover also the case i j displaystyle i j nbsp it is sufficient to consider also the singleton subarray A j j displaystyle A j ldots j nbsp This is done in line 6 by assigning max A j displaystyle max A j nbsp current sum A j displaystyle A j nbsp as the new value of current sum which after that holds the maximum over all i 1 j displaystyle i in 1 ldots j nbsp of the sum A i A j displaystyle A i cdots A j nbsp Thus the problem can be solved with the following code 11 expressed in Python def max subarray numbers Find the largest sum of any contiguous subarray best sum infinity current sum 0 for x in numbers current sum max x current sum x best sum max best sum current sum return best sum If the input contains no positive element the returned value is that of the largest element i e the value closest to 0 or negative infinity if the input was empty For correctness an exception should be raised when the input array is empty since an empty array has no maximum nonempty subarray If the array is nonempty its first element could be used in place of negative infinity if needed to avoid mixing numeric and non numeric values The algorithm can be adapted to the case which allows empty subarrays or to keep track of the starting and ending indices of the maximum subarray This algorithm calculates the maximum subarray ending at each position from the maximum subarray ending at the previous position so it can be viewed as a trivial case of dynamic programming Empty subarrays admitted edit Example run nbsp Execution of Kadane s algorithm on the above example array Blue subarray with largest sum ending at i green subarray with largest sum encountered so far a lower case letter indicates an empty array variable i is left implicit in Python code Kadane s original algorithm solves the problem variant when empty subarrays are admitted 4 7 This variant will return 0 if the input contains no positive elements including when the input is empty It is obtained by two changes in code in line 3 best sum should be initialized to 0 to account for the empty subarray A 0 1 displaystyle A 0 ldots 1 nbsp best sum 0 and line 6 in the for loop current sum should be updated as max 0 current sum x note 6 current sum max 0 current sum x As a loop invariant in the j displaystyle j nbsp th step the old value of current sum holds the maximum over all i 1 j displaystyle i in 1 ldots j nbsp of the sum A i A j 1 displaystyle A i cdots A j 1 nbsp note 7 Therefore current sum A j displaystyle A j nbsp is the maximum over all i 1 j displaystyle i in 1 ldots j nbsp of the sum A i A j displaystyle A i cdots A j nbsp To extend the latter maximum to cover also the case i j 1 displaystyle i j 1 nbsp it is sufficient to consider also the empty subarray A j 1 j displaystyle A j 1 ldots j nbsp This is done in line 6 by assigning max 0 displaystyle max 0 nbsp current sum A j displaystyle A j nbsp as the new value of current sum which after that holds the maximum over all i 1 j 1 displaystyle i in 1 ldots j 1 nbsp of the sum A i A j displaystyle A i cdots A j nbsp Machine verified C Frama C code of both variants can be found here Computing the best subarray s position edit The algorithm can be modified to keep track of the starting and ending indices of the maximum subarray as well Because of the way this algorithm uses optimal substructures the maximum subarray ending at each position is calculated in a simple way from a related but smaller and overlapping subproblem the maximum subarray ending at the previous position this algorithm can be viewed as a simple trivial example of dynamic programming Complexity edit The runtime complexity of Kadane s algorithm is O n displaystyle O n nbsp and its space complexity is O 1 displaystyle O 1 nbsp 4 7 Generalizations editSimilar problems may be posed for higher dimensional arrays but their solutions are more complicated see e g Takaoka 2002 Brodal amp Jorgensen 2007 showed how to find the k largest subarray sums in a one dimensional array in the optimal time bound O n k displaystyle O n k nbsp The Maximum sum k disjoint subarrays can also be computed in the optimal time bound O n k displaystyle O n k nbsp 12 See also editSubset sum problemNotes edit By using a precomputed table of cumulative sums S k x 1 k A x displaystyle S k sum x 1 k A x nbsp to compute the subarray sum x i j A x S j S i 1 displaystyle sum x i j A x S j S i 1 nbsp in constant time since every algorithm must at least scan the array once which already takes O n time named MaxEndingHere in Bentley 1989 and c in Gries 1982 named MaxSoFar in Bentley 1989 and s in Gries 1982 In the Python code below A j displaystyle A j nbsp is expressed as x with the index j displaystyle j nbsp left implicit While Bentley 1989 does not mention this difference using x instead of 0 in the above version without empty subarrays achieves maintaining its loop invariant current sum max i 1 j 1 A i A j 1 displaystyle max i in 1 j 1 A i A j 1 nbsp at the beginning of the j displaystyle j nbsp th step This sum is 0 displaystyle 0 nbsp when i j displaystyle i j nbsp corresponding to the empty subarray A j j 1 displaystyle A j ldots j 1 nbsp References edit Bentley 1989 p 69 Bentley 1989 p 70 Bentley 1989 p 73 a b c Bentley 1989 p 74 a b Bentley 1984 p 868 869 Bentley 1989 p 76 77 a b c Gries 1982 p 211 Gries 1982 p 209 211 Bird 1989 Sect 8 p 126 Backurs Dikkala amp Tzamos 2016 Bentley 1989 p 78 171 Bentley like Gries first introduces the variant admitting empty subarrays see below and describes only the changes Bengtsson amp Chen 2007 Backurs Arturs Dikkala Nishanth Tzamos Christos 2016 Tight Hardness Results for Maximum Weight Rectangles Proc 43rd International Colloquium on Automata Languages and Programming 81 1 81 13 doi 10 4230 LIPIcs ICALP 2016 81 S2CID 12720136 Bae Sung Eun 2007 Sequential and Parallel Algorithms for the Generalized Maximum Subarray Problem PDF Ph D thesis University of Canterbury S2CID 2681670 archived from the original PDF on 2017 10 26 Bengtsson Fredrik Chen Jingsen 2007 Computing maximum scoring segments optimally PDF Research report Lulea University of Technology Bentley Jon 1984 Programming Pearls Algorithm Design Techniques Communications of the ACM 27 9 865 873 doi 10 1145 358234 381162 S2CID 207565329 Bentley Jon May 1989 Programming Pearls 2nd ed Reading MA Addison Wesley ISBN 0 201 10331 1 Bird Richard S 1989 Algebraic Identities for Program Calculation PDF The Computer Journal 32 2 122 126 doi 10 1093 comjnl 32 2 122 dead link Brodal Gerth Stolting Jorgensen Allan Gronlund 2007 A linear time algorithm for the k maximal sums problem Mathematical Foundations of Computer Science 2007 Lecture Notes in Computer Science vol 4708 Springer Verlag pp 442 453 doi 10 1007 978 3 540 74456 6 40 ISBN 978 3 540 74455 9 Gries David 1982 A Note on the Standard Strategy for Developing Loop Invariants and Loops PDF Science of Computer Programming 2 3 207 241 doi 10 1016 0167 6423 83 90015 1 hdl 1813 6370 Takaoka Tadao 2002 Efficient algorithms for the maximum subarray problem by distance matrix multiplication Electronic Notes in Theoretical Computer Science 61 191 200 doi 10 1016 S1571 0661 04 00313 5 Tamaki Hisao Tokuyama Takeshi 1998 Algorithms for the Maximum Subarray Problem Based on Matrix Multiplication Proceedings of the 9th Symposium on Discrete Algorithms SODA 446 452 retrieved November 17 2018External links editTAN Lirong Maximum Contiguous Subarray Sum Problems PDF Archived from the original PDF on 2015 10 10 Retrieved 2017 10 26 Mu Shin Cheng 2010 The Maximum Segment Sum Problem Its Origin and a Derivation Notes on Maximum Subarray Problem 2012 www algorithmist com alexeigor wikidot com greatest subsequential sum problem on Rosetta Code geeksforgeeks page on Kadane s Algorithm Retrieved from https en wikipedia org w index php title Maximum subarray problem amp oldid 1187835725 Kadane s algorithm, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.