fbpx
Wikipedia

Semipredicate problem

In computer programming, a semipredicate problem occurs when a subroutine intended to return a useful value can fail, but the signalling of failure uses an otherwise valid return value.[1] The problem is that the caller of the subroutine cannot tell what the result means in this case.

Example edit

The division operation yields a real number, but fails when the divisor is zero. If we were to write a function that performs division, we might choose to return 0 on this invalid input. However, if the dividend is 0, the result is 0 too. This means there is no number we can return to uniquely signal attempted division by zero, since all real numbers are in the range of division.

Practical implications edit

Early programmers handled potentially exceptional cases such as division using a convention requiring the calling routine to verify the inputs before calling the division function. This had two problems: first, it greatly encumbered all code that performed division (a very common operation); second, it violated the Don't repeat yourself and encapsulation principles, the former of which suggesting eliminating duplicated code, and the latter suggesting that data-associated code be contained in one place (in this division example, the verification of input was done separately). For a computation more complicated than division, it could be difficult for the caller to recognize invalid input; in some cases, determining input validity may be as costly as performing the entire computation. The target function could also be modified and would then expect different preconditions than would the caller; such a modification would require changes in every place where the function was called.

Solutions edit

The semipredicate problem is not universal among functions that can fail.

Using a custom convention to interpret return values edit

If the range of a function does not cover the entire space corresponding to the data type of the function's return value, a value known to be impossible under normal computation can be used. For example, consider the function index, which takes a string and a substring, and returns the integer index of the substring in the main string. If the search fails, the function may be programmed to return -1 (or any other negative value), since this can never signify a successful result.

This solution has its problems, though, as it overloads the natural meaning of a function with an arbitrary convention:

  • The programmer must remember specific failure values for many functions, which of course cannot be identical if the functions have different ranges.
  • A different implementation of the same function may choose to use a different failure value, resulting in possible bugs when programmers move from environment to environment.
  • If the failing function wishes to communicate useful information about why it had failed, one failure value is insufficient.
  • A signed integer halves the possible index range to be able to store the sign bit.
  • While the chosen value is an invalid result for this operation, it might be a valid input to followup operations. For example in Python str.find returns -1 if the substring is not found,[2] but -1 is a valid index (negative indices generally start from the end[3]).

Multivalued return edit

Many languages allow, through one mechanism or another, a function to return multiple values. If this is available, the function can be redesigned to return a boolean value signalling success or failure, along with its primary return value. If multiple error modes are possible, the function may instead return an enumerated return code (error code) along with its primary return value.

Various techniques for returning multiple values include:

  • Returning a tuple of values. This is conventional in languages such as Python, that have a built-in tuple data type, and special syntax for handling these: in Python, x, y = f() calls the function f which returns a pair of values, and assigns the elements of the pair to two variables.
  • Secondary return values as in Common Lisp. All expressions have a primary value, but secondary values might be returned to interested callers. For example, the GETHASH function returns the value of the given key in an associative map, or a default value otherwise. However, it also returns a secondary boolean indicating whether the value was found, making it possible to distinguish between the "no value was found" and "the value found was equal to default value" cases. This is different from returning a tuple, in that secondary return values are optional – if a caller does not care about them, it may ignore them completely, whereas tuple-valued returns are merely syntactic sugar for returning and unpacking a list, and every caller must always know about and consume all items returned.
  • Languages with call by reference – or equivalents, such as call by address using pointers – can allow for multivalued return by designating some parameters as output parameters. In this case, the function could just return the error value, with a variable intended to store the actual result being passed to the function. This is analogous to the use of an exit status to store an error code, and streams for returning content.
  • A variant of output parameters is used in object-oriented languages that use call by sharing, where a mutable object is passed to a function, and the object is mutated to return values.
  • Logic programming languages such as Prolog have no return values. Instead, unbound logical variables are used as output parameters, to be unified with values constructed in a predicate call.

Global variable for return status edit

Similar to an "out" argument, a global variable can store what error occurred (or simply whether an error occurred).

For instance, if an error occurs, and is signalled (generally as above, by an illegal value like −1) the Unix errno variable is set to indicate which value occurred. Using a global has its usual drawbacks: thread safety becomes a concern (modern operating systems use a thread-safe version of errno), and if only one error global is used, its type must be wide enough to contain all interesting information about all possible errors in the system.

Exceptions edit

Exceptions are one widely used scheme for solving this problem. An error condition is not considered a return value of the function at all; normal control flow is disrupted and explicit handling of the error takes place automatically. They are an example of out-of-band signalling.

Expanding the return value type edit

Manually created hybrid types edit

In C, a common approach, when possible, is to use a data type deliberately wider than strictly needed by the function. For example, the standard function getchar() is defined with return type int and returns a value in the range [0,255] (the range of unsigned char) on success or the value EOF (implementation-defined, but outside the range of unsigned char) on the end of the input or a read error.

Nullable reference types edit

In languages with pointers or references, one solution is to return a pointer to a value, rather than the value itself. This return pointer can then be set to null to indicate an error. It is typically suited to functions that return a pointer anyway. This has a performance advantage over the OOP style of exception handling,[4] with the drawback that negligent programmers may not check the return value, resulting in a crash when the invalid pointer is used. Whether a pointer is null or not is another example of the predicate problem; null may be a flag indicating failure or the value of a pointer returned successfully. A common pattern in the UNIX environment is setting a separate variable to indicate the cause of an error. An example of this is the C standard library fopen() function.

Implicitly hybrid types edit

In dynamically-typed languages, such as PHP and Lisp, the usual approach is to return false, none, or null when the function call fails. This works by returning a different type to the normal return type (thus expanding the type). It is a dynamically-typed equivalent to returning a null pointer.

For example, a numeric function normally returns a number (int or float), and while zero might be a valid response; false is not. Similarly, a function that normally returns a string might sometimes return the empty string as a valid response, but return false on failure. This process of type-juggling necessitates care in testing the return value: e.g., in PHP, use === (i.e., equal and of same type) rather than just == (i.e., equal, after automatic type-conversion). It works only when the original function is not meant to return a boolean value, and still requires that information about the error be conveyed via other means.

Explicitly hybrid types edit

In Haskell and other functional programming languages, it is common to use a data type that is just as big as it needs to be to express any possible result. For example, one can write a division function that returned the type Maybe Real, and a getchar function returning Either String Char. The first is an option type, which has only one failure value, Nothing. The second case is a tagged union: a result is either some string with a descriptive error message, or a successfully read character. Haskell's type inference system helps ensure that callers deal with possible errors. Since the error conditions become explicit in the function type, looking at its signature immediately tells the programmer how to treat errors. Further, tagged unions and option types form monads when endowed with appropriate functions: this may be used to keep the code tidy by automatically propagating unhandled error conditions.

Example edit

Rust has algebraic data types and comes with the built-in Result<T, E> and Option<T> types.

fn find(key: String) -> Option<String> {  if key == "hello" {  Some(key)  } else {  None  } } 

The C++ programming language introduced std::optional<T> in the C++17 update.

std::optional<int> find_int_in_str(std::string_view str) {  constexpr auto digits = "0123456789";  auto n = str.find_first_of(digits);  if (n == std::string::npos) {  // The string simply contains no numbers, not necessarily an error  return std::nullopt;  }  int result;  // More search logic that sets 'result'  return result; } 

and std::expected<T, E> in the C++23 update

enum class parse_error {  kEmptyString,  kOutOfRange,  kNotANumber }; std::expected<int, parse_error> parse_number(std::string_view str) {  if (str.empty()) {  // Flag one unexpected situation out of several  return std::unexpected(parse_error::kEmptyString);  }  int result;  // More conversion logic that sets 'result'  return result; } 

See also edit

References edit

  1. ^ Norvig, Peter (1992). "The General Problem Solver". Paradigms of artificial intelligence programming: case studies in common LISP. Morgan Kaufmann. p. 127. ISBN 1-55860-191-0.
  2. ^ "Built-in Types — Python 3.10.4 documentation".
  3. ^ "If i or j is negative, the index is relative to the end of sequence s: len(s) + i or len(s) + j is substituted."common sequence operations note (3)
  4. ^ Why Exceptions should be Exceptional - an example of performance comparison

semipredicate, problem, this, article, relies, largely, entirely, single, source, relevant, discussion, found, talk, page, please, help, improve, this, article, introducing, citations, additional, sources, find, sources, news, newspapers, books, scholar, jstor. This article relies largely or entirely on a single source Relevant discussion may be found on the talk page Please help improve this article by introducing citations to additional sources Find sources Semipredicate problem news newspapers books scholar JSTOR February 2012 In computer programming a semipredicate problem occurs when a subroutine intended to return a useful value can fail but the signalling of failure uses an otherwise valid return value 1 The problem is that the caller of the subroutine cannot tell what the result means in this case Contents 1 Example 2 Practical implications 3 Solutions 3 1 Using a custom convention to interpret return values 3 2 Multivalued return 3 3 Global variable for return status 3 4 Exceptions 3 5 Expanding the return value type 3 5 1 Manually created hybrid types 3 5 2 Nullable reference types 3 5 3 Implicitly hybrid types 3 5 4 Explicitly hybrid types 4 Example 5 See also 6 ReferencesExample editThe division operation yields a real number but fails when the divisor is zero If we were to write a function that performs division we might choose to return 0 on this invalid input However if the dividend is 0 the result is 0 too This means there is no number we can return to uniquely signal attempted division by zero since all real numbers are in the range of division Practical implications editEarly programmers handled potentially exceptional cases such as division using a convention requiring the calling routine to verify the inputs before calling the division function This had two problems first it greatly encumbered all code that performed division a very common operation second it violated the Don t repeat yourself and encapsulation principles the former of which suggesting eliminating duplicated code and the latter suggesting that data associated code be contained in one place in this division example the verification of input was done separately For a computation more complicated than division it could be difficult for the caller to recognize invalid input in some cases determining input validity may be as costly as performing the entire computation The target function could also be modified and would then expect different preconditions than would the caller such a modification would require changes in every place where the function was called Solutions editThis section does not cite any sources Please help improve this section by adding citations to reliable sources Unsourced material may be challenged and removed June 2013 Learn how and when to remove this template message The semipredicate problem is not universal among functions that can fail Using a custom convention to interpret return values edit If the range of a function does not cover the entire space corresponding to the data type of the function s return value a value known to be impossible under normal computation can be used For example consider the function index which takes a string and a substring and returns the integer index of the substring in the main string If the search fails the function may be programmed to return 1 or any other negative value since this can never signify a successful result This solution has its problems though as it overloads the natural meaning of a function with an arbitrary convention The programmer must remember specific failure values for many functions which of course cannot be identical if the functions have different ranges A different implementation of the same function may choose to use a different failure value resulting in possible bugs when programmers move from environment to environment If the failing function wishes to communicate useful information about why it had failed one failure value is insufficient A signed integer halves the possible index range to be able to store the sign bit While the chosen value is an invalid result for this operation it might be a valid input to followup operations For example in Python str find returns 1 if the substring is not found 2 but 1 is a valid index negative indices generally start from the end 3 Multivalued return edit Many languages allow through one mechanism or another a function to return multiple values If this is available the function can be redesigned to return a boolean value signalling success or failure along with its primary return value If multiple error modes are possible the function may instead return an enumerated return code error code along with its primary return value Various techniques for returning multiple values include Returning a tuple of values This is conventional in languages such as Python that have a built in tuple data type and special syntax for handling these in Python x y f calls the function f which returns a pair of values and assigns the elements of the pair to two variables Secondary return values as in Common Lisp All expressions have a primary value but secondary values might be returned to interested callers For example the GETHASH function returns the value of the given key in an associative map or a default value otherwise However it also returns a secondary boolean indicating whether the value was found making it possible to distinguish between the no value was found and the value found was equal to default value cases This is different from returning a tuple in that secondary return values are optional if a caller does not care about them it may ignore them completely whereas tuple valued returns are merely syntactic sugar for returning and unpacking a list and every caller must always know about and consume all items returned Languages with call by reference or equivalents such as call by address using pointers can allow for multivalued return by designating some parameters as output parameters In this case the function could just return the error value with a variable intended to store the actual result being passed to the function This is analogous to the use of an exit status to store an error code and streams for returning content A variant of output parameters is used in object oriented languages that use call by sharing where a mutable object is passed to a function and the object is mutated to return values Logic programming languages such as Prolog have no return values Instead unbound logical variables are used as output parameters to be unified with values constructed in a predicate call Global variable for return status edit Similar to an out argument a global variable can store what error occurred or simply whether an error occurred For instance if an error occurs and is signalled generally as above by an illegal value like 1 the Unix a href Errno html class mw redirect title Errno errno a variable is set to indicate which value occurred Using a global has its usual drawbacks thread safety becomes a concern modern operating systems use a thread safe version of errno and if only one error global is used its type must be wide enough to contain all interesting information about all possible errors in the system Exceptions edit Exceptions are one widely used scheme for solving this problem An error condition is not considered a return value of the function at all normal control flow is disrupted and explicit handling of the error takes place automatically They are an example of out of band signalling Expanding the return value type edit Manually created hybrid types edit In C a common approach when possible is to use a data type deliberately wider than strictly needed by the function For example the standard function getchar is defined with return type int and returns a value in the range 0 255 the range of unsigned char on success or the value EOF implementation defined but outside the range of unsigned char on the end of the input or a read error Nullable reference types edit In languages with pointers or references one solution is to return a pointer to a value rather than the value itself This return pointer can then be set to null to indicate an error It is typically suited to functions that return a pointer anyway This has a performance advantage over the OOP style of exception handling 4 with the drawback that negligent programmers may not check the return value resulting in a crash when the invalid pointer is used Whether a pointer is null or not is another example of the predicate problem null may be a flag indicating failure or the value of a pointer returned successfully A common pattern in the UNIX environment is setting a separate variable to indicate the cause of an error An example of this is the C standard library a href Fopen html class mw redirect title Fopen fopen a function Implicitly hybrid types edit In dynamically typed languages such as PHP and Lisp the usual approach is to return false none or null when the function call fails This works by returning a different type to the normal return type thus expanding the type It is a dynamically typed equivalent to returning a null pointer For example a numeric function normally returns a number int or float and while zero might be a valid response false is not Similarly a function that normally returns a string might sometimes return the empty string as a valid response but return false on failure This process of type juggling necessitates care in testing the return value e g in PHP use i e equal and of same type rather than just i e equal after automatic type conversion It works only when the original function is not meant to return a boolean value and still requires that information about the error be conveyed via other means Explicitly hybrid types edit In Haskell and other functional programming languages it is common to use a data type that is just as big as it needs to be to express any possible result For example one can write a division function that returned the type Maybe Real and a getchar function returning Either String Char The first is an option type which has only one failure value Nothing The second case is a tagged union a result is either some string with a descriptive error message or a successfully read character Haskell s type inference system helps ensure that callers deal with possible errors Since the error conditions become explicit in the function type looking at its signature immediately tells the programmer how to treat errors Further tagged unions and option types form monads when endowed with appropriate functions this may be used to keep the code tidy by automatically propagating unhandled error conditions Example editRust has algebraic data types and comes with the built in Result lt T E gt and Option lt T gt types fn find key String gt Option lt String gt if key hello Some key else None The C programming language introduced std optional lt T gt in the C 17 update std optional lt int gt find int in str std string view str constexpr auto digits 0123456789 auto n str find first of digits if n std string npos The string simply contains no numbers not necessarily an error return std nullopt int result More search logic that sets result return result and std expected lt T E gt in the C 23 update enum class parse error kEmptyString kOutOfRange kNotANumber std expected lt int parse error gt parse number std string view str if str empty Flag one unexpected situation out of several return std unexpected parse error kEmptyString int result More conversion logic that sets result return result See also editNull terminated string Nullable type Option type Sentinel value Tagged unionReferences edit Norvig Peter 1992 The General Problem Solver Paradigms of artificial intelligence programming case studies in common LISP Morgan Kaufmann p 127 ISBN 1 55860 191 0 Built in Types Python 3 10 4 documentation If i or j is negative the index is relative to the end of sequence s len s i or len s j is substituted common sequence operations note 3 Why Exceptions should be Exceptional an example of performance comparison Retrieved from https en wikipedia org w index php title Semipredicate problem amp oldid 1187068348, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.