I was annoyed quite a bit with bash (no, not because I was new, I used it professionally daily for 12 years before starting NGS). One of the red herrings was the constant need to reimplement small stupid things like retry()
function for example (or copy it all the time across projects and adjust as needed).
It is obvious to me that retry()
should be in the standard library, among other similarly frequently needed functions like log()
, debug()
, error()
, exit(message)
, etc.
retry() background
In NGS, retry()
has several named parameters. The most important of them is body
, which is the code to repeatedly run until it succeeds. The code in body
reports success/failure by returning a value. Truthy value means success and that retry should finish. Falsy value means failure and that retry should attempt again (unless it reached maximum attempts).
On success, retry()
returns the value from the body
. In some cases this value is not needed, while in others it’s very handy and eliminates additional code.
On failure, retry()
throws an exception.
assert() background
In NGS, the straightforward way to assert a condition is the assert() function, either as assert(data, “message”) or assert(data, pattern, “message”). It either returns the original data
or throws an AssertFail
exception if data was falsy in the first case or if data did not match the pattern in the second case.
Take 1: retry() handles exceptions
Originally, retry()
was catching exceptions thrown in body
. The rationale was simple: you are trying something that might fail. It’s the reason you are using retry()
to begin with.
Then reality has shown, like on many other occasions, that catching exceptions indiscriminately is a bad idea. Some exceptions are unrelated to the retry logic, and should not be retried, similar to some HTTP codes.
retry(
...
body = {
1 / 0 # good luck retrying
}
)
Take 2: retry() doesn’t handle any exceptions
The swing in the other direction was also “reasonable”: orthogonality is also not a new idea. retry()
handles the retry logic. Exceptions and errors are handled by the body
itself or by something outside retry()
. But now we are breaking this code:
retry(
times = 60
sleep = 1
title = 'Waiting for session to be resumed'
...
body = {
get_session(_session.id).assert({'status': 'RUNNING'}, 'Session did not resume')
}
)
assert()
throws exception and nobody catches it so the program terminates. To fix the above, the code becomes:
retry(
times = 60
sleep = 1
title = 'Waiting for session to be resumed'
...
body = {
get_session(_session.id) =~ {'status': 'RUNNING'}
}
fail_cb = {
# all attempts failed
throw Error('Session did not resume')
}
)
… and that’s just too verbose for what it does. Also, the error message is now is not adjacent to the pattern.
Take 3: retry() uses Result type
The idea was that body
should return as subtype of Result
: either Success
or Failure
instance. There were two issues though.
- This change breaks existing code. Supporting arbitrary data returned by existing code and
Result
won’t lead to anything elegant. - It makes
body
inelegant and verbose:Result({ assert(...) })
Given the two arguments above, I’ve cut this branch of thought and didn’t implement this approach.
Take 4: retry() with retry_assert()
The problem with using assert()
in body
(or any code that it calls) is semantic ambiguity:
- “normal” assertion, where a failure signifies unexpected, likely permanent condition that will not change after a retrying.
- assertion about temporary condition that we are waiting to change on one of the next retries.
The solution was to introduce retry_assert()
, which is almost identical to assert(). The difference is that retry_assert()
throws RetryAssertFail
while assert() throws AssertFail
.
The code becomes:
retry(
times = 60
sleep = 1
title = 'Waiting for session to be resumed'
...
body = {
get_session(_session.id).retry_assert({'status': 'RUNNING'}, 'Session did not resume')
}
)
… which is concise again and the error message is near the pattern against which the data is tested. retry_assert()
throws an exception that retry()
catches.
Let’s take a brief look at the excerpt from retry() implementation in the standard library.
try {
result = body()
} catch(ra:RetryAssertFail) {
guard i < times - 1
# Ignoring the assert as long as it's not last iteration
} catch(e:Exception) {
logger("$title Exception $e")
throw e
}
If “guard” succeeds, we do nothing. The loop around this code will continue to the next iteration.
If “guard” fails, meaning we are now at the last attempt, the exception is handled by the second “catch” clause.
I’m fond the fact that the implementation is simple and straightforward.
Future
retry_assert()
while appears to be very useful and ergonomic is still new in NGS. Practice may show that some adjustments might be needed.
The plan is to give more thought to the retry()able vs not retry()able exceptions in general. In particular:
- Can any code called from
body
throwRetryAssertFail
? Should it? - Marking other exceptions as retry()able
Please share your thoughts. It’s easy to miss some other interesting perspectives. Have better naming for retry_assert()
? Have behavior improvement suggestion? Know about similar functionality in other languages? I’ll be glad to hear.
For the curious why I “was” annoyed with bash and not “am”? Well, I still am but it hurts much less now since most of my coding that was previously bash is now NGS.
Leave a comment