single poiint of failure? stuck stack

Efried · Mar 17, 2014

rossw said:
The problem appears not to be the wait time before accessing the second WC, but AFTER accessing it and BEFORE accessing the server.
Sounds to me like you need to work on your code. There are any number of ways to achieve stability under the conditions you are struggling with.

Rather than expecting CAI to implement a "purge stack" command that will be useful to perhaps exactly ONE customer, you'd be better trying to convince them of the value of user-defined timeout periods. There are times 6 seconds may be too short, just as you claim it's too long. Sounds to me like the far better solution would be for some arbitary but user-specified timeout.

great idea, that would solve the problem too.
BTW: before is equal to after since it is a cycle...

CAI_Support · Mar 17, 2014

We picked 6 seconds that is considering anywhere on earth would be able to receive response by then. We could set a byte for user to define how many seconds they want to have for TCP to finish, that will give from 1 to 255 seconds. But we are not sure how that help if his BRE board is powered off, it will never reply even waited 255 seconds.

Please note for most case, when server side is alive, the response is received a lot faster, probably in milli-second range. So that TCP stack never need to wait that 6 seconds before moving on next thing.

rossw · Mar 17, 2014

CAI_Support said:
We picked 6 seconds that is considering anywhere on earth would be able to receive response by then. We could set a byte for user to define how many seconds they want to have for TCP to finish, that will give from 1 to 255 seconds. But we are not sure how that help if his BRE board is powered off, it will never reply even waited 255 seconds.

Please note for most case, when server side is alive, the response is received a lot faster, probably in milli-second range. So that TCP stack never need to wait that 6 seconds before moving on next thing.

255 seconds is probably far longer than necessary, and 1 second granularity is adequate - but perhaps the timeout could be in half seconds or tenths?
From 0.5 seconds to 127.5 seconds or from 0.1 to 25.5 seconds.

I agree that we should be able to get a response from a server anywhere in the world in a few seconds, but the reality is that sometimes it just doesn't happen like that. The WC board could be connected via a 3G router set to dial-on-demand. It could be on a highly congested service. The server may be extremely busy. Or, it may simply take the server quite a long time to process, calculate and reply to a request.

6 seconds certainly isn't long enough in some cases, yet in others it is too long! User-set timeouts sound like a real benefit.

rossw · Mar 17, 2014

Efried said:
BTW: before is equal to after since it is a cycle...

Not from the example you cited.

start:
     webset WC2
     webset WC2
     webset SERVER
     delay 6000
end

is in a loop/cycle, but there isn't enough delay after the webset WC2 to allow the stack to timeout before sending to SERVER, not withstanding the 6 second delay AFTER, whereas

start:
     webset WC2
     webset WC2
     delay 6000
     webset SERVER
end

won't attempt to webset to SERVER until AFTER the calls to WC2 have a chance to timeout.

(Of course, you wouldn't write the code in this way, you'd use different timing logic to avoid a "blocking delay")

CAI_Support · Mar 17, 2014

Those two
WEBSET WC2
if the WC2 board was powered off and not responding, each will take six seconds to timeout, so that will need 12 seconds later to get to server.

Efried original problem was that his code keeps pumping more WEBSET request into queue. He probably did not realize there was a sending queue, documented in the user guide, with very limited size. Once there are more WEBSET requests pushed in than network can actually send out, the later WEBSET request will push off the earlier requests from the queue.

I think in this case, his server could alert him there was something going wrong, because it did not received expected WEBSET call. So that he can look into what was the cause of the problem.

rossw · Mar 17, 2014

CAI_Support said:
I think in this case, his server could alert him there was something going wrong, because it did not received expected WEBSET call. So that he can look into what was the cause of the problem.

I think the greater problem is conceptual. The program should (as far as possible) continue to operate when the outside world has gone to hell in a handbasket.

As an example, when exchange information with the "server", I always clear the result first, and check that I get an acknowledge back. That means, not a zero. An actual response I can test for that can't happen by accident.

The WC, upon NOT getting back a confirmation response can choose to re-try, generate an alarm, go to a failsafe/shutdown mode or ignore it, depending on the application and the severity of the problem not talking to the server may be.

For example: my tracker controller sends status info to the server (last actuator currents etc) and requests the current sun position. If it doesn't get a reply, it uses the last position plus a little to allow for the gradual progression of the sun. Not exact, but close. But if it can't get a response after an hour it calls for help. Far better to be "close" than simply give up!

Efried · Mar 18, 2014

CAI_Support said:
Those two
WEBSET WC2
if the WC2 board was powered off and not responding, each will take six seconds to timeout, so that will need 12 seconds later to get to server.

Efried original problem was that his code keeps pumping more WEBSET request into queue. He probably did not realize there was a sending queue, documented in the user guide, with very limited size. Once there are more WEBSET requests pushed in than network can actually send out, the later WEBSET reques

Not from the example you cited.

start:
     webset WC2
     webset WC2
     webset SERVER
     delay 6000
end

is in a loop/cycle, but there isn't enough delay after the webset WC2 to allow the stack to timeout before sending to SERVER, not withstanding the 6 second delay AFTER, whereas

start:
     webset WC2
     webset WC2
     delay 6000
     webset SERVER
end

won't attempt to webset to SERVER until AFTER the calls to WC2 have a chance to timeout.

(Of course, you wouldn't write the code in this way, you'd use different timing logic to avoid a "blocking delay")

Click to expand...

t will push off the earlier requests from the queue.

I think in this case, his server could alert him there was something going wrong, because it did not received expected WEBSET call. So that he can look into what was the cause of the problem.

rossw said:
Not from the example you cited.

start:
     webset WC2
     webset WC2
     webset SERVER
     delay 6000
end

is in a loop/cycle, but there isn't enough delay after the webset WC2 to allow the stack to timeout before sending to SERVER, not withstanding the 6 second delay AFTER, whereas

start:
     webset WC2
     webset WC2
     delay 6000
     webset SERVER
end

won't attempt to webset to SERVER until AFTER the calls to WC2 have a chance to timeout.

(Of course, you wouldn't write the code in this way, you'd use different timing logic to avoid a "blocking delay")

Thanks, yes my implementation is more complicated than that only websetting the server every 10 minutes while the other WC is websetted every 10-30 seconds.. May be that is the problem: I do have long blocking delays and the six seconds timeout is not including that delay time?
Another possibility would be to shift some cycling logic to the second WC only syncing times.

CAI_Support · Mar 18, 2014

Because WEBSET is using TCP, a reliable communication protocol, your server should not miss a heartbeat. If your server missed any, then it should alert for a problem.

In PLC coding, please do pay attention the WEBSET queue has limit size. Keep sending new WEBSET requests will push out old requests. Because limited RAM size, we have to decide to let newer requests push out older requests, or refuse to take in new requests if the queue is full. We decide to let new requests to push out older requests, because in case of emergency, the later request may be more critical than earlier ones.

Efried · Mar 18, 2014

CAI_Support said:
Because WEBSET is using TCP, a reliable communication protocol, your server should not miss a heartbeat. If your server missed any, then it should alert for a problem.

In PLC coding, please do pay attention the WEBSET queue has limit size. Keep sending new WEBSET requests will push out old requests. Because limited RAM size, we have to decide to let newer requests push out older requests, or refuse to take in new requests if the queue is full. We decide to let new requests to push out older requests, because in case of emergency, the later request may be more critical than earlier ones.

thanks, my question was if I have to use less DELAY because it interrupts/delays the TIMEOUT?

CAI_Support · Mar 18, 2014

Interrupt or delay does not change the WEBSET timeout value, at least for current firmware. If the server side did not respond, WC will retry for six seconds before considering the server died and move on next.

single poiint of failure? stuck stack

Efried

Active Member

CAI_Support

Senior Member

rossw

Active Member

rossw

Active Member

CAI_Support

Senior Member

rossw

Active Member

Efried

Active Member

CAI_Support

Senior Member

Efried

Active Member

CAI_Support

Senior Member

Similar threads