WC8 Watchdog?

rossw

Active Member
Can anyone tell me the exact operation of the watchdog in the WC8?
 
I've a board that has twice (in the last 4 days) become "non-responsive". One time I was 500km away and unable to get to it to fix. The other times, a simple powercycle has restored operation.
 
The green LED was blinking, but the board wasn't responding to ICMP pings, wasn't sending data and wasn't responding to webget commands over the network. I cannot tell if the logic was running or not. (Perhaps I need to add code to blink a LED or something?)
 
I'd hoped that a watchdog timeout would in effect send the board back to a reboot - get network address, start everything up from scratch condition, but perhaps it does not?
 
Hi Ross,
 
Watchdog timer watches if CPU stopped running code for certain length of time.  In the situation green LED is still blinking, that CPU watchdog timer must see CPU working properly, since green LED is operated by scheduler regularly.
 
ICMP ping and HTTP protocol are handled by same network stacks. In WC8, the buffer size for network handling is small. When that buffer is full, all the network traffic will not send or receive till there is a space for network activities.   The TCP stack would never trigger watchdog timer for network buffer full or network buffer overflow. TCP logic will keep retrying till network timer out.  In general tab, disable web page pulling can reduce network traffic, thus free more buffer space for desired activity.  If you have a busy network, reducing broadcast packets on the network also can reduce the amount network buffer being used.
 
If it were purely network traffic, I'd expect it to "come back" when the traffic eased, but it doesn't.
 
Assuming the board I have in there is recent enough to have the "reboot" command, will a reboot cause the WC8 to completely reinitialize the network stack, acquire a new lease, etc?
 
Is a work-around for this, for my code to ensure it is getting a regular EXTERNAL stimulus, and if not, to do a software reboot?
I don't want to have to go to the trouble of building an external hardware watchdog that just powercycles the entire device...
 
Ross,
 
The software reboot actually restart the whole firmware including network stack.
 
Although there are not much from inside board to send out, external traffic probably does not stop, since TCP stack from computers and routers will always retry by design.  When there is not enough ram in respond to ping the first time, it will not have enough RAM to respond to next network request, either, until all of them give up and timeout.
 
If possible, design the system sending first network request from WC8 to other host, not from external host to overwhelm WC8 little stack.  Once WC8 started talking to other hosts on network, it will be able to handle external network request. In General tab, disable Web Polling, if you do not need to watch web GUI all the time, that can save a lot of network traffic.
 
Wayne, there are 4 other webcontrol boards on this same switch.
None of them have faltered.
 
The webcontrol boards are firewalled (at my router) and won't see external traffic directed at them, and the boards only get periodic requests from a trusted host (which is also on the same local network) - collecting data via a call to getall.cgi once every 30 seconds.
 
I find it highly unlikely it's little stack is being overwhelmed by traffic, given I've got other boards handling significantly more frequent requests (typically every 2-3 seconds) and not experiencing issues. The board that was in place for the last 3 years has been running 24/7, using the same code, and very rarely stopped responding, until it just went nuts a few months ago and was replaced. The new board has been working ok (much later firmware, and more recent hardware revision) - just with this intermittent locking up.
 
Which firmware do you run better?  Which version firmware that the one running with problem lately?  It is strange that one board could deteriorate its behavior over time. Microchip guarantee its processor working 99 years without measurable differences, except its EEPROM could be worn out by repeatedly write exceeding its life cycles.
 
Any added feature would reduce amount of RAM can be used by all the process, including network stack. The latest official release that removed other 1-wire device support other than DS18B20 supposedly is the one using least RAM, thus allows more RAM to be used by network.  If you could update to that version firmware, please let us know if that helps.
 
The new board is version v03.02.07.
The old board was one of the ancient old original ones, h'ware rev 2.0.2 - can't easily tell what firmware it had.
 
(On that, I've had several of those old original boards that have just gone nuts with temperatures... start a little intermittent, and just get progressively worse, impossible high and low temperatures across all sensors. Replace the board (but not the sensors or power supply) and all comes good.)
 
I see.  the v3.02.07 was pretty old firmware for hw rev 2.2.2 boards.  I would recommend to update to 3.02.29 firmware and try to see if that working better.
 
A fine idea... except that 3.02.07 isn't field-upgradable.
 
I've got 12 or 15 boards ready to send back to you, I've removed as many as I can do without for the moment, send these in for upgrade, then when they come back, replace some of the ones still in service with them, and send in a second batch....
 
It's never easy!
 
Ross,  so sorry that earlier firmware did not have bootloader.  The only way to get bootloader on them is to send in for re-program.  We do not charge re-program older board without bootloader, for most customers, we only charge the shipping and handling cost for getting the board back to user.
 
Back
Top