Ethernet trouble/restore after recent firmware updates.

CORT

Active Member
With a recent round of firmware updates, my house's M1 system has gone a bit haywire.  My system now logs extremely frequent ethernet trouble/restore messages.  As soon as I arm the system, the cycle of ethernet trouble/restore goes away.  When I disarm the system the cycle of ethernet trouble/restore returns within 35 minutes.  It seems like there is something wrong with the communication between the M1 and the XEP when the system is disarmed.  I replaced the RS-232 cable and even tried a different M1 panel and a different XEP using my same config with no improvement.  I noticed the ethernet troubles after updating the firmware on my M1 and M1XEP (the M1 was updated to 5.3.10 and the XEP to 2.0.46).  To the best of my knowledge, this problem did not exist prior to the firmware updates.  I would like to keep the updated firmware if possible.
 
Has anyone else encountered this problem?
 
Regards.
 
 
 
There are a couple of old threads on Cocoontech about M1XEP trouble/restore problems.  But not quite like yours where armed state makes it go away.
 
http://cocoontech.com/forums/topic/22734-m1xep-ethernet-troubleethernet-restore/
http://cocoontech.com/forums/topic/19740-m1xep-monitoring-errors/
 
Are you using the XEP for central station monitoring?
 
It's a bit strange that the problem persists with a different M1 and XEP.  That makes me think it could be a network problem that prevents the XEP from receiving a response from the CS.   But why that would be different depending on armed/disarmed state leaves me puzzled. 
 
RAL said:
There are a couple of old threads on Cocoontech about M1XEP trouble/restore problems.  But not quite like yours where armed state makes it go away.
 
http://cocoontech.com/forums/topic/22734-m1xep-ethernet-troubleethernet-restore/
http://cocoontech.com/forums/topic/19740-m1xep-monitoring-errors/
 
Are you using the XEP for central station monitoring?
 
It's a bit strange that the problem persists with a different M1 and XEP.  That makes me think it could be a network problem that prevents the XEP from receiving a response from the CS.   But why that would be different depending on armed/disarmed state leaves me puzzled. 
Thank you. for the reply.  At the moment, I do not have CS monitoring.  I do not have a phone number in the field for monitoring either.  The XEP is enrolled (which is why the system is logging and alerting ethernet trouble/restore).  I have looked at this board and Elk's support forum extensively trying to figure out my system's problem.  I have tried powering the XEP from a different 12 volt wall wart PS.  I think I have even stumped BW at Elk.  My system is a bit unique in that the house's basement is defined as a separate "area" from the main house--the basement is left armed when vacant.  I am beginning to wonder if this is part of the problem--perhaps the updated firmware handles reporting to the XEP different from the old firmware causing a miscommunication across the RS-232 serial bus.   Tracing the RS232 bus with Elk's M1XEP utility shows a tremendous amount of "chatter" (zone reporting and handshaking, I guess) on the RS-232 serial bus that mostly goes away when the system is armed.  I don't know if it is the arming that makes the reports go away or if it is the lack of activity in the house that comes with arming.  I will leave the system unarmed one night and see what happens.
 
My understanding is that the trouble/restore events happen when the XEP fails to get a response to a ping to the CS.  If you aren't set up wfor CS monitoring, I'm not sure what the XEP would be trying to ping.
 
Is the M1 configured to communicate with any home automation system via ethernet?  Not sure why there would be RS232 traffic showing zone reporting otherwise.
 
No CS monitoring here for years and no previous XEP comm trouble, except...
 
.. I recently added some new rules that caused my Elk to restart. I would lose comm with the XEP whenever a particular rule activated. Disabling the rule resolved the issue. The subject rule is just a simple counter update.
 
I too have two "independent" areas defined. I've already reached-out to Elk about the problem, but I wonder if out issues are related.

That said, my issue started after the rule update. I was running the previous XEP firmware. I updated the XEP firmware as a result, but it did not resolve the issue.
 
sionxct said:
No CS monitoring here for years and no previous XEP comm trouble, except...
 
.. I recently added some new rules that caused my Elk to restart. I would lose comm with the XEP whenever a particular rule activated. Disabling the rule resolved the issue. The subject rule is just a simple counter update.
 
I too have two "independent" areas defined. I've already reached-out to Elk about the problem, but I wonder if out issues are related.

That said, my issue started after the rule update. I was running the previous XEP firmware. I updated the XEP firmware as a result, but it did not resolve the issue.
Does the M1 reboot every time the rule is activated?   If so, that makes me think that somehow you've created an endless loop in the rules where your last-added rule interacts with some other rule.  Processing of the rules never ends, and eventually the watchdog timer times out and causes a reboot.
 
It appears to reboot every time, and its immediate by human perception. Thank you for the suggestion. I'll have to study my logic again or maybe refactor it, but I don't think its an infinite loop. I'm not back on site for awhile, so my troubleshooting abilities are limited and since the panel is working now I'm kinda' afraid to mess with it too much until I can get back on site anyway.
 
I don't want to hijack this thread, but I have another issue that started at the same time: whenever I connect ElkRP to the Panel it complains of rule and text conflicts even though nothing changed. I think these issues on my panel are related, I just don't know that its related to the OP's issue. Maybe I need to start another thread.
 
I am the original poster of this thread and want to leave a bit of an update in case someone else is having similar troubles. I haven't figured out my M1 system's problem, but I have learned how to work around some of the issues. At the moment, my observations are not certain so take what I say with a grain of salt. I mis-identified the circumstances blaming firmware updates. Instead I think that adding a zone or two performed around the same time as the firmware updates was the tipping point. Years ago, I configured the system with two areas--one area for the main house and another area for the unfinished walk-out basement. The unfinished walk-out basement is left armed most of the time and only disarmed when needed.

Here is what is going on: When both areas are disarmed, my M1 reports ethernet trouble and ethernet restore with extreme frequency. When the basement area (common to the main house) is armed, the cycle of ethernet trouble/restore goes away. Also if the basement is disarmed, I have trouble connecting to the M1 with the XEP.  This problem too goes away when the basement is armed. What I have discovered is that my M1 constantly reports all zones status across the RS232 serial bus when the basement is disarmed. When the basement area is armed, the chatter across the RS232 bus drops to only real zone change. In other words, common area disarmed results in a constant cycle of full zone reporting and common armed results in only zone status change. I don't know why it does this. Sniffing the RS232 bus, I can see the handshake between the M1 and XEP when the common area is armed, but I cannot see the handshake when the common area is disarmed. I believe there is too much data going across the RS232 serial bus with the constant cycle of full zone reporting, and it disrupts the Q2 min requirement of handshaking between the M1 and XEP. My suspicion is that a large number of zones in the system being constantly and fully reported across the RS232 serial bus takes too much time interfering with the handshake. As of this writing, I have just turned off the error beeping and learned to live with the issues. My guess is that the troubles would go away if I set the globals to not report this much data across the RS232 serial bus. I have more troubleshooting to do with the system, so there will be more to the story.
 
It sounds like you've done some good detective work on figuring out what's going on.  Your theory on why that causes timeouts is reasonable. It's strange how having an area armed or disarmed changes the behavior.
 
One thing I'm puzzled about is that the Elk description for the global settings to send RS232 data seem like the M1 should only be sending status updates when there is a change in zone status.  Yet, from what you see, it appears to be sending data all the time.  Are you seeing requests from the XEP for data?  What data packet type is the M1 sending out to the XEP?
 
Have you tried contacting Elk to discuss this latest info with them?
 
It sounds like you've done some good detective work on figuring out what's going on. Your theory on why that causes timeouts is reasonable. It's strange how having an area armed or disarmed changes the behavior.
I am reviving an old thread. I never figured out a solution to my system's problem, until today. My system is configured with the house's main floor as area 1 and the unfinished basement as area 2. I keep the basement (area 2) armed at all times unless someone needs to use the basement workshop/mechanicals room. Normally I have the main house (area 1) common to the basement (area 2) which requires the basement to be armed in order to arm the main house. This commonality keeps me and my family from forgeting to arm the basement if we arm the main house.

Here is the problem: when the basement (area 2) is disarmed, I get errors from the M1 stating Ethernet connection is lost and zone temperature sensors are inaccurate (like minus 32 degrees). When area 2 is armed, those errors go away. It turns out the M1 spews status reports of the first 50 or so zones repeatedly across the serial bus, regardless of zone change, when area 2 is disarmed. This constant reporting of zones interferes with the XEP handshake and the display of zone temperature sensors.

As a workaround, I disabled zone reporting from globals in RP, but this was not a good solution, since it meant eKeypad was no longer able to report zones. With the recent availability of new firmware 5.3.14, my interest was rekindled. I installed the new firmware and enabled zone reporting in RP globals. The new firmware did not affect zone reporting and did not fix my system's problem.

Here is what helped. I disabled area 1 commonality to area 2, and this finally fixed the problem. When area 1 is not common to area 2, the system behaves as it should. Zone reporting across the serial bus settles down and the constant cycling of zone reporting ceases. When area 1 is common to area 2 and if area 2 is disarmed, the M1 constantly cycles zone reporting across the serial bus regardless of zone change or not zone change. I believe this is a bug in firmware 5.3.10 and 5.3.14. It only took me a year and a half to figure it out.
 
Good work on figuring that out! It sure is a strange interaction of different parts of your configuration. Please report what you've found to Elk and hopefully, they'll come up with a fix one of these days.
 
Back
Top