Email notifier board

StarTrekDoors · Jan 6, 2015

It would be nice to have 2-way communication like so many of the older systems from more than a decade ago.

I think your biggest concern in the long run will be you'll find reliability of the email notifier to be an issue. Once again I've had this notifier board go off line and is not longer sending messages (via internal network) under firmware 1.3. What I have found that will duplicate this random failure is if you disconnect the network momentarily, then reconnect, the board will fail randomly at which point it won't reconnect without power cycling the entire OPII. Given I'm using a lab test unit, I can duplicate a real world environment where it's possible the end user won't have power backup for the network during a long outage, so will require a system restart when all appears to be good ("if" the Omni stayed powered up on battery/UPS).

StarTrekDoors · May 28, 2015

So after a few months I find the email board and the latest updates don't seem to be helping with reliability. Email notification works about every 5-6th email event so I started digging into why....

I found a significantly large number errors on the network port that the email board is connected to. Cable is cat 6 and tested fine, T-568B, short run (5ft), no EMI, shielded case. Tried different ports and still generates errors, so not a bad switch port.

Checking the a new OPII board, I also found it to has a few hundred network errors, and as a matter of fact, found that only the OPII and it's email notifier board are the only two devices generating errors on the customer's network.

The complaint is that rarely are emails going out and seems to be fairly common that emails never reach the email server to be sent. Email server is on the local network, so no routing issues.

OPII 3.25 firmware with Notifier 1.3

Anyone seeing similar?

CJW-PIC · May 29, 2015

May I ask what software created the chart?
Thanks

Frunple · May 29, 2015

I see you have GigE ports. Try hard coding to 10M full since that's all the OPII and email board are.
Seeing collisions like that makes me think it auto neg'd to 10 half.

StarTrekDoors · May 29, 2015

CJW-PIC said:
May I ask what software created the chart? Thanks

Cisco (took a couple of screenshots and pasted into Microsoft Paint so I could draw circles)

h34r:

Frunple said:
I see you have GigE ports. Try hard coding to 10M full since that's all the OPII and email board are.
Seeing collisions like that makes me think it auto neg'd to 10 half.

You are correct that the Omni ports are 10M-Half, which just so happens to be the manual port settings here. Auto-negotiate (did also detect those as 10M-Half) but yielded the same bad behavior as manual port settings.

Frunple · May 29, 2015

StarTrekDoors said:
You are correct that the Omni ports are 10M-Half, which just so happens to be the manual port settings here. Auto-negotiate (did also detect those as 10M-Half) but yielded the same bad behavior as manual port settings.

Make them 10M FULL. Half duplex will never work right.

StarTrekDoors · May 29, 2015

Frunple said:
Make them 10M FULL. Half duplex will never work right.

Unfortunately neither the OPII nor the Notifier will go into full duplex, only half so forcing the port to full duplex does nothing. I would guess that is because of the antiquated chipsets used in Omni boards.

For grins I did force the port configuration to full duplex at 10M, but same thing difference and would guess others are seeing the same errors.

I'm just saying some of the reliability issues could be partially due to the port errors, but even when there are errors, the notifier should still continue its attempts to notify until successful.....

I have found restarting the OPII or making "some" OPII updates will trigger the Notifier to start working again if it stops for a day or so but so far haven't found the combination of updates that triggers it back into semi-proper operation. It just gives up and may try every so often on a new event but old events are lost forever. Fortunately the logs show the events so at least there is a second method of finding "un-notified" events, but would be nice to be able to get notification working reliably close to 100%.

Frunple · May 30, 2015

They absolutely will go to full duplex, and it will fix the problem.
If you want me to prove it, I'll set mine to half duplex, clear the counters, and I can show you the errors come pouring in.

StarTrekDoors · Jun 3, 2015

Frunple said:
They absolutely will go to full duplex, and it will fix the problem.
If you want me to prove it, I'll set mine to half duplex, clear the counters, and I can show you the errors come pouring in.

My understanding is that the OPII is "supposed to be 100M Full Duplex", however, it negotiates 10 Half because of chipset coding - it's not working as it should. I didn't mean that the OPII/EN was only half duplex, but even at 10M Full - that is wrong from what it is "supposed to be". Given others have also had network issues with the OPII, I believed the same errors with the Email Notifier are likely related to using the same chipset.

When forced to 10M Full, the Email Notifier can generate FCS errors due to the switch port and the OPII and Email Notifier not working properly with the switches.

In the case of your Linksys, perhaps the errors are not there, perhaps that works better with the OPII chipset, but I would wonder why you wouldn't run it at full speed if there are no errors. If you did, would that cause the EMail Notifier to duplicate the reliability issues some are reporting?

Leaving ports set to auto-negotiate means being not being able to switch port connections around without having everyone remember one or more has had a limitation placed on a couple of ports. It also means when moving from one customer to the next, there is no consistency with operation on the switch of the day. It's not too difficult to make a note that this customer and that customer required static settings on switch port 7, 10, 31, and 42, but would be nice if they didn't have to remember it too so that when "they" change the network, they don't have to remember that their "should be very advanced system" can't negotiate port speed and duplex correctly.

Seems today there are a lot of complaints about the EN, with some likes, and seems like most everyone likes the NTP function. I just hoped for FULL functionality "reliability" (pardon the pun) and hoping Leviton will resolve these complaints someday. :mellow:

Frunple · Jun 4, 2015

First off, I don't use Linksys.

Second, the OPII is 10M only. No where has it been said or implied that it is a 100M port. It's not. They figured there was no need for more than 10M so that's what it is. This is the hardware I'm talking about, it's only a 10M port.
You should not be "switching ports" as you said. Plug it in and leave it there. Can you imagine a business size network with people wanting to change the port their on?? Complete nightmare. That's why you use a manged switch, so you can adjust each port as needed.
Auto negotiation is not a standard between vendors. You can not believe the problems it causes. In a case like this, just hard code it and be done with the problems. It's too easy of a fix. Why argue it?

pete_c · Jun 4, 2015

Oddest thing...this is what I see with the network interface on the microrouter with the following connection to the OPII interface. Doesn't make sense to me.

Microrouter ethernet link to OPII

root@ICS-TPLink-MR:~# dmesg | grep -e eth1 -e bcm
[    2.060000] eth1: Atheros AG71xx at 0xba000000, irq 5, mode:GMII
[    8.380000] eth1: link up (1000Mbps/Full duplex)
[    9.750000] eth1: link down
[   21.510000] device eth1 entered promiscuous mode
[   23.610000] eth1: link up (1000Mbps/Full duplex)

Goofing around connected the OPII to one port on the PFSense firewall using an Intel Gb Nic.

Intel NIC to OPII

em2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
   options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
   ether xx:xx:xx:xx:xx:xx
   inet 192.168.245.249 netmask 0xfffffff8 broadcast 192.168.245.255
   nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
   media: Ethernet autoselect (10baseT/UTP <half-duplex>)
   status: active

Media 10baseT/UTP <half-duplex>

StarTrekDoors · Jun 6, 2015

Frunple said:
First off, I don't use Linksys ............... just hard code it and be done with the problems.

Hard coding the port in the managed switch generates errors.

My apologies, the screenshot you supplied looks like Linksys' switch interface which caused my confusion.

StarTrekDoors · Jun 6, 2015

pete_c said:
Oddest thing...this is what I see with the network interface on the microrouter with the following connection to the OPII interface. Doesn't make sense to me.

Microrouter ethernet link to OPII

root@ICS-TPLink-MR:~# dmesg | grep -e eth1 -e bcm
[    2.060000] eth1: Atheros AG71xx at 0xba000000, irq 5, mode:GMII
[    8.380000] eth1: link up (1000Mbps/Full duplex)
[    9.750000] eth1: link down
[   21.510000] device eth1 entered promiscuous mode
[   23.610000] eth1: link up (1000Mbps/Full duplex)

Goofing around connected the OPII to one port on the PFSense firewall using an Intel Gb Nic.

Intel NIC to OPII

em2: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
   options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
   ether xx:xx:xx:xx:xx:xx
   inet 192.168.245.249 netmask 0xfffffff8 broadcast 192.168.245.255
   nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
   media: Ethernet autoselect (10baseT/UTP <half-duplex>)
   status: active

Media 10baseT/UTP <half-duplex>

That's similar to what I am seeing on several systems as well (different switches). OPII/Notifier is advertising each operates at only half duplex, apparently by design. While there's nothing wrong with that, the reliability issues may be related to the network weaknesses, but I haven't been able to confirm that yet.

Frunple · Jun 6, 2015

StarTrekDoors said:
Hard coding the port in the managed switch generates errors.

My apologies, the screenshot you supplied looks like Linksys' switch interface which caused my confusion.

Having a rough day so I apologize for redundancy but have you tried another cable? another port?
If hard coding still gives errors, there's a physical problem with something. Change the cable first, even if you know it works with another device, try another.
I wasn't kidding, i can set mine to auto or hard code to 10M Half and I get tons of errors too so I figured 10 Full would be the fix.
Do you have another switch you can try? Maybe a bad port on Omni?

pete_c · Jun 6, 2015

Yeah here played with the TP-Link managed switch initially.

I did find an issue with the network tools on Openwrt getting a good read of the connection.

Email notifier board

StarTrekDoors

Active Member

StarTrekDoors

Active Member

Attachments

CJW-PIC

Member

Frunple

Active Member

StarTrekDoors

Active Member

Frunple

Active Member

StarTrekDoors

Active Member

Frunple

Active Member

Attachments

StarTrekDoors

Active Member

Attachments

Frunple

Active Member

pete_c

Guru

StarTrekDoors

Active Member

StarTrekDoors

Active Member

Frunple

Active Member

pete_c

Guru

Similar threads