Daqf Crash When Field Device Lost?

CraigTerris · December 16, 2014

1. Even after rebooting the DAQF PC and checking with Task Manager that there are no apps running, the message 'It appears you are trying to run a second instance of DAQF.......' comes up. Any known reason for this or is know to cause any problem?

2. The loss of comms from one field device (a yet to be determined problem with the device) causes DAQF to cease responding completely and it has to be restarted. Should this be the norm?

AzeoTech · December 16, 2014

1. It happens on some computers. The method used determine if a second instance is running sometimes gets confused. Just turn off the second instance check by going to File - Preferences.

2. That should not happen unless you are doing your comms from the primary thread of the app (i.e. inside a screen component). What release of DAQFactory and maybe you can email us the .ctl.

CraigTerris · December 16, 2014

1. Thanks

2. Emailed. Its Pro release 587b Build 2048

AzeoTech · December 17, 2014

OK, got the file. Which device, and exactly how does it fail? Do you have any screen components that directly access the device (e.g. with device.myDevice.readHolding() in the expression)?

CraigTerris · December 17, 2014

Lighthouse_Generator on 192.168.1.104 stops responding every few days and DAQF has to be restarted, though between DAQF and this device is a MOXA EDS-510 switch in the field at the end of SM fibre, then a radio link comprising a pair of Banner DXER9 radios out to ADAM I/O module at the generator. The radios are on a short, full strength, line-of-sight path. Its hard to tell exactly where the failing is, as once DAQF is restarted everything is running fine again. We do know at the time of failing that a Tx from DAQF does not get an Rx back at the near-end Banner radio, then theres a DAQF crash (presumably after the timeout period on x.104). This could be a hardware problem, though I'm more concerned about the DAQF crash with the other stuff thats running on it. This is the only node we have on a radio link.

Tx (22:08:16.054): \000\000\000\000\000\006\000\002\000\004\000\001

Rx (22:08:16.063): \000\000\000\000\000\004\000\002\001\014

Tx (22:08:21.023): \000\000\000\000\000\006\000\002\000\003\000\001

Rx (22:08:21.029): \000\000\000\000\000\004\000\002\001\028

Tx (22:08:26.023): \000\000\000\000\000\006\000\002\000\003\000\001

Tx (00:55:06.043): \000\000\000\000\000\006\000\002\000\003\000\001

Tx (03:41:46.078): \000\000\000\000\000\006\000\002\000\003\000\001

Tx (06:28:26.113): \000\000\000\000\000\006\000\002\000\000\000\003

(then I arrive at work and restart it!)

We dont have .readHolding functions, just a .readuntil in the wind data parsing sequence.

We do get occassional port locked messages - We did wonder how to pin point these!

12/18/14 07:36:50.003

P-ModbusTCP 0013: Port locked

AzeoTech · December 17, 2014

OK, but how exactly does it crash. "Crash" has a lot of different meanings.

CraigTerris · December 17, 2014

The screen goes dim, there is no response from any mouse command. Windows Task manager says DAQF is "not responding'

SteveMyres · December 22, 2014

I do have calls in screen components to functions in a custom protocol, and DAQ Factory does occasionally become unresponsive (defined as whited out screen, rotating cursor, and I believe the DF title bar says "Not responding" on a 3GB Win7 box).

But if this is the problem, I would have assumed that even if calls were made from within the UI thread, and the controller failed to respond, that control would revert to the UI as soon as the relevant port times out? It doesn't seem to do so.

AzeoTech · December 22, 2014

That depends on how its called. If its a click action then yes the ui will pause then recover after timeout. But if its in the paint event or any of the expressions of the component then the ui will keep calling for the device and thus remain continually sluggish or hung.

SteveMyres · December 22, 2014

Not a click event per se, but nothing relied upon to draw the component.

I believe the only components where I've seen this are numerical inputs. There is a panel whose action is a quick sequence. I get a value with System.EntryDialog() and if it's not empty, scale it and write to an address in the device. There is also a variable value display on the front of the panel that displays the current value of the register from a periodically read channel. This confirms that the user writes made it to the controller and in some cases the controller may also be writing to the same register (things that have an auto/manual mode, for instance). So the value displayed does depend on the write, but the only connection is via the controller, DF per se doesn't know they're linked.

I believe the timeout on the port is 2 seconds, but DF remains hung far longer than that, I believe typically indefinitely, till you kill it with Task Manager.

SteveMyres · December 22, 2014

....and bear in mind that I don't even know for a fact that it's a failure of the device to respond that triggers this (although I've only seen it or had it reported to me after a numerical input has been made), because DF is hung, so I can't do much to troubleshoot.

I do know that the device sometimes fails to reply or fails to return enough bytes, but my protocol driver is supposed to be able to handle that gracefully, and I know that it does often do so. Even so, my assumption is that this may be linked to the same behavior on the part of the device, but I don't actually know that for sure.

AzeoTech · December 22, 2014

Which DAQFactory release?

SteveMyres · December 22, 2014

v5.91 build 2203? I think

AzeoTech · December 22, 2014

Hmm. Most of the 5.9 stability issues seem to be on comm failure. Maybe the way we handle timeouts didn't carry through to the new compiler correctly.

CraigTerris · January 14, 2015

Plus when it is working OK, the comms traffic is doubled up, this module is timed for every one second..

Tx (08:18:40.016): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:40.019): \000\000\000\000\000\007\000\004\004\000E\000\000

Tx (08:18:40.032): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:40.033): \000\000\000\000\000\007\000\004\004\000E\000\000

Tx (08:18:41.016): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:41.017): \000\000\000\000\000\007\000\004\004\000I\000\000

Tx (08:18:41.032): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:41.033): \000\000\000\000\000\007\000\004\004\000I\000\000

And when sending individual commands to a module, I regularly get the 'Unable to send: Port Locked' message

AzeoTech · January 16, 2015

I'd have to see your document. I'm guessing you have duplicates somewhere.

CraigTerris · February 11, 2015

Problem solved. I chucked out the Banner radios and put Ubiquiti radios in.

AzeoTech · February 11, 2015

Great. Big fan of Ubiquiti hardware. I even use their stuff in my house. I got tired of disposable routers. And I needed multidrop due to signal loss in the walls and using their stuff was better than cheap extenders because it appears as one conherent network rather than two, and didn't cost much more.

Daqf Crash When Field Device Lost?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived