Daqf Crash When Field Device Lost?


CraigTerris

Recommended Posts

1. Even after rebooting the DAQF PC and checking with Task Manager that there are no apps running, the message 'It appears you are trying to run a second instance of DAQF.......' comes up.   Any known reason for this or is know to cause any problem? 

 

2.  The loss of comms from one field device (a yet to be determined problem with the device) causes DAQF to cease responding completely and it has to be restarted.  Should this be the norm?

  

Link to comment
Share on other sites

1.  It happens on some computers.  The method used determine if a second instance is running sometimes gets confused.  Just turn off the second instance check by going to File - Preferences.

 

2. That should not happen unless you are doing your comms from the primary thread of the app (i.e. inside a screen component).  What release of DAQFactory and maybe you can email us the .ctl.

Link to comment
Share on other sites

Lighthouse_Generator on 192.168.1.104 stops responding every few days and DAQF has to be restarted, though between DAQF and this device is a MOXA EDS-510 switch in the field at the end of SM fibre, then a radio link comprising a pair of Banner DXER9 radios out to ADAM I/O module at the generator.  The radios are on a short, full strength, line-of-sight path.  Its hard to tell exactly where the failing is, as once DAQF is restarted everything is running fine again.  We do know at the time of failing that a Tx from DAQF does not get an Rx back at the near-end Banner radio, then theres a DAQF crash (presumably after the timeout period on x.104).  This could be a hardware problem, though I'm more concerned about the DAQF crash with the other stuff thats running on it. This is the only node we have on a radio link.   

 

Tx (22:08:16.054): \000\000\000\000\000\006\000\002\000\004\000\001

Rx (22:08:16.063): \000\000\000\000\000\004\000\002\001\014

Tx (22:08:21.023): \000\000\000\000\000\006\000\002\000\003\000\001

Rx (22:08:21.029): \000\000\000\000\000\004\000\002\001\028

Tx (22:08:26.023): \000\000\000\000\000\006\000\002\000\003\000\001

Tx (00:55:06.043): \000\000\000\000\000\006\000\002\000\003\000\001

Tx (03:41:46.078): \000\000\000\000\000\006\000\002\000\003\000\001

Tx (06:28:26.113): \000\000\000\000\000\006\000\002\000\000\000\003

(then I arrive at work and restart it!)

 

 

We dont have .readHolding functions, just a .readuntil in the wind data parsing sequence.

 

We do get occassional port locked messages - We did wonder how to pin point these!  

12/18/14 07:36:50.003

P-ModbusTCP 0013: Port locked

Link to comment
Share on other sites

I do have calls in screen components to functions in a custom protocol, and DAQ Factory does occasionally become unresponsive (defined as whited out screen, rotating cursor, and I believe the DF title bar says "Not responding" on a 3GB Win7 box). 

 

But if this is the problem, I would have assumed that even if calls were made from within the UI thread, and the controller failed to respond, that control would revert to the UI as soon as the relevant port times out?  It doesn't seem to do so.

Link to comment
Share on other sites

That depends on how its called. If its a click action then yes the ui will pause then recover after timeout. But if its in the paint event or any of the expressions of the component then the ui will keep calling for the device and thus remain continually sluggish or hung.

Link to comment
Share on other sites

Not a click event per se, but nothing relied upon to draw the component.

 

I believe the only components where I've seen this are numerical inputs.  There is a panel whose action is a quick sequence.  I get a value with System.EntryDialog() and if it's not empty, scale it and write to an address in the device.  There is also a variable value display on the front of the panel that displays the current value of the register from a periodically read channel.  This confirms that the user writes made it to the controller and in some cases the controller may also be writing to the same register (things that have an auto/manual mode, for instance).  So the value displayed does depend on the write, but the only connection is via the controller, DF per se doesn't know they're linked.

 

I believe the timeout on the port is 2 seconds, but DF remains hung far longer than that, I believe typically indefinitely, till you kill it with Task Manager.

Link to comment
Share on other sites

....and bear in mind that I don't even know for a fact that it's a failure of the device to respond that triggers this (although I've only seen it or had it reported to me after a numerical input has been made), because DF is hung, so I can't do much to troubleshoot.

 

I do know that the device sometimes fails to reply or fails to return enough bytes, but my protocol driver is supposed to be able to handle that gracefully, and I know that it does often do so.  Even so, my assumption is that this may be linked to the same behavior on the part of the device, but I don't actually know that for sure.

Link to comment
Share on other sites

  • 4 weeks later...

Plus when it is working OK, the comms traffic is doubled up, this module is timed for every one second..

 

Tx (08:18:40.016): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:40.019): \000\000\000\000\000\007\000\004\004\000E\000\000

Tx (08:18:40.032): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:40.033): \000\000\000\000\000\007\000\004\004\000E\000\000

Tx (08:18:41.016): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:41.017): \000\000\000\000\000\007\000\004\004\000I\000\000

Tx (08:18:41.032): \000\000\000\000\000\006\000\004\000\004\000\002

Rx (08:18:41.033): \000\000\000\000\000\007\000\004\004\000I\000\000

 

And when sending individual commands to a module, I regularly get the 'Unable to send: Port Locked' message

Link to comment
Share on other sites

  • 4 weeks later...

Great.  Big fan of Ubiquiti hardware.  I even use their stuff in my house.  I got tired of disposable routers.  And I needed multidrop due to signal loss in the walls and using their stuff was better than cheap extenders because it appears as one conherent network rather than two, and didn't cost much more.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.