Errors in channel times


RodSwan

Recommended Posts

I'm having problems with Modbus ethernet communication to Optilogic OL4228 RTU.

I'm updating a number of drawings I inherited where the comms to the RTU was done in about 2000 lines of almost incomprehensible sequence code using large multi dimension arrays and function calls to read and write everything in the RTU as often as it could - eating up enormous amounts of CPU time. As we only use one counter, two or three digital i/o lines and a couple of analogue channels I figured I can replace all that code with a few simple device channels.

I frequently get timeout errors and PortLocked problems when starting/restarting communications (i.e. starting fresh instance of Daqfactory or leaving safe mode) after which I have to either restart dqfactory, or power cycle RTU, or both. But once comms start they then seem to continue ok except for main problem which is.....

Daqfactory doesn't return values at the time it says it does.

I have a number of channels (detailed in attached text file) but one of interest is named Ctr1.

It is a single opto sensor with a single reflector on cylinder that is turning upto 5000 rpm. I'm counting these pulses in a highspeed counter and reading them once per second.

Mostly, it works fine with about 90 increase in Ctr1 every second, but occaisionally (about once every 10 or 20 seconds or so) I get something like (contents of Ctr1 channel interspersed with comms log):

Ctr1.time 10:43:50.0012

Ctr1 150227 (Change since last reading 87)

Tx (10:43:50.002): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x32\x00\x02

Rx (10:43:50.008): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:50.009): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x37\x00\x01

Rx (10:43:50.015): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:50.016): \x00\x00\x00\x00\x00\x06\x00\x02\x06\x87\x00\x01

Rx (10:43:50.022): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:50.023): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x00\x00\x02

Rx (10:43:50.030): \x00\x00\x00\x00\x00\x07\x01\x04\x04\x4A\xD3\x00\x02 <- 150227

Ctr1.time 10:43:51.0000

Ctr1 150400 (Change since last reading 173)

Tx (10:43:51.001): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x32\x00\x02

Rx (10:43:51.008): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:51.008): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x37\x00\x01

Rx (10:43:51.015): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Ctr1.time 10:43:52.0352

Ctr1 150403 (Change since last reading 3)

Tx (10:43:52.015): \x00\x00\x00\x00\x00\x06\x00\x02\x06\x87\x00\x01

Rx (10:43:52.022): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:52.023): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x00\x00\x02

Rx (10:43:52.030): \x00\x00\x00\x00\x00\x07\x01\x04\x04\x4B\x80\x00\x02 <-150400

Tx (10:43:52.036): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x32\x00\x02

Rx (10:43:52.042): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:52.043): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x37\x00\x01

Rx (10:43:52.049): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:52.050): \x00\x00\x00\x00\x00\x06\x00\x02\x06\x87\x00\x01

Rx (10:43:52.056): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:52.057): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x00\x00\x02

Rx (10:43:52.064): \x00\x00\x00\x00\x00\x07\x01\x04\x04\x4B\x83\x00\x02 <- 150403

Ctr1.time 10:43:53.0000

Ctr1 150488 (Change since last reading 85)

Tx (10:43:53.001): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x32\x00\x02

Rx (10:43:53.007): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:53.008): \x00\x00\x00\x00\x00\x06\x00\x02\x00\x37\x00\x01

Rx (10:43:53.015): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:53.015): \x00\x00\x00\x00\x00\x06\x00\x02\x06\x87\x00\x01

Rx (10:43:53.022): \x00\x00\x00\x00\x00\x04\x00\x02\x01\x00

Tx (10:43:53.022): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x00\x00\x02

Rx (10:43:53.029): \x00\x00\x00\x00\x00\x07\x01\x04\x04\x4B\xD8\x00\x02 <- 150488

Error is Ctr1.time reports value 150400 is read at 10:43:51.0000 but comms trace shows it actually being read at 10:43:52.030.

The value is correct for the actual time it read it.

Unfortunately this completely screws up the smothing algorithm I have as the apparent speed suddenly doubles, then drops to almost zero then recovers.

The values being returned by the RTU and the times logged in the comms log are correct, but not the data returned in the channel.

I have tried varying the Timeout value on the ModBus device from 1 to 20 seconds - It made no difference to the problem.

I'm using Daqfactory version 5.73b but have verified that problem also occurs in version 5.83 as well.

Can you help please???

Rod Swan.

DaqError.txt

Link to comment
Share on other sites

The problem almost certainly has to do with the fact that you made most of your Channels D# 0 and one of your channels D#1. You should make all of them D# 1. The D# corresponds to the modbus ID of the device. Modbus ID #0 is reserved by modbus for broadcasting to all devices and should really just never be used in polling applications.

Try changing that and tell us how it works.

Link to comment
Share on other sites

Also, it appears that something is tying up the port for 1 second. It doesn't look like a timeout since I don't see a Tx without Rx and the last Rx before the delay seems to be ok. Did you get an alert at 10:43:52? You might set your timeout value to something much smaller. Since you are ethernet and your device is fast to respond, if you don't get a response within maybe 30 milliseconds, then its unlikely the device is going to respond at all, so you can probably just set your timeout to 50 instead of 1000. This will eliminate your port locked problems too.

Link to comment
Share on other sites

Hi,

That seems to be ok ta.

I saw that the D# was the ModBus ID but the code I inherited never set this parameter consistently (in the function calls) and I found that setting any value in the channel configuration seemed to make little difference to the end results. I've now tested with all channels set to "1" and "2" and appear to get the same results with either value (and my device is set to 1).

I've also set timeout to 50mS and continue get the occasional timeout but not Port Locked problems that don't dissapear on their own.

Overall it now looks better.

Rod.

Link to comment
Share on other sites

  • 2 weeks later...

Hi,

I have progressed a long way in the last few days with the "business logic" side of this DaqFactory drawing by simulating input and am now returning to the problems I see with the ModBus communication. I have difficulty detailing precisely all the problems but the two most pressing are:

1. Port Locked.

Quite often when leaving SafeMode after editing drawing/sequences the system gets stuck in a "Port locked" mode

where I usually have to exit and restart Daqfactory, and sometimes also power cycle the RTU to restart correct operation.

The alert it then gives (every second) is similar to:

08/27/09 08:38:00.507

P-ModbusTCP 0013: Port Locked.

More significantly, and for no apparent reason, after running succesfully all night (but with timing errors as detailed below) I found the system had entered this mode at 8:38 this morning.

2. Continued "spurious" readings. A digital counter on test hardware (an optical sensor on a deskfan) giving consistent data (a precision digital oscilloscope shows the input pulses solid at 92hz +/- 1 or 2%) reading once per second.

Maybe one in every twenty readings or so I get an "early" reading - The timing in the channel.time field is reported as occuring exactly on the 1 second interval, but the reported count value is around 10% low. The following reading (also exactly on the second interval) reports a corresponding 10% higher value ( ie 92,92,92,92,83,101,92,92,92,92).

My natural thoughts woud be to say the counter is reporting incorrect values as the comms log matches the channel values, but the problem gets worse (significantly) if I change the software (to the latest Daqfactory version 5.83).

I also continue to get "Channel Error Timing" alerts almost every second. e.g.

08/27/09 08:37:25.101

Channel Error Timing = 1.00, Offset = 0.00: P-ModbusTCP 0010: Timeout

08/27/09 08:37:25.304

Channel Error Timing = 1.00, Offset = 0.00: P-ModbusTCP 0010: Timeout: 15

08/27/09 08:37:25.405

Channel Error Timing = 1.00, Offset = 0.00: P-ModbusTCP 0010: Timeout: 13

though the system appears to continue correctly reading and setting the remote I/O.

I have set and checked the channel parameters as per you previous comments, and the Timeout on the ModBus I have tried at 30, 50, 100, 200, 1000 mS, all with no improvement.

I am running Windows XP SP3, and have a very small network connecting the systems together. On the target system a dedicated network interface card is used to talk to the RTU but I am reluctant to "develop" on that as it is a small system and the DaqFactory developers licence is only on my main desktop system.

I am using Daqfactory version 5.73b (as this is as installed on customers systems) but have tried version 5.83 without success - indeed the "spurious" readings were significantly worse.

One of the causes of the earlier difficulties was the creator of the drawings desire to set the priority of all the sequences to "5-Aquisition" - I have changed that and now have minimal code in channel events (mostly just beginseq commands) and most Sequences now at

How best can I move forward to resolve these problems? What can I do to give you more visibilty of these problems?

Any help gratefully appreciated.

Rod Swan.

I am planning to install on site next wednesday so do really need to resolve this as soon as possible.

Link to comment
Share on other sites

Further to last long forum post... to isolate the problem I've created a new drawing with just one device - modbusTCP - and only a small number of channels, no sequences and just one page with a couple of graphs on in DaqFactory 5.83. With this I can clearly see the problem.

I does appear to be network traffic related - i.e. increase in network traffic increases occurence of the errors.

Comms Trace gives:-

Tx (15:18:26.000): \x00\x00\x00\x00\x00\x06\x01\x02\x00\x30\x00\x04

Rx (15:18:26.006): \x00\x00\x00\x00\x00\x04\x01\x02\x01\x00

Tx (15:18:26.007): \x00\x00\x00\x00\x00\x06\x01\x02\x00\x37\x00\x01

Rx (15:18:26.014): \x00\x00\x00\x00\x00\x04\x01\x02\x01\x01

Tx (15:18:26.014): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x55\x00\x03

Rx (15:18:26.021): \x00\x00\x00\x00\x00\x09\x01\x04\x06\x18\x79\x1A\x23\x03\x0D

Tx (15:18:26.022): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x00\x00\x02

Rx (15:18:26.029): \x00\x00\x00\x00\x00\x07\x01\x04\x04\x17\xED\x00\x02 137197 92

Line above is typical of good result - timed at 29mS past the second. It is reported in the channel as occuring at 18:26.000

Tx (15:18:27.000): \x00\x00\x00\x00\x00\x06\x01\x02\x00\x30\x00\x04

Rx (15:18:27.007): \x00\x00\x00\x00\x00\x04\x01\x02\x01\x00

Tx (15:18:27.007): \x00\x00\x00\x00\x00\x06\x01\x02\x00\x37\x00\x01

Rx (15:18:27.014): \x00\x00\x00\x00\x00\x04\x01\x02\x01\x01

Tx (15:18:27.116): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x55\x00\x03

Rx (15:18:27.122): \x00\x00\x00\x00\x00\x09\x01\x04\x06\x18\x77\x1A\x23\x03\x8F

Tx (15:18:27.123): \x00\x00\x00\x00\x00\x06\x01\x04\x00\x00\x00\x02

Rx (15:18:27.129): \x00\x00\x00\x00\x00\x07\x01\x04\x04\x18\x52\x00\x02 137298 101

Line above is in error. Value 137298 is correct for 18:27.129 but channels shows it as occuring at 18:27.000 which is WRONG.

This totally screws up the speed calculations requiring considerable smoothing to present acceptable screen display of speed, but that then introduces unacceptable lag in displayed values.

It appears that DaqFactory version 5.83 is much more susceptible to other network traffic the version 5.73b.

Whilst it is inevitable that other traffic on the network may slow the comms down, this cause of this error is DaqFactory's labeling of the data recieved with the time it was requested - not the more accurate time of when it was received.

Can you correct that error?

With respect to the port locking problem. I've added the following simple sequence:

function Timed()

global y = 0


   while(1)
	  waituntil(floor(systime())+1)
	  y = !y
	  ?y
	  casefan = y
   endwhile

where CaseFan is a simple channel. Prior to begining this seq, setting Casefan in theCommand/allert window controls the case fan (as expected). BeginSeq(timed) just gives a PortLocked error once every second and I can nolonger set it from the Command window.

I've attached the ctl file.

It's also interesting to note that I have to execute (either in the startup sequence or directly in the cammand window) the "device.RTU1.ReadInputU16(1,38007,1)" command once before I get any values in the Ctr1_Value channel.

Rod.

p.s. sorry for all the agro but I need it to work - either identify and correct my mistakes or those in DaqFactory if i am correct in my understanding.

Untitled_5.83.ctl

Link to comment
Share on other sites

1) DAQFactory puts the same timestamp on all data points with the same device with the same timing/offset. We can't change this because it will affect data alignment for everyone else and mess up their logs. You are the first that has had issue with it. Fortunately there is an easy workaround: set the Offset on the important channel to 0.9. This will cause it to poll separately and so get the time of the first poll, not the last.

2) ReadInputU16(1,38007,1): I think you are using the wrong modbus notation in your channels. Doing read...(...,38007,...) will put DF in 40,000 notation which means it strips the 10,000 place and subtracts one. You put Read Input channel #1, which is actually wrong for 0 notation and instead should be 30,001. Then you wouldn't need that line

3) port locked: this is because its taking longer than your timeout to read through all the channels. I just don't see why you get this 100ms or so delay in the middle of your read block (second example monitor dump).

Link to comment
Share on other sites

Hi,

I think I'm begining to see the light :) but I still have a significant problem/query.

By carefully choosing the offset in the channels I have made my simple test case work as expected

1) Placing my Counter in its own "Group", ie in its own timeslot - the only channel with a 0.9 offset - I get

accurate timing every time :)

2) The explanation about addressing using 1 versus 30001 is confusing ;) When (and why) daqfactory strips the 10000 digit and whether I should use 1 or 30001 is unclear. However, I'm now using 30001 notation everywhere and it appears to be ok.

3) Port locking.

This I feel is the more difficult problem to work around.

I now believe this is caused not by a Timeout parameter that is too small, but by the failure to handle simultaneous input and output requests.

Reading between the lines and with a bit of experimentation I see that when requesting two such simultaneous activities on a single device Daqfactory reports a Port Locked alert and loses the output command.

In my simple example, moving the offset for all my input devices away from zero allows me to set my outputs every second on the second without problem. But if I set the offset on an input channel to zero, that takes priority over the output causing the output to fail every time :o

Is there anyway to detect that failure?

I've tried surrounding the statements with Try/Catch to no avail.

Is there anyway to guarantee that setting an output device will happen?

for example when an operator triggered event sets an output immediately how can we ensure that it won't coincide with a channel read other than by enclosing every such set in a sequence with a lot of extra code around it to prevent the set occuring at the wrong time?

Rod

Link to comment
Share on other sites

1) glad that worked for you

2) confusing, yes, but that's largely because the original Modbus spec was not well written. Even so, hardware manufacturers don't follow the spec perfectly, with half of them using one notation and half using the other. You should just stick with one notation or the other. You should not combine them. Thus to answer your comment: "whether I should use 1 or 30001 is unclear", you should just use 30001 if that is what your documentation specifies.

3) A port locked error means that DAQFactory is busy with the port and can't use the port for the output. Its not limited to outputs, but that is what you are seeing. The solution is not to try and work around it, but rather try and figure out why the port is being locked up for so long. You might try setting the timeout back up to 1000. This will cause the output to wait a full second for the port to come unlocked. If you still get port locked problems, then you need to figure out why your read's are taking up a full second.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.