New RS485 device causes timing errors - registers notation difference?


indigo
 Share

Recommended Posts

I have recently added a new device to our RS485 network (Eastron SMD120). Previously, all the devices worked great with no timing issues (Device.Arduino.SetDelay = 20). After adding this new device I either get a timing lag error or timeout error in the command / alert box (see attachment). I tried to change the timing of the device by increasing it to 2 seconds, but it still gave the same errors. I also tried to increase the SetDelay variable of the device object (up to 100ms) but this also didn't help resolve it.

The device uses register notations in > 30000 (so, for the channel list I showed, em_active_power is a channel from the new device and the parameter being read into DAQFactory is set to channel 12, but the datasheet requires it to be 30012). The other devices use 0 as the starting register number. I know that internally the modbus protocol uses 0 notation, but I think this is what is causing the issue. Any ideas? How can I circumvent the two types of register numbering?

 

 

timing_channel_list.PNG

timing_cmd_box.PNG

Link to comment
Share on other sites

I always use 0 notation.  So when I see 30012 documented, and once I confirm that the docs aren't 0 indexed as well, I will put 11 into DAQFactory.  You strip the 10,000's place and subtract 1.

I would start by disabling that new channel (set the timing to 0) and see if it goes back to working.  SetDelay(100) is going to be too big, as with 7 devices, you are talking about 700ms of wasted time each iteration, not counting any actual comms time.  I'd set that back to 20.  I've never seen it need more than 20.  The issue is just giving the RS485 line time to release, so it doesn't need to be long.

If disabling that channel fixes it, reenable it and fix the issue with the channel #, as it should be 11, not 12.  Then set the timing for all your other channels to 0 and see if you can get the new device to work on its own.  Then slowly add back each of the old devices, maybe watching the comm monitor with time stamp enabled to see if there are any surprising delays.

Also make sure you don't have any sequences controlling outputs while you are doing all this.  If you do have sequences controlling outputs, that may the issue.  SetDelay() applies after every query, and outputs can't be optimized, so each output command would add 100ms + comm time to the overall loop time.  If that overall loop time exceeded your Timing of 1, then you'd get Timing lags pretty quick.

 

Link to comment
Share on other sites

Hi, thanks for your response. I made a separate DAQFactory file to test things out, with only two devices being utilized.

1) It seems the energy monitor that is causing issues is not 0 indexed. When I changed the the channel from 12 to 11, it read 0 instead of a sensible number.

2) I reverted to 20ms delay in the new file I created.

3) In this new file there are no sequences or anything being done with the channels that can control the outputs.

What I did here is first disable the sht device (timing = 0), this starts at around 14:00. There was only one timeout error in ~20 minutes. Then I disabled the energy monitor and enabled the sht device, this didn't cause any errors. Finally, I enabled both of them and you can see in the screenshot below that I start getting errors again.

I am not sure if it is something with the energy monitor itself and the way it works that it is clashing with the other devices or if it is a clash between the different notations. Do you have any clue, specifically if there is anything else software side I can do to fix it in DAQFactory?

with_both.PNG

without_sht.PNG

Link to comment
Share on other sites

Its not a clash between the different notations, because the different notations don't really exist.  They are totally a figment of the documentation.  The actual Modbus packet only uses 0 notation.  The packet and the device know nothing about 40,001 notation.  This is what is so frustrating about Modbus, compounded by the fact that many manufacturers misunderstand it all and document their hardware incorrectly.  This is also why I pretty much exclusively convert my Modbus registers to 0 notation.   It also makes it easier to debug the packets when necessary, as the Channel # will match what you see in the comm monitor.  If you put 40,001 instead, DAQFactory (in most cases) will strip the 4 and subtract 1 for you before sending it out the comm port.

What you've done is good debugging.  The next step would be to open the comm monitor and look at the packets with Show time of TX/RX enabled along with Display all ASCII chars as Codes.  It should be pretty obvious which of the two devices is having a problem.  The ID (D#) is going to be the first number when both transmitting and receiving.  You will likely see a Tx/Rx combo for one of the D#'s but only a Tx on the other.

Next thing to try would be to set the Timing to 1 on both devices, but set the Offset to 0.5 on one of them.  This will cause the queries to stagger by 0.5 seconds.  Alternatively, start incrementing SetDelay() until it works, but I like the first method.   Then, if it works with an offset of 0.5, try moving the two queries closer together by changing the 0.5 to 0.75 or 0.25.  What I believe you will find is that one of your two devices is not letting go of the transceiver in a reasonable amount of time which is blocking further comms.  This exists in general for multidrop comms, but the amount is usually a few milliseconds.  In your case, it seems to be a lot more for one device, probably that last one you added.  This will prove it.  For example, if you leave D# 17 at offset 0 and set D#30 to offset 0.95 and you get time outs, then D#30 is the issue.  If you set D#30 offset to 0.05 and you get time outs, then D#17 is likely the issue.  If you have issues for all offsets, then I'd really need to see your comm monitor to tell anything more.

 

 

Link to comment
Share on other sites

Thanks for your detailed response. Interesting note on modbus, I am personally still quite new to the protocol so it is definitely useful to learn the small intricacies of the protocol.

I did exactly what you pointed out: I enabled the different settings on the com port and could see that both Rx and Tx was being registered for D17, but for D30 (new device), only Tx was being registered. So, apparently DAQFactory was not receiving a response from the device. I changed the offset for D30 to 0.5 and it seems to be working better now, although I still get timeouts every ~5mins or so. I think this means that the offset / delay is accumulating and then it causes a timeout, if you understand what I mean. Ideally, this works without the timeouts, but if there are no other solutions that you can think of then it is manageable. Thanks in advance for any additional input you can provide. :)

(I didn't adjust the offset for D17 since I didn't think it is the one causing the problem and I didn't adjust the SetDelay() method either for these tests)

with_offset_timeout.PNG

Link to comment
Share on other sites

I'd have to look at the comm monitor window to have any more input.  It is possible that D30 is just really slow to respond. For example, a LabJack T7 Pro set in the highest resolution / bit count takes 157 ms just to do the actual read because it is so high res, so a query would take even longer than that.  If your D30 device is slow, then that could be holding things up.  To find out, set the timing to 0 on both channels, rearrange the windows so the Comm monitor and command / alert window can both be seen at the same time ("tear off" the Comm monitor tab and put it above), then enter:

read(em_active_power)

in the command alert window.  You'll see the Tx, and then an Rx.  Look at the difference in time.  Do this a couple times to get an average.  Maybe do the same thing with sht_17_1, just to compare.

Personally if I was having this issue, I'd get a second comm port and put D30 on it alone so that it didn't cause issues with my other 17 devices.   

Link to comment
Share on other sites

I have checked the average time between Tx and Rx for both devices (8 total queries for both), for a couple of queries I did them immediately after each other but in general with some seconds between the queries. For D30 the average time is 54ms and D17 it is 26ms.

Concerning the second com port, check my screenshot below, is this what you mean? Here I just created a new serial port (smd) with the same settings as arduino_con, but I probably need to make some configuration change I assume otherwise it'll literally be a copy of arduino_con right?

em_active_power with this new smd device still provides errors by the way.

new_com.PNG

Link to comment
Share on other sites

No, I mean a second physical comm port.  Creating a duplicate comm port in DAQFactory will cause one not to work because Windows only allows one connection to a serial port at once.

The other thing to watch for is termination. I am assuming you are running RS485 since you are multidrop.  Did you properly terminate both sides of the chain?  Also, the chain has to actually be a chain with only two ends.  You can't do a star arrangement or any branches with RS485.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share