DAQFactory crashes


bartvandiepen

Recommended Posts

Hello guys,

We are pretty confidient with DAQFactory, but there is problem. After approximitely a week the DAQFactory application is crashed and it does'n't respond anymore. 2 customers with almost the same configuration have the same problem. We are running on DAQFactory 5.78, i know there this is not the latest release, but for us there are no reasons to use the new release. And updating without any reasons is not a good way, because it can cause unexepted troubles.

For a couple of days the system runs perfectly, but after a week....

We also run a test right now at our office to determine the problem. This is started 5 days ago and still running without any problem.

In the past we had also big problems with unwanted memory occupation (grow) which causes the system to crash, but that's solved. The memory occupation is stabile now. We don't have a good reputation anymore cause of the former problems. Customers have mentioned the option to use other SCADA software, ofcourse we've defend DAQFactory. I hope you understand that it is important to solve this problem as quick as possible. We want to continious using DAQFactory but this is situation makes it difficult.

We are using DAQFactory to control autoclaves, these applications are almost standard and therefore we developed a standard runtime configuration which we are using for every customer. A standard configuration gave us the opportunity to optimized it a lot!

I know that DAQFactory doesn't describe in the revision history all the points which are fixed. But do you have any idea what the reason can be? I've attached the runtime, please email bart@pro-control.nl for the password.

Thanks,

Kind regards,

Bart van Diepen

run1_v5_2_MED1110_Gert_van_Mossel_80819.ctl

Link to comment
Share on other sites

First, what do they mean by crash? A hang? Unexpected quit with error message? Second, are they doing anything with the system when this happens such as loading a file or really any interaction, or does it just do it on its own? I'm assuming, of course, that memory usage is stable. Which OPC server are you using?

Link to comment
Share on other sites

Hello,

Thanks for the fast reply.

The memory usage is stable, we are using Eurother OPC server.

I've called with the customer to get more information when the hang occur.

The computer hangs after 1 or 2 weeks, without doing anything. The PC runs all the time and then in the morning when he starts to work, he sees that it doesn't respond anymore. How does he know that? He couldn't tell me exactly, he saw different messages / situations when this occur.

I've told him that every detail is important for us to get closer to the reason.

Maybe some interesting notes are:

- Automatic-restore-point functionality of Windows?

- Computer performence setting --> processor scheduling --> set to programs or background services

- Version 5.78 instead of 5.79?

I hope this is the information you need to get further in the research of this problem,

kind regards,

Bart van Diepen

Pro Control B.V.

Link to comment
Share on other sites

OK, those are good notes actually:

- Automatic-restore-point functionality of Windows?

This, I believe, only occurs when you install new hardware. However, it reminds me of another probably issue: do they have Automatic Updates turned off? It really needs to be if they are going to run 24x7. Also, please see my post here: http://www.azeotech.com/board/index.php?showtopic=3570 concerning antivirus and general advice on running 24x7.

- Computer performence setting --> processor scheduling --> set to programs or background services

This doesn't matter that much, but probably should be set on programs. What happens is that on programs, Windows gives a priority boost to the foreground application. On background services, the boost is turned off and everything runs in the priority it wants to. DAQFactory runs pretty high in the priority list already, above most apps, so the boost only pushes it slightly higher.

- Version 5.78 instead of 5.79?

I doubt it, but it'd be worth a try. My money is on auto-updates / anti-virus issues, especially if the thing is always hung first thing in the morning (since auto-updates runs at 2am or so).

Link to comment
Share on other sites

Hey guys,

My tests which I have running right now are focused on the "automatic update" of Windows. They are disabled now. But at my customer this setting is enabled, but I know that the PC is not connected to internet. Could the automatic update function even then results in a "hang"? Of course Windows will at least take some action, trying to update via internet.

kind regards,

Bart van Diepen

Pro Control B.V.

Link to comment
Share on other sites

  • 2 weeks later...

Hello DAQFactory support,

Unfortunately our long running test at our office is also crashed. DAQFactory hangs and I could easily determine that because the graph was not moving anymore. At this computer the Windows update was switched off. There was not a Windows message or alert.

But with the Windows event viewer we found a DrWatson alert pointing to DAQFactory. See attachment

Fortunately our OPC server has a logfile for errors/warning. We could find out the first DAQFactory crashes and then the OPC server comes with warnings that a client is lost.

But we had a strange combination of a Firewall alert. Somehow DAQFactory was not in the "save" list of the Windows Firewall. I'm sure it was before! Again with the Windows event viewer we could find out the this is possibly caused by an "enrollment" coming from our server. Somebody at our office put the PC connected to our network (we've verified that).

But this enrollment action was not close timed to the "hang" of DAQFactory, I've done close research and don't think that the reason is caused by this enrollment / Firewall alert. We also wondered that Windows complained about about Firewall action, because we don't use network access?

Hopefully you this information tells you something.

greetings,

Bart van Diepen

Pro Control B.V.

daq.zip

Link to comment
Share on other sites

Good afternoon,

We don't use any tan(), sin() or cos() functions. I've checked also my source code and the sequences at:

- infinite loops

- functions which calls themself

- complex functions

Found nothing which could suggest the reason for the 'hang'. At the main page we do the trend update by our own sequence, that's maybe the only critical function. But in this function we only use simple mathematical functions, the loop has also a delay of 1 seconds.

We should plan a rough investigation i think to determine this problem. Remember our customers are still waiting for a solution, now they cannot run long tests.

Thanks for your understanding and help,

kind regards,

Bart van Diepen

Pro Control Process Automation B.V.

Link to comment
Share on other sites

OK, sounds like your scripts are fine. There are really only two things that crash DAQFactory: 1) user scripts: infinite loops without delays, functions that recursively call themselves, or in some sort of recursive loop, etc. These are things that we can't really protect against. Sometimes you want an infinite loop without delay, and recursive functions are great when used right (and blown stacks can't be caught internally).

2) stuff in the drivers / device related. We try and catch most things, but sometimes we miss a spot. The OPC driver appears to fire off a secondary thread without completely protecting it. This means that if there is an exception in this thread, which could be caused by something outside DAQFactory, it won't be caught, the thread will crash and take DAQFactory with it. It is therefore my belief that this is what is happening. I don't know what is triggering the exception, and truthfully, given the infrequency of it and the fact that you are the only one seeing it, it is unlikely that we'll know. However, I have added some protection to the thread so that if the exception occurs, it won't crash DAQFactory. Whether it can recover communications, however, is another story. There is simply no way to know until it happens again.

The update opc driver file is attached. Copy the file over the existing in the DAQFactory directory. If it catches a random exception in this secondary thread, it should generate an "OPC: Unknown error processing OPC request" or "OPC update" error in the command/alert.

That said, I don't know what threads are fired up internally in the OPC drivers and its possible they aren't catching them all. Hopefully, however, they get passed on and we can catch them.

That all said, you can reduce the usage of secondary threads by putting all your channels as Async and not do any Sync reads.

opc.zip

Link to comment
Share on other sites

  • 1 month later...

Good afternoon DAQFactory,

Thanks for the reply and handfull information / solutions. I've waited a while before posting my reply because we where running a test.

We have run a test with all the sequences disabled, just to exclude the sequences from the other part of the application. The systems didn't crash anymore, we've run the test for 4 weeks. The other test has crashed after 2 weeks. So the problem seems to be related to the sequences.

Me and my collegue have analysed the scripts and didn't found bad scripting strategies like you suggested. We have a lot of experience with programming and used many programming languages in multiple projects.

Could you have a look on the scripts? You can debug the app much quicker with debugging tools. It would be interesting to perform a test if the app use the TAN() function somewhere. I know that with Visual Basic you can add debug points in the source code.

We are putting a lot of time finding out this problem but need your help for this tough problem. We will continuing running other tests and keep you informed about this. I've attached the runtime and send the password by skype to you.

Thanks for all the support,

greetings Bart van Diepen

Pro Control B.V.

run1_v7_0.ctl

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.