bartvandiepen Posted February 23, 2009 Share Posted February 23, 2009 Hello guys, We are pretty confidient with DAQFactory, but there is problem. After approximitely a week the DAQFactory application is crashed and it does'n't respond anymore. 2 customers with almost the same configuration have the same problem. We are running on DAQFactory 5.78, i know there this is not the latest release, but for us there are no reasons to use the new release. And updating without any reasons is not a good way, because it can cause unexepted troubles. For a couple of days the system runs perfectly, but after a week.... We also run a test right now at our office to determine the problem. This is started 5 days ago and still running without any problem. In the past we had also big problems with unwanted memory occupation (grow) which causes the system to crash, but that's solved. The memory occupation is stabile now. We don't have a good reputation anymore cause of the former problems. Customers have mentioned the option to use other SCADA software, ofcourse we've defend DAQFactory. I hope you understand that it is important to solve this problem as quick as possible. We want to continious using DAQFactory but this is situation makes it difficult. We are using DAQFactory to control autoclaves, these applications are almost standard and therefore we developed a standard runtime configuration which we are using for every customer. A standard configuration gave us the opportunity to optimized it a lot! I know that DAQFactory doesn't describe in the revision history all the points which are fixed. But do you have any idea what the reason can be? I've attached the runtime, please email bart@pro-control.nl for the password. Thanks, Kind regards, Bart van Diepen run1_v5_2_MED1110_Gert_van_Mossel_80819.ctl Link to comment Share on other sites More sharing options...
AzeoTech Posted February 23, 2009 Share Posted February 23, 2009 First, what do they mean by crash? A hang? Unexpected quit with error message? Second, are they doing anything with the system when this happens such as loading a file or really any interaction, or does it just do it on its own? I'm assuming, of course, that memory usage is stable. Which OPC server are you using? Link to comment Share on other sites More sharing options...
bartvandiepen Posted February 24, 2009 Author Share Posted February 24, 2009 Hello, Thanks for the fast reply. The memory usage is stable, we are using Eurother OPC server. I've called with the customer to get more information when the hang occur. The computer hangs after 1 or 2 weeks, without doing anything. The PC runs all the time and then in the morning when he starts to work, he sees that it doesn't respond anymore. How does he know that? He couldn't tell me exactly, he saw different messages / situations when this occur. I've told him that every detail is important for us to get closer to the reason. Maybe some interesting notes are: - Automatic-restore-point functionality of Windows? - Computer performence setting --> processor scheduling --> set to programs or background services - Version 5.78 instead of 5.79? I hope this is the information you need to get further in the research of this problem, kind regards, Bart van Diepen Pro Control B.V. Link to comment Share on other sites More sharing options...
AzeoTech Posted February 24, 2009 Share Posted February 24, 2009 OK, those are good notes actually: - Automatic-restore-point functionality of Windows? This, I believe, only occurs when you install new hardware. However, it reminds me of another probably issue: do they have Automatic Updates turned off? It really needs to be if they are going to run 24x7. Also, please see my post here: http://www.azeotech.com/board/index.php?showtopic=3570 concerning antivirus and general advice on running 24x7. - Computer performence setting --> processor scheduling --> set to programs or background services This doesn't matter that much, but probably should be set on programs. What happens is that on programs, Windows gives a priority boost to the foreground application. On background services, the boost is turned off and everything runs in the priority it wants to. DAQFactory runs pretty high in the priority list already, above most apps, so the boost only pushes it slightly higher. - Version 5.78 instead of 5.79? I doubt it, but it'd be worth a try. My money is on auto-updates / anti-virus issues, especially if the thing is always hung first thing in the morning (since auto-updates runs at 2am or so). Link to comment Share on other sites More sharing options...
bartvandiepen Posted February 26, 2009 Author Share Posted February 26, 2009 Hello, Thanks for the advise. We will run a longtime test with automatic updates disabled. Keep you informed, thanks for the fast and good support. greetings Bart van Diepen Link to comment Share on other sites More sharing options...
bartvandiepen Posted February 27, 2009 Author Share Posted February 27, 2009 Hey guys, My tests which I have running right now are focused on the "automatic update" of Windows. They are disabled now. But at my customer this setting is enabled, but I know that the PC is not connected to internet. Could the automatic update function even then results in a "hang"? Of course Windows will at least take some action, trying to update via internet. kind regards, Bart van Diepen Pro Control B.V. Link to comment Share on other sites More sharing options...
AzeoTech Posted February 27, 2009 Share Posted February 27, 2009 No idea. Windows does a lot of stuff that it probably shouldn't (or you wish it didn't). Its one of my big gripes with Windows. With every release they add more and more automatic stuff that just makes the systems more bulky and unstable. Link to comment Share on other sites More sharing options...
bartvandiepen Posted March 11, 2009 Author Share Posted March 11, 2009 Hello DAQFactory support, Unfortunately our long running test at our office is also crashed. DAQFactory hangs and I could easily determine that because the graph was not moving anymore. At this computer the Windows update was switched off. There was not a Windows message or alert. But with the Windows event viewer we found a DrWatson alert pointing to DAQFactory. See attachment Fortunately our OPC server has a logfile for errors/warning. We could find out the first DAQFactory crashes and then the OPC server comes with warnings that a client is lost. But we had a strange combination of a Firewall alert. Somehow DAQFactory was not in the "save" list of the Windows Firewall. I'm sure it was before! Again with the Windows event viewer we could find out the this is possibly caused by an "enrollment" coming from our server. Somebody at our office put the PC connected to our network (we've verified that). But this enrollment action was not close timed to the "hang" of DAQFactory, I've done close research and don't think that the reason is caused by this enrollment / Firewall alert. We also wondered that Windows complained about about Firewall action, because we don't use network access? Hopefully you this information tells you something. greetings, Bart van Diepen Pro Control B.V. daq.zip Link to comment Share on other sites More sharing options...
AzeoTech Posted March 11, 2009 Share Posted March 11, 2009 Are you by chance doing any trig (tan() in particular) anywhere in your script? The firewall shouldn't really matter. DAQFactory is triggering it because it is setting itself up as a server for DAQFactory networking. Link to comment Share on other sites More sharing options...
bartvandiepen Posted March 16, 2009 Author Share Posted March 16, 2009 Good afternoon, We don't use any tan(), sin() or cos() functions. I've checked also my source code and the sequences at: - infinite loops - functions which calls themself - complex functions Found nothing which could suggest the reason for the 'hang'. At the main page we do the trend update by our own sequence, that's maybe the only critical function. But in this function we only use simple mathematical functions, the loop has also a delay of 1 seconds. We should plan a rough investigation i think to determine this problem. Remember our customers are still waiting for a solution, now they cannot run long tests. Thanks for your understanding and help, kind regards, Bart van Diepen Pro Control Process Automation B.V. Link to comment Share on other sites More sharing options...
AzeoTech Posted March 16, 2009 Share Posted March 16, 2009 OK, sounds like your scripts are fine. There are really only two things that crash DAQFactory: 1) user scripts: infinite loops without delays, functions that recursively call themselves, or in some sort of recursive loop, etc. These are things that we can't really protect against. Sometimes you want an infinite loop without delay, and recursive functions are great when used right (and blown stacks can't be caught internally). 2) stuff in the drivers / device related. We try and catch most things, but sometimes we miss a spot. The OPC driver appears to fire off a secondary thread without completely protecting it. This means that if there is an exception in this thread, which could be caused by something outside DAQFactory, it won't be caught, the thread will crash and take DAQFactory with it. It is therefore my belief that this is what is happening. I don't know what is triggering the exception, and truthfully, given the infrequency of it and the fact that you are the only one seeing it, it is unlikely that we'll know. However, I have added some protection to the thread so that if the exception occurs, it won't crash DAQFactory. Whether it can recover communications, however, is another story. There is simply no way to know until it happens again. The update opc driver file is attached. Copy the file over the existing in the DAQFactory directory. If it catches a random exception in this secondary thread, it should generate an "OPC: Unknown error processing OPC request" or "OPC update" error in the command/alert. That said, I don't know what threads are fired up internally in the OPC drivers and its possible they aren't catching them all. Hopefully, however, they get passed on and we can catch them. That all said, you can reduce the usage of secondary threads by putting all your channels as Async and not do any Sync reads. opc.zip Link to comment Share on other sites More sharing options...
bartvandiepen Posted May 11, 2009 Author Share Posted May 11, 2009 Good afternoon DAQFactory, Thanks for the reply and handfull information / solutions. I've waited a while before posting my reply because we where running a test. We have run a test with all the sequences disabled, just to exclude the sequences from the other part of the application. The systems didn't crash anymore, we've run the test for 4 weeks. The other test has crashed after 2 weeks. So the problem seems to be related to the sequences. Me and my collegue have analysed the scripts and didn't found bad scripting strategies like you suggested. We have a lot of experience with programming and used many programming languages in multiple projects. Could you have a look on the scripts? You can debug the app much quicker with debugging tools. It would be interesting to perform a test if the app use the TAN() function somewhere. I know that with Visual Basic you can add debug points in the source code. We are putting a lot of time finding out this problem but need your help for this tough problem. We will continuing running other tests and keep you informed about this. I've attached the runtime and send the password by skype to you. Thanks for all the support, greetings Bart van Diepen Pro Control B.V. run1_v7_0.ctl Link to comment Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.