Lockup / Hang / Not Responding Issues


MrDeathStar

Recommended Posts

I have encountered two lockup/hang issues while using 5.87c that I wanted to report to determine if they are known issues, find out if they have possible workarounds, or assess if they sound like specific local environment troubles.

 

(1) During development I have learned to save before deleting component(s) from a page since doing so will sometimes trigger a lockup (i.e. Windows reports that the application is "Not Responding").  I previously thought this might have been related to deleting graph components which had been created through Ctrl-D (Duplicate), but found that it can occur with other components too and regardless of duplication.  Also, the trouble can happen regardless of sequence code status (i.e. usually not running and without acquiring channel data).

 

(2) In one of our applications we use PopupModeless to show a larger/detailed Timeline graph for the X/Y elements of a selected process graph on the main screen.  The PopupModeless page uses a graph with a left and right axis and one trace which is dynamically removed and added according to the user selection on the main screen.  Everything works great and the user can select items from the main page to update the graph in the modeless popup.  The trouble occurs randomly and only when the user closes the popup using either the 'X' button or a page button which calls PopupClose.  Windows reports that the application is "Not Responding" and the user can wait or close the application.

 

NOTES: In the case of (1), this has occurred during development of different .ctl applications.  In the case of (2), the application has a background thread / sequence with a fixed delay() that gathers data via LabJack APIs, modifies some values, and issues AddValue to store results into the channels which are graphed by the main screen and the modeless popup.  The development machine is Windows 7 SP1 / 64-bit.  The runtime machines vary.

 

The recent lockup for (2) happened with a runtime license before experiment data could be exported.  Any pertinent information would be useful. 

 

(We noticed a 5.91 (build 2203) is available and are curious if it might correct these issues, but we don't want to lose compatibility with current .ctl files if that version may be unstable.)

 

Thanks for your attention.

 

 

 

Link to comment
Share on other sites

I would upgrade to 5.91, and if you are still having problems, contact us about the latest unreleased build which does graphs differently.  I'm starting to think that 5.91 isn't as unstable as we initially thought, and we'll likely release it in full soon.  There are many people using it, especially since you need it for T7 support, and we've only had a couple reported issues, both of which I believe are environment related, not DAQFactory.

 

That said, BACKUP your .ctl files before opening them in 5.91 just in case.

Link to comment
Share on other sites

Thank you for your suggestion.  I have not yet tested the PopupModeless issue with 5.91, but have been testing other issues more closely (and without using my .ctl application).  Here are some quick (hopefully duplicatable) observations:

 

(1) Deleting Components:

With 5.87c, I can open DaqFactory, create a single default 1 sec "Test" channel and a basic 2D graph component plotting the sine wave test signal.  Then, I create several (about 10) variable value components (uninitialized red 'X'), as my application had several.  Next I proceed to randomly delete (i.e. Ctrl-Select + Right-Click + Delete Component / or / Shift-Delete) the variable value components.  At some random interval, the delete will cause the application to stop (i.e. "Not Responding").  If all variable value components were deleted properly, I can repeat creation and deletion again.  It does not take more that a few attempts to lockup.  It seems that a graph component on the page is required.

 

After installing 5.91 on a separate (but simular) machine, I could not duplicate easily with the above operations (great).  But if I instead change the procedure to open DaqFactory, create an empty graph component on the page, duplicate the graph component 10 times with Ctrl-D, and then randomly select and shift-delete each graph component, the lockup occurs easily too.

 

(2) Graphing Components / Trace Style:

I stopped using multi-colored traces in my graphs under 5.87c because I encounted issues with multi-trace graphs sometimes sharing pen styles or having segments of one trace use another trace's pen style.  In one of my applications, some of the traces in a six trace graph (having multi-color traces) would stop changing color and revert to a solid value after some interval.  Changing the thickness of the first trace would affect the thickness of random segements of the third trace, for example.  I suspected it might have been related to duplicating a graph component and its multi-color trace list, so I rebuilt all my graphs.  But the problem still occured.  I noticed this behaour still occurs in 5.91.  Simple create a three trace graph with three channels and set a 4-color multi-color segement for each trace.  Then start changing the line style among traces.  After some time, one channel will plot with random pattern/thickness of another trace.  In other words, parts of the trace will be correct for that trace, while other parts will assume the color or line style of a alternate trace.

 

 

The delete problem (1) feels like a race condition related to the graph component (as does the closing of PopupModeless issue under 5.87c).  The graph trace issue of (2) feels like a windows pen object is being shared somehow and possibly not getting deleted properly.  (As a side effect, I notice that after some time, changing a trace pen style sometimes no longer updates in the graph until the application is restarted...and often the object selection rectangles in DaqFactory itself will become black instead of hatched.) 

 

When I get to running our lab application in 5.91, I will report any new feedback.  I am interested in your "latest unreleased build" for graphing if you feel it may address some issues or would just like some additional testing.  However, I have worked around the delete/trace problems in development by saving often and restarting DaqFactory.  I am most concerned about the hanging problems when my application is being used in the lab (i.e. the PopupModeless closing issue).  In the meantime, if I find better steps than mentioned above, I will report them so you may track them down easily.

Link to comment
Share on other sites

Unfortunately we used a third party graph control for graphing.  There is a lot of our own code that goes to make the graph work the way we want it, but the core drawing is handled by a third party tool, albeit a rather old one at this point.  Up until the "latest unreleased build", graphs were entirely drawn in a background thread.  This allows you to create a data intensive graph with complex calculations or millions of data points that updates in real time without bogging down the rest of the user interface.  But, that 3rd party control really doesn't play well in multiple threads.  If you delete the component container for the graph and thus the 3rd party control you are doing so from the main thread of the application.  If it happens that the graph is drawing in a background thread at the same time, it causes that control to hang DAQFactory.  What we've discovered is that with newer computers the rendering of the graph really doesn't take much time. Instead its the massaging of the data, i.e. doing the complex calculations and / or reducing a million data points down to something that can actually be seen on a 1920 pixel screen, that takes all the time, and all that is our native code and thus thread safe.  So, we moved the rendering of the graphs into the primary thread of the application, while all the data massaging occurs in the background thread.  This was not possible when DAQFactory first started 14 years ago because computers were a lot slower and rendering was a graph was still slow.

 

So there's the detailed explanation.  Please email us at support@ and I'll get you a link to that new build for you to try.

 

BTW: you are probably right about the pen style not updating, but alas, that's buried in code we don't have source for.  You'll have to wait until we have natively coded graph rendering.

Link to comment
Share on other sites

  • 2 weeks later...

I received the latest unrelease build from Matt (thank you) and performed some quick testing.  Here are my initial results:

 

Deleting Components - Fixed:

As previously reported, 5.91 already improved deletion of non-graph components when a graph component is visible on the page.  Deleting of graph components themselves is now improved too and does not hang the application.

 

PopupModeless with Graph Components - Fixed:

Closing PopupModeless dialogs with a graph component caused random hangs in 5.91 and prior.  This also appears to be fixed with the latest unreleased build.

 

Graph Trace Color - Can't Fix:

I understand third-party code may eliminate your options for repair.  However, just for future reference I attached a simple graph showing one issue (5.91 version).  About half-way through the trend, trace colors (purple, multi-gray, blue, green, multi-color) change.  Purple becomes blue and multi-color gray becomes partially colored.  Changing the multi-color trace to 'dash-dot' will also randomly affect the other multi-color trace. 

 

Oddly (or not), the rate of test channel acquisition (not page refresh rate) seems to modify the behavior and affect more trace colors/line types.  Once channel timing for the traces is changed, it is hard set any trace color / line type in the graph.

 

GUI Lockout / Page Refresh Interval / Performance - New:

I am curious about the performance impact of the new graph changes in the latest unreleased build.  In prior versions I could easily achieve 1/10th second refresh interval on my pages.  Now, even without graphing, the GUI can be locked.   For example, attempting to open the "Page Properties..." dialog to change the refresh rate will result in 'Help' contents appearring, but without the Page Properties dialog itself visible.  Somewhere it has focus, as pressing ENTER will dismiss the 'Help' screen, but clicking anywhere else will lockout the application (since the dialog is modal).  I have noticed that other dialogs and operation (i.e. like "Add Channel", or even File/New) have very long delays before showing/processing.  If I switch to another less complex page first, it eventually becomes visible and I can bring up the Page Properties dialog for the problem page to set a higher page refresh interval.

 

Going back to prior releases, I noticed that I can simulate this problem when page refresh intervals are set much much shorter (and unrealistic).  I am on a fast Core I5 machine using 6 graphs (currently not charting) and 48 variable value fields.  I think the variable value fields are part of the issue (but not in prior releases at 1/10 second), because removing many helps the situation.  They set their background color in their OnPaint event to the RGB result of a short switch statement in a sequence function.  I suppose all scripting is slow, but I needed a way to set a 'disabled' color not dependent on the variable value and have user configurable threshold.  I am presuming that page refresh is not a suggestion for a WM_PAINT interval, but possibly a tight RedrawWindow loop.

 

Any comments are welcome.  Thanks for your continued communication and resolution.

ColoredTraces.ctl

Link to comment
Share on other sites

Okay.  The attached file is an example with similar count of items as one page in our application.

Using 5.91 / Build 2203 (OK):
(1) Right-click on Page_0/Page Properties.  The "Page Properties" dialog appears with a .1 refresh rate. 

(2) Click the START button on the Page_0 to begin graphing. 

(3) Right-click on Page_0/Page Properties.  The "Page Properties" dialog appears again, as expected.

(4) Set the page refresh rate to .05.

(5) Right-click on Page_0/Page Properties.  The "Page Properties" dialog appears as expected.

 

Using 5.91 / Build 2210 (latest unreleased build):
(1) Right-click on Page_0/Page Properties.  The "Page Properties" dialog MAY appear with a .1 refresh rate.  If it does not appear, you must press ENTER to dismiss the hidden dialog.  Select Page_1 and then right-click on Page_0/Page Properties to change the refresh rate. 

(2) Click the START button on Page_0 to begin graphing. 

(3) Right-click on Page_0/Page Properties.  The "Page Properties" dialog will not appear.  Press ENTER to dismiss the hidden dialog.  Select Page_1 and then right-click on Page_0/Page Properties to change the refresh rate above .1 seconds.

In both application version, selecting menu File/New after loading (or after using) can lock the application (i.e. Not Responding).  Sometimes other operations will stall the application or there are long delays for operation dialogs to appear.

 

It seems that calling sequence code in OnPaint events may be related to the cause since removing some of the variable value controls help.  However, adding a few extra graphs instead can also cause the trouble in the latest build.  With our actual application, sequence thread calls AddValue for various channels before calling delay(1) in a loop.  If the delay(x) period collects faster than .2 seconds, we can see similar locking troubles.

 

While it is not necessary for us to have .1 second refresh, we are curious about how to detect when paint performance may lock the application (i.e. what is the maximum 'safe'/achievable refresh rate for a given page).  It is understandable that a busy page would paint slower, but it seems that setting a page refresh incorrectly can cause hanging problems during other operations/development, as if painting has higher priority than UI.

 

Thanks again for your time.

 

RefreshTest.ctl

Link to comment
Share on other sites

You are right there does appear to be an issue with page properties when the screen is loaded down and we'll chase that one down, however, it does not appear to be related to the graphs, but is rather because of the way you did the variable value components.  Its important to understand that DAQFactory script is interpreted script, not compiled code.  It compiles a little, down to pseudo code, but doesn't compile down to machine code like C or VB or even JavaScript (on some browsers).  This gives DAQFactory many positive and unique qualities, such as the ability to edit code on the fly, and also to not get all sorts of symbol errors when you create code before the symbols exist.  But, it makes it quite a bit slower.  DAQFactory makes up for it quite a bit by having very fast array functions.  For example, in DAQFactory, if you had an array, x, of 10000 elements and wanted to add it to another array, y, of 10000 elements and plot it, you'd simply put:

 

x+y

 

in the Y expression.  In other languages you would have to prepare another array with the result you wanted to graph and use a for() loop to calculate the sum on 10000 elements.  You could in fact do this in DAQFactory too, but it would be very slow.  Doing x+y is very fast.

 

So, the problem you have is that you created a page with like 58 screen controls, and before each can paint, two lines of code execute in each event.  Those lines call functions where, depending on the val or flag, up to 7 lines of code may execute, for a total of 16 lines per control * 58 controls = 928 lines of script.  How long a line of script takes to execute varies on the line and the machine its running on.  You can do a basic test by creating a sequence on a blank document:

 

private x = 0
private st = systime()
while(x<10000)
   x++
endwhile
? (systime() - st)/10000
 
On my machine, the result is 0.16ms per loop iteration, and a loop iteration is essentially 2 lines of script.  Well, that means 928 lines of script are going to take almost 75ms, which is most of the 0.1 second refresh rate, and we haven't even started drawing anything on the screen.  
 
So, as mentioned in the docs, OnPaint events have to be really short and fast.  Yours are not.  Case() in DAQFactory is not like C, and is basically just a cleaner syntax for if/elseif/elseif.
 
The way you should be doing the coloring is either by using the built in color options of the variable value component, which looks like won't fit all your needs, directly set the color from some other script for each component, or come up with a faster algorithm.  Although not nearly as clean, even doing this right in your event (not calling a function, which is kind of slow) would be better:
 
backcolor = iif(flag >= g_nCount, RGB(128,128,128), iif(val >= g_dValAlert, RGB(255,0,0), RGB(0,128,0)))
 
I only did 2 of your case statements.  Keep nesting the iif()'s in until you cover all cases.  One line of code, even that complex, will run a lot faster than 9.
 
But its only flag that is keeping you from using the built in color options of a variable value component.
 
If that's not an option, directly setting the color from script would put all this logic in background threads instead of the primary paint / UI thread and is probably the best choice.
Link to comment
Share on other sites

OK, I think we have identified the problem.  In our debug build we have a profile setup for draw.  A page with just your variable value controls was taking 370 ms to paint due to all those OnPaint event scripts.  What I believe was happening was that DAQFactory was piling up paint messages in the queue every 0.1 seconds, but the system could only service them every 0.37 seconds.  This backlog would cause the paint messages for the dialog to get lost and thus the issue.  We have modified the application so the loop that tells DAQFactory to refresh won't send the message if DAQFactory is in the middle of doing a refresh already.  This basically means that if you specify a refresh rate of 0.1 seconds, but it takes 0.37 seconds to draw the screen, that the screen will actually only refresh every 0.4 seconds (approximately).  This appears to resolve the issue.

 

Would it be helpful if DAQFactory had a way of displaying the time it takes to draw a screen?  Similar to how video games sometimes show their refresh rate so you can optimize your video settings.

Link to comment
Share on other sites

Thanks for your detailed reply.  And yes, it might be helpful to have a fps indicator.  However, I was more concerned about an aggressive refresh setting suddenly causing blocking/hanging in the application or DAQFactory (i.e. File/New, Page Properties, application buttons).  I don't mind slower paint performance to ensure other application parts operate as expected.  I use delay() instead of wait() for exactly the reason that slow processing can hang everything.  I did not expect slightly slower paint operations to cause large troubles.  I was also curious that running our application on the latest unrelease 5.91 version was slower than the released 5.91 in these issues.

 

I'm glad you found a potential paint queue filling problem.  Maybe this could explain the other troubles? (i.e. could an operation like File/New generically wait for an message queue before show a confirmation dialog, etc.)

 

Our application manually acquires and computes data into an array of values used by the various variable value components.  I did not want to update UI components or calculate RGB values for each sample continously in an acquire sequence because it would impact acquisition performance.  I could have created another thread to operate on the data, but its speed would need to be a function of the paint speed anyway (else wasted cycles).  So, I presumed such calculations should only need to be performed when needed (i.e. during OnPaint).  While I expect painting to be slower, I did not think the simple script would impact the application usability so much (since paint operations are usually a lower priority event). 

 

Since variable value component colors are based on a list of 'thresholds' (which seems to not be changable by script), I could not use the built in color tables.  This is not just because of my 'flag' requirement, but because the user can set levels in a settings dialog.  If the 'threshold' for color could be based on an alternate expression/variable than the variable value component itself, then it could work.

 

Thank you for your script performance tips too.  I will examine the iff(...) approach since it seems 'each line of script' vs 'complexity of script' is important (as well as avoiding sequence method calling). 

 

I really think the flexibility of DAQFactory scripting, array operation performance, and certainly your support in the forum are all powerful features of Azeotech.  I am enjoying the process of learning the application's strengths and how best to utilize them.

Link to comment
Share on other sites

You have to remember that everything that you interact with, the pages, the menus, popups, etc, all run in a single thread with a single message queue.  This includes painting the screen.  Any delays, like the script you have, will cause the UI to appear sluggish.  This is why the graph calculations are done in a background thread.  This is also why I recommended putting your color logic in a sequence.  Even if that sequence runs at the same rate as the page refresh, the code will run in another thread and thus not slow the main user interface thread.

 

Yes, the paint queue filling problem is most likely the cause of your other troubles.  The queue would rather quickly get completely bogged down with Paint messages and other messages would take a long time to be processed.  The latest version resolves that.  I've put the latest build in the same place you downloaded before if you'd like to try the changes on your end.  An FPS indicator was also added in the status bar.

Link to comment
Share on other sites

Wow...thanks for the update...and it works great without UI lockout!  I also like the draw speed indicator.

 

BTW: I noticed that if the page refresh rate is set faster than the indicator, graphs don't update with the fps (updates only happen if DAQFactory GUI forces a repaint).  Maybe this is by design...it's another indicator page refresh is set to fast.  If not, maybe graph paint speed should be included in the fps indicator.

 

BTW2: I timed a simple loop calling a sequence 1000x.  First I tried CASE() in the sequence, and then again using iif().  I found CASE() to be 2.5x faster, go figure.  I'll rework my code to use a sequence that sets RGB for the UI.

 

Thanks again.

Link to comment
Share on other sites

Remember, graph data is processed in the background and that's not taken into account in FPS since it runs in Idle thread priority.  This means that if the graph data processing is slow, it won't affect the overall page refresh, but may cause graphs not to update as quickly as the rest of the page.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.