Attempt at working around format() function and 10^300 output


ekanderson

Recommended Posts

The rdToDig routine seems to work as expected for data that is ALL 0's OR all > 0.

However with mixed 0's and > 0, have odd results that occur.

If 16 digit output requested, result is about what is displayed in the channel's Table view.

? rdtodig(se_grn[0,10],16)

{0.1804683982934, 0.0705231789107, 0, 0, 0, 0, 0, 0, 0, 0, 0}

? rdtodig(se_grn[0,10],6)

{0.180468, 0.070523, 0, 0, 0, 0, 0, 0, 0, 0, 0}

? rdtodig(se_grn[0,10],1)

{0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1}

? rdtodig(se_grn[0,10],0)

{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}

? rdtodig(se_grn[0,10],1)

{0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1}

? rdtodig(se_grn[0,10],10)

{0.1804683983, 0.070523179, 0.0000000001, 0.0000000001, 0.0000000001, 0.0000000001, 0.0000000001, 0.0000000001, 0.0000000001, 0.0000000001, 0.0000000001}

? rdtodig(se_grn[0,10],6)

{0.180468, 0.070523, 0, 0, 0, 0, 0, 0, 0, 0, 0}

A different channel:

? rdtodig(v.bgr[0,10],16)

{4.552399828878, 2.540493825298, 1.118616415232, 0.1993500323392, 0, 0, 0.2310348047957, 0, 0, 0, 0}

? rdtodig(v.bgr[0,10],10)

{4.5523998289, 2.5404938253, 1.1186164153, 0.1993500324, 0.0000000001, 0.0000000001, 0.2310348048, 0.0000000001, 0.0000000001, 0.0000000001, 0.0000000001}

? rdtodig(v.bgr[0,10],6)

{4.5524, 2.540494, 1.118617, 0.199351, 0.000001, 0.000001, 0.231035, 0.000001, 0.000001, 0.000001, 0.000001}

? rdtodig(v.bgr[0,10],1)

{4.6, 2.6, 1.2, 0.2, 0.1, 0.1, 0.3, 0.1, 0.1, 0.1, 0.1}

? rdtodig(v.bgr[0,10],0)

{5, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1}

Evidently under some circumstances z is evaluated as being >=0.51, even tho when it is printed within the function, it prints as 0.

If I enter the different formula in the command window, the results are as anticipated.

Where am I going wrong?????

thanks

eka

// RdToDig() Round to # of digits      Last Mod   12 May 09
// 10 Jan 09 eka  implement
// 24 Apr 09 eka  return "ND" if Out of Range value passed  Mainly NaN
// 12 May 09 eka  Try for integers, change ND to o.r??
//
//*** need to get it to work for an array: works with 5.81 build 1624
//
// pass value, number of digits to return

private pp
private string ss
global r2dig
private x
private z
private y

function rdToDig(p, e)
   r2dig =numrows(p)

      // NaN doesn't work, evidently never passed thru to this
      // for an array, works for NONE arrays       12 May 09
if ((p>=10000000000) || (p ==NaN()))
   return("nd")
endif

      // try this for integer values     12 May 09
if (floor(p) ==p ) 
   return(p)
endif

  y =10^e
  x =floor(p *y)  
  z =(p*y) -x

//  ? x

 if (z >=0.51)
     x +=1
 endif

// ? z


return(x/y)

Link to comment
Share on other sites

You forget that z is an array. Doing:

if (z >= 0.51)

doesn't do what you think. It doesn't apply the if to the entire array. z >= 0.51 returns an array, but the if() requires a scalar, so it just looks at the first element of z, namely z[0]. If this is >= 0.51, then it adds 1 to every element of x, which throws everything off.

A few other comments:

1) you should always put the function declaration at the top, not after the variable declarations.

2) be careful using "e" as a variable since its also used for exponent in constants (ie. 3e10). Can't use exp either since this is a system function (like sin())

3) you have the same issue with your other ifs. For example, the floor(p) == p, will evaluate just the first element. If you want to find out if every element is an integer, you need to do the min():

if (min(floor(p) == p))

this is because floor(p) == p returns an array of 1's and 0's. If the min() of this array is 1, then all the elements are 1, and it was a complete match.

4) you can do this script in a single line (ignoring your check for NaN()s). Its a rather common algorithm:

function rdToDig(p, ex)

return(floor((p * 10^ex)+0.5) / 10^ex)

NaN's simply remain NaN. If you want to return just "nd" if any NaN's are passed in, you can still do it in a single line:

function rdToDig(p, ex)

return(iif(max(p > 1e20),"nd",floor((p * 10^ex)+0.5) / 10^ex))

or use several:

function rdToDig(p, ex)

if (max(p > 1e20))
   return("nd")
endif
return(floor((p * 10^ex)+0.5) / 10^ex)

Link to comment
Share on other sites

Re: scalar vs array

If I understand your doc correctly.

private x // defines an array

private z =0 // defines a scalar

private y[] // this was what I assumed declared an array.

I'll have to review all my code, based on this. SOD.

Thanks

eka

Link to comment
Share on other sites

I am evidently unclear as to the purpose of the sequence.

Have channel data WITH missing data tagged with NaN() values.

I want to output the channel data with JUST the NaN() values with "nd".

The function needs to look at EACH element in the array to determine if it is NaN(), and ONLY for that element return "nd".

The code you furnished works fine, except when there is NaN() somewhere in the array. i.e., the

if ( max(P >*)

return("nd")

endif

As I mentioned elsewhere in my posts, I wrote a sequence that looked at each element, but it was VERY slow.

Any suggestions would be greatly appreciated.

ek

Link to comment
Share on other sites

First, all variables are arrays. If you do z=3 you are still creating an array, its just an array with one element, which is also called a scalar. Most functions in DAQFactory work with arrays, i.e. sin(x) where x is an array with ten elements returns an array of ten elements, but if() is not a function, its a command statement, and can only evaluate one element. If you provide it with an array, it will only look at the [0] element.

You can't replace your NaN()'s with "nd" because "nd" is a string and the rest of your array is numbers. I've forgotten why you want the NaNs replaced so don't have an immediate alternative.

Link to comment
Share on other sites

My weather data collection *.ctl flags missing or potentially Out of Range datum elements with NaN() "values" in the channels' data table, so that these datum are not included in max(), min(), mean() calculations.

The data is output in various ways into about 4 export files with some 16 export "sets". Typically 4 export sets per actual disk file. One set for the interval values, and one each for the max, min, mean for that interval. I started using the format() function, to control the format of the output because of the problems with significant figures in DF export sets. Then found that format() output 1 with 300+_ zeros for the NaN() tagged elements prior to DFExpress 5.81 Build 1624 (Thank you for that). So I wrote the RdToDig function.

In the scientific community (at least for my era) "nd" , "N.D.", or "ND" meant No Data, so that is what I wanted to putch into the export data fields for the missing/erroneous data elements.

DFExpress 5.81 Build 1624 (your patch to format() that returns "NaN", rather than 100000000000000000000000000 etc.) will now work, but I just wanted to keep the "nd" output AND not have to edit all the export sets to use format().

I know, I can probably modify RdToDig to use format() and return those strings.

Thanks again for all your help.

ek

Link to comment
Share on other sites

I am reminded of one of my favorite stories that I tell about the discovery of the ozone hole, and a lesson I've been preaching for nearly 15 years: when the ozone hole was discovered, another group of scientists had already been measuring ozone at the south pole for a number of years. When the discovery was published, this group couldn't understand how they missed it. They went back and looked at their data and remembered that they had programmed their data acquisition system to throw out any data below 50ppb because that would be "impossible" and at the time (early 80's), disk space was expensive. Of course the ozone hole is exactly that, ozone < 50ppb, and so this group missed what is probably the biggest discovery in atmospheric science in recent times.

Nowadays, disk space is cheap, and while I understand your desire to make your post-processing easier by logging processed data, I strongly recommend ALWAYS logging unprocessed data as well. You can always take unprocessed data and process it down, but it is usually impossible to take processed data and work it back.

Link to comment
Share on other sites

You do like that story. :rolleyes:

Data is data whether it fits your notion of what it's values are supposed to be or not. However, I do NOT want to export data to the weather web page, that is questionable.

All data is logged. Export files are another matter. These go out to the web page approximately "real" time (every 6 minutes). That is why I need to be able to flag questionable data as "nd" and sound an alert until such time as I can verify it. Then I can review, and if possible verify the datum and take the appropriate action(s). Since I'm not running a nuclear power plant, my audit trail is not up to those standards, but there is an audit trail or sorts on modifications to the exported data vs the logged data. The final file is then available for export.

Missing data is another matter, for ease of further processing, it's nice to have symetrical files. So missing data is NaN'ed, and if possible exported as "nd" data.

Comparing weather data and unexplored data, is similar to comparing apples and oranges (to use a trite phrase). As an extreme example. If I get a reading of 1000 degrees for air temperature, it would get flagged as "nd", exported as such, logged as the actual value and then at my leisure (if I'm still alive) try to figure out what happened. If the lawn is burned and I have a sun burn, I MAY export that datum. If I'm dead, I don't really care.

Logging files are NOT exported. These are my raw data for figuring out what is what.

So getting back on the track, do you have any suggestions about how to handle this problem??????

ek

Link to comment
Share on other sites

My point was more for all the others reading these posts that might be doing things that are bit more critical.

As for your problem, I believe I have already told you a solution, but you don't seem to like it: use an extra string channel and process the data as it comes in, putting the desired string representation into that channel from the event of the raw data channel. Its super simple and works like a champ. I know it hits your 16 channel limit, but as I've said before, we can't spend time finding a way for you to avoid buying the correct license.

Link to comment
Share on other sites

"As for your problem, I believe I have already told you a solution, but you don't seem to like it: use an extra string channel and process the data as it comes in, putting the desired string representation into that channel from the event of the raw data channel. Its super simple and works like a champ...."

It isn't that I don't like, I don't trust it. Have had NO luck using string value channels for data. See my posts regarding "Readable date and time", etc. With DFEx 5.80 and 5.81, they NEVER worked as I would expect them to. Readable Date and time wrapped around after about 12 to 22 addvalues to them depending on whether it was a virtual or "real" channel. With DFEx 5.81 1624 a similar attempt at wind direction compass letters has a similar wrap around problem. Have tried *.addvalue() and *.addvalue(insertTime()) to no avail.

I dropped the whole issue, because I found a work around for the readable date and time and the same for the Wind direction compass letters.

ek

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.