Parsing an XML recipe

jrm16311 · September 7, 2009

I would like to automate (press a button, select file, and go) the process of getting specific data points from process recipe files in XML format and auto-fill process parameters (stored in v.channels). Please see the codes below. Is this possible? What would the script look like? Thanks for your assistance.

XML Recipe Code:

<MASH>

<MASH_STEPS>

<MASH_STEP>

<NAME>Protein Rest</NAME>

<TYPE>Infusion</TYPE>

<INFUSE_AMOUNT>11.829480</INFUSE_AMOUNT>

<STEP_TIME>30</STEP_TIME>

<STEP_TEMP>50.00000000</STEP_TEMP>

<RAMP_TIME>2</RAMP_TIME>

<END_TEMP>50.00000000</END_TEMP>

<DESCRIPTION>Add 3.13 gal of water at 129.3 F</DESCRIPTION>

<WATER_GRAIN_RATIO>0.31</WATER_GRAIN_RATIO>

<DECOCTION_AMT>0.00 gal</DECOCTION_AMT>

<INFUSE_TEMP>129.3 F</INFUSE_TEMP>

<DISPLAY_STEP_TEMP>$DISPLAY_STEP_TEMP</DISPLAY_STEP_TEMP>

<DISPLAY_INFUSE_AMT>3.13 gal</DISPLAY_INFUSE_AMT>

</MASH_STEP>

</MASH_STEPS>

</MASH>

</RECIPE>

</RECIPES>

TO

DAQFactory

VChannels:

V.Name (get from <NAME>)

V.Step_Time (get from <STEP_TIME>)

V.Step_Temp (get from <STEP_TEMP>)

V.Infusion_Temp (get from <INFUSE_TEMP>)

AzeoTech · September 8, 2009

Definitely possible. DAQFactory script is powerful enough to read any file type into DF. Now in your case, there are several questions: is the data always in this format, or do you need a general XML parser, and are there carriage returns at the end of the lines as you posted it?

If you need a general XML parser, that is a bit more involved than this forum can cover. For that, I would find an open source parser and either port it over to DF or compile it into a DLL and call it from DF.

However, if you just need to read this XML schema, then you can just read the file looking for the desired tags. If there are no carriage returns and everything is in one long line, then its a bit harder because you don't have an easy delimiter (though I suppose you could use /). Since it appears you do have carriage returns, the File.Read() function will delimit for you. It'd be something like this:

private handle = file.open("myfile.xml",1,0,0,1)
while (1)
   private string datain = file.read(handle)
   if (datain == "")
	  break
   endif
   switch
	  case (left(datain,6) == "&lt;NAME&gt;")
		 v.name = parse(mid(datain,6,1000),"&lt;",0)
	  case (left(datain,11) == "&lt;STEP_TIME&gt;")
		 v.step_time = parse(mid(datain,11,1000),"&lt;",0)
	  etc.
   endcase
   delay(0.05)
endwhile
file.close(handle)

I skipped all error handling and added the delay(0.05) which is not required but ensures that if you mess up, you don't hang the computer. Once you have it tested you can remove the delay(0.05) to make it run faster. I also did not test this.

What we do is read the file one line at a time. Since we opened the file in text mode (the last 1 in open()), doing file.read() reads a single line. We then look at each line searching for the desired opening tag. I've shown two, you can add the rest. If we find it, we use mid() to strip the open tag, then parse to find the < for the closing tag and retrieve everything in between.

ccdubs · September 8, 2009

Hi jrm16311,

It looks to me like you are doing some beer brewing.

Is this homebrew or professional? I am planning on getting back into homebrew and was thinking about using DF for logging and controlling parts of the process.

So if what you are doing is for homebrew and not commercially sensitive I would have an interest in seeing what you have come up with.

jrm16311 · September 27, 2009

In the code above,

case (left(datain,6) == "<NAME>")

v.name = parse(mid(datain,6,1000),"<",0)

Can you elaborate on this part of the code? I'm trying to understand and trying to get it to work, but haven't been able to.

AzeoTech · September 28, 2009

Sure. I'll assume you understand the switch/case structure.

left(datain,6): left() takes a string and returns the first x characters, in this case 6. Since we opened the file in text mode, when we do read() we get one line at a time, so I'm just looking at the first 6 characters of the line. I should mention that this will not strip any spaces at the beginning, but from your post their did not appear to be any. Maybe there are but I can't see them. In that case, you'll want to use the ltrim() function, which takes a string and returns the same string without any preceeding spaces:

left(ltrim(datain),6)

mid(datain,6,1000): mid() is like left(), but starts in the middle (in this case character 6, the 7th character in the string), and pulls out x characters, in this case 1000, which basically gets us to the rest of the line. Again, you may need to ltrim() it. Actually if you need to ltrim(), you should just do it in front of the switch:

datain = ltrim(datain)

then you can ignore any preceeding spaces in all your logic.

parse(x, "<", 0)

The parse function splits a string based on the character specified, and returns substring specified. So, since you did mid(datain,6,1000), you basically end up with:

parse("myname</NAME>","<",0)

Parse will then split the string into "myname" and "/NAME>", but since I put 0, we only get "myname". If we had specified 1, we would have gotten "/NAME". If we had specified -1, we would have gotten a string array with both strings.

This is just a shortcut to strip the </NAME>. We could have used Find() and searched for the < then used Mid() again, but this is much faster.

jrm16311 · September 28, 2009

Excellent explanation. I did get the code to finally work, however there are two issues that I would like help with.

1. The XML files contain recipe parameters that are generated from a program called BeerSmith. The XML generated contains the following two lines above the recipe information:

&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
&lt;!-- BeerXML Format - Generated by BeerSmith  - see www.beersmith.com --&gt;

The code originally did not work for me until I removed the two lines above. I'm wondering if there is a way to read the XML file without removing the two lines (as all recipes will have the lines).

2. The code contains similarly named tags such as the code below. Currently when I run the sequence to get the value, the sequence moves from tag to tag (with similar name) with each value appearing until it reaches the last tag with the value for that tag being the last one. Is there a way to further specify a tag?

&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
&lt;!-- BeerXML Format - Generated by BeerSmith  - see www.beersmith.com --&gt;
&lt;RECIPES&gt;
&lt;RECIPE&gt;
 [b]&lt;NAME&gt;[/b]ALT2&lt;/NAME&gt;
 &lt;BATCH_SIZE&gt;23.00000000&lt;/BATCH_SIZE&gt;
 &lt;BOIL_SIZE&gt;27.50000000&lt;/BOIL_SIZE&gt;
 &lt;BOIL_TIME&gt;60&lt;/BOIL_TIME&gt;
 &lt;EFFICIENCY&gt;70.00&lt;/EFFICIENCY&gt;
 &lt;HOPS&gt;
   &lt;HOP&gt;
   [b]&lt;NAME&gt;[/b]Spalt&lt;/NAME&gt;
   &lt;AMOUNT&gt;0.0420000&lt;/AMOUNT&gt;
   &lt;USE&gt;Boil&lt;/USE&gt;
   &lt;TIME&gt;60.000&lt;/TIME&gt;
  &lt;/HOP&gt;
  &lt;HOP&gt;
   [b]&lt;NAME&gt;[/b]Spalt&lt;/NAME&gt;
   &lt;VERSION&gt;1&lt;/VERSION&gt;
   &lt;AMOUNT&gt;0.0080000&lt;/AMOUNT&gt;
   &lt;USE&gt;Boil&lt;/USE&gt;
   &lt;TIME&gt;10.000&lt;/TIME&gt;
  &lt;/HOP&gt;
 &lt;/HOPS&gt;
&lt;/RECIPE&gt;
&lt;/RECIPES&gt;

AzeoTech · September 28, 2009

1) I'm not sure why those two lines wouldn't just get ignored, but you can simply put two:

file.read(handle)

lines after the open() and before the while() to read and ignore the lines

2) For this, you'll need to create more case statements to detect where you are in the XML file (based on other tags), and use variables to store that, then create some logic to pull out the desired values.

jrm16311 · January 31, 2011

I do I remove the XML hierarchy with a script before reading the lines?

It's the spaces before each line from the hierarchy that's messing up the code above.

AzeoTech · January 31, 2011

The ltrim() function should remove the spaces.

jrm16311 · January 31, 2011

thanks.

trial and error is not working, so specifically, where would that function go?

code looks like this:

private handle = file.open("recipes.xml",1,0,0,1)
delay(1)
while (1)
   private string datain = file.read(handle)
   if (datain == "")
	  break
   endif
   switch
	  case ((left(datain,12) == "&lt;BATCH_SIZE&gt;")
		 v.batchsize = parse(mid(datain,12,1000),"&lt;",0)
	  case (left(datain,11) == "&lt;BOIL_SIZE&gt;")
		 v.boilsize = parse(mid(datain,11,1000),"&lt;",0)
	  case (left(datain,11) == "&lt;BOIL_TIME&gt;")
		 v.boiltime = parse(mid(datain,11,1000),"&lt;",0)
   endcase
   delay(0.05)
endwhile
file.close(handle)

AzeoTech · January 31, 2011

After "private string datain = file.read(handle)" just do:

datain = ltrim(data)

Sign In

Parsing an XML recipe

Recommended Posts

jrm16311

Link to comment

Share on other sites

AzeoTech

Link to comment

Share on other sites

ccdubs

Link to comment

Share on other sites

jrm16311

Link to comment

Share on other sites

AzeoTech

Link to comment

Share on other sites

jrm16311

Link to comment

Share on other sites

AzeoTech

Link to comment

Share on other sites

jrm16311

Link to comment

Share on other sites

AzeoTech

Link to comment

Share on other sites

jrm16311

Link to comment

Share on other sites

AzeoTech

Link to comment

Share on other sites

Archived

Browse

Activity