An example of how the XML file is structured:
- <xml>
-
<farm name="NAME1">
-
<size x="INTEGER1" y="INTEGER2" />
-
<neighbor name="NAME2" />
-
<crop name="PLANT" area="INTEGER3" />
-
<crop … />
-
…
-
</farm>
-
<farm name="NAME2" …>
-
…
-
</farm>
-
…
-
</xml>
Develop a scanner for the XML configuration file specified before. Remember that the scanner just
splits the grammar up into tokens.
Below is a list of categories that you will need to use in your scanner.
Category ← Tokens that fall in that category
Start Tag ← <xml, <farm, <size <neighbor, <crop
End Start Tag ← >
End Tag ← </xml>, </farm>, />
Attribute ← name, x, y, area
Assignment ← =
Number ← Any integer
String ← Anything within double quotes
White space characters should be ignored in the program.
If an invalid token is read, the error message “Invalid Token: T” should be displayed, where T is the
invalid token that was read. After an error occurs, the program should continue parsing the file.
Your lex file will need to output: matched category, colon, matched token, and a line return.
For example, the input:
- <xml>
-
<farm name="John">
-
<size x="2" y="3" />
-
</farm>
-
</xml>
Would output:
Start Tag: <xml
End Start Tag: >
Start Tag: <farm
Attribute: name
Assignment: =
String: “John”
End Start Tag: >
Start Tag: <size
…