Once you know the PDF format, you can start writing your code. You'll want to create a filestream to read the PDF file and then a different filestream to write out the XML. After the file is created, you loop through the PDF file and use your new knowledge of the specification to interpret it into the format you want.
This is just the overview, if you get stuck on a particular part, post your code, tell us what it should do, what it's doing wrong and we will take a look at the code.