473,383 Members | 1,952 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

parse http header

I am having trouble figuring out how to parse a http header into a map<string,string>
Expand|Select|Wrap|Line Numbers
  1. POST /blah HTTP/1.1
  2. Host: example.com
  3. Accept-Language: en-us,en;q=0.5
  4. Accept-Encoding: gzip,deflate
  5. Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
  6. Content-Type: application/x-www-form-urlencoded
  7. Content-Length: 25
  8.  
  9.  
My idea was to find the first index of ':' on each line and use the text before that as the key. Then use everything before the \r\n as the value. The problem is the spec (http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html) says that the values can span multiple lines. Any ideas on how I would handle those? Can the value include a colon (the specs don't seem to specify)?
Aug 9 '10 #1

✓ answered by Oralloy

Parsing is always non-trivial. The line break information you need is described in section 2.2 of the spec, however.

Why are you writing this parser, rather than using one that's already been implemented?

5 14640
johny10151981
1,059 1GB
first separate all lines. every line ends with "\r\n"
First line is fixed. It always the same. GET/POST/DELETE/HEADER or something else now i cant recall.
Form Second line you can separate by ":"

If the request is a POST request then after getting "\r\n\r\n" you get the URI in one line.
Aug 9 '10 #2
johny10151981
1,059 1GB
Your Posted header is missing one line.

Your posted header says it is a POST request and its content length is 25. But I cant see the URI. I guess your program stopped after getting "\r\n\r\n" which is not right.
Aug 9 '10 #3
Oralloy
987 Expert 512MB
Parsing is always non-trivial. The line break information you need is described in section 2.2 of the spec, however.

Why are you writing this parser, rather than using one that's already been implemented?
Aug 9 '10 #4
@Oralloy
I was using a small http client sample code. In the sample they were just ignoring the header data and going straight to the \r\n\r\n to get the response data. I am only doing GETs and one of the URLs has a redirect which is why I needed the header data.

@johny10151981
It was only an example not real data and I am only doing GETs. Your suggestion doesn't help for multi line values.

Thanks for referring me to section 2.2. It answered my question.
Aug 9 '10 #5
Oralloy
987 Expert 512MB
dschu012,

If you're using a very light-weight sample as your starting point, then you're going to have to do some work to parse the headers. Since you're only interested in one of the headers, you can cheat by reading the headers one line at a time and processing them. If you find the redirect, you're done, if you don't you have the content.

Pseudo code:
Expand|Select|Wrap|Line Numbers
  1. looking = inHeaders = true;
  2. while (looking && inHeaders)
  3.   read line
  4.   if (eof)
  5.     inHeaders = false
  6.   else if (line = "")
  7.     inHeaders = false
  8.   else if (strncmp(line, "Redirect:", 9))
  9.     ; // noOp - not found
  10.   else
  11.     looking = false // found the header we want
  12. end while
  13.  
  14. if(!looking)
  15.   process redirect
  16. else
  17.   process result
  18.  
If you read the specification, then you have a good idea of how messy parsing the headers can be.

Rather than re-inventing the wheel to parse headers, it might be worth some time to go find a little better example to start with.

If you're not stuck with C++, there is a really good Perl module for accessing web servers.

On the other hand, if you're stuck with C++, I'd say use a combination of Lex and Yacc to really simplify the work.

Another good option would be to read the entire mess into a single buffer and parse it using regex. I'm pretty sure that it'll be fairly easy to write regular expressions to parse the headers. Start by dividing the headers from the content at the first occurance of "\r\n\r\n". Then tear the headers off of the header block one at a time using one regex, repeatedly.

Failing that, I'd write a simple state machine/recognizer for general headers. The problem is that by the time you're done implementing all the quoting forms and comments, you're going to have a rather complex bit of software.

See section 2 of the document you sent...
Aug 10 '10 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

30
by: Anon | last post by:
If Http headers specify the character encoding, what is the point of the Meta tag specifying it?
10
by: Ryan Cooper | last post by:
Can anyone give me an idea or better yet, a code sample, demonstrating how to parse a JPEG header and search the header for markers located in the stream? Basically, I need to analyze an...
0
by: ramas | last post by:
Hi, I am new to PHP scripting and i am trying to connect to a soap server (as mentioned below) using the SOAP extension comesup with PHP. Now my requirement is to add my HTTP header fields along...
6
by: R. Rajesh Jeba Anbiah | last post by:
Is it necessary or is there any standard to send HTTP header status for form inputs ? Say, the user is entering invalid password in login form and now all the applications I have seen are just...
13
by: mateuszb | last post by:
Is there any opensource implementation of such library which can be used to parse HTTP headers received from server, or can be used to construct such HTTP headers ??
1
by: tacoturtle | last post by:
Hello All, do all the properties that the request posses directly correlate to a http header variable? The reason I ask is that I am trying code against a scenario where the header variables do...
4
by: Roshawn Dawson | last post by:
Hi, I've been reading the <a href="http://www.google.com/webmasters/guidelines.html">Google WebMaster Guidelines. Google urges web developers to make use of the If-Modified-Since http header...
3
by: Christian Lutz | last post by:
Hy there I have a Web Services written in Java, running on Tomcat. The Client is written in C#. When i monitor the request/Response with TCPMon (included in Tomcat) i can observer the following...
6
by: utnemisis51 | last post by:
Hi, I'm trying to include some user credentials for accessing a remote webservice. The remote location requires that I use Basic authentication, which means, from browsing around, I need to...
0
by: Dean Hallman | last post by:
Hello, I am developing a BHO that should add a custom HTTP header on a specific domain only. Don't want the header globally, otherwise I could just add a registry key. So, on...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.