471,612 Members | 2,422 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,612 software developers and data experts.

Best way for reading HTTP data

I've got a problem which relates to reading HTTP data.
I've got the socket connected to a web site and then I send "GET /
HTTP/1.1\n\n" and the page is received after a while but not all of the
page. Should I implement a timer to read the web page? How do I know
when the page is completed if sometimes socket.Available is 0?

The procedure is as follows:
-Socket socket=new
Socket(AddressFamily.InterNetwork,SocketType.Strea m,ProtocolType.Tcp);
-socket.Connect(endpoint);
-byte[] msg=Encoding.UTF8.GetBytes("GET / HTTP/1.1\n\n");
byte[] bytes=new byte[65536];
int i=socket.Send(msg,0,msg.Length,SocketFlags.None);
MessageBox.Show("Sent "+i.ToString()+" bytes. Available:
"+socket.Available.ToString()+" bytes.");
socket.Receive(bytes,0,socket.Available,SocketFlag s.None);
TrafficLogTextBox.Text+=Encoding.UTF8.GetString(by tes);
TrafficLogTextBox.Text+="\r\n";
MessageBox.Show(Encoding.UTF8.GetString(bytes));

How does HTTPWebResponse implements this? Does it use a timer between
non receiving data times? *How do I know when the page is complete?*
Did I made myself clear?

Thanks a lot,
Nuno Magalhaes.

Nov 25 '05 #1
7 2629
Web page can have large size, that is why it is normal situation that it
will be received with the help of several calls to Receive(...).
To handle this you have to parse HTTP protocol specific data. Size of the
response that server will generate is written into content-size http header.

So the algorithm is the following:
- receive the first part of the response, that contains http header, that
will describe the data that will follow;
- receive the amount of data, specified in the retrieved http header

--
Vadym Stetsyak aka Vadmyst

http://vadmyst.blogspot.com
"Nuno Magalhaes" <nu************@hotmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
I've got a problem which relates to reading HTTP data.
I've got the socket connected to a web site and then I send "GET /
HTTP/1.1\n\n" and the page is received after a while but not all of the
page. Should I implement a timer to read the web page? How do I know
when the page is completed if sometimes socket.Available is 0?

The procedure is as follows:
-Socket socket=new
Socket(AddressFamily.InterNetwork,SocketType.Strea m,ProtocolType.Tcp);
-socket.Connect(endpoint);
-byte[] msg=Encoding.UTF8.GetBytes("GET / HTTP/1.1\n\n");
byte[] bytes=new byte[65536];
int i=socket.Send(msg,0,msg.Length,SocketFlags.None);
MessageBox.Show("Sent "+i.ToString()+" bytes. Available:
"+socket.Available.ToString()+" bytes.");
socket.Receive(bytes,0,socket.Available,SocketFlag s.None);
TrafficLogTextBox.Text+=Encoding.UTF8.GetString(by tes);
TrafficLogTextBox.Text+="\r\n";
MessageBox.Show(Encoding.UTF8.GetString(bytes));

How does HTTPWebResponse implements this? Does it use a timer between
non receiving data times? *How do I know when the page is complete?*
Did I made myself clear?

Thanks a lot,
Nuno Magalhaes.

Nov 25 '05 #2
In most cases I don't have the "Content-Length" field in the HTTP
response header.
Any hints for what I could be doing wrong or what I should be doing.

Thank you Vadym.

Vadym Stetsyak wrote:
Web page can have large size, that is why it is normal situation that it
will be received with the help of several calls to Receive(...).
To handle this you have to parse HTTP protocol specific data. Size of the
response that server will generate is written into content-size http header.

So the algorithm is the following:
- receive the first part of the response, that contains http header, that
will describe the data that will follow;
- receive the amount of data, specified in the retrieved http header

--
Vadym Stetsyak aka Vadmyst

http://vadmyst.blogspot.com
"Nuno Magalhaes" <nu************@hotmail.com> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
I've got a problem which relates to reading HTTP data.
I've got the socket connected to a web site and then I send "GET /
HTTP/1.1\n\n" and the page is received after a while but not all of the
page. Should I implement a timer to read the web page? How do I know
when the page is completed if sometimes socket.Available is 0?

The procedure is as follows:
-Socket socket=new
Socket(AddressFamily.InterNetwork,SocketType.Strea m,ProtocolType.Tcp);
-socket.Connect(endpoint);
-byte[] msg=Encoding.UTF8.GetBytes("GET / HTTP/1.1\n\n");
byte[] bytes=new byte[65536];
int i=socket.Send(msg,0,msg.Length,SocketFlags.None);
MessageBox.Show("Sent "+i.ToString()+" bytes. Available:
"+socket.Available.ToString()+" bytes.");
socket.Receive(bytes,0,socket.Available,SocketFlag s.None);
TrafficLogTextBox.Text+=Encoding.UTF8.GetString(by tes);
TrafficLogTextBox.Text+="\r\n";
MessageBox.Show(Encoding.UTF8.GetString(bytes));

How does HTTPWebResponse implements this? Does it use a timer between
non receiving data times? *How do I know when the page is complete?*
Did I made myself clear?

Thanks a lot,
Nuno Magalhaes.


Nov 27 '05 #3
I think you are taking the incorrect approach here. You should use the
HTTPWebRequest and HTTPWebResponse classes. They are much, much easier
than raw sockets. Here is some sample code you can use to start with:

Sub Main()
Dim objRequest As HttpWebRequest
Dim strRequest As String
Dim objResponse As HttpWebResponse
Dim srResponse As StreamReader
Dim strUrl As String = "http://www.msn.com"
'initialize the request
objRequest = CType(WebRequest.Create(strUrl), HttpWebRequest)
objRequest.Method = "GET"

'get response
objResponse = CType(objRequest.GetResponse, HttpWebResponse)
srResponse = New StreamReader(objResponse.GetResponseStream)
Console.WriteLine(srResponse.ReadToEnd)
srResponse.Close()
Console.ReadLine()
End Sub

Its in VB.NET but you should be able to convert this quite easily. To
answer your question how do I know when the reading is done; as you can
see calling the ReadToEnd() method on the streamreader object handles
this for you.

I hope this helps

Nov 27 '05 #4
Maybe I'm not passing all the parameters to the server also. Do you
know if sending "GET / HTTP/1.1\n\n" is enough to receive the content
length field?

Nov 27 '05 #5
I can't use that higher level functions because I'm measuring
parameters of QoS such as: time to resolve dns, time to connect, time
to receive data, time to display all web page, etc...

Do you know if the "GET / HTTP/1.1" is enough to receive the
"Content-Length: " parameter in the HTTP response header?

Thank you.

Nov 27 '05 #6
Ok. I feld kinda bad posting VB code in a C# newsgroup as I forgot what
group I was in, so I will convert this on the fly for you. Sorry about
that Guys! My bad!

static void main(string[] args)
{
HttpWebRequest objRequest;
string strRequest;
HttpWebResponse objResponse;
StreamReader srResponse;
string strUrl = "http://www.msn.com";
objRequest = ((HttpWebRequest) (WebRequest.Create(strUrl)));
objRequest.Method = "GET";
objResponse = ((HttpWebResponse) (objRequest.GetResponse));
srResponse = new StreamReader(objResponse.GetResponseStream);
Console.WriteLine(srResponse.ReadToEnd);
srResponse.Close();
Console.ReadLine();
}

Nov 27 '05 #7
The GET should do it; but I would use the HEAD as it only retreive the
headers

HEAD / HTTP/1.1 \r\n
Host: localhost (or whatever)\r\n

Note the \r\n instead of \n\n

\r\n = carriage return/line feed

Nov 27 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Mike Verdone | last post: by
3 posts views Thread by Adman | last post: by
2 posts views Thread by Ed | last post: by
4 posts views Thread by Epictitus | last post: by
3 posts views Thread by MF AHMED | last post: by
1 post views Thread by XIAOLAOHU | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.