Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

Reading HTTP 1.1 requests from a real web server in C#

I've received rather a lot of questions recently asking the same question, so I thought that I 'd write a blog post on it. Here's the question:

Why does my network client fail to connect when it is using HTTP/1.1?

I encountered this same problem, and after half an hour of debugging I found the problem: It wasn't failing to connect at all, rather it was failing to read the response from the server. Consider the following program:

using System;
using System.IO;
using System.Net.Sockets;

class Program
{
    static void Main(string[] args)
    {
        TcpClient client = new TcpClient("host.name", 80);
        client.SendTimeout = 3000;
        client.ReceiveTimeout = 3000;
        StreamWriter writer = new StreamWriter(client.GetStream());
        StreamReader reader = new StreamReader(client.GetStream());
        writer.WriteLine("GET /path HTTP/1.1");
        writer.WriteLine("Host: server.name");
        writer.WriteLine();
        writer.Flush();

        string response = reader.ReadToEnd();

        Console.WriteLine("Got Response: '{0}'", response);
    }
}

If you change the hostname and request path, and then compile and run it, you (might) get the following error:

An unhandled exception of type 'System.IO.IOException' occurred in System.dll

Additional information: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time or established connection failed because connected host has failed to respond.

Strange. I'm sure that we sent the request. Let's try reading the response line by line:

string response = string.Empty;
do
{
    string nextLine = reader.ReadLine();
    response += nextLine;
    Console.WriteLine("> {0}", nextLine);
} while (reader.Peek() != -1);

Here's some example output from my server:

> HTTP/1.1 200 OK
> Server: nginx/1.9.10
> Date: Tue, 09 Feb 2016 15:48:31 GMT
> Content-Type: text/html
> Transfer-Encoding: chunked
> Connection: keep-alive
> Vary: Accept-Encoding
> strict-transport-security: max-age=31536000;
>
> 2ef
> <html>
> <head><title>Index of /libraries/</title></head>
> <body bgcolor="white">
> <h1>Index of /libraries/</h1><hr><pre><a href="../">../</a>
> <a href="prism-grammars/">prism-grammars/</a>
   09-Feb-2016 13:56                   -
> <a href="blazy.js">blazy.js</a>                                           09-F
eb-2016 13:38                9750
> <a href="prism.css">prism.css</a>                                          09-
Feb-2016 13:58               11937
> <a href="prism.js">prism.js</a>                                           09-F
eb-2016 13:58               35218
> <a href="smoothscroll.js">smoothscroll.js</a>
   20-Apr-2015 17:01                3240
> </pre><hr></body>
> </html>
>
> 0
>

...but we still get the same error. Why? The reason is that the web server is keeping the connection open, just in case we want to send another request. While this would usually be helpful (say in the case of a web browser - it will probably want to download some images or something after receiving the initial response), it's rather a nuisance for us, since we don't want to send another request and it's rather awkward to detect the end of the response without detecting the end of the stream (that's what the while (reader.Peek() != -1); is for in the example above).

Thankfully, there are a few solutions to this. Firstly, the web server will sometimes (but not always - take the example response above for starters) send a content-length header. This header will tell you how many bytes follow after the double newline (\r\n\r\n) that separate the response headers from the response body. We could use this to detect the end of the message. This is the reccommended way , according to RFC2616.

Another way to cheat here is to send the connection: close header. This instructs the web server to close the connection after sending the message (Note that this will break some of the tests in the ACW, so don't use this method!). Then we can use reader.ReadToEnd() as normal.

A further cheat would be to detect the expected end of the message that we are looking for. For HTML this will practically always be </html>. We can close the connection after we receive this line (although this doesn't work when you're not receiving HTML). This is seriously not a good idea. The HTML could be malformed, and not contain </html>.

Art by Mythdael