Stardust | Starbeamrainbowlabs

Website Integrations #3: Twitter cards

(The posts featured in the above images are this one about my new Raspberry Pi 3, and the latest coding conundrums evolved post).

You have arrived in the third of three parts in my mini-series on how I implemented rich snippets. In the last two parts I tackled open graph and becoming an oEmbed provider. In this part, I'll be talking a bit about twitter cards, and how I implemented them.

Twitter's take on the problem seems to be much simpler than Facebook's, which makes for easy implementing :D Like in the other two protocols too, they decided to have multiple different types of, well, in this case, cards. I decided to implement the summary card type. Like open graph, it adds a bunch of <meta /> tags to the header. Sigh. Anyway, here are the property names I needed to implement:

twitter:card - The type of card. In my case this is set to summary
twitter:site - This one's confusing. Although it's called 'site', it should actually be set to your own twitter handle - mine is @SBRLabs.
twitter:title - The title of the content. Practically identical to open graph's og:title.
twitter:description - The description of the content. The same as og:description.
twitter:image - A url pointing to an image that should be displayed next to the title and description. Unlike Facebook's open graph, twitter appears to support https urls here with no problem at all.

Since after implementing open graph I already had 90% of the infrastructure and calculations in place already, throwing together values for the above wasn't too difficult. Here's an example set of twitter card <meta /> tags generated by the updated code:

<meta property="twitter:card" content="summary" />
<meta property="twitter:site" content="@SBRLabs" />
<meta property="twitter:title" content="Running Prolog on Linux" />
<meta property="twitter:description" content="Hello! I hope you had a nice restful Easter. I've been a bit busy this last 6 months, but I've got a holiday at the moment, and I've just received a .... (click to read more)" />
<meta property="twitter:image" content="https://starbeamrainbowlabs.com/blog/images/20151015-learning-swi-prolog-banner.svg" />

Update October 2020: It doesn't look likes Twitter likes SVG images very much. Watch out!

Easy peasy. Next up was testing time. Thankfully, Twitter made this easy too by providing an official testing tool. Interestingly, they whitelist domains based on whether the webmaster has run a url through their tool - so if you want twitter cards to show up, make sure you plug at least one of your website's page urls through their tool.

After a few tweaks, I got this:

With that, my work was complete. This brings us to the end of my mini-series on rich-snippet integrations (unless I've missed a protocol O.o Comment below if I have)! I hope you've found it useful. If you have (or even if you haven't!) please let me know in the comments below :D

Website Integrations #2: oEmbed

Welcome to part 2 of this impromptu miniseries! In this second part of three, I'll be showing you a little about how I set up and tested a simple oEmbed provider for my blog posts - I've seen lots of oEmbed client information out there, but not much in the way of provider (or server) implementations.

If you haven't read part one about the open graph protocol yet, then you might find it interesting.

oEmbed is a bit different to open graph in that instead of throwing a bunch of meta tags into your <head />, you instead use a special <link /> element that points interested parties in the direction of some nice tasty json. Personally, I find this approach to be more sensible and easier to handle - the kind of thing you'd expect from an open standard.

To start with, I took a read of their specification, as I did with open graph. It doesn't have as many examples as I'd have liked, and I had to keep jumping around, but it's certainly not the worst I've seen.

oEmbed is built on the idea of providers (that's me!) and consumers (the programs and website you use). Providers, erm, provide machine-readable information about urls passed to them, and consumers take this information provided to them and display it to the user in a manner they think is appropriate.

To start with, I created a new PHP file to act as my provider over at https://starbeamrainbowlabs.com/blog/oembed.php and took a look at the different oEmbed types available - oEmbed has a type system of sorts, similar to open graph. I decided on link - while a rich would look cool, it would be almost impossible to test with every client out there, and I can't guarantee how the html would be rendered or what space it would have either.

With that decided, I made a list of the properties that I'd need to include in the json response:

version - The version of oEmbed. Currently 1.0 as of the time of typing.
type - The oEmbed type. I chose link here.
title - The title of the page
author_name - The name of the author
author_url - A link to the author's homepage.
provider_name - The provider's name.
provider_url - A link to the provider's homepage. I chose my blog index, since this script will only serve my blog.
cache_age - How long consumers should cache the response for. I put 1 hour (3600 seconds) here, since I usually correct mistakes after posting that I've missed, and I want them to go out fairly quickly.
thumbnail_url - A link to a suitable thumbnail picture.
thumbnail_width - The width of the thumbnail image, in pixels.
thumbnail_height - The width of the thumbnail image, in pixels.

Then I looked at the data I'd be getting from the client. It all comes in the form of GET parameters:

format - Either json or xml. Personally, I only support json.
url - The url to send oEmbed information for.

With all the information close at hand, I spent a happy hour or so writing code, and ended up with a script that outputs something like this:

{
    "version": "1.0",
    "type": "link",
    "title": "Website Integrations #1: Open Graph",
    "author_name": "Starbeamrainbowlabs",
    "author_url": "https:\/\/starbeamrainbowlabs.com\/",
    "provider_name": "Stardust | Starbeamrainbowlabs' Blog",
    "provider_url": "https:\/\/starbeamrainbowlabs.com\/blog\/",
    "cache_age": 3600,
    "thumbnail_url": "https:\/\/starbeamrainbowlabs.com\/images\/logos\/open-graph.png",
    "thumbnail_width": 300,
    "thumbnail_height": 300
}

(See it for yourself!)

Though the specification includes requirements for satisfying 2 extra GET parameters, maxwidth and maxheight, I chose to ignore them since writing a dynamic thumbnail rescaling script is both rather complicated and requires a not insignificant amount of processing power every time it is used.

After finishing the oEmbed script, I turned my attention to one final detail: The special <link /> tag required for auto-discovery. A quick bit of PHP in the article page renderer adds something like this to the header:

<link rel="alternate" type="application/json+oembed" href="https://starbeamrainbowlabs.com/blog/oembed.php?format=json&url=https%3A%2F%2Fstarbeamrainbowlabs.com%2Fblog%2Farticle.php%3Farticle%3Dposts%252F229-Website-Integrations-1-Open-Graph.html" />

and with that, my oEmbed provider implementation is complete - but it still needs testing! Unfortunately, testing tool for oEmbed are few and far between, but I did manage to find a few:

oEmbed Tester - A basic testing tool. Appears to work well for the most part - except the preview. Not sure why it says "Preview not available." all the time.
Iframely URL Debugger - Actually a testing tool for some commercial tool or other, but it still appears to accurately test not only oEmbed, but open graph and twitter cards (more on them in the next post!) too!

After testing and fixing a few bugs, my oEmbed provider was complete! Next time, I'll be taking a look at twitter's take on the subject: Twitter cards.

Found this interesting? Comment below! Share it with a friend!

Website Integrations #1: Open Graph

The logo of the Open Graph protocol.

These days, if you share a link to a website or a blog post with a friend or on a social networking site, sometimes the link expands to a preview of the link you've just posted. Personally, I find this behaviour to be quite helpful, as it lets me get an idea as to what it is that I'm about to click on.

Unfortunately, when it comes to the code behind these previews, there are no less than 3(!) different protocols that you need to implement in order to get it to work, since facebook, twitter, and the rest of the web community haven't been talking to each other quite like they should have been.

Anyway, after implementing these 3 protocols and having a bit of trouble with them, I thought I'd write up a mini-series on the process I went through, the problems I encountered, and how I solved them. In this post, I'm going to explain Facebook's Open Graph protocol.

I decided that I'd implement these 3 protocols on my home page and each blog post's page. Open Graph was the easiest - all it requires is a bunch of meta tags. These tags are split into 2 parts - the common tags, which all page types should have, and the type-specific tags, which depend on the type of page you're implementing them on. Here's the list of common tags I implemented:

og:title - The title of your page
og:description - A short description of your page
og:image, og:image:url, and og:image:secure_url - The url of an image that would fit as a preview for the page
og:url - The url of the page (not sure why this is required, since you have to know the url in order to require the page... :P Perhaps it's to help with deduplication - I'm not sure)

These were fairly easy for my home page:

<meta property="og:title" content="Starbeamrainbowlabs" />
<meta property="og:description" content="Hi! I am a computer science student who is in their second year at Hull University. I started out teaching myself about various web technologies, and then I managed to get a place at University, where I am now." />
<meta property="og:image" content="http://starbeamrainbowlabs.com/favicon.png" />
<meta property="og:image:url" content="http://starbeamrainbowlabs.com/favicon.png" />
<meta property="og:image:secure_url" content="https://starbeamrainbowlabs.com/favicon.png" />
<meta property="og:url" content="https://starbeamrainbowlabs.com/" />

When I went to test it using Facebook's official testing tool, the biggest problem I had was that the image wouldn't show up - no matter what I did. I eventually found this stackoverflow answer which explained that Facebook doesn't support https urls in anything other than the og:image:secure_url meta tag (even though they say they do) - so changing the urls to regular http solved the problem.

Next, I took a look at the type-specific tags. There's a whole bunch of them (check out this section of the spec) - I decided on the profile type for the index page of my website here:

<meta property="og:type" content="profile" />

The profile type has a few extra specific meta tags that need setting too, so I added those:

<meta property="profile:first_name" content="Starbeamrainbowlabs" />
<meta property="profile:last_name" content="Tjovik" />
<meta property="profile:username" content="Starbeamrainbowlabs" />

With that done, I turned my attention to my blog posts. Since the page is rendered in PHP (and typing out all those meta tags was a rather annoying), I created a teensy little framework to output the meta tags for me

$metaTags = [];
$metaTags["property"] = "value";

$renderedMetaTags = "";
foreach($metaTags as $metaKey => $metaValue)
    $renderedMetaTags .= "\t<meta property=\"$metaKey\" content=\"$metaValue\" />";

Now I can add as many meta tags as I like, with a fraction of the typing - and it looks neater too :D With that done, I implemented the basic meta tags. Here's some example output from the last post:

<meta property="og:title" content="4287 Reasons why your comments weren't posted" />
<meta property="og:description" content="I don't get a lot of real comments on here from what I can tell, as you've probably noticed. I don't particularly mind (though it's always awesome whe.... (click to read more)" />
<meta property="og:image" content="http://starbeamrainbowlabs.com/blog/images/20170406-Spammer-Mistakes.png" />
<meta property="og:image:url" content="http://starbeamrainbowlabs.com/blog/images/20170406-Spammer-Mistakes.png" />
<meta property="og:image:secure_url" content="https://starbeamrainbowlabs.com/blog/images/20170406-Spammer-Mistakes.png" />
<meta property="og:url" content="https://starbeamrainbowlabs.com/blog/article.php?article=posts%2F228-4287-Reasons-Your-Comments-Were-Not-Posted.html" />

That wasn't too tough. Next, I looked at the list of types again, and chose the article type for my blog posts.

<meta property="og:type" content="article" />

Like the profile type earlier, the article type also comes with a few type-specific meta tags (what they mean by not fitting into a 'vertical' I have no idea). I decided not to implement all the type-specific meta tags available here, since not all of them were practical to implement. Here's some more example output for the new tags:

<meta property="article:author" content="https://starbeamrainbowlabs.com/" />
<meta property="article:published_time" content="2017-04-08T12:56:46+01:00" />

Unfortunately, the article published time is really awkward to get hold of actually (even though it's outputted at the bottom of every article) , so I went with the 'last modified' time instead. The published time is marked up with html microdata Hopefully it doesn't cause too many issues later - though I can always change it :P

With that (and a final test), it looked like my Open Graph implementation was working as intended. Next time, I'll show you how I implemented a simple oEmbed provider.

Useful Links

Facebook's official testing tool

Demystifying UDP

Yesterday I was taking a look at [UDP Multicast], and attempting to try it out in C#. Unfortunately, I got a little bit confused as to how it worked, and ended up sending a couple of hours wondering what I did wrong. I'm writing this post to hopefully save you the trouble of fiddling around trying to get it to work yourself.

UDP stands for User Datagram Protocol (or Unreliable Datagram Protocol). It offers no guarantee that message sent will be received at the other end, but is usually faster than its counterpart, TCP. Each UDP message has a source and a destination address, a source port, and a destination port.

When you send a message to a multicast address (like the 239.0.0.0/8 range or the FF00::/8 range for ipv6, but that's a little bit more complicated), your router will send a copy of the message to all the other interested hosts on your network, leaving out hosts that have not registered their interest. Note here that an exact copy of the original message is sent to all interested parties. The original source and destination addresses are NOT changed by your router.

With that in mind, we can start to write some code.

IPAddress multicastGroup = IPAddress.Parse("239.1.2.3");
int port = 43;
IPEndPoint channel = new IPEndPoint(multicastGroup, port);
UdpClient client = new UdpClient(43);
client.JoinMulticastGroup(multicastGroup);

In the above, I set up a few variables or things like the multicast address that we are going to join, the port number, and so on. I pass the port number to the new UdpClient I create, letting it know that we are interested in messages sent to that port. I also create a variable called channel, which we will be using later.

Next up, we need to figure out a way to send a message. Unfortunately, the UdpClient class only supports sends arrays of bytes, so we will be have to convert anything we want to send to and from a byte array. Thankfully though this isn't too tough:

string data = "1 2, 1 2, Testing!";
byte[] payload = Encoding.UTF8.GetBytes(data);
string message = Encoding.UTF8.GetString(payload);

The above converts a simple string to and from a byte[] array. If you're interested, you can also serialise and deserialise C♯ objects to and from a byte[] array by using Binary Serialisation. Anyway, we can now write a method to send a message across the network. Here's what I came up with:

private static void Send(string data)
{
    Console.WriteLine("Sending '{0}' to {1}.", data, destination);
    byte[] payload = Encoding.UTF8.GetBytes(data);
    Send(payload);
}
private static void Send(byte[] payload)
{
    client.Send(payload, payload.Length, channel);
}

Here I've defined a method to send stuff across the network for me. I've added an overload, too, which automatically converts string into byte[] arrays for me.

Putting the above together will result in a multicast message being sent across the network. This won't do us much good though unless we can also receive messages from the network too. Let's fix that:

public static async Task Listen()
{
    while(true)
    {
        UdpReceiveResult result = await client.ReceiveAsync();
        string message = Encoding.UTF8.GetString(result.Buffer);
        Console.WriteLine("{0}: {1}", result.RemoteEndPoint, message);
    }
}

You might not have seen (or heard of) asynchronous C# before, but basically it's a ways of doing another thing whilst you are waiting for one thing to complete. Dot net perls have a good tutorial on the subject if you want to read up on it.

For now though, here's how you call an asynchronous method from a synchronous one (like the Main() method since that once can't be async apparently):

Task.Run(() => Listen).Wait();

If you run the above in one program while sending a message in another, you should see something appear in the console of the listener. If not, your computer may not be configured to receive multicast messages that were sent from itself. In this case try running the listener on a different machine to the sender. In theory you should be able to run the listener on as many hosts on your local network as you want and they should all receive the same message.

Jabber & XMPP: A Lost Protocol

Welcome to a special tutorial post here at starbeamrainbowlabs.com. In this post, we will be exploring an instant messaging protocol known as XMPP.

The XMPP logo

Today, you will probably use something like Skype, Gmail, or possibly FaceTime to stay in touch with your friends and family. If you were to rewind to roughly the year 2000, however, you would find that none of the above existed yet. Instead, there was something called XMPP. Originally called Jabber, XMPP is an open decentralised communications protocol (that Gmail's instant messaging service uses behind the scenes!) that allows you to stay in touch with people over the internet.

Identifying Users

There are several programs and apps that have XMPP support built in, but first let's take a look how it works. As I mentioned above, XMPP is decentralised. This means that there is no central point at which you can get an account - in fact you can create your very XMPP server right now! I will go into the details of that in a future post. Having multiple servers also raises the question of identification. How do you identify all these XMPP users at hundreds, possibly thousands of server across the globe?

Several account at 2 different servers

Thankfully, the answer is really quite simple: We use something called a Jabber ID (JID), which looks rather like an email address, for example: [email protected]. Just like an email address, the user name comes before the @ sign, and the server name comes after the @ sign.

Connecting People

Now that we know how you identify an XMPP user, we can look at how users connect and talk to each other, even if they have accounts at different servers. Connecting users is accomplished by 2 types of connections: client to server (c2s) and server to server (s2s) connections, which are usually carried out on ports 5222 and 5269 respectively. The client to server connections connect a user to their server that they registered with originally, and the server to server connections connect the user's server to the server that hosts the account to the other user that they want to talk to. In this way an XMPP user may start a conversation with any other XMPP user at any other server!

A visualisation of the example below

Here's an example. Bob is the owner of a company called Bob's Rockets and has the XMPP account [email protected]. He wants to talk to Bill, who owns the prestigious company Bill's Boosters who has the JID [email protected]. Bob will log into his XMPP account at bobsrockets.com over port 5222 (unless he is behind a firewall, but we will cover that later). Bill will log into his account at billsboosters.com over the same port. When Bob starts a chat with Bill, the server at bobsrockets.com will automagically establish a new server to server connection with billsboosters.com in order to exchange messages.

Note: When starting a conversation with another user that you haven't talked to before, XMPP requires that both parties give permission to talk to one another. Depending on your client, you may see a box or notification appear somewhere, which you have to accept.

Get your own!

Now that we have taken a look at how it works, you probably want your own account. Getting one is simple: Just go to a site like jabber.org and sign up. If you stick around for the second post in this series though I will be showing you how to set up your very own XMPP server (with encryption).

As for a program or app you can use on your computer and / or your phone, I recommend Pidgin for computers and Xabber for Android phones.

Next time, I will be showing you how to set up your own XMPP server using Prosody. I will also be showing you a few of the add-ons you can plug in to add support for things like multi-user chatrooms (optionally with passwords), file transfer proxies, firewall-busting BOSH proxies, and more!

Reading HTTP 1.1 requests from a real web server in C#

I've received rather a lot of questions recently asking the same question, so I thought that I 'd write a blog post on it. Here's the question:

Why does my network client fail to connect when it is using HTTP/1.1?

I encountered this same problem, and after half an hour of debugging I found the problem: It wasn't failing to connect at all, rather it was failing to read the response from the server. Consider the following program:

using System;
using System.IO;
using System.Net.Sockets;

class Program
{
    static void Main(string[] args)
    {
        TcpClient client = new TcpClient("host.name", 80);
        client.SendTimeout = 3000;
        client.ReceiveTimeout = 3000;
        StreamWriter writer = new StreamWriter(client.GetStream());
        StreamReader reader = new StreamReader(client.GetStream());
        writer.WriteLine("GET /path HTTP/1.1");
        writer.WriteLine("Host: server.name");
        writer.WriteLine();
        writer.Flush();

        string response = reader.ReadToEnd();

        Console.WriteLine("Got Response: '{0}'", response);
    }
}

If you change the hostname and request path, and then compile and run it, you (might) get the following error:

An unhandled exception of type 'System.IO.IOException' occurred in System.dll

Additional information: Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time or established connection failed because connected host has failed to respond.

Strange. I'm sure that we sent the request. Let's try reading the response line by line:

string response = string.Empty;
do
{
    string nextLine = reader.ReadLine();
    response += nextLine;
    Console.WriteLine("> {0}", nextLine);
} while (reader.Peek() != -1);

Here's some example output from my server:

> HTTP/1.1 200 OK
> Server: nginx/1.9.10
> Date: Tue, 09 Feb 2016 15:48:31 GMT
> Content-Type: text/html
> Transfer-Encoding: chunked
> Connection: keep-alive
> Vary: Accept-Encoding
> strict-transport-security: max-age=31536000;
>
> 2ef
> <html>
> <head><title>Index of /libraries/</title></head>
> <body bgcolor="white">
> <h1>Index of /libraries/</h1><hr><pre><a href="../">../</a>
> <a href="prism-grammars/">prism-grammars/</a>
   09-Feb-2016 13:56                   -
> <a href="blazy.js">blazy.js</a>                                           09-F
eb-2016 13:38                9750
> <a href="prism.css">prism.css</a>                                          09-
Feb-2016 13:58               11937
> <a href="prism.js">prism.js</a>                                           09-F
eb-2016 13:58               35218
> <a href="smoothscroll.js">smoothscroll.js</a>
   20-Apr-2015 17:01                3240
> </pre><hr></body>
> </html>
>
> 0
>

...but we still get the same error. Why? The reason is that the web server is keeping the connection open, just in case we want to send another request. While this would usually be helpful (say in the case of a web browser - it will probably want to download some images or something after receiving the initial response), it's rather a nuisance for us, since we don't want to send another request and it's rather awkward to detect the end of the response without detecting the end of the stream (that's what the while (reader.Peek() != -1); is for in the example above).

Thankfully, there are a few solutions to this. Firstly, the web server will sometimes (but not always - take the example response above for starters) send a content-length header. This header will tell you how many bytes follow after the double newline (\r\n\r\n) that separate the response headers from the response body. We could use this to detect the end of the message. This is the reccommended way , according to RFC2616.

Another way to cheat here is to send the connection: close header. This instructs the web server to close the connection after sending the message (Note that this will break some of the tests in the ACW, so don't use this method!). Then we can use reader.ReadToEnd() as normal.

A further cheat would be to detect the expected end of the message that we are looking for. For HTML this will practically always be </html>. We can close the connection after we receive this line (although this doesn't work when you're not receiving HTML). This is seriously not a good idea. The HTML could be malformed, and not contain </html>.

Stardust Blog

Tag Cloud