Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

How to prevent php-fpm from overriding your PHP-based caching logic

A while ago I implemented ETag support to the dynamic preview generator in Pepperminty Wiki. While I thought it worked during testing, for some reason on a private instance of Pepperminty Wiki I discovered recently that my browser didn't seen to be taking advantage of it. This got me curious, so I decided to do a little bit of digging to find out why.

It didn't take long to find the problem. For some reason, all the responses from the server had a Cache-Control: no-cache, no-store, must-revalidate header attached to them. How strange! Even more so that it was in capital letters - my convention in Pepperminty Wiki is to always make the headers lowercase.

I checked the codebase with via the Project Find feature of Atom just to make sure that I hadn't left in a debugging statement or anything (I hadn't), and then I turned my attention to Nginx (engine-x) - the web server that I use on my server. Maybe it had added a caching header?

A quick grep later revealed that it wasn't responsible either - which leaves just one part of the system unchecked - php-fpm, the PHP FastCGI server that sits just behind Nginx that's responsible for running the various PHP scripts I have that power this website and other corners of my server. Another quick grep returned a whole bunch of garbage, so after doing some research I discovered that php-fpm, by default, is configured to send this header - and that it has to be disabled by editing your php.ini (for me it's in /etc/php/7.1/fpm/php.ini), and changing

;session.cache_limiter = nocache

to be explicitly set to an empty string, like so:

session.cache_limiter = ''

This appears to have solved by problem for now - allowing me to regain control over how the resources I send back via PHP are cached. Hopefully this saves someone else the hassle of pulling their entire web server stack apart and putting it back together again the future :P

Found this helpful? Still having issues? Got a better way of solving the problem? Post a comment below!

Deterring spammers with a comment key system

I recently found myself reimplementing the comment key system I use on this blog (I posted about it here) for a another purpose. Being more experienced now, my new implemention (which I should really put into use on this blog actually) is stand-alone in a separate file - so I'm blogging about it here both to help out anyone who reads this other than myself - and for myself as I know I'll forget otherwise :P

The basic algorithm hasn't changed much since I first invented it: take the current timestamp, apply a bunch or arbitrary transformations to it, put it as a hidden field in the comment form, and then reverse the transformations on the comment key the user submits as part of the form to discover how long they had the page loaded for. Bots will have it loaded for either less than 10-ish seconds, or more than 24 hours. Humans will be somewhere in the middle - at least according to my observations!

Of course, any determined spammer can easily bypass this system if they spend even a little bit of time analysing the system - but I'm banking on the fact that my blog is too low-value for a spammer to bother reverse-engineering my system to figure out how it works.

This time, I chose to use simple XOR encryption, followed by reversing the string, followed by base64 encoding. It should be noted that XOR encryption is certainly not secure - but in this case it doesn't really matter. If my website becomes a high-enough value target for that to matter, I'll investigate proper AES encryption - which will probably be a separate post in and of itself, as a quick look revealed that it's rather involved - and will probably require quite a bit of research and experimentation working correctly.

Let's take a look at the key generation function first:

function key_generate($pass) {
    $new_key = strval(time());
    // Repeat the key so that it's long enough to XOR the key with
    $pass_enc = str_repeat($pass, (strlen($new_key) / strlen($pass)) + 1);
    $new_key = $new_key ^ $pass_enc;
    return base64_encode(strrev($new_key));
}

As I explained above, this first XORs the timestamp against a provided 'passcode' of sorts, and then it reverses it, base64 encodes it, and then returns it. I discovered that I needed to repeat the passcode to make sure it's at least as long as the timestamp - because otherwise it cuts the timestamp short! Longer passwords are always desirable for certain, but I wanted to make sure I addressed it here - just in case I need to lift this algorithm from here for a future project.

Next up is the decoding algorithm, that reverses the transformations we apply above:

    function key_decode($key, $pass) {
    $key_dec = strrev(base64_decode($key));
    // Repeat the key so that it's long enough to XOR the key with
    $pass_dec = str_repeat($pass, (strlen($key_dec) / strlen($pass)) + 1);
    return intval($key_dec ^ $pass_dec);
}

Very similar. Again, the XOR passphrase has to be repeated to make it long enough to apply to the whole encoded key without inadvertently chopping some off the end. Additionally, we also convert the timestamp back into an integer - since it is the number of seconds since the last UNIX epoch (1st January 1970 as of the time of typing).

With the ability to create and decode keys, let's write a helper method to make the verification process a bit easier:

    function key_verify($key, $pass, $min_age, $max_age) {
    $age = time() - key_decode($key, $pass);
    return $age >= $min_age && $age <= $max_age;
}

It's fairly self-explanatory, really. It takes an encoded key, decodes it, and verifies that it's age lies between the specified bounds. Here's the code in full (it updates every time I update the code in the GitHub Gist):

(Above: The full comment key code. Can't see it? Check it out on GitHub Gist here.)

OC ReMix Albums Atom Feed

I've recently discovered the wonderful OverClocked ReMix thanks to the help of a friend (thank you for the suggestion ☺), and they have a bunch of cool albums you should totally listen to (my favourite so far is Esther's Dreams :D)!

To help keep up-to-date with the albums they release, I went looking for an RSS/Atom feed that I could subscribe to. Unfortunately, I didn't end up finding one - so I wrote a simple scraper to do the job for me, and I thought I'd share it on here. You can find it live over here.

The script itself uses my earlier atom.gen.php script, since I had it lying around. Here it is in full:

<?php

$settings = new stdClass();
$settings->version = "0.1-alpha";
$settings->ocremix_url = "https://ocremix.org";
$settings->album_url = "https://ocremix.org/albums/";
$settings->user_agent = "OCReMixAlbumsAtomFeedGenerator/$settings->version (https://starbeamrainbowlabs.com/labs/ocremix_atom/albums_atom.php; <[email protected]> ) PHP/" . phpversion() . "+DOMDocument";
$settings->ignore_parse_errors = true;
$settings->atom_gen_location = "./atom.gen.php";

// --------------------------------------------

require($settings->atom_gen_location);

ini_set("user_agent", $settings->user_agent);
$ocremix_albums_html = file_get_contents($settings->album_url);

$ocremix_dom = new DOMDocument();
libxml_use_internal_errors($settings->ignore_parse_errors);
$ocremix_dom->loadHTML($ocremix_albums_html);

$ocremix_xpath = new DOMXPath($ocremix_dom);

$ocremix_albums = $ocremix_xpath->query(".//table[contains(concat(' ', normalize-space(./@class), ' '), ' data ')]//tr[contains(concat(' ', normalize-space(./@class), ' '), ' area-link ')]");

$ocremix_feed = new atomfeed();
$ocremix_feed->title = "Albums | OC ReMix";
$ocremix_feed->id_uri = $settings->album_url;
$ocremix_feed->sq_icon_uri = "http://ocremix.org/favicon.ico";
$ocremix_feed->addauthor("OverClocked ReMix", false, "https://ocremix.org/");
$ocremix_feed->addcategories([ "music", "remixes", "albums" ]);

foreach ($ocremix_albums as $album) {
    $album_date = $album->childNodes[6]->textContent;
    $album_name = $album->childNodes[2]->childNodes[1]->textContent;
    $album_url = $album->childNodes[2]->childNodes[1]->attributes["href"]->textContent;
    $album_img_url = $album->childNodes[0]->childNodes[0]->attributes["src"]->textContent;

    // Make the urls absolute
    $album_url = $settings->ocremix_url .  $album_url;
    $album_img_url = $settings->ocremix_url .  $album_img_url;

    $content = "<p><a href='$album_url'><img src='$album_img_url' alt='$album_name' /> $album_name</a> - released $album_date";

    $ocremix_feed->addentry(
        $album_url,
        $album_name,
        strtotime($album_date),
        "OverClocked ReMix",
        $content,
        false,
        [],
        strtotime($album_date)
    );
}

header("content-type: application/atom+xml");
echo($ocremix_feed->render());

Basically, I define a bunch of settings at the top for convenience (since I might end up changing them later). Then, I pull down the OcReMix albums page, extract the table of recent albums with a (not-so) tidy XPath query - since PHP's DOMDocument class doesn't seem to support anything else at a glance (yep, I used css-to-path to convert a CSS selector, since my XPath is a bit rusty and I was in a hurry :P).

Finally, I pull it apart to get a list of albums and their release dates - all of which goes straight into atom.gen.php and then straight out to the browser.

Sources and Further Reading

Profiling PHP with XDebug

(This post is a fork of a draft version of a tutorial / guide originally written as an internal document whilst at my internship.)

The PHP and xdebug logos.

Since I've been looking into xdebug's profiling function recently, I've just been tasked with writing up a guide on how to set it up and use it, from start to finish - and I thought I'd share it here.

While I've written about xdebug before in my An easier way to debug PHP post, I didn't end up covering the profiling function - I had difficulty getting it to work properly. I've managed to get it working now - this post documents how I did it. While this is written for a standard Debian server, the instructions can easily be applied to other servers.

For the uninitiated, xdebug is an extension to PHP that aids in the debugging of PHP code. It consists of 2 parts: The php extension on the server, and a client built into your editor. With these 2 parts, you can create breakpoints, step through code and more - though these functions are not the focus of this post.

To start off, you need to install xdebug. SSH into your web server with a sudo-capable account (or just use root, though that's bad practice!), and run the following command:

sudo apt install php-debug

Windows users will need to download it from here and put it in their PHP extension direction. Users of other linux distributions and windows may need to enable xdebug in their php.ini file manually (windows users will need extension=xdebug.dll; linux systems use extension=xdebug.so instead).

Once done, xdebug should be loaded and working correctly. You can verify this by looking the php information page. To see this page, put the following in a php file and request it in your browser:

<?php
phpinfo();
?>

If it's been enabled correctly, you should see something like this somewhere on the resulting page:

Part of the php information page, showing the xdebug section.

With xdebug setup, we can now begin configuring it. Xdebug gets configured in php.ini, PHP's main configuration file. Under Virtualmin each user has their own php.ini because PHP is loaded via CGI, and it's usually located at ~/etc/php.ini. To find it on your system, check the php information page as described above - there should be a row with the name "Loaded Configuration File":

The 'loaded configuration file' directive on the php information page.

Once you've located your php.ini file, open it in your favourite editor (or type sensible-editor php.ini if you want to edit over SSH), and put something like this at the bottom:

[xdebug]
xdebug.remote_enable=1
xdebug.remote_connect_back=1
xdebug.remote_port=9000
xdebug.remote_handler=dbgp
xdebug.remote_mode=req
xdebug.remote_autostart=true

xdebug.profiler_enable=false
xdebug.profiler_enable_trigger=true
xdebug.profiler_enable_trigger_value=ZaoEtlWj50cWbBOCcbtlba04Fj
xdebug.profiler_output_dir=/tmp
xdebug.profiler_output_name=php.profile.%p-%u

Obviously, you'll want to customise the above. The xdebug.profiler_enable_trigger_value directive defines a secret key we'll use later to turn profiling on. If nothing else, make sure you change this! Profiling slows everything down a lot, and could easily bring your whole server down if this secret key falls into the wrong hands (that said, simply having xdebug loaded in the first place slows things down too, even if you're not using it - so you may want to set up a separate server for development work that has xdebug installed if you haven't already). If you're not sure on what to set it to, here's a bit of bash I used to generate my random password:

dd if=/dev/urandom bs=8 count=4 status=none | base64 |  tr -d '=' | tr '+/' '-_'

The xdebug.profiler_output_dir lets you change the folder that xdebug saves the profiling output files to - make sure that the folder you specify here is writable by the user that PHP is executing as. If you've got a lot of profiling to do, you may want to consider changing the output filename, since xdebug uses a rather unhelpful filename by default. The property you want to change here is xdebug.profiler_output_name - and it supports a number of special % substitutions, which are documented here. I can recommend something phpprofile.%t-%u.%p-%H.%R.%cachegrind - it includes a timestamp and the request uri for identification purposes, while still sorting chronologically. Remember that xdebug will overwrite the output file if you don't include something that differentiates it from request to request!

With the configuration done, we can now move on to actually profiling something :D This is actually quite simple. Simply add the XDEBUG_PROFILE GET (or POST!) parameter to the url that you want to test in your browser. Here are some examples:

https://localhost/carrots/moon-iter.php?XDEBUG_PROFILE=ZaoEtlWj50cWbBOCcbtlba04Fj
https://development.galacticaubergine.de/register?vegetable=yes&mode=plus&XDEBUG_PROFILE=ZaoEtlWj50cWbBOCcbtlba04Fj

Adding this parameter to a request will cause xdebug to profile that request, and spit out a cachegrind file according to the settings we configured above. This file can then be analysed in your favourite editor, or, if it doesn't have support, an external program like qcachegrind (Windows) or kcachegrind (Everyone else).

If you need to profile just a single AJAX request or similar, most browsers' developer tools let you copy a request as a curl or wget command (Chromium-based browsers, Firefox - has an 'edit and resend' option), allowing you to resend the request with the XDEBUG_PROFILE GET parameter.

If you need to profile everything - including all subrequests (only those that pass through PHP, of course) - then you can set the XDEBUG_PROFILE parameter as a cookie instead, and it will cause profiling to be enabled for everything on the domain you set it on. Here's a bookmarklet that set the cookie:

javascript:(function(){document.cookie='XDEBUG_PROFILE='+'insert_secret_key_here'+';expires=Mon, 05 Jul 2100 00:00:00 GMT;path=/;';})();

(Source)

Replace insert_secret_key_here with the secret key you created for the xdebug.profiler_enable_trigger_value property in your php.ini file above, create a new bookmark in your browser, paste it in (making sure that your browser doesn't auto-remove the javascript: at the beginning), and then click on it when you want to enable profiling.

Was this helpful? Got any questions? Let me know in the comments below!

Sources and further reading

How to update your linux kernel version on a KimSufi server

(Or why PHP throws random errors in the latest update)

Hello again!

Since I had a bit of a time trying to find some clear information on the subject, I'm writing the blog post so that it might help others. Basically, yesterday night I updated the packages on my server (the one that runs this website!). There was a PHP update, but I didn't think much of it.

This morning, I tried to access my ownCloud instance, only to discover that it was throwing random errors and refusing to load. I'm running PHP version 7.0.16-1+deb.sury.org~xenial+2. It was spewing errors like this one:

PHP message: PHP Fatal error:  Uncaught Exception: Could not gather sufficient random data in /srv/owncloud/lib/private/Security/SecureRandom.php:80
Stack trace:
#0 /srv/owncloud/lib/private/Security/SecureRandom.php(80): random_int(0, 63)
#1 /srv/owncloud/lib/private/AppFramework/Http/Request.php(484): OC\Security\SecureRandom->generate(20)
#2 /srv/owncloud/lib/private/Log/Owncloud.php(90): OC\AppFramework\Http\Request->getId()
#3 [internal function]: OC\Log\Owncloud::write('PHP', 'Uncaught Except...', 3)
#4 /srv/owncloud/lib/private/Log.php(298): call_user_func(Array, 'PHP', 'Uncaught Except...', 3)
#5 /srv/owncloud/lib/private/Log.php(156): OC\Log->log(3, 'Uncaught Except...', Array)
#6 /srv/owncloud/lib/private/Log/ErrorHandler.php(67): OC\Log->critical('Uncaught Except...', Array)
#7 [internal function]: OC\Log\ErrorHandler::onShutdown()
#8 {main}
  thrown in /srv/owncloud/lib/private/Security/SecureRandom.php on line 80" while reading response header from upstream, client: x.y.z.w, server: ownc

That's odd. After googling around a bit, I found this page on the Arch Linux bug tracker. I'm not using arch (Ubuntu 16.04.2 LTS actually), but it turned out that this comment shed some much-needed light on the problem.

Basically, PHP have changed the way they ask the Linux Kernel for random bytes. They now use the getrandom() kernel function instead of /dev/urandom as they did before. The trouble is that getrandom() was introduced in linux 3.17, and I was running OVH's custom 3.14.32-xxxx-grs-ipv6-64 kernel.

Thankfully, after a bit more digging, I found this article. It suggests installing the kernel you want and moving one of the grub config file generators to another directory, but I found that simply changing the permissions did the trick.

Basically, I did the following:


apt update
apt full-upgrade
apt install linux-image-generic
chmod -x /etc/grub.d/06_OVHkernel
update-grub
reboot

Basically, the above first updates everything on the system. Then it installs the linux-image-generic package. linux-image-generic is the pseudo-package that always depends on the latest stable kernel image available.

Next, I remove execute privileges on the file /etc/grub.d/06_OVHkernel. This is the file that gives the default installed OVH kernel priority over any other instalaled kernels, so it's important to exclude it from the grub configuration process.

Lastly, I update my grub configuration with update-grub and then reboot. You need to make sure that you update your grub configuration file, since if you don't it'll still use the old OVH kernel!

With that all done, I'm now running 4.4.0-62-generic according to uname -a. If follow these steps yourself, make sure you have a backup! While I am happy to try and help you out in the comments below, I'm not responsible for any consequences that may arise as a result of following this guide :-)

An easier way to debug PHP

A nice view of some trees taken by Mythdael I think, the awesome person who designed this website :D

Recently at my internship I've been writing quite a bit of PHP. The language itself is OK (I mean it does the job), but it's beginning to feel like a relic of a bygone era - especially when it comes to debugging. Up until recently I've been stuck with using echo() and var_dump() calls all over the place in order to figure out what's going on in my code - that's the equivalent of debugging your C♯ ACW with Console.WriteLine() O.o

Thankfully, whilst looking for an alternative, I found xdebug. Xdebug is like visual studio's debugging tools for C♯ (albeit a more primitive form). They allow you to add breakpoints and step though your PHP code one line at a time - inspecting the contents of variables in both the local and global scope as you go. It improves the standard error messages generated by PHP, too - adding stack traces and colour to the output in order to make it much more readable.

Best of all, I found a plugin for my primary web development editor atom. It's got some simple (ish) instructions on how to set up xdebug too - it didn't take me long to figure out how to put it to use.

I'll assume you've got PHP and Nginx already installed and configured, but this tutorial looks good (just skip over the MySQL section) if you haven't yet got it installed. This should work for other web servers and configurations too, but make sure you know where your php.ini lives.

XDebug consists of 2 components: The PHP extension for the server, and the client that's built into your editor. Firstly, you need to install the server extension. I've recorded an asciicast (terminal recording) to demonstrate the process:

(Above: An asciinema recording demonstrating how to install xdebug. Can't see it? Try viewing it on asciinema.org.)

After that's done, you should be able to simply install the client for your editor (I use php-debug for atom personally), add a breakpoint, load a php page in your web browser, and start debugging!

If you're having trouble, make sure that your server can talk directly to your local development machine. If you're sitting behind any routers or firewalls, make sure they're configured to allow traffic though on port 9000 and configured to forward it on to your machine.

Capturing and sending error reports by email in C♯

A month or two ago I put together a simple automatic updater and showed you how to do the same. This time I'm going to walk you through how to write your own error reporting system. It will be able to catch any errors through a try...catch block, ask the user whether they want to send an error report when an exception is thrown, and send the report to your email inbox.

Just like last time, this post is a starting point, not an ending point. It has a significant flaw and can be easily extended to, say, ask the user to write a short paragraph detailing what they were doing at the time of the crash, or add a proper gui, for example.

Please note that this tutorial requires that you have a server of some description to use to send the error reports to. If you want to get the system to send you an email too, you'll need a working mail server. Thankfully DigitalOcean provide free credit if you have the GitHub Student pack. This tutorial assumes that your mail server (or at least a relay one) is running on the same machine as your web server. While setting one up correctly can be a challenge, Lee Hutchinson over at Ars Technica has a great tutorial that's easy to follow.

To start with, we will need a suitable test program to work with whilst building this thing. Here's a good one riddled with holes that should throw more than a few exceptions:

using System;
using System.IO;
using System.Net;
using System.Text;

public class Program
{
    public static readonly string Name = "Dividing program";
    public static readonly string Version = "0.1";

    public static string ProgramId
    {
        get { return string.Format("{0}/{1}", Name, Version); }
    }

    public static int Main(string[] args)
    {
        float a = 0, b = 0, c = 0;

        Console.WriteLine(ProgramId);
        Console.WriteLine("This program divides one number by another.");

        Console.Write("Enter number 1: ");
        a = float.Parse(Console.ReadLine());
        Console.Write("Enter number 2: ");
        b = float.Parse(Console.ReadLine());

        c = a / b;
        Console.WriteLine("Number 1 divided by number 2 is {0}.", c);

        return 0;
}

There are a few redundant using statements at the top there - we will get to utilizing them later on.

First things first - we need to capture all exceptions and build an error report:

try
{
    // Insert your program here
}
catch(Exception error)
{
    Console.Write("Collecting data - ");
    MemoryStream dataStream  = new MemoryStream();
    StreamWriter dataIn = new StreamWriter(dataStream);
    dataIn.WriteLine("***** Error Report *****");
    dataIn.WriteLine(error.ToString());
    dataIn.WriteLine();
    dataIn.WriteLine("*** Details ***");
    dataIn.WriteLine("a: {0}", a);
    dataIn.WriteLine("b: {0}", b);
    dataIn.WriteLine("c: {0}", c);
    dataIn.Flush();

    dataStream.Seek(0, SeekOrigin.Begin);
    string errorReport = new StreamReader(dataStream).ReadToEnd();
    Console.WriteLine("done");
}

If you were doing this for real, it might be a good idea to move all of your application logic it it's own class and have a call like application.Run() instead of placing your code directly inside the try{ } block. Anyway, the above will catch the exception, and build a simple error report. I'm including the values of a few variables I created too. You might want to set up your own mechanism for storing state data so that the error reporting system can access it, like a special static class or something.

Now that we have created an error report, we need to send it to the server to processing. Before we do this, though, we ought to ask the user if this is ok with them (it is their computer in all likeliness after all!). This is easy:

Console.WriteLine("An error has occurred!");
Console.Write("Would you like to report it? [Y/n] ");

bool sendReport = false;
while(true)
{
    ConsoleKey key = Console.ReadKey().Key;
    if (key == ConsoleKey.Y) {
        sendReport = true;
        break;
    }
    else if (key == ConsoleKey.N)
        break;
}
Console.WriteLine();

if(!sendReport)
{
    Console.WriteLine("No report has been sent.");
    Console.WriteLine("Press any key to exit.");
    Console.ReadKey(true);
    return 1;
}

Since this program uses the console, I'm continuing that trend here. You will need to create your own GUI if you aren't creating a console app.

Now that's taken care of, we can go ahead and send the report to the server. Here's how I've done it:

Console.Write("Sending report - ");
HttpWebRequest reportSender = WebRequest.CreateHttp("https://starbeamrainbowlabs.com/reportSender.php");
reportSender.Method = "POST";
byte[] payload = Encoding.UTF8.GetBytes(errorReport);
reportSender.ContentType = "text/plain";
reportSender.ContentLength = payload.Length;
reportSender.UserAgent = ProgramId;
Stream requestStream = reportSender.GetRequestStream();
requestStream.Write(payload, 0, payload.Length);
requestStream.Close();

WebResponse reportResponse = reportSender.GetResponse();
Console.WriteLine("done");
Console.WriteLine("Server response: {0}", ((HttpWebResponse)reportResponse).StatusDescription);
Console.WriteLine("Press any key to exit.");
Console.ReadKey(true);
return 1;

That may look unfamiliar and complicated, so let's walk through it one step at a time.

To start with, I create a new HTTP web request and point it at an address on my server. You will use a slightly different address, but the basic principle is the same. As for what resides at that address - we will take a look at that later on.

Next I set request method to be POST so that I can send some data to the server, and set a few headers to help the server out in understanding our request. Then I prepare the error report for transport and push it down the web request's request stream.

After that I get the response from the server and tell the user that we have finished sending the error report to the server.

That pretty much completes the client side code. Here's the whole thing from start to finish:

using System;
using System.IO;
using System.Net;
using System.Text;

public class Program
{
    public static readonly string Name = "Dividing program";
    public static readonly string Version = "0.1";

    public static string ProgramId
    {
        get { return string.Format("{0}/{1}", Name, Version); }
    }

    public static int Main(string[] args)
    {
        float a = 0, b = 0, c = 0;

        try
        {
            Console.WriteLine(ProgramId);
            Console.WriteLine("This program divides one number by another.");

            Console.Write("Enter number 1: ");
            a = float.Parse(Console.ReadLine());
            Console.Write("Enter number 2: ");
            b = float.Parse(Console.ReadLine());

            c = a / b;
            Console.WriteLine("Number 1 divided by number 2 is {0}.", c);
        }
        catch(Exception error)
        {
            Console.WriteLine("An error has occurred!");
            Console.Write("Would you like to report it? [Y/n] ");

            bool sendReport = false;
            while(true)
            {
                ConsoleKey key = Console.ReadKey().Key;
                if (key == ConsoleKey.Y) {
                    sendReport = true;
                    break;
                }
                else if (key == ConsoleKey.N)
                    break;
            }
            Console.WriteLine();

            if(!sendReport)
            {
                Console.WriteLine("No report has been sent.");
                Console.WriteLine("Press any key to exit.");
                Console.ReadKey(true);
                return 1;
            }

            Console.Write("Collecting data - ");
            MemoryStream dataStream  = new MemoryStream();
            StreamWriter dataIn = new StreamWriter(dataStream);
            dataIn.WriteLine("***** Error Report *****");
            dataIn.WriteLine(error.ToString());
            dataIn.WriteLine();
            dataIn.WriteLine("*** Details ***");
            dataIn.WriteLine("a: {0}", a);
            dataIn.WriteLine("b: {0}", b);
            dataIn.WriteLine("c: {0}", c);
            dataIn.Flush();

            dataStream.Seek(0, SeekOrigin.Begin);
            string errorReport = new StreamReader(dataStream).ReadToEnd();
            Console.WriteLine("done");

            Console.Write("Sending report - ");
            HttpWebRequest reportSender = WebRequest.CreateHttp("https://starbeamrainbowlabs.com/reportSender.php");
            reportSender.Method = "POST";
            byte[] payload = Encoding.UTF8.GetBytes(errorReport);
            reportSender.ContentType = "text/plain";
            reportSender.ContentLength = payload.Length;
            reportSender.UserAgent = ProgramId;
            Stream requestStream = reportSender.GetRequestStream();
            requestStream.Write(payload, 0, payload.Length);
            requestStream.Close();

            WebResponse reportResponse = reportSender.GetResponse();
            Console.WriteLine("done");
            Console.WriteLine("Server response: {0}", ((HttpWebResponse)reportResponse).StatusDescription);
            Console.WriteLine("Press any key to exit.");
            Console.ReadKey(true);
            return 1;
        }

        return 0;
    }
}

(Pastebin, Raw)

Next up is the server side code. Since I'm familiar with it and it can be found on all popular web servers, I'm going to be using PHP here. You could write this in ASP.NET, too, but I'm not familiar with it, nor do I have the appropriate environment set up at the time of posting (though I certainly plan on looking into it).

The server code can be split up into 3 sections: the settings, receiving and extending the error report, and sending the error report on in an email. Part one is quite straightforward:

<?php
/// Settings ///
$settings = new stdClass();
$settings->fromAddress = "[email protected]";
$settings->toAddress = "[email protected]";

The above simply creates a new object and stores a few settings in it. I like to put settings at the top of small scripts like this because it both makes it easy to reconfigure them and allows for expansion later.

Next we need to receive the error report from the client:

// Get the error report from the client
$errorReport = file_get_contents("php://input");

PHP on a web server it smarter than you'd think and collects some useful information about the connected client, so we can collect a few interesting statistics and tag them onto the end of the error report like this:

// Add some extra information to it
$errorReport .= "\n*** Server Information ***\n";
$errorReport .= "Date / time reported: " . date("r") . "\n";
$errorReport .= "Reporting ip: " . $_SERVER['REMOTE_ADDR'] . "\n";
if(isset($_SERVER["HTTP_X_FORWARDED_FOR"]))
{
    $errorReport .= "The error report was forwarded through a proxy.\n";
    $errorReport .= "The proxy says that it forwarded the request from this address: " . $_SERVER['HTTP_X_FORWARDED_FOR'] . "\n\n";
}
if(isset($_SERVER["HTTP_USER_AGENT"]))
{
    $errorReport .= "The reporting client identifies themselves as: " . $_SERVER["HTTP_USER_AGENT"] . ".\n";
}

I'm adding the date and time here too just because the client could potentially fake it (they could fake everything, but that's a story for another time). I'm also collecting the client's user agent string too. This is being set in the client code above to the name and version of the program running. This information could be useful if you attach multiple programs to the same error reporting script. You could modify the client code to include the current .NET version, too by utilising Environment.Version.

Lastly, since the report has gotten this far, we really should do something with it. I decided I wanted to send it to myself in an email, but you could just as easily store it in a file using something like file_put_contents("bug_reports.txt", $errorReport, FILE_APPEND);. Here's the code I came up with:

$emailHeaders = [
    "From: $settings->fromAddress",
    "Content-Type: text/plain",
    "X-Mailer: PHP/" . phpversion()
];

$subject = "Error Report";
if(isset($_SERVER["HTTP_USER_AGENT"]))
    $subject .= " from " . $_SERVER["HTTP_USER_AGENT"];

mail($settings->toAddress, $subject, $errorReport, implode("\r\n", $emailHeaders), "-t");

?>

That completes the server side code. Here's the completed script:

<?php
/// Settings ///
$settings = new stdClass();
$settings->fromAddress = "[email protected]";
$settings->toAddress = "[email protected]";

// Get the error report from the client
$errorReport = file_get_contents("php://input");

// Add some extra information to it
$errorReport .= "\n*** Server Information ***\n";
$errorReport .= "Date / time reported: " . date("r") . "\n";
$errorReport .= "Reporting ip: " . $_SERVER['REMOTE_ADDR'] . "\n";
if(isset($_SERVER["HTTP_X_FORWARDED_FOR"]))
{
    $errorReport .= "The error report was forwarded through a proxy.\n";
    $errorReport .= "The proxy says that it forwarded the request from this address: " . $_SERVER['HTTP_X_FORWARDED_FOR'] . "\n\n";
}
if(isset($_SERVER["HTTP_USER_AGENT"]))
{
    $errorReport .= "The reporting client identifies themselves as: " . $_SERVER["HTTP_USER_AGENT"] . ".\n";
}

$emailHeaders = [
    "From: $settings->fromAddress",
    "Content-Type: text/plain",
    "X-Mailer: PHP/" . phpversion()
];

$subject = "Error Report";
if(isset($_SERVER["HTTP_USER_AGENT"]))
    $subject .= " from " . $_SERVER["HTTP_USER_AGENT"];

mail($settings->toAddress, $subject, $errorReport, implode("\r\n", $emailHeaders), "-t");

?>

(Pastebin, Raw)

The last job we need to do is to upload the PHP script to a PHP-enabled web server, and go back to the client and point it at the web address at which the PHP script is living.

If you have read this far, then you've done it! You should have by this point a simple working error reporting system. Here's an example error report email that I got whilst testing it:

***** Error Report *****
System.FormatException: Input string was not in a correct format.
  at System.Number.ParseSingle (System.String value, NumberStyles options, System.Globalization.NumberFormatInfo numfmt) <0x7fe1c97de6c0 + 0x00158> in <filename unknown>:0 
  at System.Single.Parse (System.String s, NumberStyles style, System.Globalization.NumberFormatInfo info) <0x7fe1c9858690 + 0x00016> in <filename unknown>:0 
  at System.Single.Parse (System.String s) <0x7fe1c9858590 + 0x0001d> in <filename unknown>:0 
  at Program.Main (System.String[] args) <0x407d7d60 + 0x00180> in <filename unknown>:0 

*** Details ***
a: 4
b: 0
c: 0

*** Server Information ***
Date / time reported: Mon, 11 Apr 2016 10:31:20 +0100
Reporting ip: 83.100.151.189
The reporting client identifies themselves as: Dividing program/0.1.

I mentioned at the beginning of this post that that this approach has a flaw. The main problem lies in the fact that the PHP script can be abused by a knowledgeable attacker to send you lots of spam. I can't think of any real way to properly solve this, but I'd suggest storing the PHP script at a long and complicated URL that can't be easily guessed. There are probably other flaws as well, but I can't think of any at the moment.

Found a mistake? Got an improvement? Please leave a comment below!

Converting Hashtags into Titles with PHP

Recently I have been working on a website for someone I know. Mythdael (the awesome artist who created a design for this website!) did the design for this one too, and I felt that I had to bring it to life. While I was writing it, I found that I needed to convert any given hashtag into a presentable title. Since hashtags are all one word (and usually lower case), it is more or less impossible to work out where one word starts and another ends. The solution: A wordlist (My choice was a modified enable1.txt).

Here is the solution I came up with:

/*
 * From https://terenceyim.wordpress.com/2011/02/01/all-purpose-binary-search-in-php/
 * Parameters: 
 *   $a - The sort array.
 *   $first - First index of the array to be searched (inclusive).
 *   $last - Last index of the array to be searched (exclusive).
 *   $key - The key to be searched for.
 *   $compare - A user defined function for comparison. Same definition as the one in usort
 *
 * Return:
 *   index of the search key if found, otherwise return (-insert_index - 1). 
 *   insert_index is the index of smallest element that is greater than $key or sizeof($a) if $key
 *   is larger than all elements in the array.
 */
function binary_search(array $a, $first, $last, $key, $compare) {
    $lo = $first; 
    $hi = $last - 1;

    while ($lo <= $hi) {
        $mid = (int)(($hi - $lo) / 2) + $lo;
        $cmp = call_user_func($compare, $a[$mid], $key);

        if ($cmp < 0) {
            $lo = $mid + 1;
        } elseif ($cmp > 0) {
            $hi = $mid - 1;
        } else {
            return $mid;
        }
    }
    return -($lo + 1);
}

class hashtag_parser
{
    public $wordlist_length = 0;
    public $wordlist = [];

    function __construct($wordlist_path) {
        global $settings;
        $this->wordlist = file($wordlist_path, FILE_IGNORE_NEW_LINES);
        $this->wordlist_length = count($this->wordlist);
    }

    public function in_wordlist($word)
    {
        $word = strtolower($word);

        $result = binary_search($this->wordlist, 0, $this->wordlist_length, $word, "strcmp");

        if($result > -1)
            return true;
        else
            return false;
        /*
        if(in_array($word, $this->wordlist))
            return true;
        else
            return false;
        */
    }

    public function extract_words($hashtag)
    {
        global $settings;

        // Remove the hash from the beginning if it is present
        if(substr($hashtag, 0, 1) == "#") $hashtag = substr($hashtag, 1);

        // Create an array to hold the words we find
        $words = [];

        $length = strlen($hashtag); // Cache the length of the hashtag
        $pos = 0;
        while($pos < $length)
        {
//          echo("pos: $pos\n");
            // aim: find the length of the longest substring that is a valid
            // word according to the wordlist
            $longest_word_length = 0;
            for($scan_pos = $pos + 1; $scan_pos < $length + 1; $scan_pos++)
            {
//              echo("scan_pos: $scan_pos substring: " . substr($hashtag, $pos, $scan_pos - $pos) . "\n");
                if($this->in_wordlist(substr($hashtag, $pos, $scan_pos - $pos)))
                {
                    $longest_word_length = $scan_pos - $pos;
//                  echo("found word\n");
                }
            }

            // Set the length of the longest word to the remainder of the
            // string if we don't find any valid words
            if($longest_word_length == 0) $longest_word_length = $length - $pos;

            $words[] = substr($hashtag, $pos, $longest_word_length);

            $pos += $longest_word_length;
        }

        return ucwords(implode(" ", $words));
    }
}

The code is a bit messy (perhaps I should tidy it up a bit lot), but it does the job I intended it to do. I used a class because I was concerned that reading in the wordlist every time would cause the code to take too long to complete - it is slow enough as it is. Thankfully a binary search algorithm written in PHP by terenceyim helped speed things up enormously. Anyway, it can be used like this:

$parser = new hashtag_parser("/path/to/wordlist.txt");
echo($parser->extract_words("sometext")); // Prints "Some Text"

The algorithm I used is quite simple:

  1. Loop over each character.
  2. Scan ahead of the current character and figure out the length of the longest word in the input via the wordlist.
  3. If no valid word can be found, assume that the rest of the input is all one word.
  4. Extract the longest word we can find and add it to an array.
  5. Add the length of the word we found to the character pointer.
  6. If we haven't reached the end of the given input, go to step 2.
  7. If we have reach the end of the output, return the words we found.

I am posting this here in the hopes that someone else will find this code useful :)

Probably the world's most advanced string splitting function

When I was setting up this website, I foolishly picked a custom log file format that is rather hard for computers to parse. Because of this I haven't found a server log analysis tool that is intelligent enough to parse my logs (if you know of one please let me know in the comments!).

Here is an example of a typical log file entry:

[19/May/2015:00:47:52 +0100] "starbeamrainbowlabs.com" HTTP/1.1 GET 200 162.243.87.220 0s :443 /blog/article.php article=posts/013-Terminal-Reference.html "https://starbeamrainbowlabs.com/blog/?offset=60" "Mozilla/5.0 (compatible; spbot/4.4.2; +http://OpenLinkProfiler.org/bot )"

It looks strange, doesn't it? Since I want to have some idea of how many people are visiting my site, I have finally gotten around to writing my own custom log parser. In order to do this, I needed a way to convert each line into an array of terms. None of the answers on stackoverflow seemed to cut it, so I wrote my own:

<?php
function explode_adv($openers, $closers, $togglers, $delimiters, $str)
{
    $chars = str_split($str);
    $parts = [];
    $nextpart = "";
    $toggle_states = array_fill_keys($togglers, false); // true = now inside, false = now outside
    $depth = 0;
    foreach($chars as $char)
    {
        if(in_array($char, $openers))
            $depth++;
        elseif(in_array($char, $closers))
            $depth--;
        elseif(in_array($char, $togglers))
        {
            if($toggle_states[$char])
                $depth--; // we are inside a toggle block, leave it and decrease the depth
            else
                // we are outside a toggle block, enter it and increase the depth
                $depth++;

            // invert the toggle block state
            $toggle_states[$char] = !$toggle_states[$char];
        }
        else
            $nextpart .= $char;

        if($depth < 0) $depth = 0;

        if(in_array($char, $delimiters) &&
           $depth == 0 &&
           !in_array($char, $closers))
        {
            $parts[] = substr($nextpart, 0, -1);
            $nextpart = "";
        }
    }
    if(strlen($nextpart) > 0)
        $parts[] = $nextpart;

    return $parts;
}
?>

I have also posted this on stackoverflow. This function of mine takes 5 parameters:

  1. An array of characters that open a block - e.g. [, (, etc.
  2. An array of characters that close a block - e.g. ], ), etc.
  3. An array of characters that toggle a block - e.g. ", ', etc.
  4. An array of characters that should cause a split into the next part.
  5. The string to work on.

This function probably will have flaws, but it works well enough for me.

You can also find this function on GitHub's Gist - as always suggestions and contributions are always welcome :)

IP version tester

You may have heard already - we have run out of IPv4 addresses. An IPv4 address is 32 bits long and looks like this: 37.187.192.179. If you count up all the possible combinations (considering each section may be between 0 and 255), missing out the addresses reserved for special purposes, you get about 3,706,452,992 addresses.

The new system that the world is currently moving to (very slowly mind you) is called IPv6 and is 128 bits long. They look like this: 2001:41d0:52:a00::68e. This gives us a virtually unlimited supply of addresses so we should never run out.

The problem is that the world is moving far too slowly over to it and you can never be sure if you have IPv6 connectivity or not. I built a quick IP version tester to solve this problem. I know there are others out there, but I wanted to build one myself :)

You can find it here: Ip Version Tester.

Art by Mythdael