Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

TCP (Client) Networking in Pure Bash

Recently I re-remembered about /dev/tcp - a virtual bash file system that allows you to directly connect to a remote TCP endpoint - without the use of nc / netcat / ncat.

While it only allows you to connect (no listening, sadly), it's still a great bash built-in that helps avoid awkward platform-specific issues.

Here's how you'd listen for a connection with netcat, sending as many random numbers as possible to the poor unsuspecting client:

netcat -l 0.0.0.0 6666 </dev/urandom

Here's how you'd traditionally connect to that via netcat:

netcat X.Y.Z.W 6666 | pv >/dev/null

The pv command there is not installed by default, but is a great tool that shows the amount of data flowing through a pipe. It's available in the repositories for most Linux distributions without any special configuration required - so sudo apt install pv should be enough on Debian-based distributions.

Now, let's look at how we'd do this with pure bash:

pv >/dev/null </dev/tcp/X.Y.Z.W/6666

Very neat! We've been able to eliminate an extra child process. The question is though: how do they compare performance-wise? Well, that depends on how we measure it. In my case, I measured a single connection, downloading data as fast as it can for 60 seconds.

Another test would be to open many connections and download lots of small files. While I haven't done that here, I theorise that the pure-bash method would win out, as it doesn't have to spawn lots of subprocesses.

In my test, I did this:

# Traditional method
timeout 60 nc X.Y.Z.W 6666 | pv >/dev/null
# Pure-Bash method
timeout 60 pv >/dev/null </dev/tcp/X.Y.Z.W/6666

The timeout command kills the process after a given number of seconds. The server I connected to was just this:

while true; do nc -l 0.0.0.0 6666 </dev/urandom; done

Running the above test, I got the following output:

$ timeout 60 pv >/dev/null </dev/tcp/172.16.230.58/6666
 652MiB 0:00:59 [11.2MiB/s] [                                      <=>         ]
$ timeout 60 nc 172.16.230.58 6666 | pv >/dev/null
 599MiB 0:01:00 [11.1MiB/s] [                                     <=>          ]
Method Total Data Transferred
Traditional 599MiB
Pure Bash 652MiB

As it turns out, the pure bash method is apparently faster - by ~8.8%. I think this might have something to do with the lack of the additional sub-process, or some other optimisation that bash can apply when doing the TCP networking itself.

Found this interesting? Got a cool use for it? Discovered another awesome bash built-in? Comment below!

Read / Write Disk Performance Testing in Bash

Recently I needed to quickly (and non-destructively) test the read / write performance of a flash drive of mine. Naturally, I turned my attention to my terminal. This post is me documenting what I did so that I can remember for next time :P

Firstly, to test the speed of a disk, we need some data to test with. Since lots of small files will inevitably cause slowdowns due to the overhead of writing the file metadata and inode information to the superblock, it makes the most sense to use one gigantic file rather than tons of small ones. Here's what I did to generate a 1 Gigabyte file filled with zeroes:

dd if=/dev/zero of=/tmp/testfile.bin bs=1M count=1024

Cool. Next, we need to copy it to the target disk and measure the time it took. Then, since we know the size of the file (1073741824 bytes, to be exact), we can calculate the speed at which the copy took place. Here's my first attempt:

time dd if=/tmp/testfile.bin >testfile.bin

If you run this, you might find that it doesn't take it very long at all, and you get a speed of something like ~250MiB / sec! While impressive, I seriously doubt that my flash drive has that kind of speed behind it. Typically, flash memory takes longer to write to and read from - and I'm pretty sure that it can't read from it that fast either. So what's going on?

Well, it turns out that Linux is caching the disk write operations in a buffer, and then doing them in the background for us. Whilst fine for ordinary operation, this doesn't give us an accurate representation of how fast it's actually writing to the disk. Thankfully, there's something we can do about this: Use the sync command. sync will flush all cached write operations to disk for us, giving us the actual time it took to write the 1 GiB file to disk. Here's the altered command:

sync;
time sh -c 'dd if=/tmp/testfile.bin >testfile.bin; sync'

Very cool! Now, we can just take the time it took and do some simple maths to calculate the write speed of our disk. What about the read speed though? Well, to test that, we'll first need to clear out the page cache - another one of Linux's (many) caches that holds portions of files that have recently been accessed for faster retrieval - because as before, we're not interested in the speed of the cache! Here's how to do that:

echo 1 | sudo tee /proc/sys/vm/drop_caches

With the correct cache cleared, we can test the read speed accurately. Here's how I did it:

time dd if=testfile.bin of=/dev/null

Fairly simple, right? At a later date I might figure out a way of automating this, but for the occasional use now and again this works just fine :)

Found this useful? Got a better way of doing it? Want to say hi? Post in the comments below!

How to prevent php-fpm from overriding your PHP-based caching logic

A while ago I implemented ETag support to the dynamic preview generator in Pepperminty Wiki. While I thought it worked during testing, for some reason on a private instance of Pepperminty Wiki I discovered recently that my browser didn't seen to be taking advantage of it. This got me curious, so I decided to do a little bit of digging to find out why.

It didn't take long to find the problem. For some reason, all the responses from the server had a Cache-Control: no-cache, no-store, must-revalidate header attached to them. How strange! Even more so that it was in capital letters - my convention in Pepperminty Wiki is to always make the headers lowercase.

I checked the codebase with via the Project Find feature of Atom just to make sure that I hadn't left in a debugging statement or anything (I hadn't), and then I turned my attention to Nginx (engine-x) - the web server that I use on my server. Maybe it had added a caching header?

A quick grep later revealed that it wasn't responsible either - which leaves just one part of the system unchecked - php-fpm, the PHP FastCGI server that sits just behind Nginx that's responsible for running the various PHP scripts I have that power this website and other corners of my server. Another quick grep returned a whole bunch of garbage, so after doing some research I discovered that php-fpm, by default, is configured to send this header - and that it has to be disabled by editing your php.ini (for me it's in /etc/php/7.1/fpm/php.ini), and changing

;session.cache_limiter = nocache

to be explicitly set to an empty string, like so:

session.cache_limiter = ''

This appears to have solved by problem for now - allowing me to regain control over how the resources I send back via PHP are cached. Hopefully this saves someone else the hassle of pulling their entire web server stack apart and putting it back together again the future :P

Found this helpful? Still having issues? Got a better way of solving the problem? Post a comment below!

C Sharp Performance Tests: Squaring Numbers

The semester 2 coursework has begun, and this time there is a choice of either make a game to a specification, or produce a business related application to a specification. I have chosen the game.

To start off, I started writing a few mathematical functions that I would need when ewriting the game itself. I quickly found that one of things you do rather a lot is squaring numbers. In C♯ I have found 2 ways to square a number so far: a*a and Math.Pow(a, 2), where a is a number.

Even though it probably doesn't matter about the speed at which we square numbers (there are much better optimisations to make), I decided to test it anyway. I wrote this program:

using System;
using System.Diagnostics;

class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine("This program tests the performance of squaring numbers.\n");
        Console.WriteLine("Iterations | a*a Time | Math.Pow(a, 2) Time");
        Console.WriteLine("-----------|----------|--------------------");

        int iterations = 100000;
        double number = 3092d;
        double result;

        while(true)
        {
            Stopwatch timera = Stopwatch.StartNew();

            for(int i = 0; i < iterations; i++)
            {
                result = number*number;
            }
            timera.Stop();

            Stopwatch timerb = Stopwatch.StartNew();
            for(int i = 0; i < iterations; i++)
            {
                result = Math.Pow(number, 2);
            }
            timerb.Stop();

            Console.WriteLine(" {0,9} | {1,8} | {2}", iterations, timera.ElapsedMilliseconds.ToString(), timerb.ElapsedMilliseconds.ToString());

            iterations += 100000;

            System.Threading.Thread.Sleep(500); // make sure that we don't crash the server
        }
    }
}

(pastebin, binary)

I didn't have a Windows Machine handy when writing this post, so the binary has been compiled with mono's dcms. Please leave a comment below if it doesn't work for you.

Everything is done with doubles to maek it a fair test because C♯'s Math.Pow method will only accept a double. The code sleeps for half a second between runs because I was writing this on the server that is serving the websitee, and I didn't want to crash it by accident :)

The results were rather shocking. The easiest way to display the results is in a graph:

A graph of the results.

A pastebin of the results can be found here.

As you can quite clearly see, Math.Pow() is horribly inefficient compared to just multiplying a number by itself. Next time I am squaring numbers, I won't be using Math.Pow again.

Value vs Reference: A Performance Test

I have recently started to learned about classes and objects in C♯. In one of my lectures, Rob Miles mentioned that passing by reference introduced a slight performance hit, and I decided to test this.

Here is the code I used:

using System;
using System.Diagnostics;

class ValueVsReference
{
    public static void SetValue(int somenumber, int maxi)
    {
        for (int i = 0; i < maxi; i++)
        {
            somenumber = i;
        }
    }

    public static void SetReference(ref int somenumber, int maxi)
    {
        for(int i = 0; i < maxi; i++)
        {
            somenumber = i;
        }
    }

    public static void Main()
    {
        int iterations = 100000;
        int a = 0;
        Console.Write("Iterations: {0} ", iterations);

        Console.Write("Value: ");
        Stopwatch valuetime = new Stopwatch();
        valuetime.Start();
        SetValue(a, iterations);
        valuetime.Stop();
        Console.Write("{0} ", valuetime.Elapsed);

        Console.Write("Reference: ");
        Stopwatch referencetime = new Stopwatch();
        referencetime.Start();
        SetReference(ref a, iterations);
        referencetime.Stop();
        Console.WriteLine(referencetime.Elapsed);
    }
}

(gist: https://gist.github.com/sbrl/b10bf5c765630562431f)

The results for 10,000 iterations can be found in this pastebin.

Average times:

  • Value: 0.0000885
  • Reference: 0.00009295

Final result: References are indeed slower than values, by ~5% (100 - ((0.00009295/0.0000885)*100)). Normally you would not need to worry about this performance drop though since it is so small.

File System Performance in PHP

While writing pepperminty wiki, I started seeing a rather nasty in crease in page load times. After looking into it, I drew the conclusion that it must have been the file system that caused the problem. At the time, I had multiple calls to PHP's glob function to find all the wiki pages in the current directory, and I was checking to see if the wiki page existed before reading it into memory.

The solution: A page index. To cut down on the number of reads from the file system, I created a json file that containedd inforamtion about every page on the wiki. This way, it only needs to check the existence of and read in a single file before it can start rendering any one page. If the page index doesn't exist, it is automatically rebuilt with the glob function to find all the wiki pages in the current directory.

In short: to increase the performance of your PHP application, try to reduce the number of reads (and writes!) to the file system to an absolute minimum.

I still need to update the code to allow users to delete pages via the GUI though, because at present you have to have access to the server files to delete a page and then remove it from the page index manually.

Art by Mythdael