Stardust | Starbeamrainbowlabs

Summer Project Series List

At this point, it is basically the end of my summer project series - at least for a while while I start my PhD (more on that in a future post). To this end, I'm releasing the series list for it.

Summer Project Part 5: When is a function not a function?

Another post! Looks like I'm on a roll in this series :P

In the last post, I looked at the box I designed that was ready for 3D printing. That process has now been completed, and I'm now in possession of an (almost) luminous orange and pink box that could almost glow in the dark.......

I also looked at the libraries that I'll be using and how to manage the (rather limited) amount of memory available in the AVR microprocessor.

Since last time, I've somehow managed to shave a further 6% program space off (though I'm not sure how I've done it), so most recently I've been implementing 2 additional features:

An additional layer of AES encryption, to prevent The Things Network for having access to the decrypted data
GPS delta checking (as I'm calling it), to avoid sending multiple messages when the device hasn't moved

After all was said and done, I'm now at 97% program space and 47% global variable RAM usage.

To implement the additional AES encryption layer, I abused LMiC's IDEETRON AES-128 (ECB mode) implementation, which is stored in src/aes/ideetron/AES-128_V10.cpp.

It's worth noting here that if you're doing crypto yourself, it's seriously not recommended that you use ECB mode. Please don't. The only reason that I used it here is because I already had an implementation to hand that was being compiled into my program, I didn't have the program space to add another one, and my messages all start with a random 32-bit unsigned integer that will provide a measure of protection against collision attacks and other nastiness.

Specifically, it's the method with this signature:

void lmic_aes_encrypt(unsigned char *Data, unsigned char *Key);

Since this is an internal LMiC function declared in a .cpp source file with no obvious header file twin, I needed to declare the prototype in my source code as above - as the method will only be discovered by the compiler when linking the object files together (see this page for more information about the C++ compilation process. While it's for regular Linux executable binaries, it still applies here since the Arduino toolchain spits out a very similar binary that's uploaded to the microprocessor via a programmer).

However, once I'd sorted out all the typing issues, I slammed into this error:

/tmp/ccOLIbBm.ltrans0.ltrans.o: In function `transmit_send':
sketch/transmission.cpp:89: undefined reference to `lmic_aes_encrypt(unsigned char*, unsigned char*)'
collect2: error: ld returned 1 exit status

Very strange. What's going on here? I declared that method via a prototype, didn't I?

Of course, it's not quite that simple. The thing is, the file I mentioned above isn't the first place that a prototype for that method is defined in LMiC. It's actually in other.c, line 35 as a C function. Since C and C++ (for all their similarities) are decidedly different, apparently to call a C function in C++ code you need to declare the function prototype as extern "C", like this:

extern "C" void lmic_aes_encrypt(unsigned char *Data, unsigned char *Key);

This cleaned the error right up. Turns out that even if a function body is defined in C++, what matters is where the original prototype is declared.

I'm hoping to release the source code, but I need to have a discussion with my supervisor about that at the end of the project.

Found this interesting? Come across some equally nasty bugs? Comment below!

Summer Project Part 4: Threading the needle and compacting it down

In the last part, I put the circuit for the IoT device together and designed a box for said circuit to be housed inside of.

In this post, I'm going to talk a little bit about 3D printing, but I'm mostly going to discuss the software aspect of the firmware I'm writing for the Arduino Uno that's going to be control the whole operation out in the field.

Since last time, I've completed the design for the housing and sent it off to my University for 3D printing. They had some great suggestions for improving the design like making the walls slightly thicker (moving from 2mm to 4mm), and including an extra lip on the lid to keep it from shifting around. Here are some pictures:

(Left: The housing itself. Right: The lid. On the opposite side (not shown), the screw holes are indented.)

At the same time as handling sending the housing off to be 3D printed, I've also been busily iterating on the software that the Arduino will be running - and this is what I'd like to spend the majority of this post talking about.

I've been taking an iterative approach to writing it - adding a library, interfacing with it and getting it to do what I want on it's own, then integrating it into the main program.... and then compacting the whole thing down so that it'll fit inside the Arduino Uno. The thing is, the Uno is powered by an ATmega328P (datasheet). Which has 32K of program space and just 2K of RAM. Not much. At all.

The codebase I've built for the Uno is based on the following libraries.

LMiC (the matthijskooijman fork) - the (rather heavy and needlessly complicated) LoRaWAN implementation
Entropy for generating random numbers as explained in part 2
TinyGPS, for decoding NMEA messages from the NEO-6M
SdFat, for interfacing with microSD cards over SPI

Memory Management

Packing the whole program into a 32K + 2K box is not exactly an easy challenge, I discovered. I chose to first deal with the RAM issue. This was greatly aided by the FreeMemory library, which tells you how much RAM you've got left at a given point in the execution of your program. While it's a bit outdated, it's still a useful tool. It works a bit like this:

#include <MemoryFree.h>;

void setup() {
    Serial.begin(115200);
    Serial.println(freeMemory, DEC);
    char test[] = "Bobs Rockets";
    Serial.println(freeMemory, DEC); // Should be lower than the above call
}

void loop() {
    // Nothing here
}

It's worth taking a moment to revise the way stacks and heaps work - and the differences between how they work in the Arduino environment and on your desktop. This is going to get rather complicated quite quickly - so I'd advise reading this stack overflow answer first before continuing.

First, let's look at the locations in RAM for different types of allocation:

Things on the stack
Things on the heap
Global variables

Unlike on the device you're reading this on, the Arduino does not support multiple processes - and therefore the entirety of the RAM available is allocated to your program.

Since I wasn't sure about preecisly how the Arduino does it (it's processor architecture-specific), I wrote a simple test program to tell me:

#include <Arduino.h>

struct Test {
    uint32_t a;
    char b;
};

Test global_var;

void setup() {
    Serial.begin(115200);

    Test stack;
    Test* heap = new Test();

    Serial.print(F("Stack location: "));
    Serial.println((uint32_t)(&stack), DEC);

    Serial.print(F("Heap location: "));
    Serial.println((uint32_t)heap, DEC);

    Serial.print(F("Global location: "));
    Serial.println((uint32_t)&global_var, DEC);
}

void loop() {
    // Nothing here
}

This prints the following for me:

Stack location: 2295
Heap location: 461
Global location: 284

From this we can deduce that global variables are located at the beginning of the RAM space, heap allocations go on top of globals, and the stack grows down starting from the end of RAM space. It's best explained with a diagram:

Now for the differences. On a normal machine running an operating system, there's an extra layer of abstraction between where things are actually located in RAM and where the operating system tells you they are located. This is known as virtual memory address translation (see also virtual memory, virtual address space).

It's a system whereby the operating system maintains a series of tables that map physical RAM to a virtual address space that the running processes actually use. Usually each process running on a system will have it's own table (but this doesn't mean that it will have it's own physical memory - see also shared memory, but this is a topic for another time). When a process accesses an area of memory with a virtual address, the operating system will transparently translate the address using the table to the actual location in RAM (or elsewhere) that the process wants to access.

This is important (and not only for security), because under normal operation a process will probably allocate and deallocate a bunch of different lumps of memory at different times. With a virtual address space, the operating system can defragment the physical RAM space in the background and move stuff around without disturbing currently running processes. Keeping the free memory contiguous speeds up future allocations, and ensures that if a process asks for a large block of contiguous memory the operating system will be able to allocate it without issue.

As I mentioned before though, the Arduino doesn't have a virtual memory system - partly because it doesn't support multiple processes (it would need an operating system for that). The side-effect here is that it doesn't defragment the physical RAM. Since C/C++ isn't a managed language, we don't get _heap compaction_ either like in .NET environments such as Mono.

All this leads us to an environment in which heap allocation needs to be done very carefully, in order to avoid fragmenting the heap and causing a stack crash. If an object somewhere in the middle of the heap is deallocated, the heap will not shrink until everything to the right of it is also deallocated. This post has a good explanation of the problem too.

Other things we need to consider are keeping global variables to a minimum, and trying to keep most things on the stack if we can help it (though this may slow the program down if it's copying things between stack frames all the time).

To this end, we need to choose the libraries we use with care - because they can easily break these guidelines that we've set for ourselves. For example, the inbuilt SD library is out, because it uses a global variable that eats over 50% of our available RAM - and it there's no way (that I can see at least) to reclaim that RAM once we're finished with it.

This is why I chose SdFat instead, because it's at least a little better at allowing us to reclaim most of the RAM it used once we're finished with it by letting the instance fall out of scope (though in my testing I never managed to reclaim all of the RAM it used afterwards).

Alternatives like µ-Fat do exist and are even lighter, but they have restrictions such as no appending to files for example - which would make the whole thing much more complicated since we'd have to pre-allocate the space for the file (which would get rather messy).

The other major tactic you can do to save RAM is to use the F() trick. Consider the following sketch:

#include 

void setup() {
    Serial.begin(115200);
    Serial.println("Bills boosters controller, version 1");
}

void loop() {
    // Nothing here
}

On the highlighted line we've got an innocent Serial.println() call. What's not obvious here is that the string literal here is actually copied to RAM before being passed to Serial.println() - using up a huge amount of our precious working memory. Wrapping it in the F() macro forces it to stay in your program's storage space:

Serial.println(F("Bills boosters controller, version 1"));

Saving storage

With the RAM issue mostly dealt with, I then had to deal with the thorny issue of program space. Unfortunately, this is not as easy as saving RAM because we can't just 'unload' something when it's not needed.

My approach to reducing program storage space was twofold:

Picking lightweight alternatives to libraries I needed
Messing with flags of said libraries to avoid compiling parts of libraries I don't need

It is for these reasons that I ultimately went with TinyGPS instead of TinyGPS++, as it saved 1% or so of the program storage space.

It's also for this reason that I've disabled as much of LMiC as possible:

#define DISABLE_JOIN
#define DISABLE_PING
#define DISABLE_BEACONS
#define DISABLE_MCMD_DCAP_REQ
#define DISABLE_MCMD_DN2P_SET

This disables OTAA, Class B functionality (which I don't need anyway), receiving messaages, the duty cycle cap system (which I'm not sure works between reboots), and a bunch of other stuff that I'd probably find rather useful.

In the future, I'll probably dedicate an entire microcontroller to handling LoRaWAN functionality - so that I can still use the features I've had to disable here.

Even doing all this, I still had to trim down my Serial.println() calls and remove any non-essential code to bring it under the 32K limit. As of the time of typing, I've got jut 26 bytes to spare!

Next time, after tuning the TPL5110 down to the right value, we're probably going to switch gears and look at the server-side of things - and how I'm going to be storing the data I receive from the Arudino-based device I've built.

Found this interesting? Got a suggestion? Comment below!

C++ in Review

I started learning C++ back in September, and I think I've experienced enough of it in order to write a proper review post on it. I've been wanting to do this for a while, as I have quite a bit to say about it.

I think I should start off by saying that C++ is complicated. Very complicated. To the point that you could learn about it for years and still not be half way through everything that you could learn about it. While this offers unparalelled customisablility, it also makes it difficult for new programmmers to pick up the language. Even worse is that whilst something might work, it could easily contain a critical security bug that can be extremely difficult to spot.

It's also difficult (for me anyway) to remember what everything is and what it does. For example, a const int* is not the same as an int* const. I find myself constantly looking things up that I forget over and over again. On top of this, a large number of the core built in functions don't actually do what you'd think they would. For example, getline() doesn't get a line of text, and fail() checks to see if an error occurred whilst opening it. These names can't be corrected in a later version of C++ either - later C++ versions must be able to compile previous versions of the language, including pure C from 30 years ago. Perhaps this is what makes it so difficult to use?

The other thing that makes C++ annoying and difficult to use is how low level it is. You end up having to do almost everything yourself - even things that you'd think would be built into the language already. The standard template library (STL) provides a large number of methods and data strutures to fill in the gaps, but most of these are misnamed too (the vector class is actually similar to a C♯ List), so you end up implementing your own anyway because you can't find the class you want in the STL (only to have someone point it out later).

Thankfully, there is a reason why everyone is still using an old broken language to write important code - it's practically the fastest language out there short of assembly language. No other language comes close to rivalling C++ in speed. Is this the only reason we are keeping it?

Bitwise Operators in C++

A while ago I was given some directed reading on bitwise operators in C++. I've been so busy with coursework, I haven't had time to take a look! Thankfully, I found some time and thought that it would make a good post here.

Bitwise operators in C++ appear to be very similar to those in Javascript. They operate on things at a binary level - i.e. operating on the individual bits that make up a number. Apparently there are 6 different operators:

& - Bitwise AND
| - Bitwise OR
^ - Bitwise XOR (eXclusive OR)
~ - Bit inversion (monadic)
<< - Shift bits to the left
>> - Shift bits to the right

I'll explain each of these with examples below.

Bitwise AND

Bitwise AND takes a bit from each thing (usually a number), and outputs a 1 if they are both 1s, and a 0 otherwise. Here's an example:

87654321
--------
01011010 // 90
11010101 // 213
-- AND --
01010000 // 80

In the above example, the only 1s that actually make it through to the final result are positions 7 and 5, which are worth 64 and 16 respectively. This can be useful for extracting a specific bit from a number, like this:

#include <iostream>
using namespace std;
void main() {
    int c = 58, d = 15;

    cout << "c " << (c & 32) << endl;
    cout << "d " << (d & 32) << endl;
}

In the above, I create 2 variables to hold the 2 numbers that I want to test, and then I perform an AND on each one in turn, writing the result to the console. It should output something like this:

c 32
d 0

This is because 58 is 00111010, and 32 is 00100000, so only the 6th bit has a chance of making ti through to the final result.

Bitwise OR

Bitwise OR outputs a 1, so long as any of the inputs are true. Here's an example:

87654321
--------
10110101
00011101
-- OR --
10111101

Bitwise XOR

Bitwise XOR stands for exclusive OR, and outputs a 1 so long as either of the inputs are 1, but not both. For example, 1 ^ 0 = 1, but 1 ^ 1 = 0.

87654321
--------
10101101
11001110
-- XOR --
01100011

Bit inversion

Bitwise inversion is a monadic operator (i.e. it only takes one operand), and flips every bit of the input. For example 11011101 (221) would become 00100010 (34), although this greatly depends of the type and length of the number you are using.

Bit shifting

Bitshifting is the process of shifting a bit of bits to the left or right a given number of places. For example, if you shift 1010 (10) over to the left 1 place you get 10100 (20), and if you shift 0111 (7) over to the right by one place you get 0011 (3). If you play around with this, you notice that this has the effect of doubling and halving the number:

cout << "f ";
int f = 5;
for(int i = 0; i < 5; i++)
{
    f = f << 1;
    cout << f << " ";
}
cout << endl;
cout << "g ";
int g = 341;
for(int i = 0; i < 5; i++)
{
    g = g >> 1;
    cout << g << " ";
}
cout << g << endl;

It doesn't deal with the decimal place, though - for that you need something like a float or a double. Here's some example output from the above:

f 10 20 40 80 160
g 170 85 42 21 10 10

At first glance, the bit shifting operators in c++ look the same as the ones used to stream things to and from an input / output stream - I'll have to be careful when using these to make sure that I don't accidentally stream something when I meant to bitshift it.

Bitshifting can be useful when working with coloured pixels. You can set a colour like this:

unsigned int newCol = ((unsigned int) 248)<<16) +
    ((unsigned int) 0)<<8) +
    ((unsigned int) 78);

I think that just about covers bitwise operators in C++. If you're interested, you can find the source code that I've written whilst writing this post here (Raw).

Static Variable Memory Allocation

Recently we were asked in an Advanced Programming lecture to write a program that proves that static variables in C++ are located in a different area of memory to normal variables. This post is about the program I came up with.

But first I should probably explain a little bit about memory allocation in C++. I'm by no means an expert (I'm just learning!), but I'll do my best.

There are two main areas of memory in C++ - the stack and the heap. The stack is an ordered collection of data that contains all the local variables that you define. It grows downward from a given address in memory. The stack is split up into frames, too, with each frame being a specific context of execution. You can find more information about this in the Intel Software Developer's Manual, Section 6.2.

The second area of memory is the heap. The heap is a large portion of memory that can be used for storing large lumps of unordered data. I haven't taken a look at this yet, so I don't know much more about it.

Apparently variables that are declared to be static are not located in the stack (where they are located I'm not entirely sure), and we were asked to prove it. Here's what I came up with:

#include <iostream>
#include <string>

using namespace std;

void staticTest()
{
    static int staticD = 0;
    cout << "staticD: " << &staticD << endl;
}

int main()
{
    int a = 50;
    int b = 200;
    int c = 23495;

    cout << "a: " << &a << endl;
    cout << "b: " << &b << endl;
    cout << "c: " << &c << endl;

    staticTest();

    int e = 23487;
    cout << "e: " << &e << endl;

    //cin.get();

    return 0;
}

The above simply creates 3 normal variables, outputs their addresses, calls the staticTest() function to create a static variable and output it's memory address, and then creates a 4th normal variable and outputs its address too. The above program should output something like this:

a: 0091F8C4 b: 0091F8B8 c: 0091F8AC staticD: 00080320 e: 0091F8A0

In the above output, the normal variables a, b, c, and e are all located next to each other in memory between the adresses 0091F8C4 and 0091F8A0 (note how the memory address is decreasing - the stack grows downward, like a stalactite). The static variable staticD, however, is located in an entirely different area of memory over at 00080320. Since it's located in an entirely different area of memory, it's safe to assume, I think, that static variables are not stored in the stack, but somewhere else (I'm not sure where - the heap maybe?).

If you know where they are stored, please leave a comment below!

Dividing by zero doesn't always throw an exception

Yesterday I had my first Advenced Programming lab session. Part of the instructions get you to investigate how small a number has to be in order to throw a division by zero exception. I discovered, however, that if you use a double, it does not, in fact, throw any such error. I did some investigation, and thought I might share what I learnt here.

According to Google, C++ only throws a division by zero error if it is doing integer math. It doesn't seem to care if it is doing floating point math. This is because floating point numbers can hold the value Infinity, and integers can't. This means that none of the following statements will throw an error:

double x = 10.0 / 0.0;
float y = 371 / 0.0;
int z = 89.0 / 0.0;

The last line of the above looks like it should throw an error, but it doesn't. The reason for this is that it calculates 89.0 / 0.0 using floating point maths to be Inf, and then converts that into an integer. Since Inf doesn't convert directly, it picks the next best thing I'd assume. The following, however, will all throw an error:

int x = 10 / 0;
float y = 1000 / 0;
double z = 4560 / 0;

Stardust Blog

Tag Cloud