Stardust | Starbeamrainbowlabs

.desktop files: Launcher icons on Linux

Heya! Just thought I'd write a quick reminder post for myself on the topic of .desktop files. In most Linux distributions, launcher icons for things are dictated by files with the file extension .desktop.

Of course, most programs these days come with a .desktop file automatically, but if you for example download an AppImage, then you might not get an auto-generated one. You might also be packaging something for your distro's package manager (go you!) - something I do semi-regularly when apt repos for software I need isn't updated (see my apt repository!).

They can live either locally to your user account (~/.local/share/applications) or globally (/usr/share/applications), and they follow the XDG desktop entry specification (see also the Arch Linux docs page, which is fabulous as usual ✨). It's basically a fancy .ini file:

[Desktop Entry]
Encoding=UTF-8
Type=Application
Name=Krita
Comment=Krita: Professional painting and digital art creation program
Version=1.0
Terminal=false
Exec=/usr/local/bin/krita
Icon=/usr/share/icons/krita.png

Basically, leave the first line, the Type, the Version, the Terminal, and the Encoding directives alone, but the others you'll want to customise:

Name: The name of the application to be displayed in the launcher
Comment: The short 1-line description. Some distros display this as the tooltip on hover, others display it in other ways.
Exec: Path to the binary to execute. Prepend with env Foo=bar etc if you need to set some environment variables (e.g. running on a discrete card - 27K views? wow O.o)
Icon: Path to the icon to display as the launcher icon. For global .desktop files, this should be located somewhere in /usr/share/icon.

This is just the basics. There are many other directives you can include - like the Category directive, which describes - if your launcher supports categories - what categories a given launch icon should appear under.

Troubleshooting: It took me waaay too long to realise this, but if you have put your .desktop file in the right place and it isn't appearing - even after a relog - then the desktop-file-validate command could come in handy:

desktop-file-validate path/to/file.desktop

It validates the content of a given .desktop file. If it contains any errors, then it will complain about them for you - unlike your desktop environment which just ignores .desktop files that are invalid.

If you found this useful, please do leave a comment below about what you're creating launcher icons for!

Sources and further reading

Defining AI: Image segmentation

Banner showing the text 'Defining AI' on top on translucent white vertical stripes against a voronoi diagram in white against a pink/purple background. 3 progressively larger circles are present on the right-hand side.

Welcome back to Defining AI! This series is all about defining various AI-related terms in short-and-sweet blog posts.

In the last post, we took a quick look at word embeddings, which is the key behind how AI models understand text. In this one, we're going to investigate image segmentation.

Image segmentation is a particular way of framing a learning task for an AI model that takes an image as an input, and then instead of classifying the image into a given set of categories, it classifies every pixel by the category each one belongs to.

The simplest form of this is what's called semantic segmentation, where we classify each pixel via a given set of categories - e.g. building, car, sky, road, etc if we were implementing a segmentation model for some automated vehicle.

If you've been following my PhD journey for a while, you'll know that it's not just images that can be framed as an 'image' segmentation task: any data that is 2D (or can be convinced to pretend to be 2D) can be framed as an image segmentation task.

The output of an image segmentation model is basically a 3D map. This map will obviously have the width and height, but then also have an extra dimension for the channel. This is best explained with an image:

(Above: a diagram explaining how the output of an image segmentation model is formatted. See below explanation. Extracted from my work-in-progress thesis!)

Essentially, each value in the 'channel' dimension will be the probability that pixel is that class. So, for example, a single pixel in a model for predicting the background and foreground of an image might look like this:

[ 0.3, 0.7 ]

....if we consider these classes:

[ background, foreground ]

....then this pixel has a 30% change of being a background pixel, and a 70% change of being a foreground pixel - so we'd likely assume it's a foreground pixel.

Built up over the course of an entire image, you end up with a classification of every pixel. This could lead to models that separate the foreground and background in live video feeds, autonomous navigation systems, defect detection in industrial processes, and more.

Some models, such as the Segment Anything Model (website) have even been trained to generically segment any input image, as in the above image where we have a snail sat on top of a frog sat on top of a turtle, which is swimming in the water.

As alluded to earlier, you can also feed in other forms of 2D data. For example, this paper) predicts rainfall radar data a given number of hours into the future from a sample at the present moment. Or, my own research approximates the function of a physics-based model!

That's it for this post. If you've got something you'd like defining, please do leave a comment below!

I'm not sure what it'll be next, but it might be either staying at a high-level and looking at different ways that we can frame tasks in AI models, or I could jump to a lower level to look at fundamentals like loss (error) functions, backpropagation, layers (AI models are made up of multiple smaller layers), etc. What would you like to see next?

Defining AI: Word embeddings

Hey there! It's been a while. After writing my thesis for the better part of a year, I've been getting a bit burnt out on writing - so unfortunately I had to take a break from writing for this blog. My thesis is almost complete though - more on this in the next post in the PhD update blog post series. Other higher-effort posts are coming (including the belated 2nd post on NLDL-2024), but in the meantime I thought I'd start a series on defining various AI-related concepts. Each post is intended to be relatively short in length to make them easier to write.

Normal scheduling will resume soon :-)

Banner showing the text 'Defining AI' on top on transclucent white vertical stripes against a voronoi diagram in white against a pink/purple background. 3 progressively larger circles are present on the right-hand side.

As you can tell by the title of this blog post, the topic for today is word embeddings.

AI models operate fundamentally on numerical values and mathematics - so naturally to process text one has to encode said text into a numerical format before it can be shoved through any kind of model.

This process of converting text to a numerically-encoded value is called word embedding!

As you might expect, there are many ways of doing this. Often, this involves looking at what other words a given word often appears next to. This could be for example framed as a task in which a model has to predict a word given the words immediately before and after it in a sentence, and then take the output of the last layer before the output layer as the embedding (word2vec).

Other models use matrix math to calculate this instead, producing a dictionary file as an output (GloVe | paper). Still others use large models that are trained to predict randomly masked words, and process entire sentences at once (BERT and friends) - though these are computationally expensive since every bit of text you want to embed has to get pushed through the model.

Then there's contrastive learning approaches. More on contrastive learning later in the series if anyone's interested, but essentially it learns by comparing pairs of things. This can lead to a higher-level representation of the input text, which can increase performance in some circumstances, and other fascinating side effects that I won't go into in this post. Chief among these is CLIP (blog post).

The idea here is that semantically similar words wind up having similar sorts of numbers in their numerical representation (we call this a vector). This is best illustrated with a diagram:

Words embedded with GloVe and displayed in a heatmap

I used GloVe to embed some words with GloVe (really easy to use since it's literally just a dictionary), and then used cosine distance to compute the similarity between the different words. Once done, I plotted this in a heatmap.

As you can see, rain and water are quite similar (1 = identical; 0 = completely different), but rain and unrelated are not really alike at all.

That's about the long and short of word embeddings. As always with these things, you can go into an enormous amount of detail, but I have to cut it off somewhere.

Are there any AI-related concepts or questions you would like answering? Leave a comment below and I'll write another post in this series to answer your question.

NLDL was awesome! >> NLDL-2024 writeup

A cool night sky and northern lights banner I made in Inkscape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

It's the week after I attended the Northern Lights Deep Learning Conference 2024, and now that I've had time to start to process everything I leant and experienced, it's blog post time~~✨ :D

Edit: Wow, this post took a lot more effort to put together than I expected! It's now the beginning of February and I'm just finishing this post up - sorry about the wait! I think this is the longest blog post to date. Consider this a mega-post :D

In this post that is likely to be quite long I'm going to talk a bit about the trip itself, and about what happened, and the key things that I learnt. Bear in mind that this post is written while I've still sorting my thoughts out - it's likely going to take many months to fully dive into everything I saw that interested me :P

Given I'm still working my way through everything I've learnt, it is likely that I've made some errors here. Don't take my word for any facts written here please! If you're a researcher I'm quoting here and you have spotted a mistake, please let me know.

I have lots of images of slides and posters that interested me, but they don't make a very good collage! To this end images shown below are from the things I saw & experienced. Images of slides & posters etc available upon request.

Note: All paper links will be updated to DOIs when the DOIs are released. All papers have been peer-reviewed.

Day 1

A collage of images from travelling to the conference. Description below. (Above: A collage of images from the trip travelling to the conference. Description below.)

Images starting top-left, going anticlockwise:

The moon & venus set against the first.... and last sunrise I would see for the next week
Getting off the first plane in Amsterdam Schiphol airport
Flying into Bergen airport
The teeny Widerøe turboprop aircraft that took me from Bergen to Tromsø
A view from the airport window when collecting my bags
Walking to the hotel
Departures board in Bergen airport

After 3 very long flights the day before (the views were spectacular, but they left me exhausted), the first day of the conference finally arrived. As I negotiated public transport to get myself to UiT, The Arctic University of Norway I wasn't sure what to expect, but as it turned out academic conferences are held in (large) lecture theatres (at least this one was) with a variety of different events in sequence:

Opening/closing addresses: Usually at the beginning & ends of a conference. Particularly the beginning address can include useful practical information such as where and when food will be made available.
Keynote: A longer (usually ~45 minute) talk that sets the theme for the day or morning/afternoon. Often done by famous and/or accomplished researchers.
Oral Session: A session chaired by a person [usually relatively distinguished] in which multiple talks are given by individual researchers with 20 minutes per talk. Each talk is 15 minutes with 5 minutes for questions. I spoke in one of these!
Poster session: Posters are put up by researchers in a designated area (this time just outside the room) and people can walk around and chat with researchers about their researchers. If talks have a wide reach and shallow depth, posters have a narrow reach and much more depth.
- I have bunch of photographs of posters that interested me for 1 reason or another, but it will take me quite a while to work through them all to properly dig into them all and extract the interesting bits.
Panel discussion: This was where a number of distinguished people sit down on chairs/stools at the front, and a chair asks a series of questions and moderates the resulting discussion. Questions from the audience may also be allowed after some of the key preset questions have been asked.

This particular conference didn't have multiple events happening at once (called 'tracks' in some conferences I think), which I found very helpful as I didn't need to figure out which events I should attend or not. Some talks didn't sound very interesting but then turned out to be some of the highlights of the conference for me, as I'll discuss below. Definitely a fan of this format!

The talks started off looking at the fundamentals of AI. Naturally, this included a bunch of complex mathematics - the understanding of which in real-time is not my strong point - so while I did make some notes on these I need to go back and take a gander at the papers of some of the talks to fully grasp what was going on.

Moving on from AI fundamentals, the next topic was reinforcement learning. While not my current area of research area, some interesting new uses of the technology were discussed, such as dynamic pathing/navigation based on the information gained from onboard sensors by Alouette von Hove from the University of Oslo - the presented example was determining the locations of emitters of greenhouse gasses such as CO₂ & methane.

Interspersed in-between the oral sessions were poster sessions. At NLDL-2024 these were held in the afternoons and also had fruit served alongside them, which I greatly appreciated (I'm a huge fan of fruit). At these there were posters for the people who had presented earlier in the day, but also some additional posters from researchers who were presenting a talk.

If talks research a wide audience at a shallow depth, the posters reached a narrower audience but at a much greater depth. I found the structure of having the talk before the poster very useful - not only for presenting my own research (more on that later), but also for picking out some of the posters I wanted to visit to learn more about their approaches.

On day 1, the standout poster for me was one on uncertainty quantification in image segmentation models - Confident Naturalness Explanation (CNE): A Framework to Explain and Assess Patterns Forming Naturalness. While their approach to increasing the explainability of image segmentation models (particularly along class borders) was applied to land use and habitat identification, I can totally see it being applicable to many other different projects in a generic 'uncertainty-aware image segmentation' form. I would very much like to look into this one deeply and consider applying lessons learnt to my rainfall radar model.

Another interesting poster worked to segment LiDAR data in a similar fashion to that of normal 'image segmentation' (that I'm abusing in my research) - Beyond Siamese KPConv: Early Encoding and Less Supervision.

Finally, an honourable mention is one which applied reinforcement learning to task scheduling - Scheduling conditional task graphs with deep reinforcement learning.

Diversity in AI

In the afternoon, the Diversity in AI event was held. The theme was fairness of AI models, and this event was hugely influential for me. Through a combination of cutting edge research and helpful case-studies and illustrations, the speakers revealed hidden sources of bias and novel ways to try and correct for them. They asked the question of "what do we mean by a fair AI model?", and discovered the multiple different facets to the question and how fairness in an AI model can mean different things in different contexts and to different people.

They also demonstrated how taking a naïve approach to correcting for e.g. bias in a binary classifier could actually make the problem worse!

I have not yet had time to go digging into this, but I absolutely want to spend at least an entire afternoon dedicated to digging into and reading around the subject. Previously, I had no idea how big and pervasive the problem of bias in AI was, so I most certainly want to educate myself to ensure models that I create as a consequence of the research I do are as ethical as possible.

Depending on how this research reading goes, I could write a dedicated blog post on it in the future. If this would be interesting to you, please comment below with the kinds of things you'd be interesting in.

Another facet of the diversity event was that of hiring practices and diversity in academia. In the discussion panel that closed out the day, the current dilemma of low diversity (e.g. gender balance) in students taking computer science as a subject. It was suggested that how computer science is portrayed can make a difference, and that people with different backgrounds on the subject will approach and teach the subject through different lenses. Mental health was also mentioned as being a factor that requires work and effort to reduce stigma, encourage discussions, and generally improve the situation.

All in all I found the diversity event to be a very useful and eye-opening event that I'm glad I attended.

A collage from day 1 of the conference

(Above: A collage from day 1 of the conference)

Images starting top-left, going anticlockwise in an inwards spiral:

The conference theatre during a break
Coats hung up on the back wall of the conference theatre - little cultural details stood out to me and were really cool!
On the way in to the UiT campus on day 1
Some plants under some artificial sunlight bulbs I found while having a wander
Lunch on day 1: rice (+ curry, but I don't like curry)
NLDL-2024 sign
View from the top on Fjellheisen
Cinnamon bun that was very nice and I need to find a recipe
View from the cable car on the way up Fjellheisen

Social 1: Fjellheisen

The first social after the talks closed out for the day was that of the local mountain Fjellheisen (pronounced fyell-hai-sen as far as I can tell). Thankfully a cable car was available to take conference attendees (myself included) up the mountain, as it was significantly cold and snowy - especially 420m above sea level at the top. Although it was very cloudy at the time with a stratus cloud base around 300m (perhaps even lower than that), we still got some fabulous views of Tromsø and the surrounding area.

There was an indoor seating area too, in which I warmed up with a cinnamon bun and had some great conversations with some of the other conference attendees. Social events and ad-hoc discussions are, I have discovered, an integral part of the conference experience. You get to meet so many interesting people and discover so many new things that you wouldn't otherwise get the chance to explore.

Day 2

Day 2 started with AI for medical applications, and what seemed to be an unofficial secondary theme continuing the discussion of bias and fairness which made the talks just as interesting and fascinating as the previous day. By this point I figured out the conference-provided bus, resulting in more cool discussions on the way to and from the conference venue at UiT.

Every talk was interesting in it's own way, with discussions of shortcut learning (where a model learns to recognise something else other than your intended target - e.g. that some medical device in an X-Ray is an indicator of some condition when it wouldn't ordinarily present at test time), techniques to utilise contrastive learning in new ways (classifying areas of interest in very large images from microscopes) and applying the previous discussion of bias and fairness to understanding bias in contrastive learning systems, and what we can do about it through framing the task the model is presented with.

The research project that stood out to be was entitled Local gamma augmentation for ischemic stroke lesion segmentation on MRI by Jon Middleton at the University of Copenhagen. Essentially they correct for differing ranges of brightness in images from MRI scans of brains before training a model to increase accuracy and reduce bias.

The poster session again had some amazing projects that are worth mentioning. Of course, as with this entire blog post this is just my own personal recount of the things that I found interesting - I encourage you to go to a conference in person at some point if you can!

The highlight was a poster entitled LiDAR-based Norwegian Tree Species Detection Using Deep Learning. The authors segment LiDAR data by tree species, but have also invented a clever augmentation technique they call 'cowmix augmentation' to stretch the model's attention to detail on class borders and the diversity of their dataset.

Another cool poster was Automatic segmentation of ice floes in SAR images for floe size distribution. By training an autoencoder to reconstruct SAR (Synthetic Aperture Radar) images, they use the resulting output to analyse the distribution in sizes of icebergs in Antarctica.

I found that NLDL-2024 had quite a number of people working in various aspects of computer vision and image segmentation as you can probably tell from the research projects that have stood out to me so far. Given I went to present my rainfall radar data (more on that later), image handling projects stood out to me more easily than others. There seemed to be less of a focus on Natural Language Processing - which, although discussed at points, wasn't nearly as prominent a theme.

One NLP project that was a thing though was a talk on anonymising data in medical records before they are e.g. used in research projects. The researcher presented an approach using a generative text model to identify personal information in medical records. By combining it with a regular expression system, more personal information could be identified and removed than before.

While I'm not immediately working with textual data at the minute, part of my PhD does involve natural language processing. Maybe in the future when I have some more NLP-based research to present it might be nice to attend an NLP-focused conference too.

A collage of photos from day 2 of the conference.

(Above: A collage of photos from day 2 of the conference.)

Images from top-left in an anticlockwise inwards spiral:

Fjellheisen from the hotel at which the conference dinner took place
A cool church building in a square I walked through to get to the bus
The hallway from the conference area up to the plants in the day 1 collage
Books in the store on the left of #3. I got postcards here!
Everyone walking down towards the front of the conference theatre to have the group photo taken. I hope they release that photo publicly! I want a copy so bad...
The lobby of the conference dinner hotel. It's easily the fanciest place I have ever seen....!
The northern lights!!! The clouds parted for half and hour and it was such a magical experience.
Moar northern lights of awesomeness
They served fruit during the afternoon poster sessions! I am a big fan. I wonder if the University of Hull could do this in their events?
Lunch on day 2: fish pie. It was very nice!

Social 2: Conference dinner

The second social event that was arranged was a conference dinner. It was again nice to have a chance to chat with others in my field in a smaller, more focused setting - each table had about 7 people sitting at it. The food served was also very fancy - nothing like what I'm used to eating on a day-to-day basis.

The thing I will always remember though is shortly before the final address, someone came running back into the conference dinner hall to tell us they'd seen the northern lights!

Grabbing my coat and rushing out the door to some bewildered looks, I looked up and.... there they were.

As if they had always been there.

I saw the northern lights!

Seeing them has always been something I wanted to do, so I am so happy to have a chance to see them. The rest of the time it was cloudy, but the clouds parted for half an hour that evening and it was such a magical moment.

They were simultaneously exactly and nothing like what I expected. They danced around the sky a lot, so you really had to have a very good view in all directions and keep scanning the sky. They also moved much faster than I expected. They could flash and be gone in just moments, while others would just stick and hang around seemingly doing nothing for over a minute.

A technique I found to be helpful was to scan the sky with my phone's camera. It could see 'more' of the northern lights than you can see with your eyes, so you could find a spot in the sky that had a faint green glow with your phone and then stare at it - and more often than not it would brighten up quickly so you could see it with your own eyes.

Day 3

It felt like the conference went by at lightning speed. For the entire time I was focused on learning and experiencing as much as I could, and just as soon as the conference started we all reached the last day.

The theme for the third day started with self-supervised learning. As I'm increasingly discovering, self-supervised learning is all about framing the learning task you give an AI model in a clever way that partially or completely does away with the need for traditional labels. There were certainly some clever solutions on show at NLDL-2024:

Beyond output-mask comparison: A self-supervised inspired object scoring system for building change detection: Detecting changes in buildings across time, and avoiding the aforementioned shortcut learning - reminds me of SpaceNet8 in that it could be applicable to natural disaster damage detection.
Loop closure with a lower power millimetre wave radar sensor using deep learning: Sounds like a fancy title, but basically they use a cool autoencoder-based solution to figuring out when a drone that's exploring its environment doubles back on itself.

An honourable mention goes to the paper on a new speech-editing model called FastStitch. Unfortunately the primary researcher was not able to attend - a shame, cause it would have been cool to meet up and chat - but their approach looks useful for correcting speech in films, anime, etc after filming... even though it could also be used for some much more nefarious purposes too (though that goes for all of the new next-gen generative AI models coming out at the moment).

This was also the day I presented my research! As I write this, I realise that this post is now significantly long so I will dedicate a separate post to my experiences presenting my research. Suffice to say it was a very useful experience - both from the talks and the poster sessions.

Speaking of poster sessions, there was a really interesting poster today entitled Deep Perceptual Similarity is Adaptable to Ambiguous Contexts which proposes that image similarity is more complicated just comparing pixels: it's about shapes and the object shown -- not just the style a given image is drawn in. To this end, they use some kind of contrastive approach to compare how similar a series of augmentations are to the original source image as a training task.

Panel Discussion

Before the final poster session of the (main) conference, a panel discussion between 6 academics (1 chairing) who sounded very distinguished (sadly I did not completely catch their names, and I will save everyone the embarrassment of the nickname I had to assign them to keep track of the discussion in my notes) closed out the talks. There was no set theme that jumped out to me (other than AI of course), but like the diversity in AI conference discussion panel on day 1 the chair had some set questions to ask the academics making up the discussion.

The role of Universities and academics was discussed at some length. Recently large tech companies like OpenAI, Google, and others are driven by profit to race to put next-generation foundational models (a term new to me that describes large general models like GPT, Stable Diffusion, Segment Anything, CLIP, etc) to work in anything and everything they can get their hands on..... and often to the detriment of user privacy.

It was mentioned that researchers in academia have a unique freedom to choose what they research in a way that those working in industry do not. It was suggested that academia must be one step ahead of industry, and understand the strengths/weaknesses of the new technologies -- such as foundational models, and how they impact society. With this freedom, researchers in academia can ask the how and why, which industry can't spare the resources for.

The weaknesses of academia was also touched on, in that academia is very project-based - and funding for long-term initiatives can be very difficult come by. It was also mentioned that academia can get stuck on optimising e.g. a benchmark in the field of AI specifically. To this end, I would guess creativity is really important to invent innovative new ways of solving existing real-world problems rather than focusing too much on abstract benchmarks.

The topic of the risks of AI in the future also came up. While the currently-scifi concept of the Artificial General Intelligence (AGI) that is smarter than humans is a hot topic at the moment, whether or not it's actually possible is not clear (personally, it seem rather questionable that it's even possible at all) - and certainly not in the next few decades. Rather than worrying about AGI, everyone agreed that bias and unfairness in AI models is already a problem that needs to be urgently addressed.

The panel agreed that people believing the media hype generated by the large tech companies is arguably more dangerous than AGI itself... even if it were right around the corner.

The EU AI Act is right around the corner, which requires transparency of data used to train a given AI model, among many other things. This is a positive step forwards, but the panel was concerned that the act could lead to companies cutting corners on safety to tick boxes. They were also concerned that an industry would spring up around the act providing services of helping other businesses to comply with the act, which risked raising the barrier to entry significantly. How the act is actually implemented with have a large effect on its effectiveness.

While the act risks e.g. ChatGPT being forced to pull out of the EU if it does not comply with the transparency rules, the panel agreed that we must take alternate path than that of closed-source models. Open source alternatives to e.g. ChatGPT do exist and are only about 1.5 years behind the current state of the art. It appears at first that privacy and openness are at odds, but in Europe we need both.

The panel was asked what advice they had for young early-career researchers (like me!) in the audience, and had a range of helpful tips:

Don't just follow trends because everyone else is. You might see something different in a neglected area, and that's important too!
Multidisciplinary research is a great way to see different perspectives.
Speak to real people on the ground as to what their problems are, and use that as inspiration
Don't keep chasing after the 'next big thing' (although keeping up to date in your field is important, I think).
Work on the projects you want to work on - academia affords a unique freedom to researchers working within

All in all, the panel was a fascinating big-picture discussion, and there was discussion of the bigger picture of the role of academia in big-picture current global issues I haven't really seen before this point.

AI in Industry Event

The last event of the conference came around much faster than I expected - I suppose spending every waking moment focused on conferencey things will make time fly by! This event was run by 4 different people from 4 different companies involved in AI in one way or another.

It was immediately obvious that these talks were by industry professionals rather than researchers, since they somehow managed say a lot without revealing all that much about the internals of their companies. It was also interesting that some of them were almost a pitch to the researchers present to ask if they had any ideas or solutions to their problems.

This is not to say that the talks weren't useful. They were a useful insight into how industry works, and how the impact of research can be multiplied by being applied in an industry context.

It was especially interesting to listen to the discussion panel that was held between the 4 presenters / organisers of the industry event. 1 of them served as chair, moderating the discussion and asking the questions to direct the discussion. They discussed issues like silos of knowledge in industry vs academia, the importance of sharing knowledge between the 2 disciplines, and the challenges of AI explainability in practice. The panellists had valuable insights into the realities of implementing research outputs on the ground, the importance of communication, and some advice for PhD students in the audience considering a move into industry after their PhD.

A collage of photos I took during day 3

(Above: A collage from day 1 of the conference)

Images starting top-left, going anticlockwise in an inwards spiral:

A cool ship I saw on the way to the bus that morning
A neat building I saw on the way to the bus. The building design is just so different to what I'm used to.... it gives me loads of building inspiration for Minetest!
The cafeteria in which we ate lunch each day! It was so well designed, and the self-clear system was very functional and cool!
The conference theatre during one of the talks.
Day 3's lunch: lasagna! They do cool food in UiT in Tromsø, Norway!
The last meal I ate (I don't count breakfast :P) in the evening before leaving the following day
The library building I went past on the way back to the hotel in the evening. The integrated design with books + tables and interactive areas is just so cool - we don't have anything like that I know of over here!
A postbox! I almost didn't find it, but I'm glad I was able to sent a postcard or two.
The final closing address! It was a bittersweet moment that the conference was already coming to a close.

Closing thoughts

Once the industry event was wrapped up, it was time for the closing address. Just as soon as it started, the conference was over! I felt a strange mix of exhaustion, disbelief that the conference was already over, and sadness that everyone would be going their separate ways.

The very first thing I did after eating something and getting back to my hotel was collapse in bed and sleep until some horribly early hour in the morning (~4-5am, anyone?) when I needed to catch the flight home.

Overall it was an amazing conference, and I've learnt so much! It's felt so magical, like anything is possible ✨ I've met so many cool people and been introduced to so many interesting ideas, it's gonna take some time to process them all.

I apologise for how long and rambly this post has turned out to be! I wanted to get all my thoughts down in something coherent enough I can refer to it in the future. This conference has changed my outlook on AI and academia, and I'm hugely grateful to my institution for finding the money to make it possible for me to go.

It feels impossible to summarise the entire conference in 4 bullet points, but here goes:

Fairness: What do we mean by 'fair'? Hidden biases, etc. Explainable AI sounds like an easy solution but can not only mislead you but attempts to improve perceived 'fairness' can in actuality make the problem worse and you would never know!
Self-supervised learning: Clustering, contrastive learning, also tying in with the fairness theme ref sample weighting and other techniques etc.
Fundamental models: Large language models etc that are very large and expensive to train, sometimes available pretrained sand sometimes only as an API. They can zero-shot many different tasks, but what about fairness, bias, ethics?
Reinforcement learning: ...and it's applications

Advice

I'm going to end this mammoth post with some advice to prospective first-time conference goers. I'm still rather inexperienced with these sortsa things, but I do have a few things I've picked up.

If you've unsure about going to a conference, I can thoroughly recommend attending one. If you don't know which conference you'd like to attend, I recommend seeking someone with more experience than you in your field but what I can say is that I really appreciated how NLDL-2024 was not too big and not too small. It had an estimated 250 conference attendees, and I'm very thankful it did not have multiple tracks - this way I didn't hafta sort through which talks interested me and which ones didn't. The talks that did interest me sometimes surprised me: if I had the choice I would have picked an alternative, but in the end I'm glad I sat through all of them.

Next, speak to people! You're all in this together. Speak to people at lunch. On the bus/train/whatever. Anywhere and everywhere you see someone with the same conference lanyard as you, strike up a conversation! The other conference attendees have likely worked just as hard as you to get here & will likely be totally willing to you. You'll meet all sorts of new people who are just as passionate about your field as you are, which is an amazing experience.

TAKE NOTES. I used Obsidian for this purpose, but use anything that works for you. This includes both formally from talks, panel discussions, and other formal events, but also informally during chats and discussions you hold with other conference attendees. Don't forget to include who you spoke to as well! I'm bad at names and faces, but your notes will serve as a permanent record of the things you learnt and experienced at the time that you can refer back to again later. You aren't going to remember everything you see (unless you have a perfect photographic memory of course), so make notes!

On the subject of recording your experiences, take photos too. I'm now finding it very useful to have photos of important slides and posters that I saw to refer back to. I later developed a habit of photographing the first slide of every talk, which has also proved to be invaluable.

Having business cards to hand out can be extremely useful to follow up conversations. If you have time, get some made before you go and take them with you. I included some pretty graphics from my research on mine, which served as useful talking points to get conversations started.

Finally, have fun! You've worked hard to be here. Enjoy it!

If you have any thoughts / comments about my experiences at NLDL-2024, I'd love to hear them! Do leave a comment below.

LaTeX templates for writing with the University of Hull's referencing style

Hello, 2024! I'm writing this while it is still some time before the new year, but I realised just now (a few weeks ago for you), that I never blogged about the LaTeX templates I have been maintaining for a few years by now.

It's no secret that I do all of my formal academic writing in LaTeX - a typesetting language that is the industry standard in the field of Computer Science (and others too, I gather). While it's a very flexible (and at times obtuse, but this is a tale for another time) language, actually getting started is a pain. To make this process easier, I have developed over the years a pair of templates for writing that make starting off much easier.

A key issue (and skill) in academic writing is properly referencing things, and most places have their own specific referencing style you have to follow. The University of Hull is no different, so I knew from the very beginning that I needed a solution.

I can't remember who I received it from, but someone (comment below if you remember who it was, and I'll properly credit!) gave me a .bst BibTeX referencing style file that matches the University of Hull's referencing style.

I've been using it ever since, and I have also applied a few patches to it for some edge cases I have encountered that it doesn't handle. I do plan on keeping it up to date for the forseeable future with any changes they make to the aforementioned referencing style.

My templates also include this .bst file to serve as a complete starting template. There's one with a full page title (e.g. for thesis, dissertations, etc), and another with just a heading that sits at the top of the document just like a paper you might find on Semantic Scholar.

Note that I do not guarantee that the referencing style matches the University of Hull's style. All I can say is that it works for me and implements this specific referencing style.

With that in mind, I'll leave the README of the git repository to explain the specifics of how to get started with them:

https://git.starbeamrainbowlabs.com/Demos/latex-templates

They are stored on my personal git server, but you should be able to clne them just fine. Patches are most welcome via email (check the homepage of my website!)!

Happy Christmas / New Year 2023!

Heya!

I hope everyone has been having a restful winter break. This is just a quick post to wish you a happy Christmas and a great new year!

This year has been a very busy one for me - primarily with going part-time and focusing on writing my PhD thesis. I've still done a bunch of cool things this year though. Here are some of my highlights:

Open source work
- Created Share2Fediverse
- Released v1.14: The Multipoint Update for WorldEditAdditions ~~and the subsequent 5 hotfixes~~, which has 11K downloads on ContentDB alone(!)
- Ended an unexpected burnout-induced hiatus on maintainer work for tldr-pages and tackled my GitHub notifications
- Worked in on many other cool things
Researchy-stuff
- A Hull Science Festival Demo
- A paper submission to NLDL-2024!
- Publishing my first journal article
- Started writing my thesis in earnest - 102 A4 single spaced pages so far!

I'll be back in the new year with a bunch of cool things!

Looking ahead

The first event on my list is the Northern Lights Deep Learning Conference 2024, so expect to see coverage of that. Not sure how much time I'll have, but I'll definitely post on my Fediverse(/Mastodon) account: https://fediscience.org/@sbrl and write up later if I don't get a chance at the time.

Speaking of social media, my aforementioned account on the fediverse is my primary account. I post things there that do not make it to Twitter (especially given my automatic reposter has broken since API access has been restricted), so it's worth following me over there.

I will continue making Twitter/X → Fediscience redirect posts on Twitter, but as previously mentioned no Twitter/X API access means reposting content takes not-insignificant effort.

I'm also working with some PhD friends in my research group on SemEval 2024 Task 4. My role on this is identifying which word embedding system to use is best - and since this ties into my currently-partially-written blog post about the research-side of word embeddings, I think I'm ultimately going to wait until I've crunched some numbers for this project before redrafting that post with some fancier visuals.

Looking further ahead, I'm hoping that 2024 will be the year I finally finish my PhD!

Final thoughts

As always, there's likely a lot I've been doing that I have forgotten to blog about here. Please do comment below if there's anything that you'd like to see on here that I haven't posted about / have posted about but it's confusing. I'll do my best to blog about it :D

See you in 2024!

A bauble from a friend's Christmas tree.

PhD Update 17: Light at the end of the tunnel

Wow..... it's been what, 5 months since I last wrote one of these? Oops. I'll do my best to write them at the proper frequency in the future! Things have been busy. Before I talk about what's been happening, here's the ever-lengthening list of posts in this series:

As I sit here at the very bitter end of the very last day of a long but fulfilling semester, I'm feeling quite reflective about the past year and how things have gone on my PhD. One of these posts is definitely long overdue.

Timescales

Naturally the first question here is about timescales. "What happened?" I hear you ask. "I thought you said you were aiming for intent to submit September 2023 for December 2023 finish?"

Well, about that.......

As it turns out, spending half of one's week working as Experimental Officer throws off one's estimation of how much work they do. To this end, it's looking more likely that I will be submitting my thesis in early-mid semester 2 this year. In other words, that's around about March 2024 time - give or take a month or two.

After submission the next step will be my viva. Hoping I pass, it's then likely followed by corrections that must be completed based on the feedback from the viva.

What is a viva though? From what I understand, it is an oral exam in which you, your primary supervisor, and 2 examiners comb through your thesis with a fine toothcomb and ask you lots of questions. I've heard it can take several hours to complete. While the standard is to have 1 examiner be chosen internally from your department / institute and one to be chosen externally (chosen by your primary supervisor), in my case I will be having both chosen from external sources as I am now a (part-time) staff member in the Department of Computer Science at the University of Hull (my home institution).

While it's still a little ways out yet, I can't deny that the thought of my viva is making me rather nervous - having everything I've done over the past 4.5 years scrutinised by completely unknown people. In a sense, it feels like once it is time for my viva, there will be nothing more I can do. I will either know the answers to their questions.... or I will not.

Writing

As you might have guessed by now, writing has been the name - and, indeed, aim - of the game since the last post in this series. Everything is coming together rather nicely. It's looking like I'm going to end up with the following structure:

Introduction (not written*)
Background (almost there! currently working on this)
Rainfall radar for 2d flood forecasting (needs expanding)
Social media sentiment analysis (done!)
Conclusion
Acknowledgements, Appendices, etc
Dictionary of terms; List of acronyms (grows organically as I write - I need to go through and make sure I \gls all the terms I've added later)
Bibliography (currently 27 pages and counting O.o)

Technically I have written it, it's just outdated and very bad and needs throwing out the window of the tallest building I can find. Rewrite is pending - see below.

A sneak preview of my thesis as a PDF.

(Above: A sneak preview of my thesis PDF. I'm writing in LaTeX - check out my templates with the University of Hull reference style here! Evidently the pictured section needs some work.....)

I've finished the chapter on social media work, barring some minor adjustments I need to apply to ensure consistency. My current focus is the background chapter. This is most of the way there, but I need some more detail in several sections so I'm working my way through them one at a time. This is resulting a bunch more reading (especiall for vision-based water detection via satellite data), so this is taking some time.

Once I've wrapped up the background section, it will be time to turn my attention to the content chapter #2: Rainfall radar for 2d flood forecasting. Currently, it sits at halfway between a conference paper (check it out! You can read it now, though a DOI is pending and should be available after the conference) and a thesis chapter - so I need to push (pull? drag?) it the rest of the way to the finish line. This will primarily entail 2 things:

Filling out the chapter-specific related works, which are currently rather brief given space and time limitations in a conference paper
Elaborating on things like the data preprocessing, experiments, discussion, etc.

This will also take some time, which together with the background section explains the uncertaincy I still have in my finish date. Once these are both complete, I will be submitting my intent to submit! This will start a 3 month timer, by the end of which I must have submitted my thesis. During this timer period, I will be working on the introduction and conclusion chapters, which I do not expect to take nearly as long as any of the other chapters.

Once I am done writing and have submitted my thesis, I will do everything I can to ensure it is available under an open source licence for everyone to read. I believe strongly in the power of open source (and, open science) to benefit everyone, and want to share everything I've learned with all of you reading this.

At 102 pages A4 single space so far and counting though (not including the aforementioned bibliography), it's a big time investment to read. To this end, I have various publications I've written and posted about here previous that cover most of the stuff I've done (namely the rainfall radar conference paper and social media journal article), and I also want to somehow condense the content of my thesis down into a 'mini-thesis' that's about 3-6 pages ish and post that alongside my main thesis here on my website. I hope that this should provide the broad strokes and a navigation aid for the main document.

Predicting Persuasive Posts

All this writing is going to drive me crazy if I don't do something practical alongside it. Unfortuantely I have long since run out of exuses to run more experiments on my PhD work, so a good friend of mine who is also doing a PhD (they've published this paper) came along at the perfect time the other day asking for some help with a challenge competition submission they want to do. Of course, I had to agree to help out in a support role as the project sounds really interesting¹.

The official title of the challenge is thus: Multilingual Detection of Persuasion Techniques in Memes

The challenge is part of SemEval-2024 and it's basically about classifying memes from some social media network (it's unclear which one they are from) as to which persuasion tactic they are employing to manipulate the reader's opinions / beliefs.

The full challenge page is can be found here: https://propaganda.math.unipd.it/semeval2024task4/index.html

We had a meeting earlier this week to discuss, and one of the key problems we identified was that to score challengers they be using posts in multiple unseen languages. To this end, it strikes me that it is important to have multiple languages embedded in the same space for optimal results.

This is not what GloVe does (it embeds them to different 'spaces', so a model trained data in 1 language won't necessarily work well with another) - as I discovered in my demo for the Hull Science Festival - definitely want to write about this in the final post in that series - so as my role in the team I'm going to push a number of different word embeddings through the system I have developed for the aforementioned science demo to identify which one is best for embedding multilingual text. Expect some additional entries to be added to the demo and an associated blog post on my findings very soon!

Currently, I have the following word embedding systems on my list:

Word2vec
FastText
CLIP
BERT/mBERT
XLM/XLM-RoBERTa

If you know of any other good word embedding models / algorithms, please do leave a comment below.

It also occurs to me while writing this that I'll have to make sure the multilingual dataset I used for the online demo has the same or similar words translated to every language to rule out any difference in embeddings there.

A nice challenge for the Christmas holidays! My experience of collaborating with other researchers is rather limited at the moment, so I'm looking forward to working in a team to achieve a goal much faster than would otherwise be possible.

Beyond the edge

Something that has been constant nagging presence in my mind and steadily growing is the question of what happens next after my thesis. While the details have not been confirmed yet, once everything PhD-related is wrapped up I will most likely be increasing my hours by some amount such that I work Monday - Friday rather than just Monday - Wednesday lunchtime as I have been doing so far.

This extra time will consist of 2 main activities. To the best of my current understanding, this will include some additional teaching responsibilities - I will probably be teaching a module that lies squarely within 1 of my strong points. It will also, crucially, include some dedicated time for research.

This time for research I believe I will be able to spend on research related activities, including for example collaborating with other researchers, reading papers, designing and running experiments, and writing up results into publication form. Essentially what I've been doing on my PhD, just minus the thesis writing!

Of course, the things I talk about here are not set in stone, and me talking about them here is not a declaration of such.

Either way, I do feel that the technical is a strong point of mine that I am rather passsionate about, so I do desire very much to continue dedicating a significant portion of my energy towards doing practical research tasks.

I'm not sure how much I am allowed to talk about the teaching I will be doing, but do expect some updates on that here on my blog too - however high-level and broad strokesy they happen to be. What kind of teaching-related things would you be interested in being updated about here? Please do leave a comment below.

Talking more specifically, I do have a number of research ideas - one of which I have alluded to above - that I want to explore after my PhD. Most of these are based on what I have learnt from doing my PhD and the logical next steps to analyse complex real-time data sources with a view to extracting and processing information to increase situational awareness in natural disaster scenarios. When I get around to this, I will be blogging about my progress in detail here on my blog.

It should probably be mentioned that I am still quite a long way off actually putting any of these ideas into practice (I would definitely not recommend trusting any predictions my current rainfall radar → binarised water depth model makes in the real world yet!), but if you or someone you know works in the field of managing natural disasters, I would be fascinated to know what you would find most useful related to this - please leave a comment below.

Conclusion

This post has ended up being a lot longer than I expected! I've talked about my current writing progress, a rather interesting side-project (more details in a future blog post!), and initial conceptual future plans - both researchy and otherwise.

While my thesis is drawing close to completion (relatively, at least), I hope you will join me here beyond the end of this long journey that is almost at an end. As one book closes, so does another one open. A new journey is / will be only just beginning - one I can't wait to share with everyone here in future blog posts.

If you've got any thoughts, it would be cool if you could share them below.

It goes without saying, but I won't let it impact my writing progress. I divide my day up into multiple slices - one of which is dedicated to focused PhD work - and I'll be pulling from a different slice of time other than the one for my PhD writing to help out with this project. ↩

NLDL 2024: My rainfall radar paper is out!

A cool night sky and northern lights banner I made in Ink
scape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

......that's the title of the conference paper I've written about my rainfall radar research that I've been doing as part of my PhD, and now that the review process is complete I'm told by my supervisor that I can now share it!

This paper is the culmination of one half of my PhD (the other is multimodal social media sentiment analysis and it resulted in a journal article). Essentially, the idea behind the whole project was asking the question of "Can we make flood predictions all at once in 2D?".

The answer, as it turns out, is yes*.... but with a few caveats and a lot more work required before it's anywhere near ready to be coming to a smartphone near you.

It all sort of spiralled from there - and resulted in the development of a DeepLabV3+-based image semantic segmentation model that learns to approximate a physics-based water simulation.

The abstract of the paper is as follows:

Traditional predictive simulations and remote sensing techniques for forecasting floods are based on fixed and spatially restricted physics-based models. These models are computationally expensive and can take many hours to run, resulting in predictions made based on outdated data. They are also spatially fixed, and unable to scale to unknown areas.

By modelling the task as an image segmentation problem, an alternative approach using artificial intelligence to approximate the parameters of a physics-based model in 2D is demonstrated, enabling rapid predictions to be made in real-time.

I'll let the paper explain the work I've done in detail (I've tried my best to make it understandable by a wide audience). You can read it here:

https://openreview.net/forum?id=TpOsdB4gwR

(Direct link to PDF)

Long-time readers of my blog here will know that I haven't had an easy time of getting the model to work. If you'd like to read about the struggles of developing this and other models over the course of my PhD so far, I've been blogging about the whole process semi-regularly. We're currently up to part 16:

PhD Update 16: Realising the possibilities of the past

Speaking of which, it's high time I wrote another PhD update blog post, isn't it? A lot has been going on, and I'd really like to document it all here on my blog. I've also been finding it's been really useful to get me to take a step back to look at the big picture of my research - something that I've found very helpful in more ways than one. I'll definitely discuss this and my progress in the next part of my PhD update blog post series, which I tag with PhD to make them easy to find.

Until then, I'll see you in the next post!

Website update: Share2Fediverse, and you can do it too!

Heya! Got another short port for you here. You night notice that on all posts now there's a new share button (those buttons that take you to difference places with a link to this site to share it elsewhere) that looks like this:

The 5-pointed rainbow fediverse logo

If you haven't seen it before, this is the logo for the Fediverse, a decentralised network of servers and software that all interoperate (find out more here: https://fedi.tips/).

Since creating my Mastodon account, I've wanted some way to allow everyone here to share my posts on the Fediverse if they feel that way inclined. Unlike other centralised social media platforms like Reddit etc though, the Fediverse doesn't have a 'central' server that you can link to.

To this end, you need a landing page to act as a middleman. There are a few options out there already (e.g. share2fedi), but I wanted something specific and static, so I built my own solution. It looks like this:

(Above: A screenshot of Share2Fediverse.)

It's basically a bit of HTML + CSS for styling, a splash of Javascript to make the interface function and remember the instance + software you select for next time via localStorage.

Check it out at this demo link:

https://starbeamrainbowlabs.com/share2fediverse/#text=The%20fediverse%20is%20cool!%20%E2%9C%A8

Currently, it supports sharing to Mastodon, GNU Social, and Diaspora. As it turns out, finding the share url (e.g. for Mastodon on fediscience.org it's https://fediscience.org/share?text=some%20text%20here) is more difficult than it sounds, as I haven't found it to be well advertised. I'd love to add e.g. Pixelfed, Kbin, GoToSocial, Pleroma, and more.... but I need the share URL! If you know the share URL for any piece of Fediverse software, please do leave a comment below.

If you're interested in the source code, you can find it here:

https://github.com/sbrl/Share2Fediverse/

...if you'd really like to help out, you could even open a pull request! The file you want to edit is src/lib/software_db.mjs - though if you leave a comment here or open an issue I'll pick it up and add any requests.

See you on the Fediverse! o/

I'm going to NLDL 2024!

A cool night sky and northern lights banner I made in Ink
scape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

Heya! Been a moment 'cause I've been doing a lot of writing and revising of writing for my PhD recently (the promised last part of the scifest demo series is coming eventually, promise!), but I'm here to announce that, as you might have guessed by the cool new banner, I have today (yesterday? time is weird when you stay up this late) had a paper accepted for the Northern Lights Deep Learning Conference 2024, which is to be held on 9th - 11th January 2024 in Tromsø, Norway!

I have a lot of paperwork to do between now and then (and many ducks to get in a row), but I have every intention of attending the conference in person to present my rainfall radar research I've been rambling on about in my PhD update series.

I am unsure whether I'm allowed to share the paper at this stage - if anyone knows, please do get in touch. In the meantime, I'm pretty sure I can share the title without breaking any rules:

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

I also have a Cool Poster, which I'll share here after the event too in the new (work-in-progress) Research section of the main homepage that I need to throw some CSS at.

I do hope that this cool new banner gets some use bringing you more posts about (and, hopefully, from!) NLDL 2024 :D

--Starbeamrainbowlabs

Stardust Blog

Tag Cloud

.desktop files: Launcher icons on Linux

Sources and further reading

Defining AI: Image segmentation

Defining AI: Word embeddings

NLDL was awesome! >> NLDL-2024 writeup

Day 1

Diversity in AI

Social 1: Fjellheisen

Day 2

Social 2: Conference dinner

Day 3

Panel Discussion

AI in Industry Event

Closing thoughts

Advice

LaTeX templates for writing with the University of Hull's referencing style

Happy Christmas / New Year 2023!

Looking ahead

Final thoughts

PhD Update 17: Light at the end of the tunnel

Timescales

Writing

Predicting Persuasive Posts

Beyond the edge

Conclusion

NLDL 2024: My rainfall radar paper is out!

Website update: Share2Fediverse, and you can do it too!

I'm going to NLDL 2024!

Stardust
Blog