Stardust | Starbeamrainbowlabs

NLDL was awesome! >> NLDL-2024 writeup

A cool night sky and northern lights banner I made in Inkscape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

It's the week after I attended the Northern Lights Deep Learning Conference 2024, and now that I've had time to start to process everything I leant and experienced, it's blog post time~~✨ :D

Edit: Wow, this post took a lot more effort to put together than I expected! It's now the beginning of February and I'm just finishing this post up - sorry about the wait! I think this is the longest blog post to date. Consider this a mega-post :D

In this post that is likely to be quite long I'm going to talk a bit about the trip itself, and about what happened, and the key things that I learnt. Bear in mind that this post is written while I've still sorting my thoughts out - it's likely going to take many months to fully dive into everything I saw that interested me :P

Given I'm still working my way through everything I've learnt, it is likely that I've made some errors here. Don't take my word for any facts written here please! If you're a researcher I'm quoting here and you have spotted a mistake, please let me know.

I have lots of images of slides and posters that interested me, but they don't make a very good collage! To this end images shown below are from the things I saw & experienced. Images of slides & posters etc available upon request.

Note: All paper links will be updated to DOIs when the DOIs are released. All papers have been peer-reviewed.

Day 1

A collage of images from travelling to the conference. Description below. (Above: A collage of images from the trip travelling to the conference. Description below.)

Images starting top-left, going anticlockwise:

The moon & venus set against the first.... and last sunrise I would see for the next week
Getting off the first plane in Amsterdam Schiphol airport
Flying into Bergen airport
The teeny Widerøe turboprop aircraft that took me from Bergen to Tromsø
A view from the airport window when collecting my bags
Walking to the hotel
Departures board in Bergen airport

After 3 very long flights the day before (the views were spectacular, but they left me exhausted), the first day of the conference finally arrived. As I negotiated public transport to get myself to UiT, The Arctic University of Norway I wasn't sure what to expect, but as it turned out academic conferences are held in (large) lecture theatres (at least this one was) with a variety of different events in sequence:

Opening/closing addresses: Usually at the beginning & ends of a conference. Particularly the beginning address can include useful practical information such as where and when food will be made available.
Keynote: A longer (usually ~45 minute) talk that sets the theme for the day or morning/afternoon. Often done by famous and/or accomplished researchers.
Oral Session: A session chaired by a person [usually relatively distinguished] in which multiple talks are given by individual researchers with 20 minutes per talk. Each talk is 15 minutes with 5 minutes for questions. I spoke in one of these!
Poster session: Posters are put up by researchers in a designated area (this time just outside the room) and people can walk around and chat with researchers about their researchers. If talks have a wide reach and shallow depth, posters have a narrow reach and much more depth.
- I have bunch of photographs of posters that interested me for 1 reason or another, but it will take me quite a while to work through them all to properly dig into them all and extract the interesting bits.
Panel discussion: This was where a number of distinguished people sit down on chairs/stools at the front, and a chair asks a series of questions and moderates the resulting discussion. Questions from the audience may also be allowed after some of the key preset questions have been asked.

This particular conference didn't have multiple events happening at once (called 'tracks' in some conferences I think), which I found very helpful as I didn't need to figure out which events I should attend or not. Some talks didn't sound very interesting but then turned out to be some of the highlights of the conference for me, as I'll discuss below. Definitely a fan of this format!

The talks started off looking at the fundamentals of AI. Naturally, this included a bunch of complex mathematics - the understanding of which in real-time is not my strong point - so while I did make some notes on these I need to go back and take a gander at the papers of some of the talks to fully grasp what was going on.

Moving on from AI fundamentals, the next topic was reinforcement learning. While not my current area of research area, some interesting new uses of the technology were discussed, such as dynamic pathing/navigation based on the information gained from onboard sensors by Alouette von Hove from the University of Oslo - the presented example was determining the locations of emitters of greenhouse gasses such as CO₂ & methane.

Interspersed in-between the oral sessions were poster sessions. At NLDL-2024 these were held in the afternoons and also had fruit served alongside them, which I greatly appreciated (I'm a huge fan of fruit). At these there were posters for the people who had presented earlier in the day, but also some additional posters from researchers who were presenting a talk.

If talks research a wide audience at a shallow depth, the posters reached a narrower audience but at a much greater depth. I found the structure of having the talk before the poster very useful - not only for presenting my own research (more on that later), but also for picking out some of the posters I wanted to visit to learn more about their approaches.

On day 1, the standout poster for me was one on uncertainty quantification in image segmentation models - Confident Naturalness Explanation (CNE): A Framework to Explain and Assess Patterns Forming Naturalness. While their approach to increasing the explainability of image segmentation models (particularly along class borders) was applied to land use and habitat identification, I can totally see it being applicable to many other different projects in a generic 'uncertainty-aware image segmentation' form. I would very much like to look into this one deeply and consider applying lessons learnt to my rainfall radar model.

Another interesting poster worked to segment LiDAR data in a similar fashion to that of normal 'image segmentation' (that I'm abusing in my research) - Beyond Siamese KPConv: Early Encoding and Less Supervision.

Finally, an honourable mention is one which applied reinforcement learning to task scheduling - Scheduling conditional task graphs with deep reinforcement learning.

Diversity in AI

In the afternoon, the Diversity in AI event was held. The theme was fairness of AI models, and this event was hugely influential for me. Through a combination of cutting edge research and helpful case-studies and illustrations, the speakers revealed hidden sources of bias and novel ways to try and correct for them. They asked the question of "what do we mean by a fair AI model?", and discovered the multiple different facets to the question and how fairness in an AI model can mean different things in different contexts and to different people.

They also demonstrated how taking a naïve approach to correcting for e.g. bias in a binary classifier could actually make the problem worse!

I have not yet had time to go digging into this, but I absolutely want to spend at least an entire afternoon dedicated to digging into and reading around the subject. Previously, I had no idea how big and pervasive the problem of bias in AI was, so I most certainly want to educate myself to ensure models that I create as a consequence of the research I do are as ethical as possible.

Depending on how this research reading goes, I could write a dedicated blog post on it in the future. If this would be interesting to you, please comment below with the kinds of things you'd be interesting in.

Another facet of the diversity event was that of hiring practices and diversity in academia. In the discussion panel that closed out the day, the current dilemma of low diversity (e.g. gender balance) in students taking computer science as a subject. It was suggested that how computer science is portrayed can make a difference, and that people with different backgrounds on the subject will approach and teach the subject through different lenses. Mental health was also mentioned as being a factor that requires work and effort to reduce stigma, encourage discussions, and generally improve the situation.

All in all I found the diversity event to be a very useful and eye-opening event that I'm glad I attended.

A collage from day 1 of the conference

(Above: A collage from day 1 of the conference)

Images starting top-left, going anticlockwise in an inwards spiral:

The conference theatre during a break
Coats hung up on the back wall of the conference theatre - little cultural details stood out to me and were really cool!
On the way in to the UiT campus on day 1
Some plants under some artificial sunlight bulbs I found while having a wander
Lunch on day 1: rice (+ curry, but I don't like curry)
NLDL-2024 sign
View from the top on Fjellheisen
Cinnamon bun that was very nice and I need to find a recipe
View from the cable car on the way up Fjellheisen

Social 1: Fjellheisen

The first social after the talks closed out for the day was that of the local mountain Fjellheisen (pronounced fyell-hai-sen as far as I can tell). Thankfully a cable car was available to take conference attendees (myself included) up the mountain, as it was significantly cold and snowy - especially 420m above sea level at the top. Although it was very cloudy at the time with a stratus cloud base around 300m (perhaps even lower than that), we still got some fabulous views of Tromsø and the surrounding area.

There was an indoor seating area too, in which I warmed up with a cinnamon bun and had some great conversations with some of the other conference attendees. Social events and ad-hoc discussions are, I have discovered, an integral part of the conference experience. You get to meet so many interesting people and discover so many new things that you wouldn't otherwise get the chance to explore.

Day 2

Day 2 started with AI for medical applications, and what seemed to be an unofficial secondary theme continuing the discussion of bias and fairness which made the talks just as interesting and fascinating as the previous day. By this point I figured out the conference-provided bus, resulting in more cool discussions on the way to and from the conference venue at UiT.

Every talk was interesting in it's own way, with discussions of shortcut learning (where a model learns to recognise something else other than your intended target - e.g. that some medical device in an X-Ray is an indicator of some condition when it wouldn't ordinarily present at test time), techniques to utilise contrastive learning in new ways (classifying areas of interest in very large images from microscopes) and applying the previous discussion of bias and fairness to understanding bias in contrastive learning systems, and what we can do about it through framing the task the model is presented with.

The research project that stood out to be was entitled Local gamma augmentation for ischemic stroke lesion segmentation on MRI by Jon Middleton at the University of Copenhagen. Essentially they correct for differing ranges of brightness in images from MRI scans of brains before training a model to increase accuracy and reduce bias.

The poster session again had some amazing projects that are worth mentioning. Of course, as with this entire blog post this is just my own personal recount of the things that I found interesting - I encourage you to go to a conference in person at some point if you can!

The highlight was a poster entitled LiDAR-based Norwegian Tree Species Detection Using Deep Learning. The authors segment LiDAR data by tree species, but have also invented a clever augmentation technique they call 'cowmix augmentation' to stretch the model's attention to detail on class borders and the diversity of their dataset.

Another cool poster was Automatic segmentation of ice floes in SAR images for floe size distribution. By training an autoencoder to reconstruct SAR (Synthetic Aperture Radar) images, they use the resulting output to analyse the distribution in sizes of icebergs in Antarctica.

I found that NLDL-2024 had quite a number of people working in various aspects of computer vision and image segmentation as you can probably tell from the research projects that have stood out to me so far. Given I went to present my rainfall radar data (more on that later), image handling projects stood out to me more easily than others. There seemed to be less of a focus on Natural Language Processing - which, although discussed at points, wasn't nearly as prominent a theme.

One NLP project that was a thing though was a talk on anonymising data in medical records before they are e.g. used in research projects. The researcher presented an approach using a generative text model to identify personal information in medical records. By combining it with a regular expression system, more personal information could be identified and removed than before.

While I'm not immediately working with textual data at the minute, part of my PhD does involve natural language processing. Maybe in the future when I have some more NLP-based research to present it might be nice to attend an NLP-focused conference too.

A collage of photos from day 2 of the conference.

(Above: A collage of photos from day 2 of the conference.)

Images from top-left in an anticlockwise inwards spiral:

Fjellheisen from the hotel at which the conference dinner took place
A cool church building in a square I walked through to get to the bus
The hallway from the conference area up to the plants in the day 1 collage
Books in the store on the left of #3. I got postcards here!
Everyone walking down towards the front of the conference theatre to have the group photo taken. I hope they release that photo publicly! I want a copy so bad...
The lobby of the conference dinner hotel. It's easily the fanciest place I have ever seen....!
The northern lights!!! The clouds parted for half and hour and it was such a magical experience.
Moar northern lights of awesomeness
They served fruit during the afternoon poster sessions! I am a big fan. I wonder if the University of Hull could do this in their events?
Lunch on day 2: fish pie. It was very nice!

Social 2: Conference dinner

The second social event that was arranged was a conference dinner. It was again nice to have a chance to chat with others in my field in a smaller, more focused setting - each table had about 7 people sitting at it. The food served was also very fancy - nothing like what I'm used to eating on a day-to-day basis.

The thing I will always remember though is shortly before the final address, someone came running back into the conference dinner hall to tell us they'd seen the northern lights!

Grabbing my coat and rushing out the door to some bewildered looks, I looked up and.... there they were.

As if they had always been there.

I saw the northern lights!

Seeing them has always been something I wanted to do, so I am so happy to have a chance to see them. The rest of the time it was cloudy, but the clouds parted for half an hour that evening and it was such a magical moment.

They were simultaneously exactly and nothing like what I expected. They danced around the sky a lot, so you really had to have a very good view in all directions and keep scanning the sky. They also moved much faster than I expected. They could flash and be gone in just moments, while others would just stick and hang around seemingly doing nothing for over a minute.

A technique I found to be helpful was to scan the sky with my phone's camera. It could see 'more' of the northern lights than you can see with your eyes, so you could find a spot in the sky that had a faint green glow with your phone and then stare at it - and more often than not it would brighten up quickly so you could see it with your own eyes.

Day 3

It felt like the conference went by at lightning speed. For the entire time I was focused on learning and experiencing as much as I could, and just as soon as the conference started we all reached the last day.

The theme for the third day started with self-supervised learning. As I'm increasingly discovering, self-supervised learning is all about framing the learning task you give an AI model in a clever way that partially or completely does away with the need for traditional labels. There were certainly some clever solutions on show at NLDL-2024:

Beyond output-mask comparison: A self-supervised inspired object scoring system for building change detection: Detecting changes in buildings across time, and avoiding the aforementioned shortcut learning - reminds me of SpaceNet8 in that it could be applicable to natural disaster damage detection.
Loop closure with a lower power millimetre wave radar sensor using deep learning: Sounds like a fancy title, but basically they use a cool autoencoder-based solution to figuring out when a drone that's exploring its environment doubles back on itself.

An honourable mention goes to the paper on a new speech-editing model called FastStitch. Unfortunately the primary researcher was not able to attend - a shame, cause it would have been cool to meet up and chat - but their approach looks useful for correcting speech in films, anime, etc after filming... even though it could also be used for some much more nefarious purposes too (though that goes for all of the new next-gen generative AI models coming out at the moment).

This was also the day I presented my research! As I write this, I realise that this post is now significantly long so I will dedicate a separate post to my experiences presenting my research. Suffice to say it was a very useful experience - both from the talks and the poster sessions.

Speaking of poster sessions, there was a really interesting poster today entitled Deep Perceptual Similarity is Adaptable to Ambiguous Contexts which proposes that image similarity is more complicated just comparing pixels: it's about shapes and the object shown -- not just the style a given image is drawn in. To this end, they use some kind of contrastive approach to compare how similar a series of augmentations are to the original source image as a training task.

Panel Discussion

Before the final poster session of the (main) conference, a panel discussion between 6 academics (1 chairing) who sounded very distinguished (sadly I did not completely catch their names, and I will save everyone the embarrassment of the nickname I had to assign them to keep track of the discussion in my notes) closed out the talks. There was no set theme that jumped out to me (other than AI of course), but like the diversity in AI conference discussion panel on day 1 the chair had some set questions to ask the academics making up the discussion.

The role of Universities and academics was discussed at some length. Recently large tech companies like OpenAI, Google, and others are driven by profit to race to put next-generation foundational models (a term new to me that describes large general models like GPT, Stable Diffusion, Segment Anything, CLIP, etc) to work in anything and everything they can get their hands on..... and often to the detriment of user privacy.

It was mentioned that researchers in academia have a unique freedom to choose what they research in a way that those working in industry do not. It was suggested that academia must be one step ahead of industry, and understand the strengths/weaknesses of the new technologies -- such as foundational models, and how they impact society. With this freedom, researchers in academia can ask the how and why, which industry can't spare the resources for.

The weaknesses of academia was also touched on, in that academia is very project-based - and funding for long-term initiatives can be very difficult come by. It was also mentioned that academia can get stuck on optimising e.g. a benchmark in the field of AI specifically. To this end, I would guess creativity is really important to invent innovative new ways of solving existing real-world problems rather than focusing too much on abstract benchmarks.

The topic of the risks of AI in the future also came up. While the currently-scifi concept of the Artificial General Intelligence (AGI) that is smarter than humans is a hot topic at the moment, whether or not it's actually possible is not clear (personally, it seem rather questionable that it's even possible at all) - and certainly not in the next few decades. Rather than worrying about AGI, everyone agreed that bias and unfairness in AI models is already a problem that needs to be urgently addressed.

The panel agreed that people believing the media hype generated by the large tech companies is arguably more dangerous than AGI itself... even if it were right around the corner.

The EU AI Act is right around the corner, which requires transparency of data used to train a given AI model, among many other things. This is a positive step forwards, but the panel was concerned that the act could lead to companies cutting corners on safety to tick boxes. They were also concerned that an industry would spring up around the act providing services of helping other businesses to comply with the act, which risked raising the barrier to entry significantly. How the act is actually implemented with have a large effect on its effectiveness.

While the act risks e.g. ChatGPT being forced to pull out of the EU if it does not comply with the transparency rules, the panel agreed that we must take alternate path than that of closed-source models. Open source alternatives to e.g. ChatGPT do exist and are only about 1.5 years behind the current state of the art. It appears at first that privacy and openness are at odds, but in Europe we need both.

The panel was asked what advice they had for young early-career researchers (like me!) in the audience, and had a range of helpful tips:

Don't just follow trends because everyone else is. You might see something different in a neglected area, and that's important too!
Multidisciplinary research is a great way to see different perspectives.
Speak to real people on the ground as to what their problems are, and use that as inspiration
Don't keep chasing after the 'next big thing' (although keeping up to date in your field is important, I think).
Work on the projects you want to work on - academia affords a unique freedom to researchers working within

All in all, the panel was a fascinating big-picture discussion, and there was discussion of the bigger picture of the role of academia in big-picture current global issues I haven't really seen before this point.

AI in Industry Event

The last event of the conference came around much faster than I expected - I suppose spending every waking moment focused on conferencey things will make time fly by! This event was run by 4 different people from 4 different companies involved in AI in one way or another.

It was immediately obvious that these talks were by industry professionals rather than researchers, since they somehow managed say a lot without revealing all that much about the internals of their companies. It was also interesting that some of them were almost a pitch to the researchers present to ask if they had any ideas or solutions to their problems.

This is not to say that the talks weren't useful. They were a useful insight into how industry works, and how the impact of research can be multiplied by being applied in an industry context.

It was especially interesting to listen to the discussion panel that was held between the 4 presenters / organisers of the industry event. 1 of them served as chair, moderating the discussion and asking the questions to direct the discussion. They discussed issues like silos of knowledge in industry vs academia, the importance of sharing knowledge between the 2 disciplines, and the challenges of AI explainability in practice. The panellists had valuable insights into the realities of implementing research outputs on the ground, the importance of communication, and some advice for PhD students in the audience considering a move into industry after their PhD.

A collage of photos I took during day 3

(Above: A collage from day 1 of the conference)

Images starting top-left, going anticlockwise in an inwards spiral:

A cool ship I saw on the way to the bus that morning
A neat building I saw on the way to the bus. The building design is just so different to what I'm used to.... it gives me loads of building inspiration for Minetest!
The cafeteria in which we ate lunch each day! It was so well designed, and the self-clear system was very functional and cool!
The conference theatre during one of the talks.
Day 3's lunch: lasagna! They do cool food in UiT in Tromsø, Norway!
The last meal I ate (I don't count breakfast :P) in the evening before leaving the following day
The library building I went past on the way back to the hotel in the evening. The integrated design with books + tables and interactive areas is just so cool - we don't have anything like that I know of over here!
A postbox! I almost didn't find it, but I'm glad I was able to sent a postcard or two.
The final closing address! It was a bittersweet moment that the conference was already coming to a close.

Closing thoughts

Once the industry event was wrapped up, it was time for the closing address. Just as soon as it started, the conference was over! I felt a strange mix of exhaustion, disbelief that the conference was already over, and sadness that everyone would be going their separate ways.

The very first thing I did after eating something and getting back to my hotel was collapse in bed and sleep until some horribly early hour in the morning (~4-5am, anyone?) when I needed to catch the flight home.

Overall it was an amazing conference, and I've learnt so much! It's felt so magical, like anything is possible ✨ I've met so many cool people and been introduced to so many interesting ideas, it's gonna take some time to process them all.

I apologise for how long and rambly this post has turned out to be! I wanted to get all my thoughts down in something coherent enough I can refer to it in the future. This conference has changed my outlook on AI and academia, and I'm hugely grateful to my institution for finding the money to make it possible for me to go.

It feels impossible to summarise the entire conference in 4 bullet points, but here goes:

Fairness: What do we mean by 'fair'? Hidden biases, etc. Explainable AI sounds like an easy solution but can not only mislead you but attempts to improve perceived 'fairness' can in actuality make the problem worse and you would never know!
Self-supervised learning: Clustering, contrastive learning, also tying in with the fairness theme ref sample weighting and other techniques etc.
Fundamental models: Large language models etc that are very large and expensive to train, sometimes available pretrained sand sometimes only as an API. They can zero-shot many different tasks, but what about fairness, bias, ethics?
Reinforcement learning: ...and it's applications

Advice

I'm going to end this mammoth post with some advice to prospective first-time conference goers. I'm still rather inexperienced with these sortsa things, but I do have a few things I've picked up.

If you've unsure about going to a conference, I can thoroughly recommend attending one. If you don't know which conference you'd like to attend, I recommend seeking someone with more experience than you in your field but what I can say is that I really appreciated how NLDL-2024 was not too big and not too small. It had an estimated 250 conference attendees, and I'm very thankful it did not have multiple tracks - this way I didn't hafta sort through which talks interested me and which ones didn't. The talks that did interest me sometimes surprised me: if I had the choice I would have picked an alternative, but in the end I'm glad I sat through all of them.

Next, speak to people! You're all in this together. Speak to people at lunch. On the bus/train/whatever. Anywhere and everywhere you see someone with the same conference lanyard as you, strike up a conversation! The other conference attendees have likely worked just as hard as you to get here & will likely be totally willing to you. You'll meet all sorts of new people who are just as passionate about your field as you are, which is an amazing experience.

TAKE NOTES. I used Obsidian for this purpose, but use anything that works for you. This includes both formally from talks, panel discussions, and other formal events, but also informally during chats and discussions you hold with other conference attendees. Don't forget to include who you spoke to as well! I'm bad at names and faces, but your notes will serve as a permanent record of the things you learnt and experienced at the time that you can refer back to again later. You aren't going to remember everything you see (unless you have a perfect photographic memory of course), so make notes!

On the subject of recording your experiences, take photos too. I'm now finding it very useful to have photos of important slides and posters that I saw to refer back to. I later developed a habit of photographing the first slide of every talk, which has also proved to be invaluable.

Having business cards to hand out can be extremely useful to follow up conversations. If you have time, get some made before you go and take them with you. I included some pretty graphics from my research on mine, which served as useful talking points to get conversations started.

Finally, have fun! You've worked hard to be here. Enjoy it!

If you have any thoughts / comments about my experiences at NLDL-2024, I'd love to hear them! Do leave a comment below.

LaTeX templates for writing with the University of Hull's referencing style

Hello, 2024! I'm writing this while it is still some time before the new year, but I realised just now (a few weeks ago for you), that I never blogged about the LaTeX templates I have been maintaining for a few years by now.

It's no secret that I do all of my formal academic writing in LaTeX - a typesetting language that is the industry standard in the field of Computer Science (and others too, I gather). While it's a very flexible (and at times obtuse, but this is a tale for another time) language, actually getting started is a pain. To make this process easier, I have developed over the years a pair of templates for writing that make starting off much easier.

A key issue (and skill) in academic writing is properly referencing things, and most places have their own specific referencing style you have to follow. The University of Hull is no different, so I knew from the very beginning that I needed a solution.

I can't remember who I received it from, but someone (comment below if you remember who it was, and I'll properly credit!) gave me a .bst BibTeX referencing style file that matches the University of Hull's referencing style.

I've been using it ever since, and I have also applied a few patches to it for some edge cases I have encountered that it doesn't handle. I do plan on keeping it up to date for the forseeable future with any changes they make to the aforementioned referencing style.

My templates also include this .bst file to serve as a complete starting template. There's one with a full page title (e.g. for thesis, dissertations, etc), and another with just a heading that sits at the top of the document just like a paper you might find on Semantic Scholar.

Note that I do not guarantee that the referencing style matches the University of Hull's style. All I can say is that it works for me and implements this specific referencing style.

With that in mind, I'll leave the README of the git repository to explain the specifics of how to get started with them:

https://git.starbeamrainbowlabs.com/Demos/latex-templates

They are stored on my personal git server, but you should be able to clne them just fine. Patches are most welcome via email (check the homepage of my website!)!

NLDL 2024: My rainfall radar paper is out!

A cool night sky and northern lights banner I made in Ink
scape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

......that's the title of the conference paper I've written about my rainfall radar research that I've been doing as part of my PhD, and now that the review process is complete I'm told by my supervisor that I can now share it!

This paper is the culmination of one half of my PhD (the other is multimodal social media sentiment analysis and it resulted in a journal article). Essentially, the idea behind the whole project was asking the question of "Can we make flood predictions all at once in 2D?".

The answer, as it turns out, is yes*.... but with a few caveats and a lot more work required before it's anywhere near ready to be coming to a smartphone near you.

It all sort of spiralled from there - and resulted in the development of a DeepLabV3+-based image semantic segmentation model that learns to approximate a physics-based water simulation.

The abstract of the paper is as follows:

Traditional predictive simulations and remote sensing techniques for forecasting floods are based on fixed and spatially restricted physics-based models. These models are computationally expensive and can take many hours to run, resulting in predictions made based on outdated data. They are also spatially fixed, and unable to scale to unknown areas.

By modelling the task as an image segmentation problem, an alternative approach using artificial intelligence to approximate the parameters of a physics-based model in 2D is demonstrated, enabling rapid predictions to be made in real-time.

I'll let the paper explain the work I've done in detail (I've tried my best to make it understandable by a wide audience). You can read it here:

https://openreview.net/forum?id=TpOsdB4gwR

(Direct link to PDF)

Long-time readers of my blog here will know that I haven't had an easy time of getting the model to work. If you'd like to read about the struggles of developing this and other models over the course of my PhD so far, I've been blogging about the whole process semi-regularly. We're currently up to part 16:

PhD Update 16: Realising the possibilities of the past

Speaking of which, it's high time I wrote another PhD update blog post, isn't it? A lot has been going on, and I'd really like to document it all here on my blog. I've also been finding it's been really useful to get me to take a step back to look at the big picture of my research - something that I've found very helpful in more ways than one. I'll definitely discuss this and my progress in the next part of my PhD update blog post series, which I tag with PhD to make them easy to find.

Until then, I'll see you in the next post!

I'm going to NLDL 2024!

A cool night sky and northern lights banner I made in Ink
scape. It features mountains and AI-shaped constellations, with my logo and the text "@ NLDL 2024".

Heya! Been a moment 'cause I've been doing a lot of writing and revising of writing for my PhD recently (the promised last part of the scifest demo series is coming eventually, promise!), but I'm here to announce that, as you might have guessed by the cool new banner, I have today (yesterday? time is weird when you stay up this late) had a paper accepted for the Northern Lights Deep Learning Conference 2024, which is to be held on 9th - 11th January 2024 in Tromsø, Norway!

I have a lot of paperwork to do between now and then (and many ducks to get in a row), but I have every intention of attending the conference in person to present my rainfall radar research I've been rambling on about in my PhD update series.

I am unsure whether I'm allowed to share the paper at this stage - if anyone knows, please do get in touch. In the meantime, I'm pretty sure I can share the title without breaking any rules:

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

I also have a Cool Poster, which I'll share here after the event too in the new (work-in-progress) Research section of the main homepage that I need to throw some CSS at.

I do hope that this cool new banner gets some use bringing you more posts about (and, hopefully, from!) NLDL 2024 :D

--Starbeamrainbowlabs

Building the science festival demo: technical overview

Hello and welcome to the technical overview of the hull science festival demo I did on the 9th September 2023. If you haven't already, I recommend reading the main release post for context, and checking out the live online demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/

I suspect that a significant percentage of the readers of my blog here love technical nuts and bolts of things (though you're usually very quiet in the comments :P), so I'm writing a series of posts about various aspects of the demo, because it's was a very interesting project.

In this post, we'll cover the technical nuts and bolts of how I put it together, the software and libraries I used, and the approach I took. I also have another post written I'll be posting after this one on monkeypatching npm packages after you install them, because I wanted that to be it's own post. In a post after that we'll look look at the research and the theory behind the project and how it fits into my PhD and wider research.

To understand the demo, we'll work backwards and deconstruct it piece by piece - starting with what you see on screen.

Browsing for a solution

As longtime readers of my blog here will know, I'm very partial to cramming things into the browser that probably shouldn't run in one. This is also the case for this project, which uses WebGL and the HTML5 Canvas.

Of course, I didn't implement using the WebGL API directly. That's far too much effort. Instead, I used a browser-based game engine called Babylon.js. Babylon abstracts the complicated bits away, so I can just focus on implementing the demo itself and not reinventing the wheel.

Writing code in Javascript is often an exercise in putting lego bricks together (which makes it very enjoyable, since you rarely have to deviate from your actual task due to the existence of npm). To this end, in the process of implementing the demo I collected a bunch of other npm packages together to which I could then delegate various tasks:

tsv for parsing TSV files
chroma-js for handling colours
pako for running unpacking .gz files with pure JS in the browser
readablestream-lines for making a ReadableStream a line-by-line iterator
octree-es for an Octree implementation
a few other minor packages

Graphics are easy

After picking a game engine, it is perhaps unsurprising that the graphics were easy to implement - even with 70K points to display. I achieved this with Babylon's PointsCloudSystem class, which made the display of the point cloud a trivial exercise.

After adapting and applying a clever plugin (thanks, @sebavan!), I had points that were further away displaying smaller and closer ones larger. Dropping in a perceptually uniform colour map (I wonder if anyone's designed a perceptually uniform mesh map for a 3D volume?) and some fog made the whole thing look pretty cool and intuitive to navigate.

Octopain

Now that I had the points displaying, the next step was to get the text above easy point displaying properly. Clearly with 70K points (140K in the online demo!) I can't display text for all of them at once (and it would look very messy if I did), so I needed to index them somehow and efficiently determine which points were near to the player in real time. This is actually quite a well studied problem, and from prior knowledge I remember that Octrees were reasonably efficient. If I had some tine to sit down and read papers (a great pastime), this one (some kind of location recognition from point clouds; potentially indoor/outdoor tracking) and this one (AI semantic segmentation of point clouds) look very interesting.

Unfortunately, the task of extracting a list of points within a given radius was not something commonly implemented in octree implementations on npm, and combined with a bit of headache figuring out the logic of this and how to hook it up to the existing Babylon renderer resulted in this step taking some effort before I found octree-es and got it working the way I wanted it to.

In the end, I had the octree as a completely separate point indexing data structure, and I used the word as a key to link it with the PointsCloudSystem in babylon.

Gasp, is that a memory leaks I see?!

Given I was in a bit of a hurry to get the whole demo thing working, it should come as no surprise that I ended up with a memory leak. I didn't actually have time to fix it before the big day either, so I had the demo on the big monitor while I kept an eye on the memory usage of my laptop on my laptop screen!

A photo of my demo up and running on a PC with a PS4 controller on a wooden desk. An Entroware laptop sits partially obscured by a desktop PC monitor, the latter of which has the demo full screen.

(Above: A photo of my demo in action.... I kept an eye on the memory graph the taskbar on my laptop the whole time. It only crashed once!)

Anyone who has done anything with graphics and game engines probably suspects where the memory leak was already. When rendering the text above each point with a DynamicTexture, I didn't reuse the instance when the player moved, leading to a build-up of unused textures in memory that would eventually crash the machine. After the day was over, I had time to sit down and implement a pool to re-use these textures over and over again, which didn't take nearly as long as I thought it would.

Gamepad support

You would think that being a well known game engine that Babylon would have working gamepad support. The documentation even suggests as such, but sadly this is not the case. When I discovered that gamepad support was broken in Babylon (at least for my PS4 controller), I ended up monkeypatching Babylon to disable the inbuilt support (it caused a crash even when disabled O.o) and then hacking together a custom implementation.

This custom implementation is actually quite flexible, so if I ever have some time I'd like to refactor it into its own npm package. Believe it or not I tried multiple other npm packages for wrapping the Gamepad API, and none worked reliably (it's a polling API, which can make designing an efficient and stable wrapper an interesting challenge).

To do that though I would need to have some other controllers to test with, as currently it's designed only for the PS4 dualshock controller I have on hand. Some time ago I initially purchased an Xbox 360 controller wanting something that worked out of the box with Linux, but it didn't work out so well so I ended up selling it on and buying a white PS4 dualshock controller instead (pictured below).

I'm really impressed with how well the PS4 dualshock works with Linux - it functions perfectly out of the box in the browser (useful test website) just fine, and even appears to have native Linux mainline kernel support which is a big plus. The little touchpad on it is cute and helpful in some situations too, but most of the time you'd use a real pointing device.

A white PS4 dualshock controller.

(Above: A white PS4 dualshock controller.)

How does it fit in a browser anyway?!

Good question. The primary answer to this is the magic of esbuild: a magical build tool that packages your Javascript and CSS into a single file. It can also handle other associated files like images too, and on top of that it's suuuper easy to use. It tree-shakes by default, and just all-around a joy to use.

Putting it to use resulted in my ~1.5K lines of code (wow, I thought it was more than that) along with ~300K lines in libraries being condensed into a single 4MiB .js and a 0.7KiB .css file, which I could serve to the browser along with the the main index.html file. It's event really easy to implement subresource integrity, so I did that just for giggles.

Datasets, an origin story

Using the Fetch API, I could fetch a pre-prepared dataset from the server, unpack it, and do cool things with it as described above. The dataset itself was prepared using a little Python script I wrote (source).

The script uses GloVe to vectorise words (I think I used 50 dimensions since that's what fit inside my laptop at the time), and then UMAP (paper, good blog post on why UMAP is better than tSNE) to do dimensionality reduction down to 3 dimensions, whilst still preserving global structure. Judging by the experiences we had on the day, I'd say it was pretty accurate, if not always obvious why given words were related (more on this why this is the case in a separate post).

My social media data, plotted in 2D with PCA (left), tSNE (centre), and UMAP (right). Points are blue against a white background, plotted with the Python datashader package.

_(Above: My social media data, plotted in 2D with PCA (left), tSNE (centre), and UMAP (right). Points are blue against a white background, plotted with the Python datashader package.)_

I like Javascript, but I had the code written in Python due to prior research, so I just used Python (looking now there does seem to be a package that implementing UMAP in JS, so I might look at that another time). The script is generic enough that I should be able to adapt it for other projects in the future to do similar kinds of analyses.

For example, if I were to look at a comparative analysis of e.g. language used by social media posts from different hashtags or something, I could use the same pipeline and just label each group with a different colour to see the difference between the 2 visually.

The data itself comes from 2 different places, depending on where you see the demo. If you were luck enough to see it in person, then it's directly extracted from my social media data. The online one comes from page abstracts from various Wikipedia language dumps to preserve privacy of the social media dataset, just in case.

With the data converted, the last piece of the puzzle is that of how it ends up in the browser. My answer is a gzip-compressed headerless tab-separated-values file that looks something like this (uncompressed, of course):

cat    -10.147051      2.3838716       2.9629934
apple   -4.798643       3.1498482       -2.8428414
tree -2.1351748      1.7223179       5.5107193

With the data stored in this format, it was relatively trivial to load it into the browser, decompressed as mentioned previously, and then display it with Babylon.js. There's also room here to expand and add additional columns later if needed, to e.g. control the colour of each point, label each word with a group, or something else.

Conclusion

We've pulled the demo apart piece by piece, and seen at a high level how it's put together and the decisions I made while implementing it. We've seen how I implemented the graphics - aided by Babylon.js and a clever hack. I've explained how I optimised the location polling using achieve real-time performance with an octree, and how reusing textures is very important. Finally, we took a brief look at the dataset and where it came from.

In the next post, we'll take a look at how to monkeypatch an npm package and when you'd want to do so. In a later post, we'll look at the research behind the demo, what makes it tick, what I learnt while building and showing it off, and how that fits in with the wider field from my perspective.

Until then, I'll see you in the next post!

Edit 2023-11-30: Oops! I forgot to link to the source code....! If you'd like to take a gander at the source code behind the demo, you can find it here: https://github.com/sbrl/research-smflooding-vis

My Hull Science Festival Demo: How do AIs understand text?

Hello there! On Saturday 9th September 2023, I was on the supercomputing stand for the Hull Science Festival with a cool demo illustrating how artificial intelligences understand and process text. Since then, I've been hard at work tidying that demo up, and today I can announce that it's available to view online here on my website!

This post is a general high-level announcement post. A series of technical posts will follow on the nuts and bolts of both the theory behind the demo and the actual code itself and how its put together, because it's quite interesting and I want to talk about it.

I've written this post to serve as a foreword / quick explanation of what you're looking at (similar to the explanation I gave in person), but if you're impatient you can just find it here.

All AIs currently developed are essentially complex parametrised mathematical models. We train these models by updating their parameters little by little until the output of the model is similar to the output of some ground truth label.

In other words, and AI is just a bunch of maths. So how does it understand text? The answer to this question lies in converting text to numbers - a process often called 'word embedding'.

This is done by splitting an input sentence into words, and then individually converting each word into a series of numbers, which is what you will see in the demo at the link below - just convert with some magic to 3 dimensions to make it look fancy.

Similar sorts of words will have similar sorts of numbers (or positions in 3D space in the demo). As an example here, at the science festival we found a group of footballers, a group of countries, and so on.

In the demo below, you will see clouds of words processed from Wikipedia. I downloaded a bunch of page abstracts for Wikipedia in a number of different languages (source), extracted a list of words, converted them to numbers (GloVe → UMAP), and plotted them in 3D space. Can you identify every language displayed here?

Find the demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/

A screenshot of the initial attract screen of the demo. A central box allows one to choose a file to load, with a large load button directly beneath it. The background is a blurred + bloomed screenshot of a point cloud from the demo itself.

Find the demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/

If you were one of the lucky people to see my demo in person, you may notice that this online demo looks very different to the one I originally presented at the science festival. That's because the in-person demo uses data from social media, but this one uses data from Wikipedia to preserve privacy, just in case.

I hope you enjoy the demo! Time permitting, I will be back with some more posts soon to explain how I did this and the AI/NLP theory behind it at a more technical level. Some topics I want to talk about, in no particular order:

General technical outline of the nuts and bolts of how the demo works and what technologies I used to throw it together
How I monkeypatched Babylon.js's gamepad support
A detailed and technical explanation of the AI + NLP theory behind the demo, the things I've learnt about word embeddings while doing it, and what future research could look like to improve word embeddings based on what I've learnt
Word embeddings, the options available, how they differ, and which one to choose.

Until next time, I'll leave you with 2 pictures I took on the day. See you in the next post!

A photo of my demo up and running on a PC with a PS4 controller on a wooden desk. An Entroware laptop sits partially obscured by a desktop PC monitor, the latter of which has the demo full screen.

(Above: A photo of my demo in action!)

A photo of some piles of postcards arranged on a light wooden desk. My research is not shown, but visuals from other researchers' projects are printed, such as microbiology to disease research to jellyfish galaxies.

(Above: A photo of the postcards on the desk next to my demo. My research is not shown, but visuals from other researchers' projects are printed, with everything from microbiology to disease research to jellyfish galaxies.)

I've submitted a paper on my rainfall radar research to NLDL 2024!

A screenshot of the nldl.org conference website.

(Above: A screenshot of the NLDL website)

Hey there! I'm excited that last week I submitted a paper to what I hope will become my very first conference! I've attended the AAAI-22 doctoral consortium online, but I haven't had the opportunity to attend a conference until now. Of course, I had to post about it here.

First things first, which conference have I chosen? With the help of my supervisor, we chose the Northern Lights Deep Learning Conference. It's relatively close by the UK (where I live), it's relevant to my area and the paper I wanted to submit (I've been working on the paper since ~July/August 2023), and the deadline wasn't too tight. There were a few other conferences I was considering, but they either had really awkward deadlines (sorry, HADR! I've missed you twice now), or got moved to an unsafe country (IJCAI → China).

The timeline is roughly as follows:

~early - ~mid November 2023: acceptance / rejection notification
somewhere in the middle: paper revision time
9th - 11th January 2024: conference time!

Should I get accepted, I'll be attending in person! I hope to meet some cool new people in the field of AI/machine learning and have lots of fascinating discussions about the field.

As longtime readers of my blog here might have guessed, the paper I've submitted is on my research using rainfall radar data and ~~abusing~~ image segmentation to predict floods. The exact title is as follows:

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

As the paper is unreviewed, I don't feel comfortable with releasing it publicly yet. However, feel free to contact me if you'd like to read it and I'm happy to hand out a copy of the unreviewed paper individually.

Most of the content has been covered quite casually in my phd update blog post series (16 posts in the series so far! easily my longest series by now), just explained in formal language.

This paper will also form the foundation of the second of two big meaty chapters of my thesis, the first being based on my social media journal article. I'm currently at 80 pages of thesis (including appendices, excluding bibliography, single spaced a4), and I still have a little way to go before it's done.

I'll be back soon with another PhD update blog post with more details about the thesis writing process and everything else I've been up to over the last 2 months. I may also write a post on the hull science festival which I'll be attending on the supercomputing stand with a Cool Demo™, 'cause the demo is indeed very cool.

See you then!

How to read a paper

So you've got a paper. Maybe even a few papers. Okay, it's a whole stack of them and you don't have the time to read them all (they do have a habit of multiplying when you're not looking). What is one to do? I've had this question asked of me a few times, so I thought I'd write up a quick post to answer it, organise my thoughts, and explain my personal process for sorting through and reading scientific papers (I generally find regular 'news'papers to be of questionable reliability, lacking depth, and to just not to be worth the effort).

(A bunch of papers I've read.... and one that I've written.)

Finding papers

If you are in a position where you don't have any papers to begin with, then search engines are your best friend. Just like DuckDuckGo, Ecosia, and others provide an interface to search the web, there are special search engines designed to search for scientific papers. The two main ones I suggest are:

Semantic Scholar - !semantic on DuckDuckGo
Google Scholar - !scholar on DuckDuckGo

Personally, semantic scholar is my paper search engine of choice. Enter some general search terms for the field / thing you want to read about, and relevant papers will be displayed. It can be useful to change the sort order from relevance to citation count or most influential papers to get a look at what are likely to be the seminal papers (i.e. the ones that first introduced a thing - e.g. like the Attention is all you need paper first introduced the transformer) in that field - though they may be less relevant.

The other nice feature these search engines have is copying out BibTeX to paste into your bibliography in LaTeX (see also the LaTeX templates I maintain for reports/papers/dissertations/theses)

A note on reliability: Papers on preprint servers like arXiv have not been peer reviewed. Avoid these unless there's no other option.

Sorting through them

So you've you know how to find papers now, but how do you actually read them? Personally, I use a tiered system to this.

Reading the abstract: Firstly, I'll read the abstract. Just like you read the title of a search result to decide whether you want to click on the search result, so do I read the abstract to decide whether a paper is worth my time to read it.

Sometimes I'll stop there. Maybe the paper isn't what I thought it was, or I've simply got all the information I need from it. The latter is most common when I'm writing some paper or report: often I'll need a paper as a reliable source for something, and I won't need to read the whole paper to know that it has the information I need.

Okay, so suppose a paper passes a quick look at the title and abstract, and I want to go deeper. You'd think it's time to jump right in and read it from top to bottom, but you'd be wrong. Reading an entire paper in detail is significantly time consuming, and I want to be really sure it's worth the effort before I commit to it.

Skim reading: The next test is a quick skim read. If it's a journal article, there might be some key contributions at the top of the paper - these are a good place to start. If not, then they can often be found at the end of the introduction - this also goes for conference papers as well. The introduction is usually my second stop (though remember I'm still not reading it word for word yet), followed by the end of the results/experimental discussion section to understand the key points of what they did and how that went for them.

AI summarisation Another option if a paper is dense and/or long is to use an AI summarisation tool. These must always be taken with a grain of salt, but can help to direct my search when I'm having difficulty extracting a specific piece of information. AI summarisation can also be a good start if an abstract is bad or missing the information I want but the subject itself is interesting. I often find AI-generated summaries can be quite generic, so it's not a complete solution.

A note on ChatGPT: ChatGPT is a generic language model, and as such isn't ideal for generating summaries of documents. It's best to use a model specifically trained for this purpose, and to take any output you get with a grain of salt.

AI document discussion: Occasionally the abstract of a paper suggests that it contains a significantly interesting nugget of information I'm interested in acquiring (again, most often when writing a paper rather than initial research), but the paper is long, dense, I'm having difficulty finding it, or some combination of the three.

This is where AI-driven document discussion can be invaluable. As I noted earlier, AI-generated summaries tend to be quite generic, so it's not great if there's something highly specific I'm after. The only place I'm currently aware of that ships this feature in a useful form is Kagi, a paid-for search engine with AI features (document summarisation and document discussion) built-in. I'm sure others have shipped the feature, but I haven't seen them yet.

Essentially, AI-driven document discussion is where you ask a natural language question about the target paper, and it does the reading comprehension for you by answering your question with useful quotes from the paper. Then once you have the answer you can go and look at that specific part of the paper (use your browser's find tool) to get additional context.

I've found this to be a great time saver. It can also be useful if I'm unsure if a paper actually talks about the thing I'm interested in or not.

Kagi: Specifically, Kagi (my current main search engine) implements both of the aforementioned features. They can be access via the Discuss document option next to search engines, or by dedicated !bangs (Kagi implements all of DuckDuckGo's !bangs too), which are significantly helpful as I touched on above.

AI summarisation: !sum <url_of_paper_or_webpage>
AI discuss document: !discuss <url_of_paper_or_webpage>

A disclaimer: I have received no money or other forms of compensation for mentioning Kagi here. Kagi have no asked me to mention them here at all, I just think their product is helpful, useful, produces good search results, and saves me time. AI models can be computationally expensive, so I speculate it would be difficult to find a free version without strings attached.

(Above: A screenshot of a sample discuss document discussion about the paper Attention is all you need)

How to read a paper effectively

So a paper has somehow made it through all of those steps unscathed, and yet I still haven't extracted everything I want to know from it. By this point it must be a significantly interesting paper that I likely want lots of details from.

The process of actually reading a paper from top to bottom is an inherently time consuming one: hence all the other steps above to filter papers out with minimal effort before I commit to spending what is typically an hour or more of my time to a single paper.

My general advice is to do a re-read of the abstract to confirm, and then start with the introduction and make your way down. Take it slow.

Making notes: When I do read a paper, I always make notes when doing so. Having 2 monitors is also helpful, as I can make notes on 1 and have the paper on the other. My current tool of choice here is Obsidian, a fabulous open-source note taking system that I'll wholeheartedly recommend to everyone. It's Markdown-based and has a tagging system (nested tags are supported too!) to keep papers organised. The directed graph and canvas features are also pretty cool. My general template at the moment I use for making notes on papers is as follows:

---
tags: some, tags/here
---

> - URL: <https://example.com/paper_url_here/doi_if_possible.pdf>
> - Year: YEAR_PAPER_WAS_PUBLISHED

- Bulleted notes go here
    - I nest bullet points based on the topic
        - To as many levels as needed
    - These notes are very casual
- [I contain my own thoughts in square brackets]
    - This keeps the things that the paper says separate from the things that I think about it
- Sometimes if I'm making a lot of notes I'll split them up into sections derived from the paper


## PDF
The last section contains the PDF of the paper itself. Obsidian supports dragging and dropping PDFs in, and it also has a dedicated PDF viewer.

Complete with an explanation of what each section is for!

You don't have to use Obsidian (it's the best one I've found), but I strongly recommend making notes while you read a paper. This way you have some distilled notes in your own words to refer back to later. It also helps to further your own understanding of the topic of a paper by putting it into your own words. Other tools I'm aware of include OneNote and QOwnnotes (I still use this for making notes in meetings and recording random stuff that's not necessarily related to research. I keep Obsidian quite focused atm).

Make sure these notes are digital. You'll thank me later. The number of times I've used Obsidian's search function to find the notes I made about a specific paper is absolutely unreal. Over time you'll get a good sense for what you need to make notes on, to avoid both having to refer back to the paper again later and having so many notes that it takes longer than hunting around in the source paper for the information you were after.

(Above: A screenshot of my Obsidian workspace.)

Sometimes your research project will change direction, and the notes you made are suddenly less relevant. Or you've learned something elsewhere and now come back with fresh and more experienced eyes. I often update the notes I took initially to add more information, or references to other related papers that go together.

Continual evaluation: As I read, I'm continually evaluating in the back of my mind whether it's worth continuing to read. I'm asking questions like "is this paper going on a tangent?", and "is the solution to their problem the researchers employed actually interesting to me?", and "is this paper getting too dense for me to understand?", and "is the explanation the paper gives actually intelligible?" (yes, papers do vary in explanatory quality). If the exercise of reading a paper becomes not worth the time, stop reading it and move on.

Sometimes it's worth jumping into skim-reading mode for a bit if something's irrelevant etc to see if it gets better.

But I don't understand something!

This is a normal part of reading a paper. This can be for a number of reasons:

The paper is bad
The paper is good, but is terrible at explaining things
The paper contains more maths than explanation of the variables contained therein
I'm lacking some prerequisite knowledge that the paper doesn't properly explain
Some other issue

It is not always obvious which of these cases I find myself in when I encounter difficulty reading a paper. Nevertheless, I employ a number of strategies to deal with the situation:

Reading around: As in most things, reading around the area of the paper that is causing and issue may yield additional information. Sometimes returning to the related works / background / approach section can help.
Search for related papers: There are many papers that have been written, so it can be worth going looking for a related paper. It might be a better paper or worded differently that makes it easier to understand.
Look through the paper's references: This can also be a good way to trace back to the source of an idea. Semantic scholar's References tab below the abstract lists all the references too, and the related works section of a paper will tell you how each cited work is relevant to the problem, motivations, and subsequent method and results thereof.
Look for seminal papers: See above. Finding the original paper on a given idea can help a lot, as it's often explained in much more detail than later papers that assume you've read the so-called seminal work.
Web search: For specific terms or concepts. Sometimes just a quick definition is needed. Other times it's more substantial and requires reading an entire separate blog post - compare Attention is all you need with the blog post the illustrated transformer. Each provides a different perspective. In this case I actually read both at the same time to fully understand the topic. Make sure you properly assess anything you find for reliability as usual.

Supervision: It's very unlikely that after all of these steps I'll still be stumped on how to proceed, but it has happened. In these situation it can be extremely helpful to have someone more experienced in the field to discuss with. For me, this is my PhD supervisor Nina.

Whoever they are, keeping in regular contact is best as you work through a project. Frequency varies, but for my PhD supervision this has fallen somewhere between 1 week and 3 weeks between each meeting, and each meeting is no less than an hour long. Their advice and insight can guide your efforts as you progress through a research project.

They will also likely be busy people, so make sure you properly prepare before meeting them. Summarise what you've read and how it relates to your project and what you want to do. Make a list of questions that you want to ask them. Gather your thoughts. This will help you make the most of your discussion with them.

Conclusion

I've outlined my personal process I employ when reading a paper (in perhaps more detail than was necessary). It's designed to save me time and allow me to cover ground relatively quickly (though quickly is still a relative term, as in a worst-case with a completely new broad field it can take weeks to cover it enough to gain a good understanding thereof).

This is my process: you need to find something that works for you. It's okay if this takes time. Maybe lots of time... but you'll get there in the end. The more you read, the more you'll get an instinctive sense of the stuff I ramble about here. My method isn't perfect either - I'm still learning, so my process will likely evolve over time.

If you've got any comments or questions, do leave them in the comments section below and I'll do my best to answer them.

The journal article about my social media research is out now!

This is just a quick little post to announce that I have published my first journal article! This has been a significantly long time in the making, with the review process and all associated corrections alone taking from October 2022 until a week or two ago.

It has been published in the Elsevier journal Computers and Geosciences, with the following title:

Real-time social media sentiment analysis for rapid impact assessment of floods

The article is open access, so everyone should be able to read it. I must thank everyone who has helped and contributed to the process of putting this journal article together - their names are on the journal article.

Hopefully this is the first of many!

Stardust Blog

Tag Cloud

NLDL was awesome! >> NLDL-2024 writeup

Day 1

Diversity in AI

Social 1: Fjellheisen

Day 2

Social 2: Conference dinner

Day 3

Panel Discussion

AI in Industry Event

Closing thoughts

Advice

LaTeX templates for writing with the University of Hull's referencing style

NLDL 2024: My rainfall radar paper is out!

I'm going to NLDL 2024!

Building the science festival demo: technical overview

Browsing for a solution

Graphics are easy

Octopain

Gasp, is that a memory leaks I see?!

Gamepad support

How does it fit in a browser anyway?!

Datasets, an origin story

Conclusion

My Hull Science Festival Demo: How do AIs understand text?

I've submitted a paper on my rainfall radar research to NLDL 2024!

How to read a paper

Finding papers

Sorting through them

How to read a paper effectively

But I don't understand something!

Conclusion

The journal article about my social media research is out now!

Stardust
Blog