Starbeamrainbowlabs

Stardust
Blog


Archive


Mailing List Articles Atom Feed Comments Atom Feed Twitter Reddit Facebook

Tag Cloud

3d 3d printing account algorithms android announcement architecture archives arduino artificial intelligence artix assembly async audio automation backups bash batch blender blog bookmarklet booting bug hunting c sharp c++ challenge chrome os cluster code codepen coding conundrums coding conundrums evolved command line compilers compiling compression conference conferences containerisation css dailyprogrammer data analysis debugging defining ai demystification distributed computing dns docker documentation downtime electronics email embedded systems encryption es6 features ethics event experiment external first impressions freeside future game github github gist gitlab graphics guide hardware hardware meetup holiday holidays html html5 html5 canvas infrastructure interfaces internet interoperability io.js jabber jam javascript js bin labs latex learning library linux lora low level lua maintenance manjaro minetest network networking nibriboard node.js open source operating systems optimisation outreach own your code pepperminty wiki performance phd photos php pixelbot portable privacy problem solving programming problems project projects prolog protocol protocols pseudo 3d python reddit redis reference release releases rendering research resource review rust searching secrets security series list server software sorting source code control statistics storage svg systemquery talks technical terminal textures thoughts three thing game three.js tool tutorial twitter ubuntu university update updates upgrade version control virtual reality virtualisation visual web website windows windows 10 worldeditadditions xmpp xslt

Teaching this September

A banner from a game long lost. Maybe I'll remake it and finish it someday.

Hello!

Believe it or not, I'm going to be teachificatinating a thing at University this semester, which starts at the end of this month and lasts until around December-ish time (yeah, I'm surprised too).

It's called Secure Digital Infrastructure, and I'll be teaching Linux and system administration skills, so that includes the following sorta-areas:

(related posts aren't necessarily the exact content I'm going to cover, but are related)

To this end, it is quite stressful and is taking significantly more energy than I expected to prepare for this.

I definitely want to talk about it here, but that will likely happen after the fact - probably some time in January or February.

Please be patient with me as I navigate this new and unexpected experience :-)

--Starbeamrainbowlabs

A banner from a game long lost. Maybe I'll remake it and finish it someday.

PhD Update 19: The Reckoning

The inevitability of all PhDs. At first it seems distant and ephemeral, but it is also the inescapable and unavoidable destination for all on the epic journey of the PhD.

Sit down and listen as I tell my own tale of the event I speak of.

I am, of course, talking about the PhD Viva. It differs from country to country, but here in the UK the viva is an "exam" that happens a few months after you have submitted your thesis (PhD Update 18: The end and the beginning). Unlike across the pond in the US, in the UK vivas are a much more private affair, with only you, the chair, and your internal and external examiners normally attending.

In my case, that was 2 externals (as I am also staff, ref Achievement get: Experimental Officer Position!), an internal, and of course the chair. I won't name them as I'm unsure of policy there, but they were experts in the field and very kind people.

I write this a few weeks removed from the actual event (see also my post on Fediscience at the time), and I thought that my viva itself deserved a special entry in this series dedicated to it.

My purpose in this post is to talk about my experience as honestly and candidly as I can, and offer some helpful advice from someone who has now been through the process.

The Structure

The viva itself took about 4 hours. It's actually a pretty complicated affair: all your examiners (both internal and external) have to read your thesis and come up with a list of questions (hidden from you of course). Then, on the day but before you enter the room they have to debate who is going to ask what to avoid duplication.

In practice this usually means that the examiners will meet in the morning to discuss, before having lunch and then convening for the actual viva bit where they ask the questions. In my case, I entered the room to meet the examiners and say hi, before leaving again for them to sort out who was going to ask what.

Then, the main part of the viva simply consists of you answering all the questions that they have for you. Once all the questions are answered, then the viva is done.

You are usually allowed a copy of your thesis in one form or another to assist you while answering their questions. The exact form this will take varies from institution to institution, so I recommended always checking this with someone in charge (e.g. the Doctoral College in my case) well in advance - you don't want to be hit with paperwork and confusion minutes before your viva is scheduled to start!

After the questions, you leave the room again for the examiners to deliberate over what the outcome will be, before calling you back into the room to give you the news.

Once they have done this: the whole thing is over and you can go sleep (trust me, you will not want to do anything else).

My experience

As I alluded to in the aforementioned post on fediscience (a node in the fediverse), I found the viva a significantly intense experience - and one I'm not keen on repeating any time soon. I strongly recommend having someone nearby as emotional support for after the viva and during those periods when you have to step out of the room. I am not ashamed to admit that there were tears after the exam had ended.

More of the questions than I expected focused on the 'big picture' kinda stuff, like how my research questions linked in with the rest of the thesis, and how the thesis flowed. I was prepared for technical questions -- and there were some technical questions -- but the 'fluffy stuff' kinda questions caught me a little off guard. For example, there were some questions about my introduction and how while I introduced the subject matter well, the jump into the technical stuff with the research questions was quite jarring, with concepts mentioned that weren't introduced beforehand.

To this end, I can recommend looking over the 'big picture' stuff beforehand so that you are prepared for questions that quiz you on your motivations for doing your research in the first place and question different aspects research questions.

It can also feel quite demoralising, being questioned for hours on what has been your entire life for multiple years. It can feel like all you have done is pointless, and you need to start over. While it is sure that you could improve upon your methods if you started from scratch, remember that you have worked hard to get to this point! You have discovered things that were not known to the world before your research began, and that is a significant accomplishment!

Try not to think too hard about the corrections you will need to make once the viva is done. Institutions differ, but in my case it is the job of the chair to compile the list of corrections and then send them to you (in one form or another). The list of corrections - even if they are explained to you verbally when you go back in to receive the result - may surprise you.

Outcome

As I am sure that most of you reading this are wondering, what was my result?! Before I tell you, I will preface the answer to your burning question with a list of the possible outcomes:

  • Pass with no corrections (extremely rare)
  • Pass with X months corrections (common, where X is a multiple of 3)
  • Fail (also extremely rare)

In my case, I passed with corrections!

It is complicated by the fact that while the panel decided that I had 6 months of corrections to do, I am not able to spend 100% of my time doing them. To this end, it is currently undefined how long I will have to do them - paperwork is still being sorted out.

The reasons for this are many, but chief among them is that I will be doing some teaching in September - more to come on my experience doing that in a separate post (series?) just as soon as I have clarified what I can talk about and what I can't.

I have yet to recieve a list of the corrections themselves (although I have not checked my email recently as I'm on holiday now as I write this), but it is likely that the corrections will include re-running some experiments - a process I have begun already.

Looking ahead

So here we are. I have passed my viva with corrections! This is not the end of this series - I will keep everyone updated in future posts as I work through the corrections.

I also intend to write a post or two about my experience learning to teach - a (side)quest that I am currently persuing in my capacity as Experimental Officer (research is still my focus - don't worry!)

Hopefully this post has provided some helpful insight into the process of the PhD viva - and my experience in mine.

The viva is not a destination: only a waypoint on a longer journey.

If you have any questions, I am happy to anwser them in the comments, and chat on the fediverse and via other related channels.

PhD Update 18: The end and the beginning

Hello! It has been a while. Things have been most certainly happening, and I'm sorry I haven't had the energy to update my blog here as often as I'd like. Most notably, I submitted my thesis last week (gasp!)! This does not mean the end of this series though - see below.

Before we continue, here's our traditional list of past posts:

Since last time, that detecting persuasive tactic challenge has ended too, and we have a paper going through at the moment: BDA at SemEval-2024 Task 4: Detection of Persuasion in Memes Across Languages with Ensemble Learning and External Knowledge.

Theeeeeeeeeeeeesis

Hi! A wild thesis appeared! Final counts are 35,417 words, 443 separate sources, 167 pages, and 50 pages of bibliography - making that 217 pages in total. No wonder it took so long to write! I submitted at 2:35pm BST on Friday 10th May 2024.

I. can. finally. rest.

It has been such a long process, and taken a lot of energy to complete it, especially since large amounts of formal academic writing isn't usually my thing. I would like to extend a heartfelt thanks especially to my supervisor for being there from beginning to end and beyond to support me through this endeavour - and everyone else who has helped out in one way or another (you know who you are).

Next step is the viva, which will be some time in July. I know who my examiners are going to be, but I'm unsure whether it would be wise to say here. Between now and then, I want to stalk investigate my examiners' research histories, which should give me an insight into their perspective on my research.

Once the viva is done, I expect to have a bunch of corrections to do. Once those are completed, I will to the best of my ability be releasing my thesis for all to read for free. I still need to talk to people to figure out how to do that, but rest assured that if you can't get enough of my research via the papers I've written for some reason, then my thesis will not be far behind.

Coming to the end of my PhD and submitting my thesis has been surprisingly emotionally demanding, so I thank everyone who is still here for sticking around and being patient as I navigate these unfamiliar events.

Researchy things

While my PhD may be coming to a close (I still can't believe this is happening), I have confirmed that I will have dedicated time for research-related activities. Yay!

This means, of course, that as one ending draws near, a new beginning is also starting. Today's task after writing this post is to readificate around my chosen idea to figure out where there's a gap in existing research for me to make a meaningful contribution. In a very real way, it's almost like I am searching for directions as I did in my very first post in this series.

My idea is connected to the social media research that I did previously on multimodal natural language processing of flooding tweets and images with respect to sentiment analysis (it sounded better in my head).

Specifically, I think I can do better than just sentiment analysis. Imagine an image of a street that's partially underwater. Is there a rescue team on a boat rescuing someone? What about the person on the roof waving for help? Perhaps it's a bridge that's about to be swept away, or a tree that has fallen down? Can we both identify these things in images and map them to physical locations?

Existing approaches to e.g. detect where the water is in the image are prone to misidentifying water that is infact where it should be for once, such as in rivers and lakes. To this end, I propose looking for the people and things in the water rather than the water itself and go for a people-centred approach to flood information management.

I imagine that while I'll probably use data from social media I already have (getting a hold of new data from social media is very difficult at the moment) - filtered for memes and misinformation this time - if you know of any relevant sources of data or datasets, I'm absolutely interested and please get in touch. It would be helpful but not required if it's related to a specific natural disaster event (I'm currently looking at floods, branching out to others is absolutely possible and on the cards but I will need to submit a new ethics form for that before touching any data).

Another challenge I anticipate is that of unlabelled data. It is often the case that large volumes of data are generated during an unfolding natural disaster, and processing it all can be a challenge. To this end, somehow I want my approach here to make sense of unlabelled images. Of course, generalist foundational models like CLIP are great, but lack the ability to be specific and accurate enough with natural disaster images.

I also intend that this idea would be applicable to images from a range of sources, and not just with respect to social media. I don't know what those sources could be just yet, but if you have some ideas, please let me know.

Finally, I am particularly interested if you or someone you know are in any way involved in natural disaster management. What kinds of challenges do you face? Would this be in any way useful? Please do get in touch either in the comments below or sending me an email (my email address is on the homepage of this website).

Persuasive tactics challenge

The research group I'm part of were successful in completing the SemEval Task 4: Multilingual Detection of Persuasion Techniques in Memes! I implemented the 'late fusion engine', which is a fancy name for an algorithm that uses in basic probability to combine categorical predictions from multiple different models depending on how accurate each model was on a per-category basis.

I'm unsure of the status of the paper, but I think it's been through peer-review so you can find that here: BDA at SemEval-2024 Task 4: Detection of Persuasion in Memes Across Languages with Ensemble Learning and External Knowledge.

I wasn't the lead on that challenge, but I believe the lead person (a friend of mine, if you are reading this and want me to link to somewhere here get in touch) on that project will be going to mexico to present it.

Teaching

I'm still not sure what I can say and what I can't, but starting in september I have been asked to teach a module on basic system administration skills. It's a rather daunting prospect, but I have a bunch of people much more experienced than me to guide me through the process. At the moment the plan is for 21 lecture-ish things, 9 labs, and the assessment stuff, so I'm rather nervous about preparing all of this content.

Of course, as a disclaimer nothing written in this section should be taken as absolute. (Hopefully) more information at some point, though unfortunately I doubt that I would be allowed to share the content created given it's University course material.

As always though, if there's a specific topic that lies anywhere within my expertise that you'd like explaining, I'm happy to write a blog post about it (in my own time, of course).

Conclusion

We've taken a little look at what is been going on since I last posted, and while this post has been rather talky (will try for some kewl graphics next time!), nonetheless I hope this has been an interesting read. I've submitted my thesis, started initial readificating for my next research project - which we've explored the ideas here, helped out a group research challenge project thingy, and been invited to do some teaching!

Hopefully the next post in this series will come out on time - long-term the plan is to absolutely continue blogging about the research I'm doing.

Until next time, the journey continues!

(Oh yeah! and finally finally, to the person who asked a question by email about this old post (I think?), I'm sorry for the delay and I'll try to get back to you soon.)

LaTeX templates for writing with the University of Hull's referencing style

Hello, 2024! I'm writing this while it is still some time before the new year, but I realised just now (a few weeks ago for you), that I never blogged about the LaTeX templates I have been maintaining for a few years by now.

It's no secret that I do all of my formal academic writing in LaTeX - a typesetting language that is the industry standard in the field of Computer Science (and others too, I gather). While it's a very flexible (and at times obtuse, but this is a tale for another time) language, actually getting started is a pain. To make this process easier, I have developed over the years a pair of templates for writing that make starting off much easier.

A key issue (and skill) in academic writing is properly referencing things, and most places have their own specific referencing style you have to follow. The University of Hull is no different, so I knew from the very beginning that I needed a solution.

I can't remember who I received it from, but someone (comment below if you remember who it was, and I'll properly credit!) gave me a .bst BibTeX referencing style file that matches the University of Hull's referencing style.

I've been using it ever since, and I have also applied a few patches to it for some edge cases I have encountered that it doesn't handle. I do plan on keeping it up to date for the forseeable future with any changes they make to the aforementioned referencing style.

My templates also include this .bst file to serve as a complete starting template. There's one with a full page title (e.g. for thesis, dissertations, etc), and another with just a heading that sits at the top of the document just like a paper you might find on Semantic Scholar.

Note that I do not guarantee that the referencing style matches the University of Hull's style. All I can say is that it works for me and implements this specific referencing style.

With that in mind, I'll leave the README of the git repository to explain the specifics of how to get started with them:

https://git.starbeamrainbowlabs.com/Demos/latex-templates

They are stored on my personal git server, but you should be able to clne them just fine. Patches are most welcome via email (check the homepage of my website!)!

PhD Update 17: Light at the end of the tunnel

Wow..... it's been what, 5 months since I last wrote one of these? Oops. I'll do my best to write them at the proper frequency in the future! Things have been busy. Before I talk about what's been happening, here's the ever-lengthening list of posts in this series:

As I sit here at the very bitter end of the very last day of a long but fulfilling semester, I'm feeling quite reflective about the past year and how things have gone on my PhD. One of these posts is definitely long overdue.

Timescales

Naturally the first question here is about timescales. "What happened?" I hear you ask. "I thought you said you were aiming for intent to submit September 2023 for December 2023 finish?"

Well, about that.......

As it turns out, spending half of one's week working as Experimental Officer throws off one's estimation of how much work they do. To this end, it's looking more likely that I will be submitting my thesis in early-mid semester 2 this year. In other words, that's around about March 2024 time - give or take a month or two.

After submission the next step will be my viva. Hoping I pass, it's then likely followed by corrections that must be completed based on the feedback from the viva.

What is a viva though? From what I understand, it is an oral exam in which you, your primary supervisor, and 2 examiners comb through your thesis with a fine toothcomb and ask you lots of questions. I've heard it can take several hours to complete. While the standard is to have 1 examiner be chosen internally from your department / institute and one to be chosen externally (chosen by your primary supervisor), in my case I will be having both chosen from external sources as I am now a (part-time) staff member in the Department of Computer Science at the University of Hull (my home institution).

While it's still a little ways out yet, I can't deny that the thought of my viva is making me rather nervous - having everything I've done over the past 4.5 years scrutinised by completely unknown people. In a sense, it feels like once it is time for my viva, there will be nothing more I can do. I will either know the answers to their questions.... or I will not.

Writing

As you might have guessed by now, writing has been the name - and, indeed, aim - of the game since the last post in this series. Everything is coming together rather nicely. It's looking like I'm going to end up with the following structure:

  1. Introduction (not written*)
  2. Background (almost there! currently working on this)
  3. Rainfall radar for 2d flood forecasting (needs expanding)
  4. Social media sentiment analysis (done!)
  5. Conclusion
  6. Acknowledgements, Appendices, etc
  7. Dictionary of terms; List of acronyms (grows organically as I write - I need to go through and make sure I \gls all the terms I've added later)
  8. Bibliography (currently 27 pages and counting O.o)
  • Technically I have written it, it's just outdated and very bad and needs throwing out the window of the tallest building I can find. Rewrite is pending - see below.

A sneak preview of my thesis as a PDF.

(Above: A sneak preview of my thesis PDF. I'm writing in LaTeX - check out my templates with the University of Hull reference style here! Evidently the pictured section needs some work.....)

I've finished the chapter on social media work, barring some minor adjustments I need to apply to ensure consistency. My current focus is the background chapter. This is most of the way there, but I need some more detail in several sections so I'm working my way through them one at a time. This is resulting a bunch more reading (especiall for vision-based water detection via satellite data), so this is taking some time.

Once I've wrapped up the background section, it will be time to turn my attention to the content chapter #2: Rainfall radar for 2d flood forecasting. Currently, it sits at halfway between a conference paper (check it out! You can read it now, though a DOI is pending and should be available after the conference) and a thesis chapter - so I need to push (pull? drag?) it the rest of the way to the finish line. This will primarily entail 2 things:

  • Filling out the chapter-specific related works, which are currently rather brief given space and time limitations in a conference paper
  • Elaborating on things like the data preprocessing, experiments, discussion, etc.

This will also take some time, which together with the background section explains the uncertaincy I still have in my finish date. Once these are both complete, I will be submitting my intent to submit! This will start a 3 month timer, by the end of which I must have submitted my thesis. During this timer period, I will be working on the introduction and conclusion chapters, which I do not expect to take nearly as long as any of the other chapters.

Once I am done writing and have submitted my thesis, I will do everything I can to ensure it is available under an open source licence for everyone to read. I believe strongly in the power of open source (and, open science) to benefit everyone, and want to share everything I've learned with all of you reading this.

At 102 pages A4 single space so far and counting though (not including the aforementioned bibliography), it's a big time investment to read. To this end, I have various publications I've written and posted about here previous that cover most of the stuff I've done (namely the rainfall radar conference paper and social media journal article), and I also want to somehow condense the content of my thesis down into a 'mini-thesis' that's about 3-6 pages ish and post that alongside my main thesis here on my website. I hope that this should provide the broad strokes and a navigation aid for the main document.

Predicting Persuasive Posts

All this writing is going to drive me crazy if I don't do something practical alongside it. Unfortuantely I have long since run out of exuses to run more experiments on my PhD work, so a good friend of mine who is also doing a PhD (they've published this paper) came along at the perfect time the other day asking for some help with a challenge competition submission they want to do. Of course, I had to agree to help out in a support role as the project sounds really interesting1.

The official title of the challenge is thus: Multilingual Detection of Persuasion Techniques in Memes

The challenge is part of SemEval-2024 and it's basically about classifying memes from some social media network (it's unclear which one they are from) as to which persuasion tactic they are employing to manipulate the reader's opinions / beliefs.

The full challenge page is can be found here: https://propaganda.math.unipd.it/semeval2024task4/index.html

We had a meeting earlier this week to discuss, and one of the key problems we identified was that to score challengers they be using posts in multiple unseen languages. To this end, it strikes me that it is important to have multiple languages embedded in the same space for optimal results.

This is not what GloVe does (it embeds them to different 'spaces', so a model trained data in 1 language won't necessarily work well with another) - as I discovered in my demo for the Hull Science Festival - definitely want to write about this in the final post in that series - so as my role in the team I'm going to push a number of different word embeddings through the system I have developed for the aforementioned science demo to identify which one is best for embedding multilingual text. Expect some additional entries to be added to the demo and an associated blog post on my findings very soon!

Currently, I have the following word embedding systems on my list:

  • Word2vec
  • FastText
  • CLIP
  • BERT/mBERT
  • XLM/XLM-RoBERTa

If you know of any other good word embedding models / algorithms, please do leave a comment below.

It also occurs to me while writing this that I'll have to make sure the multilingual dataset I used for the online demo has the same or similar words translated to every language to rule out any difference in embeddings there.

A nice challenge for the Christmas holidays! My experience of collaborating with other researchers is rather limited at the moment, so I'm looking forward to working in a team to achieve a goal much faster than would otherwise be possible.

Beyond the edge

Something that has been constant nagging presence in my mind and steadily growing is the question of what happens next after my thesis. While the details have not been confirmed yet, once everything PhD-related is wrapped up I will most likely be increasing my hours by some amount such that I work Monday - Friday rather than just Monday - Wednesday lunchtime as I have been doing so far.

This extra time will consist of 2 main activities. To the best of my current understanding, this will include some additional teaching responsibilities - I will probably be teaching a module that lies squarely within 1 of my strong points. It will also, crucially, include some dedicated time for research.

This time for research I believe I will be able to spend on research related activities, including for example collaborating with other researchers, reading papers, designing and running experiments, and writing up results into publication form. Essentially what I've been doing on my PhD, just minus the thesis writing!

Of course, the things I talk about here are not set in stone, and me talking about them here is not a declaration of such.

Either way, I do feel that the technical is a strong point of mine that I am rather passsionate about, so I do desire very much to continue dedicating a significant portion of my energy towards doing practical research tasks.

I'm not sure how much I am allowed to talk about the teaching I will be doing, but do expect some updates on that here on my blog too - however high-level and broad strokesy they happen to be. What kind of teaching-related things would you be interested in being updated about here? Please do leave a comment below.

Talking more specifically, I do have a number of research ideas - one of which I have alluded to above - that I want to explore after my PhD. Most of these are based on what I have learnt from doing my PhD and the logical next steps to analyse complex real-time data sources with a view to extracting and processing information to increase situational awareness in natural disaster scenarios. When I get around to this, I will be blogging about my progress in detail here on my blog.

It should probably be mentioned that I am still quite a long way off actually putting any of these ideas into practice (I would definitely not recommend trusting any predictions my current rainfall radar → binarised water depth model makes in the real world yet!), but if you or someone you know works in the field of managing natural disasters, I would be fascinated to know what you would find most useful related to this - please leave a comment below.

Conclusion

This post has ended up being a lot longer than I expected! I've talked about my current writing progress, a rather interesting side-project (more details in a future blog post!), and initial conceptual future plans - both researchy and otherwise.

While my thesis is drawing close to completion (relatively, at least), I hope you will join me here beyond the end of this long journey that is almost at an end. As one book closes, so does another one open. A new journey is / will be only just beginning - one I can't wait to share with everyone here in future blog posts.

If you've got any thoughts, it would be cool if you could share them below.


  1. It goes without saying, but I won't let it impact my writing progress. I divide my day up into multiple slices - one of which is dedicated to focused PhD work - and I'll be pulling from a different slice of time other than the one for my PhD writing to help out with this project. 

Building the science festival demo: How to monkeypatch an npm package

A pink background dotted with bananas, with the patch-package logo front and centre, and the npm logo small in the top-left. Small brown package boxes are present in the bottom 2 corners.

In a previous post, I talked about the nuts and bolts of the demo on a technical level, and put it's all put together. I alluded to the fact that I had to monkeypatch Babylon.js to disable the gamepad support because it was horribly broken, and I wanted to dedicate an entire post to the subject.

Partly because it's a clever hack I used, and partly because if I ever need to do something similar again I want a dedicated tutorially-style post on how I did it so I can repeat the process.

Monkeypatching an npm package after installation in a reliable way is an inherently fragile task: it is not something you want to do if you can avoid. In some cases though, it's unavoidable:

  1. If you're short on time, and need something to work
  2. If you are going to submit a pull request to fix something now, but need an interim workaround until your pull request is accepted upstream
  3. If upstream doesn't want to fix the problem, and you're forced to either maintain a patch or fork upstream into a new project, which is a lot more work.

We'll assume that one of these 3 cases is true.

In the game Factorio, there's a saying 'there's a mod for that' that is often repeated in response to questions in discourse about the game. The same is true of Javascript: If you need to do a non-trivial thing, there's usually an npm package that does it that you can lean on instead of reinventing the wheel.

In this case, that package is called patch-package. patch-package is a lovely little tool that enables you to do 2 related things:

a) Generate patch files simply by editing a given npm package in-situ b) Automatically and transparently apply generated patch files on npm install, requiring no additional setup steps should you clone your project down from its repository and run npm install.

Assuming you have a working setup with the target npm package you want to patch already installed, first install patch-package:

npm install --save patch-package

Note: We don't --save-dev here, because patch-package needs to run any time the target package is installed... not just in your development environment - unless the target package to patch is also a development dependency.

Next, delve into node_modules/ and directly edit the files associated with the target package you want to edit.

Sometimes, projects will ship multiple npm packages, with one being containing the pre-minified build distribution, and th other distributing the raw source - e.g. if you have your own build system like esbuild and want to tree-shake it.

This is certainly the case for Babylon.js, so I had to switch from the main babylonjs package to @babylon/core, which contains the source. Unfortunately official documentation for Babylon.js is rather inconsistent which can lead to confusion using the latter, but once I figured out how the imports worked it all came out in the wash.

Once done, generate the patch file for the target package like so:

npx patch-package your-package-name-here

This should create a patch file in the directory patches/ alongside your package.json file.

The final step is to enable automatic and transparent application of the new patch file on package installation. To do this, open up your package.json for editing, and add the following to the scripts object:

"scripts": {
    "postinstall": "patch-package"
}

...so a complete example might look a bit like this:

{
    "name": "research-smflooding-vis",
    "version": "1.0.0",
    "description": "Visualisations of the main smflooding research for outreach purposes",
    "main": "src/index.mjs",

    // ....

    "scripts": {
        "postinstall": "patch-package",
        "test": "echo \"No tests have been written yet.\"",
        "build": "node src/esbuild.mjs",
        "watch": "ESBUILD_WATCH=yes node src/esbuild.mjs"
    },

    // ......

    "dependencies": {
        // .....
    }
}

That's really all you need to do!

After you've applied the patch like this, don't forget to commit your changes to your git/mercurial/whatever repository.

I would also advise being a bit careful installing updates to any packages you've patched in future, in case of changes - though of course installing dependency package updates are vitally important to keep your code updated and secure.

As a rule of thumb, I recommend actively working to minimise the number of patches you apply to packages, and only use this method as a last resort.

That's all for this post. In future posts, I want to look more at the AI theory behind the demo, it's implications, and what it could mean for research in the field in the future (is there even a kind of paper one writes about things one learns from outreach activities that accidentally have a bearing on my actual research? and would it even be worth writing something formal? a question for my supervisor and commenters on that blog post when it comes out I think).

See you in the next post!

(Background to post banner: Unsplash)

Building the science festival demo: technical overview

Banner showing gently coloured point clouds of words against a dark background on the left, with the Humber Science Festival logo, fading into a screenshot of the attract screen of the actual demo on the right.

Hello and welcome to the technical overview of the hull science festival demo I did on the 9th September 2023. If you haven't already, I recommend reading the main release post for context, and checking out the live online demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/

I suspect that a significant percentage of the readers of my blog here love technical nuts and bolts of things (though you're usually very quiet in the comments :P), so I'm writing a series of posts about various aspects of the demo, because it's was a very interesting project.

In this post, we'll cover the technical nuts and bolts of how I put it together, the software and libraries I used, and the approach I took. I also have another post written I'll be posting after this one on monkeypatching npm packages after you install them, because I wanted that to be it's own post. In a post after that we'll look look at the research and the theory behind the project and how it fits into my PhD and wider research.

To understand the demo, we'll work backwards and deconstruct it piece by piece - starting with what you see on screen.

Browsing for a solution

As longtime readers of my blog here will know, I'm very partial to cramming things into the browser that probably shouldn't run in one. This is also the case for this project, which uses WebGL and the HTML5 Canvas.

Of course, I didn't implement using the WebGL API directly. That's far too much effort. Instead, I used a browser-based game engine called Babylon.js. Babylon abstracts the complicated bits away, so I can just focus on implementing the demo itself and not reinventing the wheel.

Writing code in Javascript is often an exercise in putting lego bricks together (which makes it very enjoyable, since you rarely have to deviate from your actual task due to the existence of npm). To this end, in the process of implementing the demo I collected a bunch of other npm packages together to which I could then delegate various tasks:

Graphics are easy

After picking a game engine, it is perhaps unsurprising that the graphics were easy to implement - even with 70K points to display. I achieved this with Babylon's PointsCloudSystem class, which made the display of the point cloud a trivial exercise.

After adapting and applying a clever plugin (thanks, @sebavan!), I had points that were further away displaying smaller and closer ones larger. Dropping in a perceptually uniform colour map (I wonder if anyone's designed a perceptually uniform mesh map for a 3D volume?) and some fog made the whole thing look pretty cool and intuitive to navigate.

Octopain

Now that I had the points displaying, the next step was to get the text above easy point displaying properly. Clearly with 70K points (140K in the online demo!) I can't display text for all of them at once (and it would look very messy if I did), so I needed to index them somehow and efficiently determine which points were near to the player in real time. This is actually quite a well studied problem, and from prior knowledge I remember that Octrees were reasonably efficient. If I had some tine to sit down and read papers (a great pastime), this one (some kind of location recognition from point clouds; potentially indoor/outdoor tracking) and this one (AI semantic segmentation of point clouds) look very interesting.

Unfortunately, the task of extracting a list of points within a given radius was not something commonly implemented in octree implementations on npm, and combined with a bit of headache figuring out the logic of this and how to hook it up to the existing Babylon renderer resulted in this step taking some effort before I found octree-es and got it working the way I wanted it to.

In the end, I had the octree as a completely separate point indexing data structure, and I used the word as a key to link it with the PointsCloudSystem in babylon.

Gasp, is that a memory leaks I see?!

Given I was in a bit of a hurry to get the whole demo thing working, it should come as no surprise that I ended up with a memory leak. I didn't actually have time to fix it before the big day either, so I had the demo on the big monitor while I kept an eye on the memory usage of my laptop on my laptop screen!

A photo of my demo up and running on a PC with a PS4 controller on a wooden desk. An Entroware laptop sits partially obscured by a desktop PC monitor, the latter of which has the demo full screen.

(Above: A photo of my demo in action.... I kept an eye on the memory graph the taskbar on my laptop the whole time. It only crashed once!)

Anyone who has done anything with graphics and game engines probably suspects where the memory leak was already. When rendering the text above each point with a DynamicTexture, I didn't reuse the instance when the player moved, leading to a build-up of unused textures in memory that would eventually crash the machine. After the day was over, I had time to sit down and implement a pool to re-use these textures over and over again, which didn't take nearly as long as I thought it would.

Gamepad support

You would think that being a well known game engine that Babylon would have working gamepad support. The documentation even suggests as such, but sadly this is not the case. When I discovered that gamepad support was broken in Babylon (at least for my PS4 controller), I ended up monkeypatching Babylon to disable the inbuilt support (it caused a crash even when disabled O.o) and then hacking together a custom implementation.

This custom implementation is actually quite flexible, so if I ever have some time I'd like to refactor it into its own npm package. Believe it or not I tried multiple other npm packages for wrapping the Gamepad API, and none worked reliably (it's a polling API, which can make designing an efficient and stable wrapper an interesting challenge).

To do that though I would need to have some other controllers to test with, as currently it's designed only for the PS4 dualshock controller I have on hand. Some time ago I initially purchased an Xbox 360 controller wanting something that worked out of the box with Linux, but it didn't work out so well so I ended up selling it on and buying a white PS4 dualshock controller instead (pictured below).

I'm really impressed with how well the PS4 dualshock works with Linux - it functions perfectly out of the box in the browser (useful test website) just fine, and even appears to have native Linux mainline kernel support which is a big plus. The little touchpad on it is cute and helpful in some situations too, but most of the time you'd use a real pointing device.

A white PS4 dualshock controller.

(Above: A white PS4 dualshock controller.)

How does it fit in a browser anyway?!

Good question. The primary answer to this is the magic of esbuild: a magical build tool that packages your Javascript and CSS into a single file. It can also handle other associated files like images too, and on top of that it's suuuper easy to use. It tree-shakes by default, and just all-around a joy to use.

Putting it to use resulted in my ~1.5K lines of code (wow, I thought it was more than that) along with ~300K lines in libraries being condensed into a single 4MiB .js and a 0.7KiB .css file, which I could serve to the browser along with the the main index.html file. It's event really easy to implement subresource integrity, so I did that just for giggles.

Datasets, an origin story

Using the Fetch API, I could fetch a pre-prepared dataset from the server, unpack it, and do cool things with it as described above. The dataset itself was prepared using a little Python script I wrote (source).

The script uses GloVe to vectorise words (I think I used 50 dimensions since that's what fit inside my laptop at the time), and then UMAP (paper, good blog post on why UMAP is better than tSNE) to do dimensionality reduction down to 3 dimensions, whilst still preserving global structure. Judging by the experiences we had on the day, I'd say it was pretty accurate, if not always obvious why given words were related (more on this why this is the case in a separate post).

My social media data, plotted in 2D with PCA (left), tSNE (centre), and UMAP (right). Points are blue against a white background, plotted with the Python datashader package.

_(Above: My social media data, plotted in 2D with PCA (left), tSNE (centre), and UMAP (right). Points are blue against a white background, plotted with the Python datashader package.)_

I like Javascript, but I had the code written in Python due to prior research, so I just used Python (looking now there does seem to be a package that implementing UMAP in JS, so I might look at that another time). The script is generic enough that I should be able to adapt it for other projects in the future to do similar kinds of analyses.

For example, if I were to look at a comparative analysis of e.g. language used by social media posts from different hashtags or something, I could use the same pipeline and just label each group with a different colour to see the difference between the 2 visually.

The data itself comes from 2 different places, depending on where you see the demo. If you were luck enough to see it in person, then it's directly extracted from my social media data. The online one comes from page abstracts from various Wikipedia language dumps to preserve privacy of the social media dataset, just in case.

With the data converted, the last piece of the puzzle is that of how it ends up in the browser. My answer is a gzip-compressed headerless tab-separated-values file that looks something like this (uncompressed, of course):

cat    -10.147051      2.3838716       2.9629934
apple   -4.798643       3.1498482       -2.8428414
tree -2.1351748      1.7223179       5.5107193

With the data stored in this format, it was relatively trivial to load it into the browser, decompressed as mentioned previously, and then display it with Babylon.js. There's also room here to expand and add additional columns later if needed, to e.g. control the colour of each point, label each word with a group, or something else.

Conclusion

We've pulled the demo apart piece by piece, and seen at a high level how it's put together and the decisions I made while implementing it. We've seen how I implemented the graphics - aided by Babylon.js and a clever hack. I've explained how I optimised the location polling using achieve real-time performance with an octree, and how reusing textures is very important. Finally, we took a brief look at the dataset and where it came from.

In the next post, we'll take a look at how to monkeypatch an npm package and when you'd want to do so. In a later post, we'll look at the research behind the demo, what makes it tick, what I learnt while building and showing it off, and how that fits in with the wider field from my perspective.

Until then, I'll see you in the next post!

Edit 2023-11-30: Oops! I forgot to link to the source code....! If you'd like to take a gander at the source code behind the demo, you can find it here: https://github.com/sbrl/research-smflooding-vis

My Hull Science Festival Demo: How do AIs understand text?

Banner showing gently coloured point clouds of words against a dark background on the left, with the Humber Science Festival logo, fading into a screenshot of the attract screen of the actual demo on the right.

Hello there! On Saturday 9th September 2023, I was on the supercomputing stand for the Hull Science Festival with a cool demo illustrating how artificial intelligences understand and process text. Since then, I've been hard at work tidying that demo up, and today I can announce that it's available to view online here on my website!

This post is a general high-level announcement post. A series of technical posts will follow on the nuts and bolts of both the theory behind the demo and the actual code itself and how its put together, because it's quite interesting and I want to talk about it.

I've written this post to serve as a foreword / quick explanation of what you're looking at (similar to the explanation I gave in person), but if you're impatient you can just find it here.

All AIs currently developed are essentially complex parametrised mathematical models. We train these models by updating their parameters little by little until the output of the model is similar to the output of some ground truth label.

In other words, and AI is just a bunch of maths. So how does it understand text? The answer to this question lies in converting text to numbers - a process often called 'word embedding'.

This is done by splitting an input sentence into words, and then individually converting each word into a series of numbers, which is what you will see in the demo at the link below - just convert with some magic to 3 dimensions to make it look fancy.

Similar sorts of words will have similar sorts of numbers (or positions in 3D space in the demo). As an example here, at the science festival we found a group of footballers, a group of countries, and so on.

In the demo below, you will see clouds of words processed from Wikipedia. I downloaded a bunch of page abstracts for Wikipedia in a number of different languages (source), extracted a list of words, converted them to numbers (GloVeUMAP), and plotted them in 3D space. Can you identify every language displayed here?


Find the demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/

A screenshot of the initial attract screen of the demo. A central box allows one to choose a file to load, with a large load button directly beneath it. The background is a blurred + bloomed screenshot of a point cloud from the demo itself.

Find the demo here: https://starbeamrainbowlabs.com/labs/research-smflooding-vis/


If you were one of the lucky people to see my demo in person, you may notice that this online demo looks very different to the one I originally presented at the science festival. That's because the in-person demo uses data from social media, but this one uses data from Wikipedia to preserve privacy, just in case.

I hope you enjoy the demo! Time permitting, I will be back with some more posts soon to explain how I did this and the AI/NLP theory behind it at a more technical level. Some topics I want to talk about, in no particular order:

  • General technical outline of the nuts and bolts of how the demo works and what technologies I used to throw it together
  • How I monkeypatched Babylon.js's gamepad support
  • A detailed and technical explanation of the AI + NLP theory behind the demo, the things I've learnt about word embeddings while doing it, and what future research could look like to improve word embeddings based on what I've learnt
  • Word embeddings, the options available, how they differ, and which one to choose.

Until next time, I'll leave you with 2 pictures I took on the day. See you in the next post!

Edit 2023-11-30: Oops! I forgot to link to the source code....! If you'd like to take a gander at the source code behind the demo, you can find it here: https://github.com/sbrl/research-smflooding-vis

A photo of my demo up and running on a PC with a PS4 controller on a wooden desk. An Entroware laptop sits partially obscured by a desktop PC monitor, the latter of which has the demo full screen.

(Above: A photo of my demo in action!)

A photo of some piles of postcards arranged on a light wooden desk. My research is not shown, but visuals from other researchers' projects are printed, such as microbiology to disease research to jellyfish galaxies.

(Above: A photo of the postcards on the desk next to my demo. My research is not shown, but visuals from other researchers' projects are printed, with everything from microbiology to disease research to jellyfish galaxies.)

I've submitted a paper on my rainfall radar research to NLDL 2024!

A screenshot of the nldl.org conference website.

(Above: A screenshot of the NLDL website)

Hey there! I'm excited that last week I submitted a paper to what I hope will become my very first conference! I've attended the AAAI-22 doctoral consortium online, but I haven't had the opportunity to attend a conference until now. Of course, I had to post about it here.

First things first, which conference have I chosen? With the help of my supervisor, we chose the Northern Lights Deep Learning Conference. It's relatively close by the UK (where I live), it's relevant to my area and the paper I wanted to submit (I've been working on the paper since ~July/August 2023), and the deadline wasn't too tight. There were a few other conferences I was considering, but they either had really awkward deadlines (sorry, HADR! I've missed you twice now), or got moved to an unsafe country (IJCAI → China).

The timeline is roughly as follows:

  • ~early - ~mid November 2023: acceptance / rejection notification
  • somewhere in the middle: paper revision time
  • 9th - 11th January 2024: conference time!

Should I get accepted, I'll be attending in person! I hope to meet some cool new people in the field of AI/machine learning and have lots of fascinating discussions about the field.

As longtime readers of my blog here might have guessed, the paper I've submitted is on my research using rainfall radar data and abusing image segmentation to predict floods. The exact title is as follows:

Towards AI for approximating hydrodynamic simulations as a 2D segmentation task

As the paper is unreviewed, I don't feel comfortable with releasing it publicly yet. However, feel free to contact me if you'd like to read it and I'm happy to hand out a copy of the unreviewed paper individually.

Most of the content has been covered quite casually in my phd update blog post series (16 posts in the series so far! easily my longest series by now), just explained in formal language.

This paper will also form the foundation of the second of two big meaty chapters of my thesis, the first being based on my social media journal article. I'm currently at 80 pages of thesis (including appendices, excluding bibliography, single spaced a4), and I still have a little way to go before it's done.

I'll be back soon with another PhD update blog post with more details about the thesis writing process and everything else I've been up to over the last 2 months. I may also write a post on the hull science festival which I'll be attending on the supercomputing stand with a Cool Demo™, 'cause the demo is indeed very cool.

See you then!

How to read a paper

So you've got a paper. Maybe even a few papers. Okay, it's a whole stack of them and you don't have the time to read them all (they do have a habit of multiplying when you're not looking). What is one to do? I've had this question asked of me a few times, so I thought I'd write up a quick post to answer it, organise my thoughts, and explain my personal process for sorting through and reading scientific papers (I generally find regular 'news'papers to be of questionable reliability, lacking depth, and to just not to be worth the effort).

A bunch of papers

(A bunch of papers I've read.... and one that I've written.)

Finding papers

If you are in a position where you don't have any papers to begin with, then search engines are your best friend. Just like DuckDuckGo, Ecosia, and others provide an interface to search the web, there are special search engines designed to search for scientific papers. The two main ones I suggest are:

Personally, semantic scholar is my paper search engine of choice. Enter some general search terms for the field / thing you want to read about, and relevant papers will be displayed. It can be useful to change the sort order from relevance to citation count or most influential papers to get a look at what are likely to be the seminal papers (i.e. the ones that first introduced a thing - e.g. like the Attention is all you need paper first introduced the transformer) in that field - though they may be less relevant.

The other nice feature these search engines have is copying out BibTeX to paste into your bibliography in LaTeX (see also the LaTeX templates I maintain for reports/papers/dissertations/theses)

A note on reliability: Papers on preprint servers like arXiv have not been peer reviewed. Avoid these unless there's no other option.

Sorting through them

So you've you know how to find papers now, but how do you actually read them? Personally, I use a tiered system to this.

Reading the abstract: Firstly, I'll read the abstract. Just like you read the title of a search result to decide whether you want to click on the search result, so do I read the abstract to decide whether a paper is worth my time to read it.

Sometimes I'll stop there. Maybe the paper isn't what I thought it was, or I've simply got all the information I need from it. The latter is most common when I'm writing some paper or report: often I'll need a paper as a reliable source for something, and I won't need to read the whole paper to know that it has the information I need.

Okay, so suppose a paper passes a quick look at the title and abstract, and I want to go deeper. You'd think it's time to jump right in and read it from top to bottom, but you'd be wrong. Reading an entire paper in detail is significantly time consuming, and I want to be really sure it's worth the effort before I commit to it.

Skim reading: The next test is a quick skim read. If it's a journal article, there might be some key contributions at the top of the paper - these are a good place to start. If not, then they can often be found at the end of the introduction - this also goes for conference papers as well. The introduction is usually my second stop (though remember I'm still not reading it word for word yet), followed by the end of the results/experimental discussion section to understand the key points of what they did and how that went for them.

AI summarisation Another option if a paper is dense and/or long is to use an AI summarisation tool. These must always be taken with a grain of salt, but can help to direct my search when I'm having difficulty extracting a specific piece of information. AI summarisation can also be a good start if an abstract is bad or missing the information I want but the subject itself is interesting. I often find AI-generated summaries can be quite generic, so it's not a complete solution.

A note on ChatGPT: ChatGPT is a generic language model, and as such isn't ideal for generating summaries of documents. It's best to use a model specifically trained for this purpose, and to take any output you get with a grain of salt.

AI document discussion: Occasionally the abstract of a paper suggests that it contains a significantly interesting nugget of information I'm interested in acquiring (again, most often when writing a paper rather than initial research), but the paper is long, dense, I'm having difficulty finding it, or some combination of the three.

This is where AI-driven document discussion can be invaluable. As I noted earlier, AI-generated summaries tend to be quite generic, so it's not great if there's something highly specific I'm after. The only place I'm currently aware of that ships this feature in a useful form is Kagi, a paid-for search engine with AI features (document summarisation and document discussion) built-in. I'm sure others have shipped the feature, but I haven't seen them yet.

Essentially, AI-driven document discussion is where you ask a natural language question about the target paper, and it does the reading comprehension for you by answering your question with useful quotes from the paper. Then once you have the answer you can go and look at that specific part of the paper (use your browser's find tool) to get additional context.

I've found this to be a great time saver. It can also be useful if I'm unsure if a paper actually talks about the thing I'm interested in or not.

Kagi: Specifically, Kagi (my current main search engine) implements both of the aforementioned features. They can be access via the Discuss document option next to search engines, or by dedicated !bangs (Kagi implements all of DuckDuckGo's !bangs too), which are significantly helpful as I touched on above.

  • AI summarisation: !sum <url_of_paper_or_webpage>
  • AI discuss document: !discuss <url_of_paper_or_webpage>

A disclaimer: I have received no money or other forms of compensation for mentioning Kagi here. Kagi have no asked me to mention them here at all, I just think their product is helpful, useful, produces good search results, and saves me time. AI models can be computationally expensive, so I speculate it would be difficult to find a free version without strings attached.

A screenshot of a sample discuss document discussion about the paper Attention is all you need.

(Above: A screenshot of a sample discuss document discussion about the paper Attention is all you need)

How to read a paper effectively

So a paper has somehow made it through all of those steps unscathed, and yet I still haven't extracted everything I want to know from it. By this point it must be a significantly interesting paper that I likely want lots of details from.

The process of actually reading a paper from top to bottom is an inherently time consuming one: hence all the other steps above to filter papers out with minimal effort before I commit to spending what is typically an hour or more of my time to a single paper.

My general advice is to do a re-read of the abstract to confirm, and then start with the introduction and make your way down. Take it slow.

Making notes: When I do read a paper, I always make notes when doing so. Having 2 monitors is also helpful, as I can make notes on 1 and have the paper on the other. My current tool of choice here is Obsidian, a fabulous open-source note taking system that I'll wholeheartedly recommend to everyone. It's Markdown-based and has a tagging system (nested tags are supported too!) to keep papers organised. The directed graph and canvas features are also pretty cool. My general template at the moment I use for making notes on papers is as follows:

---
tags: some, tags/here
---

> - URL: <https://example.com/paper_url_here/doi_if_possible.pdf>
> - Year: YEAR_PAPER_WAS_PUBLISHED

- Bulleted notes go here
    - I nest bullet points based on the topic
        - To as many levels as needed
    - These notes are very casual
- [I contain my own thoughts in square brackets]
    - This keeps the things that the paper says separate from the things that I think about it
- Sometimes if I'm making a lot of notes I'll split them up into sections derived from the paper


## PDF
The last section contains the PDF of the paper itself. Obsidian supports dragging and dropping PDFs in, and it also has a dedicated PDF viewer.

Complete with an explanation of what each section is for!

You don't have to use Obsidian (it's the best one I've found), but I strongly recommend making notes while you read a paper. This way you have some distilled notes in your own words to refer back to later. It also helps to further your own understanding of the topic of a paper by putting it into your own words. Other tools I'm aware of include OneNote and QOwnnotes (I still use this for making notes in meetings and recording random stuff that's not necessarily related to research. I keep Obsidian quite focused atm).

Make sure these notes are digital. You'll thank me later. The number of times I've used Obsidian's search function to find the notes I made about a specific paper is absolutely unreal. Over time you'll get a good sense for what you need to make notes on, to avoid both having to refer back to the paper again later and having so many notes that it takes longer than hunting around in the source paper for the information you were after.

A screenshot of my obsidian workspace.

(Above: A screenshot of my Obsidian workspace.)

Sometimes your research project will change direction, and the notes you made are suddenly less relevant. Or you've learned something elsewhere and now come back with fresh and more experienced eyes. I often update the notes I took initially to add more information, or references to other related papers that go together.

Continual evaluation: As I read, I'm continually evaluating in the back of my mind whether it's worth continuing to read. I'm asking questions like "is this paper going on a tangent?", and "is the solution to their problem the researchers employed actually interesting to me?", and "is this paper getting too dense for me to understand?", and "is the explanation the paper gives actually intelligible?" (yes, papers do vary in explanatory quality). If the exercise of reading a paper becomes not worth the time, stop reading it and move on.

Sometimes it's worth jumping into skim-reading mode for a bit if something's irrelevant etc to see if it gets better.

But I don't understand something!

This is a normal part of reading a paper. This can be for a number of reasons:

  1. The paper is bad
  2. The paper is good, but is terrible at explaining things
  3. The paper contains more maths than explanation of the variables contained therein
  4. I'm lacking some prerequisite knowledge that the paper doesn't properly explain
  5. Some other issue

It is not always obvious which of these cases I find myself in when I encounter difficulty reading a paper. Nevertheless, I employ a number of strategies to deal with the situation:

  • Reading around: As in most things, reading around the area of the paper that is causing and issue may yield additional information. Sometimes returning to the related works / background / approach section can help.
  • Search for related papers: There are many papers that have been written, so it can be worth going looking for a related paper. It might be a better paper or worded differently that makes it easier to understand.
  • Look through the paper's references: This can also be a good way to trace back to the source of an idea. Semantic scholar's References tab below the abstract lists all the references too, and the related works section of a paper will tell you how each cited work is relevant to the problem, motivations, and subsequent method and results thereof.
  • Look for seminal papers: See above. Finding the original paper on a given idea can help a lot, as it's often explained in much more detail than later papers that assume you've read the so-called seminal work.
  • Web search: For specific terms or concepts. Sometimes just a quick definition is needed. Other times it's more substantial and requires reading an entire separate blog post - compare Attention is all you need with the blog post the illustrated transformer. Each provides a different perspective. In this case I actually read both at the same time to fully understand the topic. Make sure you properly assess anything you find for reliability as usual.

Supervision: It's very unlikely that after all of these steps I'll still be stumped on how to proceed, but it has happened. In these situation it can be extremely helpful to have someone more experienced in the field to discuss with. For me, this is my PhD supervisor Nina.

Whoever they are, keeping in regular contact is best as you work through a project. Frequency varies, but for my PhD supervision this has fallen somewhere between 1 week and 3 weeks between each meeting, and each meeting is no less than an hour long. Their advice and insight can guide your efforts as you progress through a research project.

They will also likely be busy people, so make sure you properly prepare before meeting them. Summarise what you've read and how it relates to your project and what you want to do. Make a list of questions that you want to ask them. Gather your thoughts. This will help you make the most of your discussion with them.

Conclusion

I've outlined my personal process I employ when reading a paper (in perhaps more detail than was necessary). It's designed to save me time and allow me to cover ground relatively quickly (though quickly is still a relative term, as in a worst-case with a completely new broad field it can take weeks to cover it enough to gain a good understanding thereof).

This is my process: you need to find something that works for you. It's okay if this takes time. Maybe lots of time... but you'll get there in the end. The more you read, the more you'll get an instinctive sense of the stuff I ramble about here. My method isn't perfect either - I'm still learning, so my process will likely evolve over time.

If you've got any comments or questions, do leave them in the comments section below and I'll do my best to answer them.

Art by Mythdael