PhD Update 18: The end and the beginning
Hello! It has been a while. Things have been most certainly happening, and I'm sorry I haven't had the energy to update my blog here as often as I'd like. Most notably, I submitted my thesis last week (gasp!)! This does not mean the end of this series though - see below.
Before we continue, here's our traditional list of past posts:
Since last time, that detecting persuasive tactic challenge has ended too, and we have a paper going through at the moment: BDA at SemEval-2024 Task 4: Detection of Persuasion in Memes Across Languages with Ensemble Learning and External Knowledge.
Theeeeeeeeeeeeesis
Hi! A wild thesis appeared! Final counts are 35,417 words, 443 separate sources, 167 pages, and 50 pages of bibliography - making that 217 pages in total. No wonder it took so long to write! I submitted at 2:35pm BST on Friday 10th May 2024.
I. can. finally. rest.
It has been such a long process, and taken a lot of energy to complete it, especially since large amounts of formal academic writing isn't usually my thing. I would like to extend a heartfelt thanks especially to my supervisor for being there from beginning to end and beyond to support me through this endeavour - and everyone else who has helped out in one way or another (you know who you are).
Next step is the viva, which will be some time in July. I know who my examiners are going to be, but I'm unsure whether it would be wise to say here. Between now and then, I want to stalk investigate my examiners' research histories, which should give me an insight into their perspective on my research.
Once the viva is done, I expect to have a bunch of corrections to do. Once those are completed, I will to the best of my ability be releasing my thesis for all to read for free. I still need to talk to people to figure out how to do that, but rest assured that if you can't get enough of my research via the papers I've written for some reason, then my thesis will not be far behind.
Coming to the end of my PhD and submitting my thesis has been surprisingly emotionally demanding, so I thank everyone who is still here for sticking around and being patient as I navigate these unfamiliar events.
Researchy things
While my PhD may be coming to a close (I still can't believe this is happening), I have confirmed that I will have dedicated time for research-related activities. Yay!
This means, of course, that as one ending draws near, a new beginning is also starting. Today's task after writing this post is to readificate around my chosen idea to figure out where there's a gap in existing research for me to make a meaningful contribution. In a very real way, it's almost like I am searching for directions as I did in my very first post in this series.
My idea is connected to the social media research that I did previously on multimodal natural language processing of flooding tweets and images with respect to sentiment analysis (it sounded better in my head).
Specifically, I think I can do better than just sentiment analysis. Imagine an image of a street that's partially underwater. Is there a rescue team on a boat rescuing someone? What about the person on the roof waving for help? Perhaps it's a bridge that's about to be swept away, or a tree that has fallen down? Can we both identify these things in images and map them to physical locations?
Existing approaches to e.g. detect where the water is in the image are prone to misidentifying water that is infact where it should be for once, such as in rivers and lakes. To this end, I propose looking for the people and things in the water rather than the water itself and go for a people-centred approach to flood information management.
I imagine that while I'll probably use data from social media I already have (getting a hold of new data from social media is very difficult at the moment) - filtered for memes and misinformation this time - if you know of any relevant sources of data or datasets, I'm absolutely interested and please get in touch. It would be helpful but not required if it's related to a specific natural disaster event (I'm currently looking at floods, branching out to others is absolutely possible and on the cards but I will need to submit a new ethics form for that before touching any data).
Another challenge I anticipate is that of unlabelled data. It is often the case that large volumes of data are generated during an unfolding natural disaster, and processing it all can be a challenge. To this end, somehow I want my approach here to make sense of unlabelled images. Of course, generalist foundational models like CLIP are great, but lack the ability to be specific and accurate enough with natural disaster images.
I also intend that this idea would be applicable to images from a range of sources, and not just with respect to social media. I don't know what those sources could be just yet, but if you have some ideas, please let me know.
Finally, I am particularly interested if you or someone you know are in any way involved in natural disaster management. What kinds of challenges do you face? Would this be in any way useful? Please do get in touch either in the comments below or sending me an email (my email address is on the homepage of this website).
Persuasive tactics challenge
The research group I'm part of were successful in completing the SemEval Task 4: Multilingual Detection of Persuasion Techniques in Memes! I implemented the 'late fusion engine', which is a fancy name for an algorithm that uses in basic probability to combine categorical predictions from multiple different models depending on how accurate each model was on a per-category basis.
I'm unsure of the status of the paper, but I think it's been through peer-review so you can find that here: BDA at SemEval-2024 Task 4: Detection of Persuasion in Memes Across Languages with Ensemble Learning and External Knowledge.
I wasn't the lead on that challenge, but I believe the lead person (a friend of mine, if you are reading this and want me to link to somewhere here get in touch) on that project will be going to mexico to present it.
Teaching
I'm still not sure what I can say and what I can't, but starting in september I have been asked to teach a module on basic system administration skills. It's a rather daunting prospect, but I have a bunch of people much more experienced than me to guide me through the process. At the moment the plan is for 21 lecture-ish things, 9 labs, and the assessment stuff, so I'm rather nervous about preparing all of this content.
Of course, as a disclaimer nothing written in this section should be taken as absolute. (Hopefully) more information at some point, though unfortunately I doubt that I would be allowed to share the content created given it's University course material.
As always though, if there's a specific topic that lies anywhere within my expertise that you'd like explaining, I'm happy to write a blog post about it (in my own time, of course).
Conclusion
We've taken a little look at what is been going on since I last posted, and while this post has been rather talky (will try for some kewl graphics next time!), nonetheless I hope this has been an interesting read. I've submitted my thesis, started initial readificating for my next research project - which we've explored the ideas here, helped out a group research challenge project thingy, and been invited to do some teaching!
Hopefully the next post in this series will come out on time - long-term the plan is to absolutely continue blogging about the research I'm doing.
Until next time, the journey continues!
(Oh yeah! and finally finally, to the person who asked a question by email about this old post (I think?), I'm sorry for the delay and I'll try to get back to you soon.)