I’m 50 days away from thesis submission, 182 days in to my master’s project. I’ve mentioned that my project was on deforestation in a few previous posts, but never really wrote about it, and since I’m approaching the time for more writing, I thought I should finally get around to it.
This project isn’t my first computational one, since the two projects I did in my third year at Cambridge also involved using available data and computational modelling. The huge difference though is that those projects were only about two to three months long each and there honestly isn’t much that one can do in such a short time (we were advised to spend no more than 70 hours on each project, but I reckon I spent at least 120 hours…), while I’ve got 8+ months to deal with this. Those projects also had to be done on top of lectures and essays and supervisions, whereas I could concentrate fully on this project, and really go deeper, but even so, I wish I’ve got more time. My other experiences with longer research projects were from secondary school (15-16 years old) and junior college (17-18 years old) and had involved field work, with minor amounts of data processing and analysis towards the end, and these kinds of ecology projects were what I was more familiar with. So despite having had some (very limited) experience with computational project previously, this project was really an eye-opener for me, and I got a glimpse of what the other end of the spectrum of ecological studies was really like. I’ll also admit that, though I really enjoy field work and have always looked to do field studies, I am hooked by the sheer amount one can do with computational studies.
Over the few years that I’ve been in the UK, I’ve been increasingly drawn into the computational world, finding data analysis and visualisation fascinating, and coding and programming pretty amazing. Perhaps it was the dawning realisation that I’m not particularly good at species identification (compared to peers of my age), and while I’m no where near good at programming (compared to everyone else in the world), I can get good at it still. And it’s broadly applicable to a variety of fields.
My project involves identifying global deforestation hotspots, modelling the drivers and projecting future deforestation hotspots. I used a freely available global dataset of forest cover and forest loss from 2000 to 2014, and processed that to obtain a map of deforestation hotspots. Between downloading the dataset and plotting a map of deforestation hotspots, a lot of my time was spent considering the definition of a forest, figuring how best to calculate deforestation, what a deforestation hotspot meant, and double checking my code/visualising my data to ensure it was giving the right output. I did a literature search to inform my choice of explanatory variables for the modelling, obtained and processed the relevant data to get them in a suitable format for modelling.
There has been a fair amount of wading through lots of statistics and math, trying to understand theories and the logic behind some numbers and values, instead of just running the relevant function and accepting the number that magically appears. I tried not to be daunted by the tonnes of equations (I’m someone whose mind automatically blanks when I see an equation, rather than try and engage with it), and the fact that I didn’t quite understand most of it, and the knowledge that some modelling entails rather high level computational skills. Some things I have accepted that I will probably not be able to fully grasp at this stage, given the lack of time and resources, but which I acceded by telling myself I will look into it later. Most of my time is probably spent Googling and trying to read up, on statistics, R package vignettes, and specific functions in packages.
I have learned how to download files and run Python scripts using bash in terminal on my Mac (though not much else apart from that). I’ve been using R extensively, mainly manipulating raster objects, but also for data frames, plotting, saving output etc. Not great at it still, could definitely improve on making code neater, faster and more legible, but a marked step up from before. Also trying to write in LaTeX and using GitHub to version control, but I’m still a long way from incorporating those into my workflow.
I realised though, that doing a computational project did not mean I only picked up computational skills. The actual running of code and processing of data seldom took much time – as mentioned earlier, time was mostly spent thinking, reading, and trouble shooting (or just managing my data and writing ReadMe files to keep track of what I’ve done). But during the occasional periods when my computer was doing my work for me, I tried to hone my other skills. Articulating a rabbit skeleton I picked up during my Pennine Way hike (still a work in progress), drawing, coming up with graphics using Inkscape, creating a Tumblr for Silwood Park, cooking, and blanching vegetables. Though developing these skills are perhaps more an artefact of being at Silwood Park than doing a computational project…
I did this one-year master’s partly because I wasn’t sure if going into a PhD and research was what I really wanted to do. I did get occasions of existential crises and questioned my research skills and interests, but overall, I have quite enjoyed working on my project and do want to get deeper into some of the topics. And so now, alongside finishing up my project and writing my thesis, PhD hunting is underway…