Delivering large data science projects remotely: lessons learnt
Doing data science working from home is challenging: here is what we have learnt...
When we produced our last article for Waterwise we were about 3 weeks into the COVID-19 lockdown and just starting to hear anecdotal evidence about household consumption increasing and business use decreasing. In this newsletter we examine some of the evidence we are starting to see, think about the drivers behind the changes, and consider what this means for the future. It also gives us an opportunity to show off some neat data visualisations.
To explore what has been happening to consumption since the COVID-19 lockdown started on 23rd March 2020, we are going to take a look at three sets of data, some aggregated demand data from about 200 predominantly domestic DMAs (district metered areas), some aggregated consumption data from about 1000 non-households, and some Google Trend data. The data is anonymous with the household and non-household data coming from different regions of the country but it is consistent with what has been going on around the UK.
So, let us start with household consumption illustrated in Figure 1, which shows consumption from June 2019 through to the 7th June 2020. This shows a data visualisation in the form of a spectrogram to illustrate how water use varies across time. It allows the viewer to see trends in night use, morning peak use, afternoon and evening use at a high level. In the spectrogram the x-axis shows individual days throughout the period, the y-axis shows the hour of the day from just after midnight at the bottom, to midnight at the top. Demand or aggregate flow is shown by the colour scale in litres/property/hour (low flows are dark, high flows are bright). The data has been corrected for the BST/GMT time change so the time plotted represents the actual clock time that consumers would respond to.
In Figure 1 you can see the regular morning pattern of morning consumption in weekdays and weekends before the lockdown, and how the morning peak is diffused during school holiday periods, e.g. July and August, and the Xmas and New Year period. COVID-19 lockdown starts at the end of March 2020, and immediately the strong morning peak pattern is changed, with the morning peak starting later in the day. Plus, with the hot dry weather in April and May, there is high water use through the day and evening, at levels higher than the peak in the summer of 2019 (which can be seen at the left-hand side of the plot).
Figure 1 Aggregate consumption from 200,000 households from June 2019 to June 2020
Clearly consumption has increased and, in this case, the peak daily consumption at the end of May is about 35% higher than it was pre-lockdown. But we also see changes in the daily profile. In ‘normal times’, we see the highest daily peak during the morning (around 8am on a weekday) but post lockdown we see the evening peak often being higher than the morning peak. The weekday morning peak is also later between 9 and 10am. Clearly some of these changes in consumption are driven by the changes in behaviour from working at home and not needing to get children up and ready for school in the mornings. But, there are other drivers for the changes to consumption and we will return to this later. Next, we turn our attention to non-household use, which has changed dramatically due to businesses shutting during lockdown or people working from home with only essential premises remaining open. Figure 2 shows a similar spectrogram from the aggregated consumption of 1000 non-households. This shows data from April 2019 through to mid-April 2020. Again, there is a regular weekly pattern to the consumption, with a peak between 11am and 3pm. There is a slight reduction visible during the August school holiday period, and a much clearer reduction in consumption over the Xmas and New Year period. Towards the right-hand side of the graph the impact of the COVID-19 lockdown can be clearly seen.
Figure 2 Aggregate consumption from 1,000 non-households from April 2019 to mid April 2020
Immediately after the 23rd March, the consumption drops significantly. There is still some consumption from 7am through to about 4pm, but after this there is very little consumption evident. No doubt that there has been a huge reduction in consumption from the hospitality, entertainment and retail sectors. We are still analysing this non-household data and hope to publish more on this soon.
Returning to the household consumption (Figure 1), and the possible drivers for the increase. This is a challenging area to unpick, as there is so much going on. Obviously, there are more people at home during the day, including adults working from home or furloughed (or home early from university), and school-aged children, also mostly at home. There are changes in water use like more handwashing (difficult to quantify the impact – but we know it will have an effect). Then, soon after lockdown there was a mini heatwave with hotter temperatures than normal during April and May 2020, and extremely dry weather through March, April and May 2020. Based on our work on the 2018 summer peak, we expect this to result in an increase in outside use. So, we did some analysis using Google Trends, looking for increased interest in paddling pools, hose pipes and sprinklers. This is presented in Figure 3, alongside the increase in household consumption.
Figure 3 Analysis from Google Trend data on search interest in outside water use products
In Figure 3 the units are normalised between 0 and 100 for each of the individual graphs, i.e. they cannot be directly compared in terms of scale, but can be in terms of trends.
The top three lines on the graph show the relative interest (from before lockdown) in searches for paddling pools, hosepipes and sprinklers, the bottom line (pink) shows the relative change in household consumption over the period The similarity is striking, and although the patterns for these search terms look reasonably well correlated, this does not necessarily mean that they are the cause. It does however highlight that peak external use might be just as much to do with garden recreational use, as more traditional garden watering.
Additional factors that may also be responsible for increases in household consumption post lockdown include increased occupancy (older children returning home when colleges and universities closed), and also less movement of people between areas (people not going to work and not going away on holiday).
In summary, household consumption has seen a huge increase since lockdown, with business use going in the other direction). The reasons for the increase in household use are complex, some are obvious, but if we are to understand how to predict the impact going forward and how to deliver water efficiency, then there needs to be a lot more detailed analysis. It is far from clear how lockdown will end, and how long its impacts will last. Water companies have a very uncertain summer ahead, if it turns out to be very warm there are likely to be some big challenges. We are also likely to see some prolonged impacts from home working, and whatever the new normal will be in the future.