In this post, you will first of all be able to download weather station data from all over the world formatted in an appropriate manner for easy manipulation using the package GSODR. And get quick view on some variable. Secondly, interpolate these data over a given area using the inverse distance weighted approach.
The code is executable in the R programming tool.
Global Surface Summary of the Day (‘GSOD’)
Daily global surface summary product are produced by the National Climatic Data Center (NCDC) These data include quality controlled daily, monthly, seasonal, and yearly measurements of temperature, precipitation, wind, and degree days as well as radar data and 30-year Climate Normals. The data can be downloaded from the Climate Data Online using this link : www.ncdc.noaa.gov/cdo-web/ Historical data for 30809 stations are available.
Framework of GSOD
The data summaries here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program according to WMO Resolution 40 (Cg-XII). Get further for the description in this link: http://www7.ncdc.noaa.gov/CDO/GSOD_DESC.txt.
LOAD Packages
#Load library
library(tmap)
library(GSODR) #Global Surface Summary of the Day ('GSOD') Weather
library("dplyr")
library(hydroTSM)
library(skimr)#summary statistics
library(raster)
library(maps)
library(ggplot2)
Set working directory
setwd("C:/Users/DEROOT/Blog/N2")
GSOD Weather Station Data Inventories
inventory <- get_inventory()
as_tibble(inventory)
Retrieve stations information of a particular country
As example I will use Benin republic (in West-Africa).
Print Benin station descriptive information: station id,name,latitude and longitude
Benin_stat_id=unique(inventory[which(inventory$COUNTRY_NAME=='BENIN'),1:4])
Benin_stat_id
Download all weather stations for the year 2018 in Benin using get_GSOD
tbar_benin_2018<-get_GSOD(years=2018, station=Benin_stat_id$STNID)
as_tibble(tbar_benin_2018)
#Sub important station ID character (remove *- 99999*)
tbar_benin_2018$STNID<-substr(tbar_benin_2018$STNID,1,6)
Benin_stat_id$STNID<-substr(Benin_stat_id$STNID,1,6)
as_tibble(Benin_stat_id)
Climograph based on precipitation and temperature data for a given each station
Quick view on data
tbar_benin_2018$DATE<-as.Date(tbar_benin_2018$YEARMODA,"%Y-%m-%d")
#Subset of tbar_benin_2018
climdf=tbar_benin_2018[,c('TEMP',"MAX",'MIN','PRCP','LONGITUDE','LATITUDE','STNID')]
clim_xts=xts(x = climdf, order.by =tbar_benin_2018$DATE)
#convert content to numeric
storage.mode(clim_xts) <- "double"
# Plot climograph for the first station as example
climdt=subset(clim_xts,STNID==Benin_stat_id$STNID[1])
climograph(pcp=climdt$PRCP,tmx=climdt$MAX,tmn=climdt$TEMP,
main=paste("Climograph",Benin_stat_id$NAME[1]),
na.rm=TRUE,pcp.col='green',tmean.col='red')
2. Perform an inverse distance weighted interpolation using the new tmap packages
Create new dataset
#Create new dataset
clim_dt=climdf
clim_dt$Year=format(tbar_benin_2018$DATE,'%Y')
#summarise weather paramters in a yearly basis
clim_dt=clim_dt%>% group_by(STNID,LONGITUDE,LATITUDE,Year) %>% summarise(
TEMP=mean(TEMP),
MAX=mean(MAX, na.rm=TRUE),
MIN=mean(MIN, na.rm=TRUE),
PRCP=sum(PRCP, na.rm=TRUE))
clim_dt
Get geographic data
study_area<- getData("GADM", country=c("Benin"), level=0)
#study_area1<- getData("GADM", country=c("Benin"), level=1)#first level subdivision
plot(study_area,main='BENIN')
Grid point distribution
attach(clim_dt)
(spatial_Benin<- ggplot(clim_dt, aes(x =LONGITUDE, y =LATITUDE,
colour =TEMP))+ geom_point())
detach(clim_dt)
Format the grid points (stations) to right coordinate system
stations=SpatialPoints(clim_dt[,c("LONGITUDE","LATITUDE")],
proj4string=CRS("+proj=utm +zone=30 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0"))
#Creation of a SpatialPointsDataFrame variable
stations = SpatialPointsDataFrame(stations,clim_dt)
class(stations)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
Check if the study area and the grid poinst (stattions) have same coordinate
crs(study_area)
## CRS arguments:
## +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
crs(stations)
## CRS arguments:
## +proj=utm +zone=30 +datum=WGS84 +units=m +no_defs +ellps=WGS84
## +towgs84=0,0,0
Distribution of grid point on the study area
plot(study_area, col='lightblue')
plot(stations, main="Map of Plot Locations",add=TRUE)
IDW : Interpolation function
study_area: geographic file in SpatialPointsDataFrame
format
param:parameter interpolated
idp : inverse distance weighting power
stations: weather data in SpatialPointsDataFrame
format
idw_interpol=function(study_area,stations,param,idp){
lon_range <- as.numeric(c(extent(study_area)[1],extent(study_area)[2])) # min/max longitude of the interpolation area
lat_range <- as.numeric(c(extent(study_area)[3], extent(study_area)[4])) # min/max latitude of the interpolation area
# Grid creation
grd <- expand.grid(x = seq(from = lon_range[1],to = lon_range[2],by = 0.1),
y = seq(from = lat_range[1], to = lat_range[2],by = 0.1))
# Convert grd object to a matrix and then turn into a spatial points object
coordinates(grd) <- ~x + y
# Create SpatialPixel object
gridded(grd) <- TRUE
fullgrid(grd) <- TRUE
#### view grid with points overlayed
plot(grd, cex = 1.5, col = "grey")
plot(stations,pch = 15,col = "red",cex = 1, add = TRUE)
plot(study_area,add = TRUE)
#Extend projection attributes of stations to the empty grid
proj4string(grd) <- proj4string(stations)
# interpolate the maximum temperature
clim_idw<- gstat::idw(formula =param ~ 1,
locations = stations,
newdata = grd,
idp = 1)
# Create raster object of the interplotaion and then clip to study area
r <- raster(clim_idw)
r.m <- mask(r,study_area)
return(r.m)
}
Perform idw_interpol
function on maxium temperature
r.m=idw_interpol(study_area,stations,stations$MAX,idp = 1)
## [inverse distance weighted interpolation]
Thematic map of the interpolated maximum temperature
pal =c("#FFFFC8", "#FCE7A0", "#F7C252" ,"#F39300", "#EB5500" ,"#BE1D00", "#7D0025")
tmap_mode("plot")
## tmap mode set to plotting
tm_shape(r.m,xlim=c(extent(r.m)[1],extent(r.m)[2]+0.5),
ylim=c(extent(r.m)[3]-0.3,extent(r.m)[4]+0.5),unit='km')+
tm_scale_bar(position=c("left", "bottom"))+
tm_raster(n=10,palette =pal,auto.palette.mapping = FALSE, title="Temperature \n[C]") +
tm_shape(stations)+tm_dots(size=0.4) +
tm_compass(position=c("right", "top"))+tm_grid(lines=FALSE)+
tm_layout(scale = .8,legend.position= c("right","bottom"),
legend.bg.color = "skyblue", legend.bg.alpha = .2,
legend.frame = "gray50")
Well done Thierry !
ReplyDeleteOk boss, I'm going to try
ReplyDeleteI'm already shared it
Good job Thierry!
ReplyDeleteAll the best
Thierry, thank you for your post. It is very good and nice work. Keep it up. However, I have a question regarding your work. Where did you get the data for precipitation and temperature for Benin? You just stated "Climograph based on precipitation and temperature data for a given each station" but where this data came from I have not seen where it started. In additional to that, the format required in the analysis is also not shown.
ReplyDeleteSorry for asking such a poor question but I am novel in R
Thierry,
ReplyDeleteOnce again, the question above has an answer dont bother about it. Now, I am getting an error when I am trying to plot the Climograph with Malawi data as shown below. What could be the problem?
....................
...................
04-12" "2018-04-30" "2018-06-04" "2018-06-11" "2018-07-18"
[6] "2018-08-15"
> clim_xts=as.xts(tbar_malawi_2018$DATE,climdf)
> storage.mode(clim_xts) <- "double"
> climdt=subset(clim_xts,STNID==Malawi_stat_id$STNID[1])
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 0, 309
In addition: Warning message:
In as.data.frame.numeric(coredata(x), row.names, optional, ...) :
'row.names' is not a character vector of length 0 -- omitting it. Will be an error!
> head(Malawi_stat_id)
*** FEDERAL CLIMATE COMPLEX INTEGRATED SURFACE DATA INVENTORY ***
This inventory provides the number of weather observations by
STATION-YEAR-MONTH for beginning of record through July 2020
STNID NAME LAT LON
1: 674210 CHITIPA -9.700 33.267
2: 674230 KARONGA -9.954 33.893
3: 674240 BOLERO -10.967 33.733
4: 674850 MZIMBA -11.883 33.617
5: 674890 MZUZU -11.445 34.012
6: 674930 NKHATA BAY -11.600 34.300
> climdt=subset(clim_xts,STNID==Malawi_stat_id$STNID[1])
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 0, 309
In addition: Warning message:
In as.data.frame.numeric(coredata(x), row.names, optional, ...) :
'row.names' is not a character vector of length 0 -- omitting it. Will be an error!
>
Thank you very much for your comments. It helps me to find out that, there are some missing stations in your country. In fact, for the first six stations, you are requesting in Malawi, there is no data saved on the Global Summary Of the Day (GSOD)repository site.
DeleteThe package GSODR uses two online repositories in downloading the weather files. If the number of stations requested in using GSODR Command are below 10, then it downloads the files by accessing this website: https://www.ncei.noaa.gov/data/global-summary-of-the-day/access/2018/
After thorough checking among the station’s CSV files there, only a few stations in your country are in. In particular stations (676930-99999; CHILEKA INTL) and (677910-99999; BVUMBWE) are available. One can report it to them for consideration in future update (ncei.orders@noaa.gov). Sometimes, there are some convention which prevents this organization of sharing certain country stations data. Therefore, preventing some stations data to be shared in an open-source repository.
Nevertheless, you can access hourly weather data for all stations in Malawi from this link: https://www.ncei.noaa.gov/data/global-hourly/access/2018/
Hope the above will help. Please, do not hesitate to post your comments for further clarification.
Run the code below for one the station mentined above:
------------------------------------
# GSOD Weather Station Data Inventories
inventory <- get_inventory()
head(inventory)
# Retrieve stations information of a particular country
# As example I will use Benin republic (in West-Africa).
#print Malawi station descriptive information: station id,name,latitude and longitude
MALAWI_stat_id=unique(inventory[which(inventory$COUNTRY_NAME=='MALAWI'),1:4])
MALAWI_stat_id
## Download station 676930-99999 data of the year 2018
tbar_Malawi_2018<-get_GSOD(years= 2018, station='676930-99999')
#Sub important station ID character
tbar_Malawi_2018$STNID<-substr(tbar_Malawi_2018$STNID,1,6)
MALAWI_stat_id$STNID<-substr(MALAWI_stat_id$STNID,1,6)
# Quick view on data
# Climograph based on precipitation and temperature data for a given each station
#Subset of temperature,precipitation
tbar_Malawi_2018$DATE<-as.Date(tbar_Malawi_2018$YEARMODA,"%Y-%m-%d")
climdf=tbar_Malawi_2018[,c('TEMP',"MAX",'MIN','PRCP','LONGITUDE','LATITUDE','STNID')]
clim_xts=xts(x = climdf, order.by =tbar_Malawi_2018$DATE)
#convert content to numeric
storage.mode(clim_xts) <- "double"
# Plot climograph for the first station as example
climdt=subset(clim_xts,STNID==unique(clim_xts$STNID)[1])
climograph(pcp=climdt$PRCP,
tmx=climdt$MAX,
tmn=climdt$TEMP,
main=paste("Climograph",unique(clim_xts$STNID)[1]),
na.rm=TRUE,
pcp.col='green',
tmean.col='red')
-------------------------------------------------------------------------
Thierry, this is very encouraging. Thank you for your prompt response. I will make a follow up with my data section to make sure that all stations are reported to WMO hub or NOAA.
ReplyDeleteLet me give the code a try and will come back should I face a challenge. Greetings from Malawi
Thierry, this is very encouraging. Thank you for your prompt response. I will make a follow up with my data section to make sure that all stations are reported to WMO hub or NOAA.
ReplyDeleteLet me give the code a try and will come back should I face a challenge. Greetings from Malawi
Dear Thierry,
ReplyDeleteHere I come again; how can I subset the data period so that instead of just plotting for a year, may be I want to plot Climographs for 10years or 30 years. I would like to compare for a station how it behaved the past 10years and now.
The other question is how can I read the hourly data just the same way as we are doing with the 24 hour data from synops.
Thanks in advance.
Sorry for the late reply.
DeleteThank Charles for your comments. I am working on the hourly data. I will share a post on that in the coming days. For the climograph, you can just make a loop to plot for the 10 years. I will share with you a code on that soon.
I am happy that my blog reach you in Malawi.
Sorry for my late reply. In response to your comments I have prepared a post on that. Check the link below to access it.
Deletehttps://climdatanalyst.blogspot.com/2020/09/post-processing-of--hourly-isd.html
Do not hesitate for further if clarifications are needed.
Too good information that you shared. It would be so useful post for those are searching for climate data online.
ReplyDeleteThank for your comment. Happy that it helps.
Delete