Women doctors outnumber men in only one UK country...

…, Scotland! This is based on licensed doctor numbers, subject to uncertainty of location of doctors via General Medical Council’s (GMC) medical register

By Tom Franklin, 29th May 2018

The demographics of the medical profession are changing. In 2017, latest data from the GMC showed that the number of licensed doctors in Scotland was greater than those of men for the first time.

Their annual report, the state of medical education and practice 2017, showed that 51.01% of all licensed doctors in Scotland were female, relative to just 44.13% in Wales.

This demographic shift highlights the changing medical profession in the UK. In 2017, 57.64% of doctors in training, the future of the UK’s medical workforce, are female. This suggests that in the future, we are likely to see a female dominated medical profession in the UK.

Creating the above interactive map in R

We’ll learn how to make the above interactive map step by step to visualise UK countries with majority male and female doctors. This has been inspired by an excellent rpubs post I found by Bhaskar V. Karambelkar, which visualises predicted US election results prior to the 2016 election.

A repository with the full code to make the interactive map can be found here for those who prefer that to reading a blog post.

Process to create interactive map

  1. Load libraries and shapefiles and filter for UK countries
  2. Load doctor data and clean
  3. Join data to map
  4. Build map components and final map

1. Load libraries and shapefiles and filter for UK countries

So, there are quite a few packages to load first…

library(tidyverse); library(leaflet); library(geojsonio); library(rgdal); 
library(dplyr); library(plyr); library(data.table); library(RColorBrewer);
library(raster); library(ggplot2); library(rgeos); library(readr);
library(mapproj); library(tictoc); library(ggmap); library(maps);
library(ggthemes); library(htmlwidgets); library(tidyr); library(sp);

Once we’ve done that, we can load up our shapefiles which have come from Natural Earth Mapping. It took quite a while to figure out which map shapefiles I needed to use, but it turned out to be the “map admin units” ones, with a link here.

These are for the whole world, so we need to filter for UK countries using the %like% function to look for country names (known as SUBUNIT’s in our shapefile data) which match UK country names.

world_countries_shapefile <- shapefile("shapefiles/ne_10m_admin_0_map_units.shp")

# One way to filter countries is by starting letter of country
uk_countries = subset(world_world_countries_shapefile, SUBUNIT %like% "England" | 
                        SUBUNIT%like% "Wales" |
                        SUBUNIT %like% "Scotland" |
                        SUBUNIT %like% "Northern Ireland")

# Not sure if this actually helps run the code, but seems to do the trick!
uk_countries <- spTransform(uk_countries, CRS("+proj=longlat +ellps=WGS84"))

Now we have our shapefiles in good shape, we can now move onto analysing the data of doctor numbers in each UK country.

2. Load doctor data and clean

This data is from the GMC’s State of Medical Education and Practise 2017 annual report. It’s not in the most helpful format for machine reading, so we’ll manipulate it to be in a form we can add to our shapefiles.

doctor_data <- read.csv("data/doctors.csv")

# Select only 2017 data, necessary variables and put in wide format
doctor_data %>% 
  dplyr::filter(Year == 2017) %>% 
  dplyr::select(Country, Gender, Year, Number) %>%
  tidyr::spread(Gender, Number) -> doctor_data

R has seen comma’s in what we would see as numeric fields and decided that they are characters. Let’s remove them and define it as a numeric field.

doctor_data$Male <- as.numeric(sub(",", "", doctor_data$Male, fixed = TRUE))
doctor_data$Female <- as.numeric(sub(",", "", doctor_data$Female, fixed = TRUE))

Now we’ll use an ifelse statement to analyse which gender has the greatest number of doctors per country and print a statement saying just that. We’ll also add proportions based on total per country (assuming total is male plus female which isn’t the case in real life). We’re doing this so we have data we can use to have in our hover over tooltip for the map.

doctor_data %>%
  mutate(majority_gender = ifelse(Male > Female, "Majority of doctors are male", 
                           ifelse(Female > Male, "Majority of doctors are female", "Gender Equality")))  %>%
  mutate(total = Male + Female) %>%
  mutate(Male_prop = (Male / total)*100) %>%
  mutate(Female_prop = (Female / total)*100) -> doctor_data

Final bit of data tidying, rounding numbers and general housekeeping before we add the data to the map and generate the map.

# Round proportions to 2.d.p
doctor_data$Male_prop <- format(round(doctor_data$Male_prop, 2), nsmall = 2)
doctor_data$Female_prop <- format(round(doctor_data$Female_prop, 2), nsmall = 2)

# General tidying
doctor_data$Country <- as.character(doctor_data$Country)
doctor_data <- droplevels(doctor_data)
doctor_data$majority_gender <- as.factor(doctor_data$majority_gender)

3. Join data to map

We’ll use the merge function from the sp package to add our doctor data onto the map shapefiles.

data_for_mapping <- sp::merge(uk_countries,
                              doctor_data,
                              by.x = 'SUBUNIT',
                              by.y = 'Country',
                              duplicateGeoms = TRUE)

now to check it’s working…

leaflet(data_for_mapping) %>%
  addPolygons()

4. Build map components and final map

We need to add some colour to our map

map_pal = colorFactor(c('purple', '#4169e1'), data_for_mapping$majority_gender)

This object called map_pal will know to split the colours of the factor data_for_mapping$majority_gender into the two colours stated. I’ve used one in word format an the other in hex format just to show that R can interpret both fine!

leaflet(data_for_mapping) %>%
  addPolygons(fillColor=~map_pal(data_for_mapping$majority_gender))

Now let’s give the map a tooltip, I won’t go into too much detail around this, but in essence, it creates a space for the tooltip to exist with the style elements, then it allocates a space for each data part to go. Then, we then basically put the data pieces in the place where we’d like them to go and use the lapply function to combine the html to the data.

hoverText <- sprintf("<div style='font-size:12px;width:200px;float:left'>
            <span style='font-size:18px;font-weight:bold'>%s</span><br/> 
            <div style='width:95%%'>
              <span style='float:left'>Male</span>
                     <span style='float:right'>Female</span>
                     <br/>
                     <span style='color:black;float:left'>%s%%</span>
                     <span style='color:black;float:right'>%s%%</span><br clear='all'/>
                     <span style='background:#D4DCF7;width:%s%%;float:left'>&nbsp;</span>
                     <span style='background:#E7CCFC;width:%s%%;float:right'>&nbsp;</span>
                     </div>
                     <br/><span style='font-size:10px'>%s</span>
                     </div>",
                      data_for_mapping$SUBUNIT, 
                      data_for_mapping$Male_prop, data_for_mapping$Female_prop,
                      data_for_mapping$Male_prop, data_for_mapping$Female_prop,
                     data_for_mapping$majority_gender) %>%
  lapply(htmltools::HTML)

Adding this hoverText will give us a hover over tooltip effect like below…

leaflet(data_for_mapping) %>%
  addPolygons(fillColor=~map_pal(data_for_mapping$majority_gender),
              label = ~hoverText)

Bring it all together…

The rest is simply tidying, making users unable to drag the view away from the main focus of the image, making the background white, setting bondaries of the view. Hope this quick guide helps you in your projects - happyR’ing!

leaflet(data_for_mapping,
        options=leafletOptions(attributionControl = FALSE, 
                               dragging = FALSE, zoomControl = FALSE, minZoom = 5.2, maxZoom = 5.2)) %>%
  addPolygons(fillColor=~map_pal(data_for_mapping$majority_gender),
              weight = 1,
              label = ~hoverText,
              color = "grey",
              labelOptions = labelOptions(
                offset = c(-100,-140),
                #direction='bottom',
                textOnly = T,
                style=list(
                  'background'='rgba(255,255,255,0.95)',
                  'border-color' = 'rgba(0,0,0,1)',
                  'border-radius' = '4px',
                  'border-style' = 'solid',
                  'border-width' = '4px')),
              highlightOptions = highlightOptions(weight = 3, bringToFront = TRUE)) %>%
  setMaxBounds(lat1 = 60, lng1 = 8.05, lat2 = 50, lng2 = -15.) %>%
  htmlwidgets::onRender(
    "function(el, t) {
    var myMap = this;
    // get rid of the ugly grey background
    myMap._container.style['background'] = '#ffffff';
    }") 
devtools::session_info()
## Session info -------------------------------------------------------------
##  setting  value                       
##  version  R version 3.4.3 (2017-11-30)
##  system   x86_64, darwin15.6.0        
##  ui       X11                         
##  language (EN)                        
##  collate  en_GB.UTF-8                 
##  tz       Europe/London               
##  date     2018-07-28
## Packages -----------------------------------------------------------------
##  package      * version    date      
##  assertthat     0.2.0      2017-04-11
##  backports      1.1.2      2017-12-13
##  base         * 3.4.3      2017-12-07
##  bindr          0.1        2016-11-13
##  bindrcpp     * 0.2        2017-06-17
##  blogdown       0.5        2018-01-24
##  bookdown       0.5        2017-08-20
##  colorspace     1.3-2      2016-12-14
##  compiler       3.4.3      2017-12-07
##  crosstalk      1.0.0      2016-12-21
##  curl           2.8.1      2017-07-21
##  data.table   * 1.10.4-3   2017-10-27
##  datasets     * 3.4.3      2017-12-07
##  devtools     * 1.13.4     2017-11-09
##  digest         0.6.15     2018-01-28
##  dplyr        * 0.7.4      2017-09-28
##  evaluate       0.10.1     2017-06-24
##  foreign        0.8-69     2017-06-22
##  geojsonio    * 0.4.2      2017-09-01
##  geosphere      1.5-7      2017-11-05
##  ggmap        * 2.6.1      2016-01-23
##  ggplot2      * 2.2.1.9000 2018-06-25
##  ggthemes     * 3.4.0      2017-02-19
##  glue           1.2.0      2017-10-29
##  graphics     * 3.4.3      2017-12-07
##  grDevices    * 3.4.3      2017-12-07
##  grid           3.4.3      2017-12-07
##  gtable         0.2.0      2016-02-26
##  hms            0.4.1      2018-01-24
##  htmltools      0.3.6      2017-04-28
##  htmlwidgets  * 1.2.1      2018-07-24
##  httpuv         1.4.5      2018-07-19
##  httr           1.3.1      2017-08-20
##  jpeg           0.1-8      2014-01-23
##  jsonlite       1.5        2017-06-01
##  knitr          1.20       2018-02-20
##  later          0.7.3      2018-06-08
##  lattice        0.20-35    2017-03-25
##  lazyeval       0.2.1      2017-10-29
##  leaflet      * 1.1.0      2017-02-21
##  magrittr       1.5        2014-11-22
##  mapproj      * 1.2-5      2017-06-08
##  maps         * 3.2.0      2017-06-08
##  maptools       0.9-2      2017-03-25
##  memoise        1.1.0      2017-04-21
##  methods      * 3.4.3      2017-12-07
##  mime           0.5        2016-07-07
##  munsell        0.4.3      2016-02-13
##  pillar         1.1.0      2018-01-14
##  pkgconfig      2.0.1      2017-03-21
##  plyr         * 1.8.4      2016-06-08
##  png            0.1-7      2013-12-03
##  promises       1.0.1      2018-04-13
##  proto          1.0.0      2016-10-29
##  purrr          0.2.4      2017-10-18
##  R6             2.2.2      2017-06-17
##  raster       * 2.5-8      2016-06-02
##  RColorBrewer * 1.1-2      2014-12-07
##  Rcpp           0.12.17    2018-05-18
##  readr        * 1.1.1      2017-05-16
##  reshape2       1.4.3      2017-12-11
##  rgdal        * 1.2-20     2018-05-07
##  rgeos        * 0.3-26     2017-10-31
##  RgoogleMaps    1.4.1      2016-09-18
##  rjson          0.2.15     2014-11-03
##  rlang          0.2.1      2018-05-30
##  rmarkdown      1.9        2018-03-01
##  rprojroot      1.3-2      2018-01-03
##  scales         0.5.0.9000 2017-10-04
##  shiny          1.1.0      2018-05-17
##  sp           * 1.2-7      2018-01-19
##  stats        * 3.4.3      2017-12-07
##  stringi        1.2.2      2018-05-02
##  stringr        1.3.1      2018-05-10
##  tibble         1.4.2      2018-01-22
##  tictoc       * 1.0        2014-06-17
##  tidyr        * 0.7.2      2017-10-16
##  tidyselect     0.2.3      2017-11-06
##  tools          3.4.3      2017-12-07
##  utils        * 3.4.3      2017-12-07
##  V8             1.5        2017-04-25
##  withr          2.1.2      2018-06-25
##  xfun           0.1        2018-01-22
##  xtable         1.8-2      2016-02-05
##  yaml           2.1.19     2018-05-01
##  source                               
##  CRAN (R 3.4.0)                       
##  cran (@1.1.2)                        
##  local                                
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.3)                       
##  cran (@0.5)                          
##  CRAN (R 3.4.0)                       
##  local                                
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.1)                       
##  CRAN (R 3.4.2)                       
##  local                                
##  cran (@1.13.4)                       
##  cran (@0.6.15)                       
##  CRAN (R 3.4.2)                       
##  CRAN (R 3.4.1)                       
##  CRAN (R 3.4.3)                       
##  CRAN (R 3.4.1)                       
##  CRAN (R 3.4.2)                       
##  CRAN (R 3.4.0)                       
##  Github (tidyverse/ggplot2@348b26f)   
##  CRAN (R 3.4.0)                       
##  cran (@1.2.0)                        
##  local                                
##  local                                
##  local                                
##  CRAN (R 3.4.0)                       
##  cran (@0.4.1)                        
##  CRAN (R 3.4.0)                       
##  Github (ramnathv/htmlwidgets@29ca4f7)
##  cran (@1.4.5)                        
##  CRAN (R 3.4.1)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  cran (@1.20)                         
##  cran (@0.7.3)                        
##  CRAN (R 3.4.3)                       
##  cran (@0.2.1)                        
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  local                                
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.3)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  cran (@1.0.1)                        
##  CRAN (R 3.4.0)                       
##  cran (@0.2.4)                        
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  cran (@0.12.17)                      
##  CRAN (R 3.4.0)                       
##  cran (@1.4.3)                        
##  cran (@1.2-20)                       
##  cran (@0.3-26)                       
##  CRAN (R 3.4.0)                       
##  CRAN (R 3.4.0)                       
##  cran (@0.2.1)                        
##  cran (@1.9)                          
##  cran (@1.3-2)                        
##  Github (hadley/scales@d767915)       
##  cran (@1.1.0)                        
##  cran (@1.2-7)                        
##  local                                
##  cran (@1.2.2)                        
##  cran (@1.3.1)                        
##  cran (@1.4.2)                        
##  CRAN (R 3.4.0)                       
##  cran (@0.7.2)                        
##  cran (@0.2.3)                        
##  local                                
##  local                                
##  CRAN (R 3.4.0)                       
##  Github (jimhester/withr@fe56f20)     
##  CRAN (R 3.4.3)                       
##  CRAN (R 3.4.0)                       
##  cran (@2.1.19)