Return to MUSA 801 Projects Page

The following project was created in association with the MUSA 801 Practicum at the University of Pennsylvania, taught by Ken Steif, Michael Fichman, and Matt Harris. We would like to thank Charlie Catlett of the Argonne National Laboratory for providing feedback to help us create a meaningful application. All products from this class should be considered proofs of concept and works in progress.

This document is split into two parts. The first part addresses the policy implications of our project, as well as the concepts of data reliability that underpinned our methodology. The second part presents our methodology together with the codeblocks necessary for its replication. The policy implications and concepts are explicated in the sections Introduction and Defining Data Reliability. Our methods can then be replicated by following the sections Scoring Data Reliability and Scoring Data Reliability After Imputation.

1. Introduction

A Senseable Smart City

“…sound and touch and taste all have a place in the tools which…will define digital planning in the near future.” - Michael Batty

In a foreword for Robert Laurini’s Information Systems for Urban Planning: A Hypermedia Co-operative Approach, Michael Batty, Professor of Spatial Analysis and Planning at the University College of London, identifies the role that our senses play in the future of urban planning. As urban citizens rely on their senses to interpret and experience their urban environment through sights, sounds, smells, and touch, a planner needs to obtain a good understanding of the same sensorial stimuli that shape urban experience and quality of life in order to make meaningful improvements to it.

This is why the rising ubiquity and decreasing cost of sensing tools has the potential to change the way planners plan, structure, and manage the city. Sensor devices are often designed for tasks that either emulate or extend beyond the human senses. More importantly, they collect valuable data that help us approximate the human sensory experience in an urban environment, generating large volumes of feedback at spatial and temporal scales that can be further analysed for detailed insights. As cities around the world strive to be ‘smart’ in the ways they enhance quality of life for its citizens, sensor networks are also increasingly deployed to collect data that can be used to understand and manage the urban experience.Here, new technology plays a role in transforming efforts for sustainable urban growth and smart city planning.

The Array of Things (AoT)

As an urban-scale sensing network that collects real-time environmental data in cities, the AoT initiative exemplifies this trend. This is an initiative led by Charlie Catlett and the researchers from the Urban Center for Computation and Data, a joint initiative of the Argonne National Laboratory and the University of Chicago. Launched in 2016 and currently implemented in Chicago, the data collected through this initiative is open and free to the public.

The AoT could be the first sensing project of this geographic scale and level of temporal and data type specifity. As presented in the figure below, the AoT network comprises nodes, which are sensor boxes containing up to 15 sensors measuring different sensory data types, or parameters. These parameters include temperature, humidity, pressure, PM 2.5 concentration, and concentrations of other hazardous gas types such as carbon monoxide (CO), nitrous dioxide (NO2) and sulphur dioxide (SO2). As of this writing, there are 86 nodes citywide, as seen in the figure below. When fully implemented, the AoT network will consist of 500 nodes across Chicago.

This opens up a whole array of possible angles and ways that individuals, organisations, researchers, engineers and scientists can study urban environment and living. This is the main objective of the AoT initiative. Particularly, the data presents valuable insights for urban policy planners and researchers interested in devising urban policies that are sensitive to the unavoidable human-environment dynamics that shape urban behaviour and livability.

Importance of sensor data reliability

Extracting such valuable insights from sensor data requires the raw data to be processed and analysed. Here, the extent of data processing and quality of data analysis critically depends on the reliability of the data itself.

Therefore, our project seeks to evaluate the level of reliability of the data collected by the AoT network. In the following Section 2, we first define criteria for what it means for AoT data to be reliable. In Section 3 and Section 4, based on these criteria, we then provide a method to numerically score daily network data reliability for the network in terms of different data parameters.

Based on this score metric, we hope that planners could easily identify segments of the big AoT dataset for their relevant analysis. All in all, we also hope the transparent evaluation factors and scores behind this data reliability analysis could promote a more informed use of data in the increasingly data-driven planning process. We see this application to be additionally useful in improving research efficiency, considering the increasingly large stores of sensor data available - planners will be able to scope the spatial and temporal scale of their research according to where and when reliable segments of the data is available, instead of having to explore many different datasets to finalise a suitable scope of analysis.

1.2 Setup

Below we set the working directory, and load the libraries needed for the analysis as well as a plotTheme. We also set the memory limit to a high value, given the size of data we are processing here.

library(dplyr)
library(DBI)
library(RSQLite)
library(dbplyr)
library(lubridate)
library(leaflet)
library(tmap)
library(ggplot2)
library(sf)
library(plotly)
library(sp)
library(spatstat)
library(rgeos)
library(rgdal)
library(tidyr)
library(gridExtra)
library(stringr)
library(tidyverse)
library(caret)
library(sf)
library(FNN)
library(spdep)
library(knitr)
library(kableExtra)
library(htmlwidgets)
library(htmltools)
library(tmap)
library(openair)

setwd("~/Capstone/Exploratory")
memory.limit(100000000000)

## [1] 1e+11

plotTheme <- function(base_size = 12) {
  theme(
    text = element_text( color = "black"),
    plot.title = element_text(size = 14,colour = "black"),
    plot.subtitle=element_text(face="italic"),
    plot.caption=element_text(hjust=0),
    axis.ticks.x = element_blank(),
    axis.ticks.y = element_line( size=.1, color="#ababab" ),
    panel.grid.major.y = element_line( size=.1, color="#ababab" ),
    panel.grid.major.x = element_blank(),
    panel.background = element_blank(),
    panel.border = element_blank()
  )
}

2. Defining Data Reliability

In order to use the data, planners need to know if the data is reliable. Here, we identify 4 ideal criteria for AoT data to be considered reliable and useful:

Sensor Measure Reliability: All the data collected should be within sensing range, as determined by the sensor specifications.
Spatial Reliability: Reliable data should be collected for the whole spatial extent of the study area, which is Chicago in this case.
Temporal Reliability: Reliable data should be collected at consistent time intervals throughout the day.
Imputability: Missing and unreliable data records should be easily substitutable by the next nearest record in time and space. This substitution, or imputation, should improve sensor measure, spatial, and temporal reliability.

Each criteria is demonstrated below, together with the method through which the criteria is scored. Such numerical scores are metrics that facilitate easy comparison between different datasets.

Each of these 4 criteria will be conceptualised in detail in the following sub-sections below.

2.2 Sensor Value Reliability

This is the first and most important criteria defining data reliability in our method.

According to the AoT metadata site, each sensor has a specific detection range. This means that a well-functioning sensor should record values within this range. Otherwise, values recorded outside this range indicate that the sensor is faulty - and these values are unreliable and unuseable.

The table below presents the specific detection ranges of different sensors recording different data parameters.

Data Type	Parameter	Minimum senseable value	Maximum senseable value	Unit
Weather	Temperature	-55	125	deg Celsius
Weather	Relative Humidity	0	100	Percent
Weather	Pressure	300	1100	Pascal
Air Quality	PM2.5 concentration	0	-	PPM
Air Quality	CO concentration	0	1000	PPM
Air Quality	H2S concentration	0	50	PPM
Air Quality	NO2 concentration	0	20	PPM
Air Quality	O3 concentration	0	20	PPM
Air Quality	SO2 concentration	0	20	PPM

It should also be noted that temperature values recorded by the sensors should also fall within logical seasonal ranges. Intuitively, we know that temperature cannot fluctuate between -55 deg Celsius and 125 deg Celsius in a day. Also, we expect temperature values collected during winter in Chicago to be around or below the freezing point of 0 deg Celsius, while summers should record higher temperature values. Therefore, to determine sensor value reliability for temperature values, we also reference daily temperature ranges in Chicago published by the National Weather Service.

Based on these, we can label each sensor value as reliable or not. If the value falls within the sensor specification range (and daily range for temperature values), reliability == 1, and if not, reliability == 0.

To evaluate the network sensor value reliability, we are interested to know on average the proportion of sensor values measured in each node that are reliable. The more sensor values measured that are reliable, the more reliable the overall network is in terms of sensor value reliability.

Based on the this criterion of Sensor Value Reliability, we can define active nodes and inactive nodes in the network.

Active nodes record at least one reliable value that falls within the sensor specification range during the course of a day.
Inactive nodes record not even one reliable values that falls within the sensor specification range during the course of a day. These includes nodes containing sensors that do not record any value at all.

This helps us define two other relevant reliability criteria, Spatial Reliability and Temporal Reliability, that will be elaborated on in Section 2.3 and Section 2.4 respectively.

2.3 Spatial Reliability

To evaluate the spatial reliability of the network, we are interested to know whether active nodes are distributed across Chicago. Here, the area covered by active nodes are considered to have reliable data collected for it. A network that is fully spatially reliable is one that has its nodes distributed across the whole of Chicago, such that the total spatial extent of the nodes span the area of the city. A network that is not fully spatially reliable is one which total spatial extent span only part of the city. The figure below illustrates this point:

Therefore, the average proportion of Chicago’s area covered by the network extent at any one time serves as a metric for spatial reliability here.

2.4 Temporal Reliability

To evaluate the temporal reliability of the network, we are interested to know the average proportion of the day-duration that a node within the network is active for i.e. collecting reliable data. A network that is fully temporally reliable is one that has all its nodes collecting reliable data consistently across all time intervals during a day. A network that is not fully temporally reliable is one that has at least one of its nodes not collecting reliable data at some point during the day. The figure below illustrates this type of network - while some of its nodes are consistently collecting reliable data throughout the day, others have periods during which no reliable data is collected at all:

Therefore, the average proportion of the day-duration during which reliable data is being collected serves as a metric for temporal reliability here.

2.5 Imputability

Imputability refers to the possibility and effectiveness of replacing missing or as-if-missing values with observed ones. Here, unreliable data is considered as-if-missing.

The figure below presents our imputation method to replace missing and unreliable data in the dataset we retrieve from the AoT database.

To evaluate imputability, we first apply the procedure above to our original dataset to obtain one that is imputed for. We then apply the score metrics for the other 3 reliability criteria on this new imputed dataset, and compare the scores. Ideally, the scores for the second dataset should be higher - this will suggest the effectiveness of imputing for unreliable and missing data in the retrieved dataset. If the scores are not higher, this indicates that that imputation cannot be used to ‘salvage’ the original dataset that is consisted of too many unreliable data points. In this case, planners might be advised to not use the dataset for that day and data type at all.

3. Scoring Data Reliability

In this section, we will present the method of scoring Data Reliability based on the criteria of Sensor Value Reliability, Spatial Reliability and Temporal Reliability. This will be demonstrated using data collected on 2018-12-15 for the different data types listed for weather and air quality in their respective sections.

The flow chart below illustrates the common workflow we adopt for scoring data reliability for each day:

3.2 Pre-scoring Data Retrieval and Processing

To manage the large AoT dataset, we first download the dataset for December 2018 from the AoT data site and import it into a database using SQL Server Management Studio (instructions on this can be found here. We then save this database as Chicago2018-12.db. It is this database that we will connect to using the dbConnect function available from the DBI R package.

dbname<-'Chicago2018-12.db'

To faciliate the workflow, we provide a function below for users to retrieve AoT data from the SQL database and then determine if each data point is reliable or not.

defValid<-function(dbname, system, parameter1, sensor1=NULL, high=NULL, low=NULL, actual1=NULL){
  
  #1. Connect to SQL database
  
  con<-dbConnect(SQLite(), dbname=dbname)
  
  #2. Send query and retrieve data from database
  weather<-
    dbSendQuery(con,
                paste0(
                  "SELECT data.timestamp, data.node_id, data.subsystem,data.sensor, data.parameter, data.value_hrf, nodes.lat, nodes.lon
                  FROM data
                  JOIN nodes
                  ON data.node_id = nodes.node_id
                  WHERE data.subsystem 
                  IN ('",system,"')"))%>%
    dbFetch()%>%
    mutate(timestamp2=ymd_hms(timestamp),
           date=date(timestamp),
           value_hrf=as.numeric(value_hrf))%>%
    filter(parameter==parameter1)%>%
    mutate(by10=cut(timestamp2, breaks='10 min'))%>%
    mutate(time=ymd_hms(by10))
  
  #3. Determine for each data point whether it is reliable or not, 
  ##sensor reliability specification differs according to different data parameters
  
  if(parameter1=='humidity'){
    weather%>%
      filter(sensor==sensor1)%>%
      filter(parameter==parameter1)%>%
      mutate(val_qual=ifelse(is.na(value_hrf), 0, 
                             ifelse(value_hrf>100|value_hrf<0,0,1)))%>%
      group_by(date,node_id)%>%
      mutate(val_qual=ifelse(mean(val_qual)!=1, 0,1))->df
  }else if(parameter1=='pressure'){
    weather%>%
      filter(sensor==sensor1)%>%
      filter(parameter==parameter1)%>%
      mutate(val_qual=ifelse(is.na(value_hrf), 0, 
                             ifelse(value_hrf>1100|value_hrf<300,0,1)))%>%
      group_by(date,node_id)%>%
      mutate(val_qual=ifelse(mean(val_qual)!=1, 0,1))->df
  }else if(parameter1=='pm2_5'){
    weather%>%
      filter(parameter==parameter1)%>%
      mutate(val_qual=ifelse(is.na(value_hrf), 0, 
                             ifelse(value_hrf<0,0,1)))->df
  }else if(parameter1=='concentration'){
    weather%>%
      filter(parameter==parameter1)%>%
      filter(sensor==sensor1)%>%
      mutate(val_qual=ifelse(is.na(value_hrf), 0, 
                             ifelse(value_hrf<low|value_hrf>high,0,1)))->df
  }else if(parameter1=='temperature'){
    actual<-read.csv(actual1)
    actual$date<-ymd(actual$date)
    
    weather%>%
      filter(parameter==parameter1)%>%
      left_join(actual, by='date')%>%
      mutate(high_bound = high + 5,
             low_bound = low - 5) %>%
      mutate(val_qual = ifelse(value_hrf > high_bound | value_hrf < low_bound, 0, 1))->df
    
    df%>%
      filter(val_qual==1)%>%
      group_by(by10, node_id)%>%
      mutate(quant75= quantile(value_hrf, probs=0.75),
             quant25= quantile(value_hrf, probs=0.25))%>%
      mutate(val_qual= ifelse(value_hrf > quant75 | value_hrf < quant25, 0,1))->df1
    
    df%>%
      filter(val_qual==0)%>%
      bind_rows(df1)->df
    
    rm(df1)
    
  }
  
  return(df)
}

The function calls for the following inputs:

dbname: Name of the SQL database file containing a month’s worth of AoT data
system: The sensor system within the node - see the AoT site for the specific system types.
parameter1: The data type - see the AoT site for the specific parameter types.
sensor1: The sensor type to specify for different gas concentration types - only required when parameter1 = concentration. See the AoT site for the specific sensor types.
high: The highest value sensor1 can record - only required when parameter1 = concentration
low: The lowest value sensor1 can record - only required when parameter1 = concentration
actual: The dataframe containing temperature ranges obtained from the National Weather Service - only required when temperature data is retrieved (parameter1 = temperature)

The function implements the following procedure:

Connect to SQL database
Send query and retrieve data from database
Determine for each data point whether it is reliable or not

The function returns

timestamp: Date and time at which data observation is recorded.
node_id: Unique ID number of the node at which data is being recorded
subsystem: Subsystem within which the node is located
sensor: Sensor model
parameter: Data type
value_hrf: Measurement
nodes.lat: Latitude location of node
nodes.lon: Longitude location of node
timestamp2: timestamp in datetime format
date: Date extracted from timestamp2
by10: Time extracted from timestamp2, rounded to the nearest 10 minute interval
val_qual: 1 if data record is reliable, 0 if not

This function has to be applied before each scoring process in the following sections. Click on the tabs below to view the specific inputs that will yield the relevant data parameter types. The data retrieved here will then be scored in the following sections.

Temperature

Apply function to retrieve temperature data and label reliability

systemTemp<-'metsense'
actualTemp<-'december_weather.csv'
parameterTemp<-'temperature'
dfTemp<-defValid(dbname, system=systemTemp, parameter1=parameterTemp, actual1=actualTemp)

#because we are only scoring for a day here, filter data for 2018-12-15
dfTemp%>%
  filter(date=='2018-12-15')->dfTemp

Observe first 10 rows

dfTemp%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

timestamp	node_id	lat	lon	by10	value_hrf
2018/12/15 00:00:01	001e0610bc12	41.75034	-87.663518	2018-12-15 00:00:00	40.20
2018/12/15 00:00:01	001e06113f54	41.884607	-87.624577	2018-12-15 00:00:00	35.60
2018/12/15 00:00:01	001e0611537d	41.794167	-87.601646	2018-12-15 00:00:00	29.70
2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	35.10
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	241.00
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	128.86
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	-254.00
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	214.75
2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	37.20
2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	241.00

Humidity

Apply function to retrieve humidity data and label reliability

systemHumidity<-'metsense'
parameterHumidity<-'humidity'
sensorHumidity<-'htu21d'
dfHumidity<-defValid(dbname, system=systemHumidity, parameter1=parameterHumidity, sensor1=sensorHumidity)

#because we are only scoring for a day here, filter data for 2018-12-15
dfHumidity%>%
  filter(date=='2018-12-15')->dfHumidity

Observe first 10 rows

dfHumidity%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

## Adding missing grouping variables: `date`

date	timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018-12-15	2018/12/15 00:00:00	001e0610ee36	41.751295	-87.605288	2018-12-15 00:00:00	77.50	1
2018-12-15	2018/12/15 00:00:00	001e0610ee43	41.788608	-87.598713	2018-12-15 00:00:00	75.80	1
2018-12-15	2018/12/15 00:00:01	001e0610bc12	41.75034	-87.663518	2018-12-15 00:00:00	80.76	1
2018-12-15	2018/12/15 00:00:01	001e06113f54	41.884607	-87.624577	2018-12-15 00:00:00	81.80	1
2018-12-15	2018/12/15 00:00:01	001e0611537d	41.794167	-87.601646	2018-12-15 00:00:00	118.99	0
2018-12-15	2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	118.99	0
2018-12-15	2018/12/15 00:00:05	001e061130f4	41.896157	-87.662391	2018-12-15 00:00:00	86.06	1
2018-12-15	2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	118.99	0
2018-12-15	2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	86.75	1
2018-12-15	2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	118.99	0

Pressure

Apply function to retrieve pressure data and label reliability

systemPressure<-'metsense'
parameterPressure<-'pressure'
sensorPressure<-'bmp180'
dfPressure<-defValid(dbname, system=systemPressure, parameter1=parameterPressure, sensor1=sensorPressure)

#because we are only scoring for a day here, filter data for 2018-12-15
dfPressure %>%
  filter(date=='2018-12-15')->dfPressure

Observe first 10 rows

dfPressure%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

## Adding missing grouping variables: `date`

date	timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018-12-15	2018/12/15 00:00:00	001e0610ee36	41.751295	-87.605288	2018-12-15 00:00:00	1042.56	1
2018-12-15	2018/12/15 00:00:00	001e0610ee43	41.788608	-87.598713	2018-12-15 00:00:00	1011.77	1
2018-12-15	2018/12/15 00:00:01	001e0610bc12	41.75034	-87.663518	2018-12-15 00:00:00	1119.05	0
2018-12-15	2018/12/15 00:00:01	001e06113f54	41.884607	-87.624577	2018-12-15 00:00:00	1017.66	1
2018-12-15	2018/12/15 00:00:01	001e0611537d	41.794167	-87.601646	2018-12-15 00:00:00	1016.86	1
2018-12-15	2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	1080.70	1
2018-12-15	2018/12/15 00:00:05	001e061130f4	41.896157	-87.662391	2018-12-15 00:00:00	997.53	1
2018-12-15	2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	2361.56	0
2018-12-15	2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	1036.71	1
2018-12-15	2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	2361.56	0

PM2.5 Concentration

Apply function to retrieve PM 2.5 Concentration data and label reliability.

systemPM25<-'alphasense'
parameterPM25<-'pm2_5'

dfPM25<-defValid(dbname, systemPM25, parameterPM25)

#because we are only scoring for a day here, filter data for 2018-12-15
dfPM25%>%
  filter(date=='2018-12-15')->dfPM25

Observe first 10 rows

dfPM25%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:09	001e0610f05c	41.924903	-87.687703	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:11	001e06113107	41.751142	-87.71299	2018-12-15 00:00:00	12.179	1
2018/12/15 00:00:13	001e06113dbc	41.713867	-87.536509	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:18	001e0610bc10	41.736314	-87.624179	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:21	001e0610ba15	41.722457	-87.57535	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:31	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:31	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:35	001e0610f05c	41.924903	-87.687703	2018-12-15 00:00:00	NA	0

CO Concentration

Apply function to retrieve CO concentration data and label reliability.

systemCO<-'chemsense'
parameterCO<-'concentration'

sensorCO<-'co'
highCO<-1000
lowCO<-0
dfCO<-defValid(dbname, system=systemCO, parameter1=parameterCO, sensor1=sensorCO, high=highCO, low=lowCO)

#because we are only scoring for a day here, filter data for 2018-12-15
dfCO %>% 
  filter(date=='2018-12-15')->dfCO

Observe first 10 rows

dfCO%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018/12/15 00:00:00	001e0610ee43	41.788608	-87.598713	2018-12-15 00:00:00	-0.10126	0
2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	-0.52656	0
2018/12/15 00:00:05	001e0610ef27	41.846579	-87.685557	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:05	001e061130f4	41.896157	-87.662391	2018-12-15 00:00:00	0.27291	1
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	0.08977	1
2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	0.12475	1
2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	-0.07333	0
2018/12/15 00:00:09	001e0610f05c	41.924903	-87.687703	2018-12-15 00:00:00	-0.08132	0
2018/12/15 00:00:09	001e06114503	41.666078	-87.539374	2018-12-15 00:00:00	-0.11219	0
2018/12/15 00:00:11	001e06113107	41.751142	-87.71299	2018-12-15 00:00:00	0.14908	1

H2S Concentration

Apply function to retrieve H2S concentration data and label reliability.

systemH2S<-'chemsense'
parameterH2S<-'concentration'

sensorH2S<-'h2s'
highH2S<-50
lowH2S<-0
dfH2S<-defValid(dbname, system=systemH2S, parameter1=parameterH2S, sensor1=sensorH2S, high=highH2S, low=lowH2S)

#because we are only scoring for a day here, filter data for 2018-12-15
dfH2S %>%
  filter(date=='2018-12-15')->dfH2S

Observe first 10 rows

dfH2S%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018/12/15 00:00:00	001e0610ee43	41.788608	-87.598713	2018-12-15 00:00:00	0.00271	1
2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	-0.13310	0
2018/12/15 00:00:05	001e0610ef27	41.846579	-87.685557	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:05	001e061130f4	41.896157	-87.662391	2018-12-15 00:00:00	0.46145	1
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	0.11730	1
2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	-0.02195	0
2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	-0.04071	0
2018/12/15 00:00:09	001e0610f05c	41.924903	-87.687703	2018-12-15 00:00:00	0.02893	1
2018/12/15 00:00:09	001e06114503	41.666078	-87.539374	2018-12-15 00:00:00	0.16494	1
2018/12/15 00:00:11	001e06113107	41.751142	-87.71299	2018-12-15 00:00:00	-0.06809	0

NO2 Concentration

Apply function to retrieve NO2 concentration data and label reliability.

systemNO2<-'chemsense'
parameterNO2<-'concentration'

sensorNO2<-'no2'
highNO2<-20
lowNO2<-0
dfNO2<-defValid(dbname, system=systemNO2, parameter1=parameterNO2, sensor1=sensorNO2, high=highNO2, low=lowNO2)

#because we are only scoring for a day here, filter data for 2018-12-15
dfNO2 %>%
  filter(date=='2018-12-15')->dfNO2

Observe first 10 rows

dfNO2%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018/12/15 00:00:00	001e0610ee43	41.788608	-87.598713	2018-12-15 00:00:00	0.00470	1
2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	0.00000	1
2018/12/15 00:00:05	001e0610ef27	41.846579	-87.685557	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:05	001e061130f4	41.896157	-87.662391	2018-12-15 00:00:00	0.00000	1
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	0.00000	1
2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	0.01549	1
2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	0.02010	1
2018/12/15 00:00:09	001e0610f05c	41.924903	-87.687703	2018-12-15 00:00:00	0.07814	1
2018/12/15 00:00:09	001e06114503	41.666078	-87.539374	2018-12-15 00:00:00	0.00000	1
2018/12/15 00:00:11	001e06113107	41.751142	-87.71299	2018-12-15 00:00:00	0.00543	1

O3 Concentration

Apply function to retrieve O3 concentration data and label reliability.

systemO3<-'chemsense'
parameterO3<-'concentration'

sensorO3<-'o3'
highO3<-20
lowO3<-0
dfO3<-defValid(dbname, system=systemO3, parameter1=parameterO3, sensor1=sensorO3, high=highO3, low=lowO3)

#because we are only scoring for a day here, filter data for 2018-12-15
dfO3 %>%
  filter(date=='2018-12-15')->dfO3

Observe first 10 rows

dfO3%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018/12/15 00:00:00	001e0610ee43	41.788608	-87.598713	2018-12-15 00:00:00	0.00000	1
2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	0.03330	1
2018/12/15 00:00:05	001e0610ef27	41.846579	-87.685557	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:05	001e061130f4	41.896157	-87.662391	2018-12-15 00:00:00	0.08645	1
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	0.02858	1
2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	0.00000	1
2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	0.00000	1
2018/12/15 00:00:09	001e0610f05c	41.924903	-87.687703	2018-12-15 00:00:00	0.08059	1
2018/12/15 00:00:09	001e06114503	41.666078	-87.539374	2018-12-15 00:00:00	-0.01805	0
2018/12/15 00:00:11	001e06113107	41.751142	-87.71299	2018-12-15 00:00:00	0.00000	1

SO2 Concentration

Apply function to retrieve SO2 concentration data and label reliability.

systemSO2<-'chemsense'
parameterSO2<-'concentration'

sensorSO2<-'so2'
highSO2<-20
lowSO2<-0
dfSO2<-defValid(dbname, system=systemSO2, parameter1=parameterSO2, sensor1=sensorSO2, high=highSO2, low=lowSO2)

#because we are only scoring for a day here, filter data for 2018-12-15
dfSO2 %>%
  filter(date=='2018-12-15')->dfSO2

Observe first 10 rows

dfSO2%>%
  select(timestamp, node_id, lat, lon, by10, value_hrf, val_qual)%>%
  head(10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c("striped", "hover"))

timestamp	node_id	lat	lon	by10	value_hrf	val_qual
2018/12/15 00:00:00	001e0610ee43	41.788608	-87.598713	2018-12-15 00:00:00	-0.08692	0
2018/12/15 00:00:04	001e061144c0	41.764122	-87.72242	2018-12-15 00:00:00	1.06933	1
2018/12/15 00:00:05	001e0610ef27	41.846579	-87.685557	2018-12-15 00:00:00	NA	0
2018/12/15 00:00:05	001e061130f4	41.896157	-87.662391	2018-12-15 00:00:00	-1.40657	0
2018/12/15 00:00:06	001e06114fd4	41.794477	-87.615957	2018-12-15 00:00:00	-0.60472	0
2018/12/15 00:00:07	001e06113cf1	41.884688	-87.627864	2018-12-15 00:00:00	0.05132	1
2018/12/15 00:00:08	001e061146bc	41.918733	-87.668257	2018-12-15 00:00:00	0.11572	1
2018/12/15 00:00:09	001e0610f05c	41.924903	-87.687703	2018-12-15 00:00:00	-0.05358	0
2018/12/15 00:00:09	001e06114503	41.666078	-87.539374	2018-12-15 00:00:00	-0.20640	0
2018/12/15 00:00:11	001e06113107	41.751142	-87.71299	2018-12-15 00:00:00	0.39536	1

3.3 Scoring Sensor Value Reliability

In this section, the method of scoring Sensor Value Reliability is presented for each data parameter type for the day of 2012-12-15. There are 2 scores obtained for this criteria. In summary, this section scores sensor value reliability for the entire network by analysing the amount of reliable data measured by the temperature sensors in each node at each 10-minute time interval during the day. This amount is compared in terms of proportion to account for the different total amounts of data measured by the sensors in different nodes and/or at different times of the day. The node sensor value reliablity of each node is obtained by taking the mean of these proportions across the time-intervals during the day. The network sensor value reliability (Score 1) is finally obtained by taking the mean of all the nodes’ average sensor value reliability. To also observe whether this reliability in sensor values is consistent throughout the day for each node, standard deviation metrics are used to score node sensor value reliability consistency. The overall consistency in sensor value reliability (Score 2) for the network is then obtained as the average mean of these nodes’ consistency scores.

The flowchart below illustrates the scoring process in this section:

Click on the tabs below to view in detail how the scores were constructed for each data parameter.

Temperature

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting temperature data in the network. It can be observed that most nodes collect more than 100 data measurements for every 10-minute time interval.

dfTemp%>%
  ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable Temperature Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfTemp%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfTemp1

The table below shows how X varies for each 10-minute time interval (by10) for a single node 001e0610ba13. Node Sensor Value Reliability remains constant, given that it is the average of X here.

dfTemp1%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	44.34783	41.35618
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	44.16667	41.35618
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	43.47826	41.35618
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	44.16667	41.35618
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	42.60870	41.35618
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	43.33333	41.35618
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	43.33333	41.35618
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	43.33333	41.35618
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	41.73913	41.35618
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	45.00000	41.35618
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	43.47826	41.35618
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	44.16667	41.35618
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	41.73913	41.35618
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	43.33333	41.35618
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	40.86957	41.35618
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	42.50000	41.35618
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	40.83333	41.35618
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	45.00000	41.35618
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	41.66667	41.35618
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	40.00000	41.35618
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	48.69565	41.35618

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

ggplot()+
  geom_line(data=dfTemp1,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfTemp1,aes(x=by10, y=X, group=1, col="Proportion of Reliable Data"), size=1, alpha=0.5)+
      scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable Temperature Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below presents the Node Sensor Value Reliability of each node in the network. From the table, it can be observed that the average sensor value reliability levels for nodes vary between 11.3% to 65.0% reliable.

dfTemp1%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e0610ef27	41.846579	-87.685557	64.99769
001e061135cb	41.779369	-87.664421	64.12399
001e0610ee33	41.965089	-87.679076	52.79469
001e0610e532	41.857959	-87.656427	52.28412
001e0610f05c	41.924903	-87.687703	52.00181
001e0610e537	41.961622	-87.665948	51.58313
001e061130f4	41.896157	-87.662391	51.45028
001e0610f732	41.895005	-87.745817	51.39065
001e0610eef4	41.912681	-87.681052	51.17562
001e0610ee43	41.788608	-87.598713	51.14156
001e0610ba46	41.878377	-87.627678	51.13476
001e0610ee36	41.751295	-87.605288	51.08645
001e0610ee5d	41.923996	-87.761072	51.03694
001e0610bbf9	41.768319	-87.683396	50.87284
001e0610bc10	41.736314	-87.624179	50.67658
001e0610f6db	41.791329	-87.598677	48.42593
001e06113dbc	41.713867	-87.536509	42.10656
001e06113f54	41.884607	-87.624577	41.87248
001e0610bc12	41.75034	-87.663518	41.78039
001e06113a48	41.943263	-87.688069	41.65358
001e0610ba13	41.751238	-87.712990	41.35618
001e0610ba15	41.722457	-87.57535	41.22434
001e06113cf1	41.884688	-87.627864	40.98682
001e0611537d	41.794167	-87.601646	40.77823
001e06113107	41.751142	-87.71299	38.42869
001e061144c0	41.764122	-87.72242	34.21231
001e0610e538	41.736593	-87.604759	32.02960
001e0610fb4c	41.913583	-87.682414	30.72025
001e06114503	41.666078	-87.539374	30.06114
001e06113ace	41.83107	-87.617298	13.88310
001e0610f703	41.87148	-87.67644	13.38366
001e06114500	41.714494	-87.643099	12.53336
001e0611462f	41.823527	-87.641054	12.48516
001e06113d22	41.800846	-87.703739	12.46162
001e0610f8f4	41.832579	-87.646133	12.09063
001e0611536c	41.88575	-87.62969	12.05440
001e06114fd4	41.794477	-87.615957	12.03402
001e061146bc	41.918733	-87.668257	11.86846
001e0610eef2	41.965256	-87.66672	11.84833
001e061146ba	41.96759	-87.76257	11.50035
001e0610e835	41.968757	-87.679174	11.29906

The spatial distribution of this varying sensor value reliability levels by node is then visualised in the following map. There, it can be generally observed that nodes of similar average sensor value reliability levels tend to be located close to one another. The nodes with the lowest sensor value reliability levels tend to be located mostly around the city centre, with the rest located in isolation at the city periphery.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfTemp1$NodeMeanX, n = 5)

dfTemp1$lat<-as.numeric(dfTemp1$lat)
dfTemp1$lon<-as.numeric(dfTemp1$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfTemp1,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfTemp1$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfTemp1$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfTemp1$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for temperature data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfTemp1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfTemp1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
36.3617

Constructing Score 2

Besides the average level of sensor value reliability (Score 1), we are also interested to know whether this level is consistent across the day for each node, and ultimately the average consistency in sensor value reliability levels in the network. Here, the value of consistency needs to be considered in relation of the average level of sensor value reliability scored. While a high level of consistency is desirable when the nodes are generally recording reliable data, a similarly high level of consistency is not at all desirable when the nodes are generally recording unreliable data. This is the basis for the second score.

The table below presents the node sensor value reliability consistency score for each node in the network. Here, the score is first standardised to fit a scale of 0 to 100, with a score of 100 indicating that the level of sensor value reliability is perfectly consistent across the day. In other words, the proportion of reliable values collected is identical for all the 10-minute time intervals of the day for the node. Then, depending on the average level of sensor value reliability, this consistency score is adjusted to reflect its desirability.

dfTemp1%>%
  select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfTemp2

dfTemp2%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e0610bbf9	41.76832	-87.68340	97.726680	42.29874
001e0610ee36	41.75129	-87.60529	97.613689	42.29874
001e0610ee5d	41.92400	-87.76107	97.552482	42.29874
001e0610f732	41.89500	-87.74582	97.426711	42.29874
001e0610ba46	41.87838	-87.62768	97.340617	42.29874
001e0610ee43	41.78861	-87.59871	97.333544	42.29874
001e061130f4	41.89616	-87.66239	97.026105	42.29874
001e0610e537	41.96162	-87.66595	96.662634	42.29874
001e0610e532	41.85796	-87.65643	96.553289	42.29874
001e0610ee33	41.96509	-87.67908	95.940712	42.29874
001e0610f05c	41.92490	-87.68770	95.548531	42.29874
001e0610eef4	41.91268	-87.68105	93.955537	42.29874
001e0610bc10	41.73631	-87.62418	91.976894	42.29874
001e061135cb	41.77937	-87.66442	85.808949	42.29874
001e0610ef27	41.84658	-87.68556	84.491752	42.29874
001e06114503	41.66608	-87.53937	50.173085	42.29874
001e0610f6db	41.79133	-87.59868	19.087700	42.29874
001e0610fb4c	41.91358	-87.68241	16.947028	42.29874
001e06113d22	41.80085	-87.70374	16.149122	42.29874
001e0611462f	41.82353	-87.64105	15.878560	42.29874
001e0610f703	41.87148	-87.67644	14.701805	42.29874
001e06113ace	41.83107	-87.61730	14.566941	42.29874
001e06114500	41.71449	-87.64310	14.513772	42.29874
001e0610eef2	41.96526	-87.66672	14.415931	42.29874
001e061146bc	41.91873	-87.66826	13.812588	42.29874
001e061144c0	41.76412	-87.72242	13.356711	42.29874
001e0610f8f4	41.83258	-87.64613	13.153401	42.29874
001e0611536c	41.88575	-87.62969	12.352170	42.29874
001e061146ba	41.96759	-87.76257	12.231410	42.29874
001e0610e835	41.96876	-87.67917	12.074803	42.29874
001e06114fd4	41.79448	-87.61596	12.032181	42.29874
001e0610e538	41.73659	-87.60476	10.604387	42.29874
001e06113dbc	41.71387	-87.53651	5.784395	42.29874
001e06113a48	41.94326	-87.68807	4.675932	42.29874
001e06113107	41.75114	-87.71299	4.509346	42.29874
001e06113f54	41.88461	-87.62458	4.467886	42.29874
001e0610bc12	41.75034	-87.66352	4.015151	42.29874
001e0610ba15	41.72246	-87.57535	3.388502	42.29874
001e0610ba13	41.75124	-87.71299	3.166652	42.29874
001e06113cf1	41.88469	-87.62786	2.666945	42.29874
001e0611537d	41.79417	-87.60165	2.563674	42.29874

The spatial distribution of this consistency in sensor value reliability by node is then visualised in the following map. There, it can be picked out that node 001e06114503 is located at the southmost end of Chicago.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfTemp2$NodeSDScore, n = 5)

dfTemp2$lat<-as.numeric(dfTemp2$lat)
dfTemp2$lon<-as.numeric(dfTemp2$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfTemp2,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfTemp2$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfTemp2$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfTemp2$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

The density plot below shows the distribution ofnode sensor value reliability consistency scores relative to the the network average - this average is taken as Score 2, which represents the overall consistency in sensor value reliability of the AoT network for temperature data on 2012-12-15.

dfTemp2%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT Temperature Network on 2012-12-15,

Score 1 = 36.6 : On average, only 36.6% of the temperature data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 42.3 : From the density distribution, it could be observed that the node sensor value reliability is consistently bad more often than consistently good, hence the moderate score here.

Humidity

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting humidity data in the network. It can be observed that most nodes collect more than 20 data measurements for every 10-minute time interval. It can also be observed that nodes are either collecting reliable data or unreliable ones - there is no mix of both.

dfHumidity%>%
  ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable Humidity Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfHumidity%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfHumidity1

dfHumidity1%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	100	100

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfHumidity1,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfHumidity1,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
      scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable Humidity Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfHumidity1%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e0610ee36	41.751295	-87.605288	100.00000
001e0610ee43	41.788608	-87.598713	100.00000
001e0610bc12	41.75034	-87.663518	100.00000
001e06113f54	41.884607	-87.624577	100.00000
001e061130f4	41.896157	-87.662391	100.00000
001e06113cf1	41.884688	-87.627864	100.00000
001e0610ee5d	41.923996	-87.761072	100.00000
001e06113dbc	41.713867	-87.536509	100.00000
001e0610e532	41.857959	-87.656427	100.00000
001e0610ba46	41.878377	-87.627678	100.00000
001e0610ee33	41.965089	-87.679076	100.00000
001e0610f732	41.895005	-87.745817	100.00000
001e0610bbf9	41.768319	-87.683396	100.00000
001e06113a48	41.943263	-87.688069	100.00000
001e0610ba13	41.751238	-87.712990	100.00000
001e0610e537	41.961622	-87.665948	100.00000
001e0610f6db	41.791329	-87.598677	100.00000
001e06113107	41.751142	-87.71299	92.36111
001e0611537d	41.794167	-87.601646	0.00000
001e061144c0	41.764122	-87.72242	0.00000
001e06114fd4	41.794477	-87.615957	0.00000
001e061146bc	41.918733	-87.668257	0.00000
001e0610f05c	41.924903	-87.687703	0.00000
001e06114503	41.666078	-87.539374	0.00000
001e0611536c	41.88575	-87.62969	0.00000
001e0611462f	41.823527	-87.641054	0.00000
001e0610f8f4	41.832579	-87.646133	0.00000
001e0610f703	41.87148	-87.67644	0.00000
001e06113d22	41.800846	-87.703739	0.00000
001e0610e538	41.736593	-87.604759	0.00000
001e0610bc10	41.736314	-87.624179	0.00000
001e0610eef4	41.912681	-87.681052	0.00000
001e06113ace	41.83107	-87.617298	0.00000
001e061146ba	41.96759	-87.76257	0.00000
001e0610ba15	41.722457	-87.57535	0.00000
001e0610e835	41.968757	-87.679174	0.00000
001e0610eef2	41.965256	-87.66672	0.00000
001e06114500	41.714494	-87.643099	0.00000

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfHumidity1$NodeMeanX, n = 5)

dfHumidity1$lat<-as.numeric(dfHumidity1$lat)
dfHumidity1$lon<-as.numeric(dfHumidity1$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfHumidity1,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfHumidity1$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfHumidity1$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfHumidity1$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for humidity data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfHumidity1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfHumidity1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
47.1674

Constructing Score 2

dfHumidity1%>%
select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfHumidity2

dfHumidity2%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e0610ee36	41.75129	-87.60529	100	47.36842
001e0610ee43	41.78861	-87.59871	100	47.36842
001e0610bc12	41.75034	-87.66352	100	47.36842
001e06113f54	41.88461	-87.62458	100	47.36842
001e061130f4	41.89616	-87.66239	100	47.36842
001e06113cf1	41.88469	-87.62786	100	47.36842
001e0610ee5d	41.92400	-87.76107	100	47.36842
001e06113107	41.75114	-87.71299	100	47.36842
001e06113dbc	41.71387	-87.53651	100	47.36842
001e0610e532	41.85796	-87.65643	100	47.36842
001e0610ba46	41.87838	-87.62768	100	47.36842
001e0610ee33	41.96509	-87.67908	100	47.36842
001e0610f732	41.89500	-87.74582	100	47.36842
001e0610bbf9	41.76832	-87.68340	100	47.36842
001e06113a48	41.94326	-87.68807	100	47.36842
001e0610ba13	41.75124	-87.71299	100	47.36842
001e0610e537	41.96162	-87.66595	100	47.36842
001e0610f6db	41.79133	-87.59868	100	47.36842
001e0611537d	41.79417	-87.60165	0	47.36842
001e061144c0	41.76412	-87.72242	0	47.36842
001e06114fd4	41.79448	-87.61596	0	47.36842
001e061146bc	41.91873	-87.66826	0	47.36842
001e0610f05c	41.92490	-87.68770	0	47.36842
001e06114503	41.66608	-87.53937	0	47.36842
001e0611536c	41.88575	-87.62969	0	47.36842
001e0611462f	41.82353	-87.64105	0	47.36842
001e0610f8f4	41.83258	-87.64613	0	47.36842
001e0610f703	41.87148	-87.67644	0	47.36842
001e06113d22	41.80085	-87.70374	0	47.36842
001e0610e538	41.73659	-87.60476	0	47.36842
001e0610bc10	41.73631	-87.62418	0	47.36842
001e0610eef4	41.91268	-87.68105	0	47.36842
001e06113ace	41.83107	-87.61730	0	47.36842
001e061146ba	41.96759	-87.76257	0	47.36842
001e0610ba15	41.72246	-87.57535	0	47.36842
001e0610e835	41.96876	-87.67917	0	47.36842
001e0610eef2	41.96526	-87.66672	0	47.36842
001e06114500	41.71449	-87.64310	0	47.36842

The spatial distribution of this consistency in sensor value reliability by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfHumidity2$NodeSDScore, n = 5)

dfHumidity2$lat<-as.numeric(dfHumidity2$lat)
dfHumidity2$lon<-as.numeric(dfHumidity2$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfHumidity2,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfHumidity2$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfHumidity2$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfHumidity2$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfHumidity2%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT Humidity Network on 2012-12-15,

Score 1 = 47.2 : On average, only 47.2% of the humidity data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 47.4 : From the density distribution, it could be observed that the node sensor value reliability is consistently bad more often than consistently good, hence the moderate score here.

Pressure

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting pressure data in the network. It can be observed that most nodes collect more than 20 data measurements for every 10-minute time interval.It can also be observed that nodes are either collecting reliable data or unreliable ones - there is no mix of both.

dfPressure%>%
   ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable Pressure Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfPressure%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfPressure1

dfPressure1%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	100	100

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfPressure1,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfPressure1,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
      scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable Pressure Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfPressure1%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e0610ee36	41.751295	-87.605288	100.00000
001e0610ee43	41.788608	-87.598713	100.00000
001e06113f54	41.884607	-87.624577	100.00000
001e0611537d	41.794167	-87.601646	100.00000
001e061130f4	41.896157	-87.662391	100.00000
001e06113cf1	41.884688	-87.627864	100.00000
001e0610f05c	41.924903	-87.687703	100.00000
001e0610ee5d	41.923996	-87.761072	100.00000
001e06113dbc	41.713867	-87.536509	100.00000
001e0610e532	41.857959	-87.656427	100.00000
001e0610ba46	41.878377	-87.627678	100.00000
001e0610ee33	41.965089	-87.679076	100.00000
001e0610f732	41.895005	-87.745817	100.00000
001e0610bc10	41.736314	-87.624179	100.00000
001e0610eef4	41.912681	-87.681052	100.00000
001e0610bbf9	41.768319	-87.683396	100.00000
001e06113a48	41.943263	-87.688069	100.00000
001e0610ba15	41.722457	-87.57535	100.00000
001e0610ba13	41.751238	-87.712990	100.00000
001e0610e537	41.961622	-87.665948	100.00000
001e0610f6db	41.791329	-87.598677	100.00000
001e0610e538	41.736593	-87.604759	93.05556
001e061144c0	41.764122	-87.72242	92.36111
001e06113107	41.751142	-87.71299	92.36111
001e0610bc12	41.75034	-87.663518	0.00000
001e06114fd4	41.794477	-87.615957	0.00000
001e061146bc	41.918733	-87.668257	0.00000
001e06114503	41.666078	-87.539374	0.00000
001e0611536c	41.88575	-87.62969	0.00000
001e0611462f	41.823527	-87.641054	0.00000
001e0610f8f4	41.832579	-87.646133	0.00000
001e0610f703	41.87148	-87.67644	0.00000
001e06113d22	41.800846	-87.703739	0.00000
001e06113ace	41.83107	-87.617298	0.00000
001e061146ba	41.96759	-87.76257	0.00000
001e0610e835	41.968757	-87.679174	0.00000
001e0610eef2	41.965256	-87.66672	0.00000
001e06114500	41.714494	-87.643099	0.00000

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfPressure1$NodeMeanX, n = 5)

dfPressure1$lat<-as.numeric(dfPressure1$lat)
dfPressure1$lon<-as.numeric(dfPressure1$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfPressure1,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfPressure1$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfPressure1$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfPressure1$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for pressure data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfPressure1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfPressure1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
62.5731

Constructing Score 2

dfPressure1%>%
 select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfPressure2

dfPressure2%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e0610ee36	41.75129	-87.60529	100	63.15789
001e0610ee43	41.78861	-87.59871	100	63.15789
001e06113f54	41.88461	-87.62458	100	63.15789
001e0611537d	41.79417	-87.60165	100	63.15789
001e061144c0	41.76412	-87.72242	100	63.15789
001e061130f4	41.89616	-87.66239	100	63.15789
001e06113cf1	41.88469	-87.62786	100	63.15789
001e0610f05c	41.92490	-87.68770	100	63.15789
001e0610ee5d	41.92400	-87.76107	100	63.15789
001e06113107	41.75114	-87.71299	100	63.15789
001e06113dbc	41.71387	-87.53651	100	63.15789
001e0610e532	41.85796	-87.65643	100	63.15789
001e0610ba46	41.87838	-87.62768	100	63.15789
001e0610ee33	41.96509	-87.67908	100	63.15789
001e0610e538	41.73659	-87.60476	100	63.15789
001e0610f732	41.89500	-87.74582	100	63.15789
001e0610bc10	41.73631	-87.62418	100	63.15789
001e0610eef4	41.91268	-87.68105	100	63.15789
001e0610bbf9	41.76832	-87.68340	100	63.15789
001e06113a48	41.94326	-87.68807	100	63.15789
001e0610ba15	41.72246	-87.57535	100	63.15789
001e0610ba13	41.75124	-87.71299	100	63.15789
001e0610e537	41.96162	-87.66595	100	63.15789
001e0610f6db	41.79133	-87.59868	100	63.15789
001e0610bc12	41.75034	-87.66352	0	63.15789
001e06114fd4	41.79448	-87.61596	0	63.15789
001e061146bc	41.91873	-87.66826	0	63.15789
001e06114503	41.66608	-87.53937	0	63.15789
001e0611536c	41.88575	-87.62969	0	63.15789
001e0611462f	41.82353	-87.64105	0	63.15789
001e0610f8f4	41.83258	-87.64613	0	63.15789
001e0610f703	41.87148	-87.67644	0	63.15789
001e06113d22	41.80085	-87.70374	0	63.15789
001e06113ace	41.83107	-87.61730	0	63.15789
001e061146ba	41.96759	-87.76257	0	63.15789
001e0610e835	41.96876	-87.67917	0	63.15789
001e0610eef2	41.96526	-87.66672	0	63.15789
001e06114500	41.71449	-87.64310	0	63.15789

The spatial distribution of this consistency in sensor value reliability by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfPressure2$NodeSDScore, n = 5)

dfPressure2$lat<-as.numeric(dfPressure2$lat)
dfPressure2$lon<-as.numeric(dfPressure2$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfPressure2,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfPressure2$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfPressure2$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfPressure2$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfPressure2%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT Pressure Network on 2012-12-15,

Score 1 = 62.6 : On average, only 62.6% of the pressure data measured by the nodes in the network every 10-minute is reliable. This is a moderate score.
Score 2 = 63.2 : From the density distribution, it could be observed that the node sensor value reliability is consistently good more often than consistently bad, hence the above-moderate score here.

PM2.5 Concentration

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting PM2.5 Concentration data in the network. It can be observed that most nodes collect more than 20 data measurements for every 10-minute time interval.

dfPM25%>%
  ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable PM 2.5 Concentration Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfPM25%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfPM251

dfPM251%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon
001e0610ba15	2018-12-15 00:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 00:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 00:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 00:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 00:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 00:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 01:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 01:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 01:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 01:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 01:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 01:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 02:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 02:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 02:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 02:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 02:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 02:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 03:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 03:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 03:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 03:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 03:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 03:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 04:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 04:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 04:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 04:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 04:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 04:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 05:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 05:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 05:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 05:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 05:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 05:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 06:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 06:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 06:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 06:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 06:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 06:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 07:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 07:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 07:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 07:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 07:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 07:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 08:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 08:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 08:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 08:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 08:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 08:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 09:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 09:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 09:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 09:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 09:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 09:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 10:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 10:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 10:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 10:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 10:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 10:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 11:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 11:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 11:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 11:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 11:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 11:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 12:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 12:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 12:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 12:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 12:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 12:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 13:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 13:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 13:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 13:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 13:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 13:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 14:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 14:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 14:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 14:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 14:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 14:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 15:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 15:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 15:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 15:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 15:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 15:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 16:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 16:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 16:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 16:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 16:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 16:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 17:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 17:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 17:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 17:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 17:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 17:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 18:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 18:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 18:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 18:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 18:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 18:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 19:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 19:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 19:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 19:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 19:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 19:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 20:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 20:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 20:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 20:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 20:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 20:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 21:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 21:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 21:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 21:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 21:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 21:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 22:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 22:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 22:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 22:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 22:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 22:50:00	41.722457	-87.57535
001e0610ba15	2018-12-15 23:00:00	41.722457	-87.57535
001e0610ba15	2018-12-15 23:10:00	41.722457	-87.57535
001e0610ba15	2018-12-15 23:20:00	41.722457	-87.57535
001e0610ba15	2018-12-15 23:30:00	41.722457	-87.57535
001e0610ba15	2018-12-15 23:40:00	41.722457	-87.57535
001e0610ba15	2018-12-15 23:50:00	41.722457	-87.57535

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfPM251,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfPM251,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
      scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable PM 2.5 Concentration Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfPM251%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e06113107	41.751142	-87.71299	92.36111
001e0610bc10	41.736314	-87.624179	43.22665
001e061144c0	41.764122	-87.72242	0.00000
001e06114fd4	41.794477	-87.615957	0.00000
001e0610f05c	41.924903	-87.687703	0.00000
001e06113dbc	41.713867	-87.536509	0.00000
001e0610ba15	41.722457	-87.57535	0.00000
001e06114500	41.714494	-87.643099	0.00000

The spatial distribution of this varying sensor value reliability levels by node is then visualised in the following map. There, it can also be observed that the limited number of PM2.5 nodes are located in the south-side of the city, with only one node located in the north.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfPM251$NodeMeanX, n = 5)

dfPM251$lat<-as.numeric(dfPM251$lat)
dfPM251$lon<-as.numeric(dfPM251$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfPM251,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfPM251$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfPM251$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfPM251$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for PM 2.5 concentration data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfPM251%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfPM251%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
16.94847

Constructing Score 2

dfPM251%>%
select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfPM252

dfPM252%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e06113107	41.75114	-87.71299	100	25
001e0610bc10	41.73631	-87.62418	100	25
001e061144c0	41.76412	-87.72242	0	25
001e06114fd4	41.79448	-87.61596	0	25
001e0610f05c	41.92490	-87.68770	0	25
001e06113dbc	41.71387	-87.53651	0	25
001e0610ba15	41.72246	-87.57535	0	25
001e06114500	41.71449	-87.64310	0	25

The spatial distribution of this consistency in sensor value reliability by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfPM252$NodeSDScore, n = 5)

dfPM252$lat<-as.numeric(dfPM252$lat)
dfPM252$lon<-as.numeric(dfPM252$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfTemp2,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfPM252$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfPM252$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfPM252$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfPM252%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT PM 2.5 Network on 2012-12-15,

Score 1 = 16.9 : On average, only 16.9% of the pressure data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 25.0 : Only 2 nodes consistently collected reliable data. The other nodes consistently collected unreliable data, hence the low score here.

CO Concentration

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting CO concentration data in the network. It can be observed that most nodes collect around 20 data measurements for every 10-minute time interval.

dfCO%>%
  ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable CO concentration Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfCO%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfCO1

dfCO1%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	37.500000	39.41967
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	21.739130	39.41967
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	45.833333	39.41967
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	45.833333	39.41967
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	25.000000	39.41967
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	16.666667	39.41967
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	66.666667	39.41967
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	87.500000	39.41967
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	37.500000	39.41967
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	25.000000	39.41967
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	16.666667	39.41967
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	41.666667	39.41967
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	13.043478	39.41967
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	16.666667	39.41967
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	37.500000	39.41967
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	16.666667	39.41967
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	45.833333	39.41967
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	12.500000	39.41967
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	37.500000	39.41967
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	45.833333	39.41967
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	54.166667	39.41967
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	29.166667	39.41967
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	37.500000	39.41967
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	62.500000	39.41967
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	79.166667	39.41967
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	60.869565	39.41967
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	62.500000	39.41967
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	70.833333	39.41967
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	54.166667	39.41967
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	79.166667	39.41967
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	83.333333	39.41967
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	91.666667	39.41967
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	91.666667	39.41967
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	95.833333	39.41967
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	100.000000	39.41967
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	83.333333	39.41967
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	95.833333	39.41967
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	83.333333	39.41967
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	65.217391	39.41967
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	58.333333	39.41967
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	54.166667	39.41967
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	70.833333	39.41967
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	62.500000	39.41967
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	54.166667	39.41967
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	66.666667	39.41967
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	75.000000	39.41967
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	58.333333	39.41967
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	60.869565	39.41967
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	41.666667	39.41967
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	33.333333	39.41967
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	37.500000	39.41967
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	62.500000	39.41967
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	58.333333	39.41967
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	66.666667	39.41967
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	95.833333	39.41967
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	58.333333	39.41967
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	33.333333	39.41967
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	45.454546	39.41967
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	29.166667	39.41967
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	25.000000	39.41967
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	29.166667	39.41967
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	20.833333	39.41967
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	29.166667	39.41967
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	41.666667	39.41967
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	54.166667	39.41967
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	25.000000	39.41967
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	86.956522	39.41967
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	70.833333	39.41967
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	62.500000	39.41967
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	54.166667	39.41967
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	66.666667	39.41967
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	58.333333	39.41967
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	29.166667	39.41967
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	20.833333	39.41967
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	33.333333	39.41967
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	20.833333	39.41967
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	25.000000	39.41967
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	17.391304	39.41967
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	29.166667	39.41967
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	12.500000	39.41967
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	8.695652	39.41967
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	12.500000	39.41967
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	0.000000	39.41967
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	8.333333	39.41967
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	4.166667	39.41967
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	37.500000	39.41967
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	33.333333	39.41967
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	30.434783	39.41967
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	50.000000	39.41967
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	58.333333	39.41967
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	25.000000	39.41967
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	12.500000	39.41967
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	33.333333	39.41967
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	78.260870	39.41967

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfCO1,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfCO1,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
      scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable CO Concentration Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfCO1%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e0610eef2	41.965256	-87.66672	75.455415
001e06114500	41.714494	-87.643099	63.883350
001e06113ace	41.83107	-87.617298	51.487017
001e0610f05c	41.924903	-87.687703	43.870773
001e0610bc10	41.736314	-87.624179	43.631743
001e06114fd4	41.794477	-87.615957	43.574673
001e0610ba13	41.751238	-87.712990	39.419672
001e0610e537	41.961622	-87.665948	39.168176
001e0610ee43	41.788608	-87.598713	37.835900
001e061146bc	41.918733	-87.668257	34.312669
001e06113107	41.751142	-87.71299	33.855425
001e061130f4	41.896157	-87.662391	31.257091
001e06113cf1	41.884688	-87.627864	28.629479
001e06114503	41.666078	-87.539374	11.781424
001e0610f6db	41.791329	-87.598677	9.636675
001e061144c0	41.764122	-87.72242	8.485044
001e0610ba15	41.722457	-87.57535	3.145815
001e0610ba46	41.878377	-87.627678	2.435588
001e0610ef27	41.846579	-87.685557	0.000000
001e0610ee33	41.965089	-87.679076	0.000000
001e0610e532	41.857959	-87.656427	0.000000

The spatial distribution of this varying sensor value reliability levels by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfCO1$NodeMeanX, n = 5)

dfCO1$lat<-as.numeric(dfCO1$lat)
dfCO1$lon<-as.numeric(dfCO1$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfCO1,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfCO1$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfCO1$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfCO1$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for CO concentration data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfCO1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfCO1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
28.66028

Constructing Score 2

dfCO1%>%
select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfCO2

dfCO2%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e061144c0	41.76412	-87.72242	100.00000	63.56703
001e0610ba46	41.87838	-87.62768	100.00000	63.56703
001e0610ba15	41.72246	-87.57535	100.00000	63.56703
001e0610f6db	41.79133	-87.59868	100.00000	63.56703
001e06114503	41.66608	-87.53937	96.64364	63.56703
001e06113107	41.75114	-87.71299	79.69208	63.56703
001e0610ba13	41.75124	-87.71299	79.05702	63.56703
001e06114500	41.71449	-87.64310	72.35331	63.56703
001e0610e537	41.96162	-87.66595	68.39149	63.56703
001e061130f4	41.89616	-87.66239	64.92731	63.56703
001e0610bc10	41.73631	-87.62418	64.42289	63.56703
001e0610f05c	41.92490	-87.68770	62.70975	63.56703
001e06114fd4	41.79448	-87.61596	62.23816	63.56703
001e061146bc	41.91873	-87.66826	62.19979	63.56703
001e0610eef2	41.96526	-87.66672	61.35660	63.56703
001e06113cf1	41.88469	-87.62786	61.11279	63.56703
001e0610ee43	41.78861	-87.59871	55.78541	63.56703
001e06113ace	41.83107	-87.61730	44.01734	63.56703
001e0610ef27	41.84658	-87.68556	0.00000	63.56703
001e0610ee33	41.96509	-87.67908	0.00000	63.56703
001e0610e532	41.85796	-87.65643	0.00000	63.56703

The spatial distribution of this consistency in sensor value reliability by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfCO2$NodeSDScore, n = 5)

dfCO2$lat<-as.numeric(dfCO2$lat)
dfCO2$lon<-as.numeric(dfCO2$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfCO2,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfCO2$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfCO2$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfCO2$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfCO2%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\nOverall Consistency\nin Sensor Value Reliability', size = 4, vjust= -1.5, hjust=-1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT CO Concentration Network on 2012-12-15,

Score 1 = 28.7 : On average, only 28.7% of the CO concentration data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 63.4 : From the density distribution, it could be observed that the node sensor value reliability is consistently good more often than consistently bad, hence the above-moderate score here.

H2S Concentration

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting H2S concentration data in the network. It can be observed that most nodes collect around 20 data measurements for every 10-minute time interval.

dfH2S%>%
    ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable H2S Concentration Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfH2S%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfH2S1

dfH2S1%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	26.086956	36.72332
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	12.500000	36.72332
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	41.666667	36.72332
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	29.166667	36.72332
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	17.391304	36.72332
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	4.166667	36.72332
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	41.666667	36.72332
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	58.333333	36.72332
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	70.833333	36.72332
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	12.500000	36.72332
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	30.434783	36.72332
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	70.833333	36.72332
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	8.333333	36.72332
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	54.545454	36.72332
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	70.833333	36.72332
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	62.500000	36.72332
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	58.333333	36.72332
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	66.666667	36.72332
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	29.166667	36.72332
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	26.086956	36.72332
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	29.166667	36.72332
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	58.333333	36.72332
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	30.434783	36.72332
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	41.666667	36.72332
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	12.500000	36.72332
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	41.666667	36.72332
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	41.666667	36.72332
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	27.272727	36.72332
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	29.166667	36.72332
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	41.666667	36.72332
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	12.500000	36.72332
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	21.739130	36.72332
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	62.500000	36.72332
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	8.333333	36.72332
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	26.086956	36.72332
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	26.086956	36.72332
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	12.500000	36.72332
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	29.166667	36.72332
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	58.333333	36.72332
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	13.043478	36.72332
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	29.166667	36.72332
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	66.666667	36.72332
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	62.500000	36.72332
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	45.833333	36.72332
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	16.666667	36.72332
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	54.166667	36.72332
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	41.666667	36.72332
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	62.500000	36.72332
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	60.869565	36.72332
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	58.333333	36.72332
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	58.333333	36.72332
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	37.500000	36.72332
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	70.833333	36.72332
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	33.333333	36.72332
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	30.434783	36.72332
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	29.166667	36.72332
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	20.833333	36.72332
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	50.000000	36.72332
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	25.000000	36.72332
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	43.478261	36.72332

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfH2S1,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfH2S1,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
      scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable H2S Concentration Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfH2S1%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e061130f4	41.896157	-87.662391	100.00000
001e0610bc10	41.736314	-87.624179	100.00000
001e06113ace	41.83107	-87.617298	100.00000
001e06114500	41.714494	-87.643099	94.44444
001e06114503	41.666078	-87.539374	93.66823
001e0610eef2	41.965256	-87.66672	92.06799
001e06114fd4	41.794477	-87.615957	85.80014
001e0610f05c	41.924903	-87.687703	62.21945
001e0610f6db	41.791329	-87.598677	57.51434
001e0610ba46	41.878377	-87.627678	53.15382
001e0610e537	41.961622	-87.665948	47.71538
001e06113cf1	41.884688	-87.627864	43.84561
001e061146bc	41.918733	-87.668257	41.73094
001e06113107	41.751142	-87.71299	41.00556
001e0610ba13	41.751238	-87.712990	36.72332
001e0610ee43	41.788608	-87.598713	27.72871
001e061144c0	41.764122	-87.72242	18.28327
001e0610ba15	41.722457	-87.57535	11.83541
001e0610ef27	41.846579	-87.685557	0.00000
001e0610ee33	41.965089	-87.679076	0.00000
001e0610e532	41.857959	-87.656427	0.00000

The spatial distribution of this varying sensor value reliability levels by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfH2S1$NodeMeanX, n = 5)

dfH2S1$lat<-as.numeric(dfH2S1$lat)
dfH2S1$lon<-as.numeric(dfH2S1$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfH2S1,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfH2S1$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfH2S1$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfH2S1$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for H2S concentration on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfH2S1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfH2S1%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
52.74936

Constructing Score 2

dfH2S1%>%
select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfH2S2

dfH2S2%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e061144c0	41.76412	-87.72242	100.00000	61.80346
001e061130f4	41.89616	-87.66239	100.00000	61.80346
001e0610bc10	41.73631	-87.62418	100.00000	61.80346
001e06113ace	41.83107	-87.61730	100.00000	61.80346
001e06114500	41.71449	-87.64310	100.00000	61.80346
001e0610ba15	41.72246	-87.57535	99.84139	61.80346
001e06114503	41.66608	-87.53937	85.96484	61.80346
001e0610eef2	41.96526	-87.66672	83.67821	61.80346
001e06114fd4	41.79448	-87.61596	79.35105	61.80346
001e0610f6db	41.79133	-87.59868	74.27648	61.80346
001e0610ba46	41.87838	-87.62768	71.07506	61.80346
001e0610f05c	41.92490	-87.68770	68.82749	61.80346
001e0610ee43	41.78861	-87.59871	50.58038	61.80346
001e0610ba13	41.75124	-87.71299	41.95153	61.80346
001e06113107	41.75114	-87.71299	39.92343	61.80346
001e061146bc	41.91873	-87.66826	36.70309	61.80346
001e0610e537	41.96162	-87.66595	33.36216	61.80346
001e06113cf1	41.88469	-87.62786	32.33757	61.80346
001e0610ef27	41.84658	-87.68556	0.00000	61.80346
001e0610ee33	41.96509	-87.67908	0.00000	61.80346
001e0610e532	41.85796	-87.65643	0.00000	61.80346

The spatial distribution of this consistency in sensor value reliability by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfH2S2$NodeSDScore, n = 5)

dfH2S2$lat<-as.numeric(dfH2S2$lat)
dfH2S2$lon<-as.numeric(dfH2S2$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfH2S2,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfH2S2$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfH2S2$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfH2S2$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfH2S2%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT H2S concentration Network on 2012-12-15,

Score 1 = 52.7 : On average, only 52.7% of the H2S concentration data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 61.8 : From the density distribution, it could be observed that the node sensor value reliability is consistently good more often than consistently bad, hence the above-moderate score here.

NO2 Concentration

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting NO2 concentration data in the network. It can be observed that most nodes collect around 20 data measurements for every 10-minute time interval.

dfNO2%>%
    ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable NO2 Concentration Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfNO2%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfNO21

dfNO21%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	30.434783	40.26611
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	45.833333	40.26611
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	12.500000	40.26611
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	41.666667	40.26611
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	26.086956	40.26611
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	50.000000	40.26611
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	41.666667	40.26611
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	45.833333	40.26611
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	0.000000	40.26611
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	30.434783	40.26611
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	25.000000	40.26611
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	4.166667	40.26611
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	9.090909	40.26611
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	4.166667	40.26611
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	4.166667	40.26611
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	0.000000	40.26611
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	17.391304	40.26611
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	25.000000	40.26611
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	25.000000	40.26611
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	25.000000	40.26611
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	12.500000	40.26611
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	26.086956	40.26611
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	25.000000	40.26611
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	45.833333	40.26611
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	4.166667	40.26611
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	8.333333	40.26611
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	50.000000	40.26611
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	27.272727	40.26611
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	25.000000	40.26611
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	16.666667	40.26611
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	54.166667	40.26611
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	8.695652	40.26611
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	33.333333	40.26611
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	62.500000	40.26611
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	58.333333	40.26611
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	54.166667	40.26611
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	41.666667	40.26611
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	20.833333	40.26611
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	45.833333	40.26611
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	70.833333	40.26611
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	69.565217	40.26611
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	50.000000	40.26611
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	54.166667	40.26611
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	47.826087	40.26611
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	54.166667	40.26611
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	37.500000	40.26611
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	79.166667	40.26611
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	75.000000	40.26611
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	70.833333	40.26611
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	70.833333	40.26611
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	75.000000	40.26611
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	58.333333	40.26611
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	75.000000	40.26611
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	70.833333	40.26611
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	50.000000	40.26611
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	39.130435	40.26611
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	79.166667	40.26611
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	95.833333	40.26611
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	75.000000	40.26611
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	87.500000	40.26611
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	91.666667	40.26611
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	91.666667	40.26611
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	54.166667	40.26611
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	66.666667	40.26611
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	70.833333	40.26611
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	75.000000	40.26611
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	100.000000	40.26611
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	70.833333	40.26611
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	60.869565	40.26611
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	50.000000	40.26611
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	58.333333	40.26611
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	87.500000	40.26611
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	62.500000	40.26611
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	41.666667	40.26611
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	79.166667	40.26611
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	83.333333	40.26611
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	86.956522	40.26611
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	54.166667	40.26611
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	50.000000	40.26611
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	29.166667	40.26611
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	54.166667	40.26611
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	45.833333	40.26611
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	43.478261	40.26611

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfNO21,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfNO21,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
       scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable NO2 Concentration Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfNO21%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e0610f05c	41.924903	-87.687703	100.000000
001e0610ba46	41.878377	-87.627678	100.000000
001e06113ace	41.83107	-87.617298	99.767261
001e0610eef2	41.965256	-87.66672	99.621326
001e06114fd4	41.794477	-87.615957	99.417522
001e061146bc	41.918733	-87.668257	99.069042
001e06113cf1	41.884688	-87.627864	98.714271
001e0610f6db	41.791329	-87.598677	98.231179
001e061130f4	41.896157	-87.662391	95.391757
001e061144c0	41.764122	-87.72242	91.421916
001e06113107	41.751142	-87.71299	85.478311
001e06114503	41.666078	-87.539374	85.264099
001e06114500	41.714494	-87.643099	81.703356
001e0610ee43	41.788608	-87.598713	77.654489
001e0610bc10	41.736314	-87.624179	68.007750
001e0610ba15	41.722457	-87.57535	53.672024
001e0610ba13	41.751238	-87.712990	40.266112
001e0610e537	41.961622	-87.665948	4.322665
001e0610ef27	41.846579	-87.685557	0.000000
001e0610ee33	41.965089	-87.679076	0.000000
001e0610e532	41.857959	-87.656427	0.000000

The spatial distribution of this varying sensor value reliability levels by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfNO21$NodeMeanX, n = 5)

dfNO21$lat<-as.numeric(dfNO21$lat)
dfNO21$lon<-as.numeric(dfNO21$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfNO21,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfNO21$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfNO21$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfNO21$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for NO2 concentration data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfNO21%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfNO21%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
70.3811

Constructing Score 2

dfNO21%>%
select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfNO22

dfNO22%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e0610f05c	41.92490	-87.68770	100.00000	74.85045
001e0610ba46	41.87838	-87.62768	100.00000	74.85045
001e0610e537	41.96162	-87.66595	100.00000	74.85045
001e06113ace	41.83107	-87.61730	98.91567	74.85045
001e0610eef2	41.96526	-87.66672	98.21697	74.85045
001e06114fd4	41.79448	-87.61596	97.79367	74.85045
001e061144c0	41.76412	-87.72242	97.33161	74.85045
001e061146bc	41.91873	-87.66826	96.95073	74.85045
001e0610f6db	41.79133	-87.59868	96.90857	74.85045
001e06113cf1	41.88469	-87.62786	96.46912	74.85045
001e061130f4	41.89616	-87.66239	91.77244	74.85045
001e06113107	41.75114	-87.71299	90.37978	74.85045
001e06114503	41.66608	-87.53937	78.54444	74.85045
001e06114500	41.71449	-87.64310	78.01626	74.85045
001e0610ee43	41.78861	-87.59871	75.60495	74.85045
001e0610ba15	41.72246	-87.57535	68.24256	74.85045
001e0610ba13	41.75124	-87.71299	59.31641	74.85045
001e0610bc10	41.73631	-87.62418	47.39623	74.85045
001e0610ef27	41.84658	-87.68556	0.00000	74.85045
001e0610ee33	41.96509	-87.67908	0.00000	74.85045
001e0610e532	41.85796	-87.65643	0.00000	74.85045

The spatial distribution of this consistency in sensor value reliability by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfNO22$NodeSDScore, n = 5)

dfNO22$lat<-as.numeric(dfNO22$lat)
dfNO22$lon<-as.numeric(dfNO22$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfNO22,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfNO22$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfNO22$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfNO22$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfNO22%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT NO2 concentration Network on 2012-12-15,

Score 1 = 70.4 : On average, only 70.4% of the NO2 concentration data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 74.9 : From the density distribution, it could be observed that the node sensor value reliability is consistently good more often than consistently bad, hence the above-moderate score here.

O3 Concentration

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting O3 concentration data in the network. It can be observed that most nodes collect around 20 data measurements for every 10-minute time interval.

dfO3%>%
    ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable O3 Concentration Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfO3%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfO31

dfO31%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	100	100
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	100	100

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfO31,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfO31,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
        scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable O3 Concentration Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfO31%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e06113cf1	41.884688	-87.627864	100.00000
001e061146bc	41.918733	-87.668257	100.00000
001e0610f05c	41.924903	-87.687703	100.00000
001e0610ba46	41.878377	-87.627678	100.00000
001e0610bc10	41.736314	-87.624179	100.00000
001e0610ba13	41.751238	-87.712990	100.00000
001e0610f6db	41.791329	-87.598677	100.00000
001e0610eef2	41.965256	-87.66672	99.97106
001e061130f4	41.896157	-87.662391	99.88300
001e0610e537	41.961622	-87.665948	99.50684
001e06114fd4	41.794477	-87.615957	97.98209
001e0610ee43	41.788608	-87.598713	96.63345
001e06113107	41.751142	-87.71299	92.27179
001e061144c0	41.764122	-87.72242	92.12202
001e0610ba15	41.722457	-87.57535	92.10561
001e06113ace	41.83107	-87.617298	68.69716
001e06114500	41.714494	-87.643099	68.24957
001e06114503	41.666078	-87.539374	57.50222
001e0610ef27	41.846579	-87.685557	0.00000
001e0610ee33	41.965089	-87.679076	0.00000
001e0610e532	41.857959	-87.656427	0.00000

The spatial distribution of this varying sensor value reliability levels by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfO31$NodeMeanX, n = 5)

dfO31$lat<-as.numeric(dfO31$lat)
dfO31$lon<-as.numeric(dfO31$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfO31,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfO31$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfO31$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfO31$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for O3 Concentration data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfO31%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfO31%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
79.28213

Constructing Score 2

dfO31%>%
select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             0, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfO32

dfO32%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e06113cf1	41.88469	-87.62786	100.00000	77.58294
001e061146bc	41.91873	-87.66826	100.00000	77.58294
001e0610f05c	41.92490	-87.68770	100.00000	77.58294
001e0610ba46	41.87838	-87.62768	100.00000	77.58294
001e0610bc10	41.73631	-87.62418	100.00000	77.58294
001e0610ba13	41.75124	-87.71299	100.00000	77.58294
001e0610f6db	41.79133	-87.59868	100.00000	77.58294
001e0610eef2	41.96526	-87.66672	99.65268	77.58294
001e06113107	41.75114	-87.71299	99.36023	77.58294
001e061130f4	41.89616	-87.66239	99.14727	77.58294
001e061144c0	41.76412	-87.72242	98.84547	77.58294
001e0610e537	41.96162	-87.66595	98.31807	77.58294
001e06114fd4	41.79448	-87.61596	95.93814	77.58294
001e0610ee43	41.78861	-87.59871	93.79447	77.58294
001e0610ba15	41.72246	-87.57535	88.93739	77.58294
001e06113ace	41.83107	-87.61730	59.97926	77.58294
001e06114500	41.71449	-87.64310	57.28921	77.58294
001e06114503	41.66608	-87.53937	37.97947	77.58294
001e0610ef27	41.84658	-87.68556	0.00000	77.58294
001e0610ee33	41.96509	-87.67908	0.00000	77.58294
001e0610e532	41.85796	-87.65643	0.00000	77.58294

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfO32$NodeSDScore, n = 5)

dfO32$lat<-as.numeric(dfO32$lat)
dfO32$lon<-as.numeric(dfO32$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfO32,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfO32$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfO32$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfO32$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfO32%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT O3 Concentration Network on 2012-12-15,

Score 1 = 79.3 : On average, only 79.3% of the O3 Concentration data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 77.6 : From the density distribution, it could be observed that the node sensor value reliability is consistently good more often than consistently bad, hence the above-moderate score here.

SO2 Concentration

We begin by observing how reliable and unreliable data measurements are distributed across the day for each node in the network on 2018-12-15. The figure below shows how the number of reliable and unreliable data collected varies across the day’s duration for each node collecting SO2 Concentration data in the network. It can be observed that most nodes collect around 20 data measurements for every 10-minute time interval.

dfSO2%>%
  ggplot()+
  geom_bar(aes(x=by10, fill=as.factor(val_qual)))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Number collected', x='Time', 
       title='Number of Reliable and Unreliable SO2 Concentration Data Collected on 2018-12-15 For Each Node\n- By Time')+
  scale_fill_manual(values=c('indianred1', 'cornflowerblue'), 
                    labels=c('Unreliable', 
                            'Reliable'), 
                    name="")+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

Calculate Proportion of Reliable Data Measurements

As the flowchart shows, we need to calculate X, which represents the proportion of reliable data collected by each node at each 10-minute time interval.

From X, we can then calculate the Node Sensor Value Reliability, which represents the average proportion of reliable data collected by each node during the day.

dfSO2%>%
  group_by(node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(node_id, by10, lat, lon, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  mutate(NodeMeanX = sum(X)/144)->dfSO21

dfSO21%>%
  arrange(node_id)%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	by10	lat	lon	X	NodeMeanX
001e0610ba13	2018-12-15 00:00:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 00:10:00	41.751238	-87.712990	73.91304	60.72535
001e0610ba13	2018-12-15 00:20:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 00:30:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 00:40:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 00:50:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 01:00:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 01:10:00	41.751238	-87.712990	87.50000	60.72535
001e0610ba13	2018-12-15 01:20:00	41.751238	-87.712990	79.16667	60.72535
001e0610ba13	2018-12-15 01:30:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 01:40:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 01:50:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 02:00:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 02:10:00	41.751238	-87.712990	82.60870	60.72535
001e0610ba13	2018-12-15 02:20:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 02:30:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 02:40:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 02:50:00	41.751238	-87.712990	83.33333	60.72535
001e0610ba13	2018-12-15 03:00:00	41.751238	-87.712990	87.50000	60.72535
001e0610ba13	2018-12-15 03:10:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 03:20:00	41.751238	-87.712990	58.33333	60.72535
001e0610ba13	2018-12-15 03:30:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 03:40:00	41.751238	-87.712990	41.66667	60.72535
001e0610ba13	2018-12-15 03:50:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 04:00:00	41.751238	-87.712990	25.00000	60.72535
001e0610ba13	2018-12-15 04:10:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 04:20:00	41.751238	-87.712990	83.33333	60.72535
001e0610ba13	2018-12-15 04:30:00	41.751238	-87.712990	69.56522	60.72535
001e0610ba13	2018-12-15 04:40:00	41.751238	-87.712990	25.00000	60.72535
001e0610ba13	2018-12-15 04:50:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 05:00:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 05:10:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 05:20:00	41.751238	-87.712990	91.66667	60.72535
001e0610ba13	2018-12-15 05:30:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 05:40:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 05:50:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 06:00:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 06:10:00	41.751238	-87.712990	45.45455	60.72535
001e0610ba13	2018-12-15 06:20:00	41.751238	-87.712990	29.16667	60.72535
001e0610ba13	2018-12-15 06:30:00	41.751238	-87.712990	41.66667	60.72535
001e0610ba13	2018-12-15 06:40:00	41.751238	-87.712990	37.50000	60.72535
001e0610ba13	2018-12-15 06:50:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 07:00:00	41.751238	-87.712990	37.50000	60.72535
001e0610ba13	2018-12-15 07:10:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 07:20:00	41.751238	-87.712990	33.33333	60.72535
001e0610ba13	2018-12-15 07:30:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 07:40:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 07:50:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 08:00:00	41.751238	-87.712990	73.91304	60.72535
001e0610ba13	2018-12-15 08:10:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 08:20:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 08:30:00	41.751238	-87.712990	79.16667	60.72535
001e0610ba13	2018-12-15 08:40:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 08:50:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 09:00:00	41.751238	-87.712990	79.16667	60.72535
001e0610ba13	2018-12-15 09:10:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 09:20:00	41.751238	-87.712990	29.16667	60.72535
001e0610ba13	2018-12-15 09:30:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 09:40:00	41.751238	-87.712990	37.50000	60.72535
001e0610ba13	2018-12-15 09:50:00	41.751238	-87.712990	69.56522	60.72535
001e0610ba13	2018-12-15 10:00:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 10:10:00	41.751238	-87.712990	58.33333	60.72535
001e0610ba13	2018-12-15 10:20:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 10:30:00	41.751238	-87.712990	87.50000	60.72535
001e0610ba13	2018-12-15 10:40:00	41.751238	-87.712990	58.33333	60.72535
001e0610ba13	2018-12-15 10:50:00	41.751238	-87.712990	41.66667	60.72535
001e0610ba13	2018-12-15 11:00:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 11:10:00	41.751238	-87.712990	83.33333	60.72535
001e0610ba13	2018-12-15 11:20:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 11:30:00	41.751238	-87.712990	58.33333	60.72535
001e0610ba13	2018-12-15 11:40:00	41.751238	-87.712990	72.72727	60.72535
001e0610ba13	2018-12-15 11:50:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 12:00:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 12:10:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 12:20:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 12:30:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 12:40:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 12:50:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 13:00:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 13:10:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 13:20:00	41.751238	-87.712990	83.33333	60.72535
001e0610ba13	2018-12-15 13:30:00	41.751238	-87.712990	73.91304	60.72535
001e0610ba13	2018-12-15 13:40:00	41.751238	-87.712990	33.33333	60.72535
001e0610ba13	2018-12-15 13:50:00	41.751238	-87.712990	91.66667	60.72535
001e0610ba13	2018-12-15 14:00:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 14:10:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 14:20:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 14:30:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 14:40:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 14:50:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 15:00:00	41.751238	-87.712990	83.33333	60.72535
001e0610ba13	2018-12-15 15:10:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 15:20:00	41.751238	-87.712990	41.66667	60.72535
001e0610ba13	2018-12-15 15:30:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 15:40:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 15:50:00	41.751238	-87.712990	73.91304	60.72535
001e0610ba13	2018-12-15 16:00:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 16:10:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 16:20:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 16:30:00	41.751238	-87.712990	73.91304	60.72535
001e0610ba13	2018-12-15 16:40:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 16:50:00	41.751238	-87.712990	79.16667	60.72535
001e0610ba13	2018-12-15 17:00:00	41.751238	-87.712990	79.16667	60.72535
001e0610ba13	2018-12-15 17:10:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 17:20:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 17:30:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 17:40:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 17:50:00	41.751238	-87.712990	83.33333	60.72535
001e0610ba13	2018-12-15 18:00:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 18:10:00	41.751238	-87.712990	83.33333	60.72535
001e0610ba13	2018-12-15 18:20:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 18:30:00	41.751238	-87.712990	37.50000	60.72535
001e0610ba13	2018-12-15 18:40:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 18:50:00	41.751238	-87.712990	86.95652	60.72535
001e0610ba13	2018-12-15 19:00:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 19:10:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 19:20:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 19:30:00	41.751238	-87.712990	33.33333	60.72535
001e0610ba13	2018-12-15 19:40:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 19:50:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 20:00:00	41.751238	-87.712990	37.50000	60.72535
001e0610ba13	2018-12-15 20:10:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 20:20:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 20:30:00	41.751238	-87.712990	50.00000	60.72535
001e0610ba13	2018-12-15 20:40:00	41.751238	-87.712990	79.16667	60.72535
001e0610ba13	2018-12-15 20:50:00	41.751238	-87.712990	41.66667	60.72535
001e0610ba13	2018-12-15 21:00:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 21:10:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 21:20:00	41.751238	-87.712990	37.50000	60.72535
001e0610ba13	2018-12-15 21:30:00	41.751238	-87.712990	34.78261	60.72535
001e0610ba13	2018-12-15 21:40:00	41.751238	-87.712990	54.16667	60.72535
001e0610ba13	2018-12-15 21:50:00	41.751238	-87.712990	41.66667	60.72535
001e0610ba13	2018-12-15 22:00:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 22:10:00	41.751238	-87.712990	37.50000	60.72535
001e0610ba13	2018-12-15 22:20:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 22:30:00	41.751238	-87.712990	25.00000	60.72535
001e0610ba13	2018-12-15 22:40:00	41.751238	-87.712990	62.50000	60.72535
001e0610ba13	2018-12-15 22:50:00	41.751238	-87.712990	69.56522	60.72535
001e0610ba13	2018-12-15 23:00:00	41.751238	-87.712990	75.00000	60.72535
001e0610ba13	2018-12-15 23:10:00	41.751238	-87.712990	66.66667	60.72535
001e0610ba13	2018-12-15 23:20:00	41.751238	-87.712990	79.16667	60.72535
001e0610ba13	2018-12-15 23:30:00	41.751238	-87.712990	45.83333	60.72535
001e0610ba13	2018-12-15 23:40:00	41.751238	-87.712990	70.83333	60.72535
001e0610ba13	2018-12-15 23:50:00	41.751238	-87.712990	47.82609	60.72535

The plot below presents how X varies around each node’s Node Sensor Value Reliability.

  ggplot()+
  geom_line(data=dfSO21,aes(x=by10, y=NodeMeanX, group=1), col='black', size=1, linetype='dashed')+
  geom_line(data=dfSO21,aes(x=by10, y=X, group=1), col='indianred', size=1, alpha=0.5)+
        scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  scale_color_manual('',
                    values=c("Proportion of Reliable Data"='indianred'))+
  facet_wrap(~node_id, ncol=5)+
  labs(y='Proportion Reliable', x='Time', 
       title='Proportion of Reliable SO2 Data Collected on 2018-12-15 For Each Node - By Time',
       subtitle='Mean proportion for each node denoted by dashed line.')+
  plotTheme()+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

dfSO21%>%
  select(-X, -by10)%>%
  unique()%>%
  as.data.frame()%>%
  arrange(desc(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeMeanX
001e0610ba15	41.722457	-87.57535	82.9051383
001e061144c0	41.764122	-87.72242	75.6648315
001e0610ba13	41.751238	-87.712990	60.7253468
001e061146bc	41.918733	-87.668257	51.6541090
001e0610ee43	41.788608	-87.598713	49.8188406
001e06113cf1	41.884688	-87.627864	48.1645028
001e0610f05c	41.924903	-87.687703	46.3013285
001e06113107	41.751142	-87.71299	45.6653835
001e0610e537	41.961622	-87.665948	42.4152073
001e0610ba46	41.878377	-87.627678	42.2813964
001e0610f6db	41.791329	-87.598677	38.1982186
001e06114503	41.666078	-87.539374	31.3967346
001e06114fd4	41.794477	-87.615957	15.5725049
001e0610eef2	41.965256	-87.66672	7.8741445
001e061130f4	41.896157	-87.662391	0.1169988
001e0610ef27	41.846579	-87.685557	0.0000000
001e0610ee33	41.965089	-87.679076	0.0000000
001e0610bc10	41.736314	-87.624179	0.0000000
001e06113ace	41.83107	-87.617298	0.0000000
001e0610e532	41.857959	-87.656427	0.0000000
001e06114500	41.714494	-87.643099	0.0000000

The spatial distribution of this varying sensor value reliability levels by node is then visualised in the following map.

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Blues", dfSO21$NodeMeanX, n = 5)

dfSO21$lat<-as.numeric(dfSO21$lat)
dfSO21$lon<-as.numeric(dfSO21$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfSO21,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeMeanX),fillOpacity = 0.5, 
                   popup = paste("Node:", dfSO21$node_id, "<br>",
                                 "Mean Proportion of Reliable Data Collected:", round(dfSO21$NodeMeanX), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfSO21$NodeMeanX, opacity = 0.7, title = NULL,position = "topright")

Constructing Score 1

The density plot below shows the distribution of node sensor value reliability relative to the its average - this average is taken as Score 1, which represents the overall network sensor value reliability of the AoT network for SO2 Concentration data on 2012-12-15. The network sensor value reliability represents essentially the average proportion of reliable data collected by each node at each time-interval of the day.

dfSO21%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 1`= mean(NodeMeanX))%>%
  ggplot()+
  geom_density(aes(NodeMeanX), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 1`), size = 2)+
  geom_text(aes(x= `Score 1`, y=0), label='Score 1:\n Average Proportion of\nReliable Data Collected\n by Each Node', size = 4, vjust= -2, hjust=-0.1)+
  labs(x= 'Node Sensor Value Reliability',
       y= 'Density',
       title = 'Distribution of Node Sensor Value Reliability Scores')+
  xlim(0, 100)+
  plotTheme()

dfSO21%>%
  select(node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  summarise(Score = mean(NodeMeanX))%>%
  kable()%>%
  kable_styling(bootstrap_options = "striped")

Score
30.41689

Constructing Score 2

dfSO21%>%
 select(node_id, lat, lon, by10, X)%>%
  group_by(node_id, lat, lon)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>mean(X), 
                       mean(X),
                       abs(sd(X))))%>%
  mutate(NodeSDScore= ifelse(mean(X)==0, 
                             100, 
                             ifelse(mean(X)<50, 
                                    abs(100-abs(100-100*(NodeSD/mean(X)))),
                                    abs(100-100*(NodeSD/mean(X))))))%>%
  select(node_id, lat, lon, NodeSDScore)%>%
  unique()%>%
  as.data.frame()%>%
  mutate(`Score 2`= mean(NodeSDScore))->dfSO22

dfSO22%>%
  arrange(desc(NodeSDScore))%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	lat	lon	NodeSDScore	Score 2
001e0610ef27	41.84658	-87.68556	100.00000	71.45969
001e061130f4	41.89616	-87.66239	100.00000	71.45969
001e06114fd4	41.79448	-87.61596	100.00000	71.45969
001e0610ee33	41.96509	-87.67908	100.00000	71.45969
001e0610bc10	41.73631	-87.62418	100.00000	71.45969
001e06113ace	41.83107	-87.61730	100.00000	71.45969
001e0610eef2	41.96526	-87.66672	100.00000	71.45969
001e0610e532	41.85796	-87.65643	100.00000	71.45969
001e06114500	41.71449	-87.64310	100.00000	71.45969
001e06114503	41.66608	-87.53937	79.17938	71.45969
001e0610ba15	41.72246	-87.57535	78.84579	71.45969
001e0610ba13	41.75124	-87.71299	74.04494	71.45969
001e061146bc	41.91873	-87.66826	71.19508	71.45969
001e061144c0	41.76412	-87.72242	60.37703	71.45969
001e0610f6db	41.79133	-87.59868	36.65302	71.45969
001e0610e537	41.96162	-87.66595	36.55186	71.45969
001e0610ba46	41.87838	-87.62768	36.14718	71.45969
001e0610f05c	41.92490	-87.68770	36.13700	71.45969
001e06113107	41.75114	-87.71299	33.48121	71.45969
001e0610ee43	41.78861	-87.59871	29.46617	71.45969
001e06113cf1	41.88469	-87.62786	28.57482	71.45969

chig<-readOGR('.', 'chigBound')

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\leech\OneDrive\Documents\Capstone\Exploratory", layer: "chigBound"
## with 1 features
## It has 1 fields

pal <- colorNumeric("Purples", dfSO22$NodeSDScore, n = 5)

dfSO22$lat<-as.numeric(dfSO22$lat)
dfSO22$lon<-as.numeric(dfSO22$lon)

leaflet() %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(data=chig, fillOpacity = 0, weight=2, color='black')%>%
  addCircleMarkers(data=dfSO22,
                   lng = ~lon, lat = ~lat, weight = 2,
                   radius = 7, opacity = 0.2,
                   fillColor= ~pal(NodeSDScore),fillOpacity = 0.5, 
                   popup = paste("Node:", dfSO22$node_id, "<br>",
                                 "Consistency Score for Level of Sensor Value Reliability:", round(dfSO22$NodeSDScore), "%", "<br>"))%>%
  addLegend(pal = pal, values = dfSO22$NodeSDScore, opacity = 0.7, title = NULL,position = "topright")

dfSO22%>%
  ggplot()+
  geom_density(aes(NodeSDScore), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 2`), size = 2)+
  geom_text(aes(x= `Score 2`, y=0), label='Score 2:\n Overall Consistency\nin Sensor Value Reliability', size = 4, vjust= -2, hjust=1)+
  labs(x='Node Sensor Value Reliability Consistency Score', 
       y='Density',
       title='Distribution of Node Sensor Value Reliability Consistency Scores')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT SO2 Concentration Network on 2012-12-15,

Score 1 = 30.4 : On average, only 30.4% of the SO2 Concentration data measured by the nodes in the network every 10-minute is reliable. This is a low score.
Score 2 = 71.5 : From the density distribution, it could be observed that the node sensor value reliability is consistently good more often than consistently bad, hence the above-moderate score here.

3.4 Scoring Spatial Reliability

In this section, the method of scoring Spatial Reliability is presented for each data parameter type for the day of 2012-12-15. There are 2 scores obtained for this criteria. In summary, this section scores spatial reliability in terms of the average proportion of network active at any time-interval (Score 3) and average proportion of Chicago area covered (Score 4) at any time-interval during the day.

WhileScore 3 here is not strictly spatial, it helps us interpret the extent of spatial coverage indicated by Score 4 in relation to the number of nodes collecting reliable data. For instance, a tightly clustered pattern of many nodes in the network will have a high Score 3 but low Score 4, while a widely dispersed pattern of a few nodes in the network will have a low Score 3 but high Score 4. These two scores have to be interpreted together, and are therefore scored under the same criteria.

The flowchart below illustrates the scoring process in this section:

Click on the tabs below to view the scores constructed for each data parameter.

Temperature

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable temperature data in the AoT network vary across the day’s duration.

dfTemp%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfTemp3

ggplot(data=dfTemp3, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfTemp3$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable temperature data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable temperature data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfTemp3%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	40	46.51163	47.27067
2018-12-15 00:10:00	40	46.51163	47.27067
2018-12-15 00:20:00	40	46.51163	47.27067
2018-12-15 00:30:00	40	46.51163	47.27067
2018-12-15 00:40:00	40	46.51163	47.27067
2018-12-15 00:50:00	40	46.51163	47.27067
2018-12-15 01:00:00	40	46.51163	47.27067
2018-12-15 01:10:00	40	46.51163	47.27067
2018-12-15 01:20:00	40	46.51163	47.27067
2018-12-15 01:30:00	40	46.51163	47.27067
2018-12-15 01:40:00	40	46.51163	47.27067
2018-12-15 01:50:00	40	46.51163	47.27067
2018-12-15 02:00:00	40	46.51163	47.27067
2018-12-15 02:10:00	40	46.51163	47.27067
2018-12-15 02:20:00	40	46.51163	47.27067
2018-12-15 02:30:00	40	46.51163	47.27067
2018-12-15 02:40:00	40	46.51163	47.27067
2018-12-15 02:50:00	40	46.51163	47.27067
2018-12-15 03:00:00	40	46.51163	47.27067
2018-12-15 03:10:00	41	47.67442	47.27067
2018-12-15 03:20:00	41	47.67442	47.27067
2018-12-15 03:30:00	41	47.67442	47.27067
2018-12-15 03:40:00	41	47.67442	47.27067
2018-12-15 03:50:00	41	47.67442	47.27067
2018-12-15 04:00:00	41	47.67442	47.27067
2018-12-15 04:10:00	41	47.67442	47.27067
2018-12-15 04:20:00	41	47.67442	47.27067
2018-12-15 04:30:00	41	47.67442	47.27067
2018-12-15 04:40:00	41	47.67442	47.27067
2018-12-15 04:50:00	41	47.67442	47.27067
2018-12-15 05:00:00	41	47.67442	47.27067
2018-12-15 05:10:00	41	47.67442	47.27067
2018-12-15 05:20:00	41	47.67442	47.27067
2018-12-15 05:30:00	41	47.67442	47.27067
2018-12-15 05:40:00	41	47.67442	47.27067
2018-12-15 05:50:00	41	47.67442	47.27067
2018-12-15 06:00:00	41	47.67442	47.27067
2018-12-15 06:10:00	41	47.67442	47.27067
2018-12-15 06:20:00	41	47.67442	47.27067
2018-12-15 06:30:00	41	47.67442	47.27067
2018-12-15 06:40:00	41	47.67442	47.27067
2018-12-15 06:50:00	41	47.67442	47.27067
2018-12-15 07:00:00	41	47.67442	47.27067
2018-12-15 07:10:00	41	47.67442	47.27067
2018-12-15 07:20:00	41	47.67442	47.27067
2018-12-15 07:30:00	41	47.67442	47.27067
2018-12-15 07:40:00	41	47.67442	47.27067
2018-12-15 07:50:00	41	47.67442	47.27067
2018-12-15 08:00:00	41	47.67442	47.27067
2018-12-15 08:10:00	41	47.67442	47.27067
2018-12-15 08:20:00	41	47.67442	47.27067
2018-12-15 08:30:00	40	46.51163	47.27067
2018-12-15 08:40:00	40	46.51163	47.27067
2018-12-15 08:50:00	40	46.51163	47.27067
2018-12-15 09:00:00	40	46.51163	47.27067
2018-12-15 09:10:00	40	46.51163	47.27067
2018-12-15 09:20:00	40	46.51163	47.27067
2018-12-15 09:30:00	40	46.51163	47.27067
2018-12-15 09:40:00	40	46.51163	47.27067
2018-12-15 09:50:00	40	46.51163	47.27067
2018-12-15 10:00:00	40	46.51163	47.27067
2018-12-15 10:10:00	41	47.67442	47.27067
2018-12-15 10:20:00	41	47.67442	47.27067
2018-12-15 10:30:00	41	47.67442	47.27067
2018-12-15 10:40:00	41	47.67442	47.27067
2018-12-15 10:50:00	41	47.67442	47.27067
2018-12-15 11:00:00	41	47.67442	47.27067
2018-12-15 11:10:00	41	47.67442	47.27067
2018-12-15 11:20:00	41	47.67442	47.27067
2018-12-15 11:30:00	41	47.67442	47.27067
2018-12-15 11:40:00	41	47.67442	47.27067
2018-12-15 11:50:00	41	47.67442	47.27067
2018-12-15 12:00:00	41	47.67442	47.27067
2018-12-15 12:10:00	41	47.67442	47.27067
2018-12-15 12:20:00	41	47.67442	47.27067
2018-12-15 12:30:00	41	47.67442	47.27067
2018-12-15 12:40:00	41	47.67442	47.27067
2018-12-15 12:50:00	41	47.67442	47.27067
2018-12-15 13:00:00	41	47.67442	47.27067
2018-12-15 13:10:00	41	47.67442	47.27067
2018-12-15 13:20:00	41	47.67442	47.27067
2018-12-15 13:30:00	41	47.67442	47.27067
2018-12-15 13:40:00	41	47.67442	47.27067
2018-12-15 13:50:00	41	47.67442	47.27067
2018-12-15 14:00:00	41	47.67442	47.27067
2018-12-15 14:10:00	41	47.67442	47.27067
2018-12-15 14:20:00	41	47.67442	47.27067
2018-12-15 14:30:00	41	47.67442	47.27067
2018-12-15 14:40:00	41	47.67442	47.27067
2018-12-15 14:50:00	41	47.67442	47.27067
2018-12-15 15:00:00	40	46.51163	47.27067
2018-12-15 15:10:00	39	45.34884	47.27067
2018-12-15 15:20:00	39	45.34884	47.27067
2018-12-15 15:30:00	39	45.34884	47.27067
2018-12-15 15:40:00	39	45.34884	47.27067
2018-12-15 15:50:00	39	45.34884	47.27067
2018-12-15 16:00:00	39	45.34884	47.27067
2018-12-15 16:10:00	39	45.34884	47.27067
2018-12-15 16:20:00	39	45.34884	47.27067
2018-12-15 16:30:00	39	45.34884	47.27067
2018-12-15 16:40:00	39	45.34884	47.27067
2018-12-15 16:50:00	41	47.67442	47.27067
2018-12-15 17:00:00	41	47.67442	47.27067
2018-12-15 17:10:00	41	47.67442	47.27067
2018-12-15 17:20:00	41	47.67442	47.27067
2018-12-15 17:30:00	41	47.67442	47.27067
2018-12-15 17:40:00	41	47.67442	47.27067
2018-12-15 17:50:00	41	47.67442	47.27067
2018-12-15 18:00:00	41	47.67442	47.27067
2018-12-15 18:10:00	41	47.67442	47.27067
2018-12-15 18:20:00	41	47.67442	47.27067
2018-12-15 18:30:00	41	47.67442	47.27067
2018-12-15 18:40:00	41	47.67442	47.27067
2018-12-15 18:50:00	41	47.67442	47.27067
2018-12-15 19:00:00	41	47.67442	47.27067
2018-12-15 19:10:00	41	47.67442	47.27067
2018-12-15 19:20:00	41	47.67442	47.27067
2018-12-15 19:30:00	41	47.67442	47.27067
2018-12-15 19:40:00	41	47.67442	47.27067
2018-12-15 19:50:00	41	47.67442	47.27067
2018-12-15 20:00:00	41	47.67442	47.27067
2018-12-15 20:10:00	41	47.67442	47.27067
2018-12-15 20:20:00	41	47.67442	47.27067
2018-12-15 20:30:00	41	47.67442	47.27067
2018-12-15 20:40:00	41	47.67442	47.27067
2018-12-15 20:50:00	41	47.67442	47.27067
2018-12-15 21:00:00	41	47.67442	47.27067
2018-12-15 21:10:00	41	47.67442	47.27067
2018-12-15 21:20:00	41	47.67442	47.27067
2018-12-15 21:30:00	41	47.67442	47.27067
2018-12-15 21:40:00	41	47.67442	47.27067
2018-12-15 21:50:00	41	47.67442	47.27067
2018-12-15 22:00:00	41	47.67442	47.27067
2018-12-15 22:10:00	41	47.67442	47.27067
2018-12-15 22:20:00	41	47.67442	47.27067
2018-12-15 22:30:00	41	47.67442	47.27067
2018-12-15 22:40:00	41	47.67442	47.27067
2018-12-15 22:50:00	41	47.67442	47.27067
2018-12-15 23:00:00	41	47.67442	47.27067
2018-12-15 23:10:00	41	47.67442	47.27067
2018-12-15 23:20:00	41	47.67442	47.27067
2018-12-15 23:30:00	41	47.67442	47.27067
2018-12-15 23:40:00	41	47.67442	47.27067
2018-12-15 23:50:00	41	47.67442	47.27067

The density plot below shows the distribution of propActive (Proportion of Active Nodes) recorded at each time-interval of the day relative to the the network average - this average is taken as Score 3, which represents the average proportion of network active of the AoT network for temperature data on 2012-12-15.

dfTemp3%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

To calculate the average proportion of Chicago area covered by the distribution of active nodes, we first begin by extracting the latitude and longitude locations of these nodes. This is done by obtaining all the locations at which every reliable data point is recorded, and compiling the unique latitude and longitude locations from this list.

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfTemp%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfTemp4

dfTemp4%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

by10	node_id	lat	lon
2018-12-15 00:00:00	001e0610ee36	41.75129	-87.60529
2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:00:00	001e0610bc12	41.75034	-87.66352
2018-12-15 00:00:00	001e06113f54	41.88461	-87.62458
2018-12-15 00:00:00	001e0611537d	41.79417	-87.60165
2018-12-15 00:00:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:00:00	001e0610ef27	41.84658	-87.68556
2018-12-15 00:00:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:00:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:00:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:00:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:00:00	001e06114503	41.66608	-87.53937
2018-12-15 00:00:00	001e0611536c	41.88575	-87.62969
2018-12-15 00:00:00	001e0610ee5d	41.92400	-87.76107
2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15 00:00:00	001e06113dbc	41.71387	-87.53651
2018-12-15 00:00:00	001e0610e532	41.85796	-87.65643
2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:00:00	001e0610f703	41.87148	-87.67644
2018-12-15 00:00:00	001e0610fb4c	41.91358	-87.68241
2018-12-15 00:00:00	001e06113d22	41.80085	-87.70374
2018-12-15 00:00:00	001e0610ee33	41.96509	-87.67908
2018-12-15 00:00:00	001e0610e538	41.73659	-87.60476
2018-12-15 00:00:00	001e0610f732	41.89500	-87.74582
2018-12-15 00:00:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:00:00	001e0610eef4	41.91268	-87.68105
2018-12-15 00:00:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:00:00	001e0610bbf9	41.76832	-87.68340
2018-12-15 00:00:00	001e06113a48	41.94326	-87.68807
2018-12-15 00:00:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:00:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:00:00	001e0611462f	41.82353	-87.64105
2018-12-15 00:00:00	001e0610f8f4	41.83258	-87.64613
2018-12-15 00:00:00	001e061135cb	41.77937	-87.66442
2018-12-15 00:00:00	001e0610e835	41.96876	-87.67917
2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:00:00	001e061146ba	41.96759	-87.76257
2018-12-15 00:00:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:10:00	001e0610ba15	41.72246	-87.57535

Using the latitude and longitude locations, we can obtain a point distribution of the active nodes in the network at any time-interval. However, as we are interested in the area of coverage by these nodes instead of their point locations, we construct a spatial bounding box around the active node points at every time interval. The ratio of the area of this spatial bounding box to the whole area of Chicago indicates the proportion of Chicago area covered (AreaProp) for each time interval. To find the average proportion of Chicago area covered (Score 4), the mean of the proportions calculated for each time interval is obtained. This is all presented in the table below.

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfTemp4a<-NULL
for(i in unique(dfTemp4$by10)){
  
subset <- 
      dfTemp4%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

Ps2AreaProp<-gArea(Ps2)/chigArea

df1<-NULL
df1$by10<-i
df1$AreaProp<-Ps2AreaProp
df1<-as.data.frame(df1)

dfTemp4a<-rbind(dfTemp4a, df1)
  
}

dfTemp4a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfTemp4a
dfTemp4a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.7217209	72.17209
2018-12-15 00:10:00	0.7217209	72.17209
2018-12-15 00:20:00	0.7217209	72.17209
2018-12-15 00:30:00	0.7217209	72.17209
2018-12-15 00:40:00	0.7217209	72.17209
2018-12-15 00:50:00	0.7217209	72.17209
2018-12-15 01:00:00	0.7217209	72.17209
2018-12-15 01:10:00	0.7217209	72.17209
2018-12-15 01:20:00	0.7217209	72.17209
2018-12-15 01:30:00	0.7217209	72.17209
2018-12-15 01:40:00	0.7217209	72.17209
2018-12-15 01:50:00	0.7217209	72.17209
2018-12-15 02:00:00	0.7217209	72.17209
2018-12-15 02:10:00	0.7217209	72.17209
2018-12-15 02:20:00	0.7217209	72.17209
2018-12-15 02:30:00	0.7217209	72.17209
2018-12-15 02:40:00	0.7217209	72.17209
2018-12-15 02:50:00	0.7217209	72.17209
2018-12-15 03:00:00	0.7217209	72.17209
2018-12-15 03:10:00	0.7217209	72.17209
2018-12-15 03:20:00	0.7217209	72.17209
2018-12-15 03:30:00	0.7217209	72.17209
2018-12-15 03:40:00	0.7217209	72.17209
2018-12-15 03:50:00	0.7217209	72.17209
2018-12-15 04:00:00	0.7217209	72.17209
2018-12-15 04:10:00	0.7217209	72.17209
2018-12-15 04:20:00	0.7217209	72.17209
2018-12-15 04:30:00	0.7217209	72.17209
2018-12-15 04:40:00	0.7217209	72.17209
2018-12-15 04:50:00	0.7217209	72.17209
2018-12-15 05:00:00	0.7217209	72.17209
2018-12-15 05:10:00	0.7217209	72.17209
2018-12-15 05:20:00	0.7217209	72.17209
2018-12-15 05:30:00	0.7217209	72.17209
2018-12-15 05:40:00	0.7217209	72.17209
2018-12-15 05:50:00	0.7217209	72.17209
2018-12-15 06:00:00	0.7217209	72.17209
2018-12-15 06:10:00	0.7217209	72.17209
2018-12-15 06:20:00	0.7217209	72.17209
2018-12-15 06:30:00	0.7217209	72.17209
2018-12-15 06:40:00	0.7217209	72.17209
2018-12-15 06:50:00	0.7217209	72.17209
2018-12-15 07:00:00	0.7217209	72.17209
2018-12-15 07:10:00	0.7217209	72.17209
2018-12-15 07:20:00	0.7217209	72.17209
2018-12-15 07:30:00	0.7217209	72.17209
2018-12-15 07:40:00	0.7217209	72.17209
2018-12-15 07:50:00	0.7217209	72.17209
2018-12-15 08:00:00	0.7217209	72.17209
2018-12-15 08:10:00	0.7217209	72.17209
2018-12-15 08:20:00	0.7217209	72.17209
2018-12-15 08:30:00	0.7217209	72.17209
2018-12-15 08:40:00	0.7217209	72.17209
2018-12-15 08:50:00	0.7217209	72.17209
2018-12-15 09:00:00	0.7217209	72.17209
2018-12-15 09:10:00	0.7217209	72.17209
2018-12-15 09:20:00	0.7217209	72.17209
2018-12-15 09:30:00	0.7217209	72.17209
2018-12-15 09:40:00	0.7217209	72.17209
2018-12-15 09:50:00	0.7217209	72.17209
2018-12-15 10:00:00	0.7217209	72.17209
2018-12-15 10:10:00	0.7217209	72.17209
2018-12-15 10:20:00	0.7217209	72.17209
2018-12-15 10:30:00	0.7217209	72.17209
2018-12-15 10:40:00	0.7217209	72.17209
2018-12-15 10:50:00	0.7217209	72.17209
2018-12-15 11:00:00	0.7217209	72.17209
2018-12-15 11:10:00	0.7217209	72.17209
2018-12-15 11:20:00	0.7217209	72.17209
2018-12-15 11:30:00	0.7217209	72.17209
2018-12-15 11:40:00	0.7217209	72.17209
2018-12-15 11:50:00	0.7217209	72.17209
2018-12-15 12:00:00	0.7217209	72.17209
2018-12-15 12:10:00	0.7217209	72.17209
2018-12-15 12:20:00	0.7217209	72.17209
2018-12-15 12:30:00	0.7217209	72.17209
2018-12-15 12:40:00	0.7217209	72.17209
2018-12-15 12:50:00	0.7217209	72.17209
2018-12-15 13:00:00	0.7217209	72.17209
2018-12-15 13:10:00	0.7217209	72.17209
2018-12-15 13:20:00	0.7217209	72.17209
2018-12-15 13:30:00	0.7217209	72.17209
2018-12-15 13:40:00	0.7217209	72.17209
2018-12-15 13:50:00	0.7217209	72.17209
2018-12-15 14:00:00	0.7217209	72.17209
2018-12-15 14:10:00	0.7217209	72.17209
2018-12-15 14:20:00	0.7217209	72.17209
2018-12-15 14:30:00	0.7217209	72.17209
2018-12-15 14:40:00	0.7217209	72.17209
2018-12-15 14:50:00	0.7217209	72.17209
2018-12-15 15:00:00	0.7217209	72.17209
2018-12-15 15:10:00	0.7217209	72.17209
2018-12-15 15:20:00	0.7217209	72.17209
2018-12-15 15:30:00	0.7217209	72.17209
2018-12-15 15:40:00	0.7217209	72.17209
2018-12-15 15:50:00	0.7217209	72.17209
2018-12-15 16:00:00	0.7217209	72.17209
2018-12-15 16:10:00	0.7217209	72.17209
2018-12-15 16:20:00	0.7217209	72.17209
2018-12-15 16:30:00	0.7217209	72.17209
2018-12-15 16:40:00	0.7217209	72.17209
2018-12-15 16:50:00	0.7217209	72.17209
2018-12-15 17:00:00	0.7217209	72.17209
2018-12-15 17:10:00	0.7217209	72.17209
2018-12-15 17:20:00	0.7217209	72.17209
2018-12-15 17:30:00	0.7217209	72.17209
2018-12-15 17:40:00	0.7217209	72.17209
2018-12-15 17:50:00	0.7217209	72.17209
2018-12-15 18:00:00	0.7217209	72.17209
2018-12-15 18:10:00	0.7217209	72.17209
2018-12-15 18:20:00	0.7217209	72.17209
2018-12-15 18:30:00	0.7217209	72.17209
2018-12-15 18:40:00	0.7217209	72.17209
2018-12-15 18:50:00	0.7217209	72.17209
2018-12-15 19:00:00	0.7217209	72.17209
2018-12-15 19:10:00	0.7217209	72.17209
2018-12-15 19:20:00	0.7217209	72.17209
2018-12-15 19:30:00	0.7217209	72.17209
2018-12-15 19:40:00	0.7217209	72.17209
2018-12-15 19:50:00	0.7217209	72.17209
2018-12-15 20:00:00	0.7217209	72.17209
2018-12-15 20:10:00	0.7217209	72.17209
2018-12-15 20:20:00	0.7217209	72.17209
2018-12-15 20:30:00	0.7217209	72.17209
2018-12-15 20:40:00	0.7217209	72.17209
2018-12-15 20:50:00	0.7217209	72.17209
2018-12-15 21:00:00	0.7217209	72.17209
2018-12-15 21:10:00	0.7217209	72.17209
2018-12-15 21:20:00	0.7217209	72.17209
2018-12-15 21:30:00	0.7217209	72.17209
2018-12-15 21:40:00	0.7217209	72.17209
2018-12-15 21:50:00	0.7217209	72.17209
2018-12-15 22:00:00	0.7217209	72.17209
2018-12-15 22:10:00	0.7217209	72.17209
2018-12-15 22:20:00	0.7217209	72.17209
2018-12-15 22:30:00	0.7217209	72.17209
2018-12-15 22:40:00	0.7217209	72.17209
2018-12-15 22:50:00	0.7217209	72.17209
2018-12-15 23:00:00	0.7217209	72.17209
2018-12-15 23:10:00	0.7217209	72.17209
2018-12-15 23:20:00	0.7217209	72.17209
2018-12-15 23:30:00	0.7217209	72.17209
2018-12-15 23:40:00	0.7217209	72.17209
2018-12-15 23:50:00	0.7217209	72.17209

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the temperature network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfTemp4$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfTemp4b<-merge(dfTemp4, by10, by='by10', all.x=TRUE)

for(i in unique(dfTemp4b$by10)){

subset <- 
      dfTemp4b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT Temperature Network on 2012-12-15,

Score 3 = 43.7 : At any given 10-minute time interval in any given node, an average 43.7% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 72.2 : At any given 10-minute time interval in any given node, reliable data is collected for an average 72.2% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

Humidity

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable humidity data in the AoT network vary across the day’s duration.

dfHumidity%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfHumidity3

## Adding missing grouping variables: `date`

ggplot(data=dfHumidity3, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfHumidity3$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable humidity data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable humidity data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfHumidity3%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	18	20.93023	20.84141
2018-12-15 00:10:00	18	20.93023	20.84141
2018-12-15 00:20:00	18	20.93023	20.84141
2018-12-15 00:30:00	18	20.93023	20.84141
2018-12-15 00:40:00	18	20.93023	20.84141
2018-12-15 00:50:00	18	20.93023	20.84141
2018-12-15 01:00:00	18	20.93023	20.84141
2018-12-15 01:10:00	18	20.93023	20.84141
2018-12-15 01:20:00	18	20.93023	20.84141
2018-12-15 01:30:00	18	20.93023	20.84141
2018-12-15 01:40:00	18	20.93023	20.84141
2018-12-15 01:50:00	18	20.93023	20.84141
2018-12-15 02:00:00	18	20.93023	20.84141
2018-12-15 02:10:00	18	20.93023	20.84141
2018-12-15 02:20:00	18	20.93023	20.84141
2018-12-15 02:30:00	18	20.93023	20.84141
2018-12-15 02:40:00	18	20.93023	20.84141
2018-12-15 02:50:00	18	20.93023	20.84141
2018-12-15 03:00:00	18	20.93023	20.84141
2018-12-15 03:10:00	18	20.93023	20.84141
2018-12-15 03:20:00	18	20.93023	20.84141
2018-12-15 03:30:00	18	20.93023	20.84141
2018-12-15 03:40:00	18	20.93023	20.84141
2018-12-15 03:50:00	18	20.93023	20.84141
2018-12-15 04:00:00	18	20.93023	20.84141
2018-12-15 04:10:00	18	20.93023	20.84141
2018-12-15 04:20:00	18	20.93023	20.84141
2018-12-15 04:30:00	18	20.93023	20.84141
2018-12-15 04:40:00	18	20.93023	20.84141
2018-12-15 04:50:00	18	20.93023	20.84141
2018-12-15 05:00:00	18	20.93023	20.84141
2018-12-15 05:10:00	18	20.93023	20.84141
2018-12-15 05:20:00	18	20.93023	20.84141
2018-12-15 05:30:00	18	20.93023	20.84141
2018-12-15 05:40:00	18	20.93023	20.84141
2018-12-15 05:50:00	18	20.93023	20.84141
2018-12-15 06:00:00	18	20.93023	20.84141
2018-12-15 06:10:00	18	20.93023	20.84141
2018-12-15 06:20:00	18	20.93023	20.84141
2018-12-15 06:30:00	18	20.93023	20.84141
2018-12-15 06:40:00	18	20.93023	20.84141
2018-12-15 06:50:00	18	20.93023	20.84141
2018-12-15 07:00:00	18	20.93023	20.84141
2018-12-15 07:10:00	18	20.93023	20.84141
2018-12-15 07:20:00	18	20.93023	20.84141
2018-12-15 07:30:00	18	20.93023	20.84141
2018-12-15 07:40:00	18	20.93023	20.84141
2018-12-15 07:50:00	18	20.93023	20.84141
2018-12-15 08:00:00	18	20.93023	20.84141
2018-12-15 08:10:00	18	20.93023	20.84141
2018-12-15 08:20:00	18	20.93023	20.84141
2018-12-15 08:30:00	18	20.93023	20.84141
2018-12-15 08:40:00	18	20.93023	20.84141
2018-12-15 08:50:00	18	20.93023	20.84141
2018-12-15 09:00:00	18	20.93023	20.84141
2018-12-15 09:10:00	18	20.93023	20.84141
2018-12-15 09:20:00	18	20.93023	20.84141
2018-12-15 09:30:00	18	20.93023	20.84141
2018-12-15 09:40:00	18	20.93023	20.84141
2018-12-15 09:50:00	18	20.93023	20.84141
2018-12-15 10:00:00	18	20.93023	20.84141
2018-12-15 10:10:00	18	20.93023	20.84141
2018-12-15 10:20:00	18	20.93023	20.84141
2018-12-15 10:30:00	18	20.93023	20.84141
2018-12-15 10:40:00	18	20.93023	20.84141
2018-12-15 10:50:00	18	20.93023	20.84141
2018-12-15 11:00:00	18	20.93023	20.84141
2018-12-15 11:10:00	18	20.93023	20.84141
2018-12-15 11:20:00	18	20.93023	20.84141
2018-12-15 11:30:00	18	20.93023	20.84141
2018-12-15 11:40:00	18	20.93023	20.84141
2018-12-15 11:50:00	18	20.93023	20.84141
2018-12-15 12:00:00	18	20.93023	20.84141
2018-12-15 12:10:00	18	20.93023	20.84141
2018-12-15 12:20:00	18	20.93023	20.84141
2018-12-15 12:30:00	18	20.93023	20.84141
2018-12-15 12:40:00	18	20.93023	20.84141
2018-12-15 12:50:00	18	20.93023	20.84141
2018-12-15 13:00:00	18	20.93023	20.84141
2018-12-15 13:10:00	18	20.93023	20.84141
2018-12-15 13:20:00	18	20.93023	20.84141
2018-12-15 13:30:00	18	20.93023	20.84141
2018-12-15 13:40:00	18	20.93023	20.84141
2018-12-15 13:50:00	18	20.93023	20.84141
2018-12-15 14:00:00	18	20.93023	20.84141
2018-12-15 14:10:00	18	20.93023	20.84141
2018-12-15 14:20:00	18	20.93023	20.84141
2018-12-15 14:30:00	18	20.93023	20.84141
2018-12-15 14:40:00	18	20.93023	20.84141
2018-12-15 14:50:00	18	20.93023	20.84141
2018-12-15 15:00:00	17	19.76744	20.84141
2018-12-15 15:10:00	17	19.76744	20.84141
2018-12-15 15:20:00	17	19.76744	20.84141
2018-12-15 15:30:00	17	19.76744	20.84141
2018-12-15 15:40:00	17	19.76744	20.84141
2018-12-15 15:50:00	17	19.76744	20.84141
2018-12-15 16:00:00	17	19.76744	20.84141
2018-12-15 16:10:00	17	19.76744	20.84141
2018-12-15 16:20:00	17	19.76744	20.84141
2018-12-15 16:30:00	17	19.76744	20.84141
2018-12-15 16:40:00	17	19.76744	20.84141
2018-12-15 16:50:00	18	20.93023	20.84141
2018-12-15 17:00:00	18	20.93023	20.84141
2018-12-15 17:10:00	18	20.93023	20.84141
2018-12-15 17:20:00	18	20.93023	20.84141
2018-12-15 17:30:00	18	20.93023	20.84141
2018-12-15 17:40:00	18	20.93023	20.84141
2018-12-15 17:50:00	18	20.93023	20.84141
2018-12-15 18:00:00	18	20.93023	20.84141
2018-12-15 18:10:00	18	20.93023	20.84141
2018-12-15 18:20:00	18	20.93023	20.84141
2018-12-15 18:30:00	18	20.93023	20.84141
2018-12-15 18:40:00	18	20.93023	20.84141
2018-12-15 18:50:00	18	20.93023	20.84141
2018-12-15 19:00:00	18	20.93023	20.84141
2018-12-15 19:10:00	18	20.93023	20.84141
2018-12-15 19:20:00	18	20.93023	20.84141
2018-12-15 19:30:00	18	20.93023	20.84141
2018-12-15 19:40:00	18	20.93023	20.84141
2018-12-15 19:50:00	18	20.93023	20.84141
2018-12-15 20:00:00	18	20.93023	20.84141
2018-12-15 20:10:00	18	20.93023	20.84141
2018-12-15 20:20:00	18	20.93023	20.84141
2018-12-15 20:30:00	18	20.93023	20.84141
2018-12-15 20:40:00	18	20.93023	20.84141
2018-12-15 20:50:00	18	20.93023	20.84141
2018-12-15 21:00:00	18	20.93023	20.84141
2018-12-15 21:10:00	18	20.93023	20.84141
2018-12-15 21:20:00	18	20.93023	20.84141
2018-12-15 21:30:00	18	20.93023	20.84141
2018-12-15 21:40:00	18	20.93023	20.84141
2018-12-15 21:50:00	18	20.93023	20.84141
2018-12-15 22:00:00	18	20.93023	20.84141
2018-12-15 22:10:00	18	20.93023	20.84141
2018-12-15 22:20:00	18	20.93023	20.84141
2018-12-15 22:30:00	18	20.93023	20.84141
2018-12-15 22:40:00	18	20.93023	20.84141
2018-12-15 22:50:00	18	20.93023	20.84141
2018-12-15 23:00:00	18	20.93023	20.84141
2018-12-15 23:10:00	18	20.93023	20.84141
2018-12-15 23:20:00	18	20.93023	20.84141
2018-12-15 23:30:00	18	20.93023	20.84141
2018-12-15 23:40:00	18	20.93023	20.84141
2018-12-15 23:50:00	18	20.93023	20.84141

dfHumidity3%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfHumidity%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfHumidity4

## Adding missing grouping variables: `date`

dfHumidity4%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

date	by10	node_id	lat	lon
2018-12-15	2018-12-15 00:00:00	001e0610ee36	41.75129	-87.60529
2018-12-15	2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15	2018-12-15 00:00:00	001e0610bc12	41.75034	-87.66352
2018-12-15	2018-12-15 00:00:00	001e06113f54	41.88461	-87.62458
2018-12-15	2018-12-15 00:00:00	001e061130f4	41.89616	-87.66239
2018-12-15	2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15	2018-12-15 00:00:00	001e0610ee5d	41.92400	-87.76107
2018-12-15	2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15	2018-12-15 00:00:00	001e06113dbc	41.71387	-87.53651
2018-12-15	2018-12-15 00:00:00	001e0610e532	41.85796	-87.65643
2018-12-15	2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15	2018-12-15 00:00:00	001e0610ee33	41.96509	-87.67908
2018-12-15	2018-12-15 00:00:00	001e0610f732	41.89500	-87.74582
2018-12-15	2018-12-15 00:00:00	001e0610bbf9	41.76832	-87.68340
2018-12-15	2018-12-15 00:00:00	001e06113a48	41.94326	-87.68807
2018-12-15	2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15	2018-12-15 00:00:00	001e0610e537	41.96162	-87.66595
2018-12-15	2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15	2018-12-15 00:10:00	001e0610ee36	41.75129	-87.60529
2018-12-15	2018-12-15 00:10:00	001e06113f54	41.88461	-87.62458
2018-12-15	2018-12-15 00:10:00	001e0610e537	41.96162	-87.66595
2018-12-15	2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15	2018-12-15 00:10:00	001e0610ee43	41.78861	-87.59871
2018-12-15	2018-12-15 00:10:00	001e061130f4	41.89616	-87.66239
2018-12-15	2018-12-15 00:10:00	001e0610f6db	41.79133	-87.59868
2018-12-15	2018-12-15 00:10:00	001e0610ee5d	41.92400	-87.76107
2018-12-15	2018-12-15 00:10:00	001e06113cf1	41.88469	-87.62786
2018-12-15	2018-12-15 00:10:00	001e0610e532	41.85796	-87.65643
2018-12-15	2018-12-15 00:10:00	001e06113dbc	41.71387	-87.53651
2018-12-15	2018-12-15 00:10:00	001e0610ba46	41.87838	-87.62768
2018-12-15	2018-12-15 00:10:00	001e0610f732	41.89500	-87.74582
2018-12-15	2018-12-15 00:10:00	001e0610bbf9	41.76832	-87.68340
2018-12-15	2018-12-15 00:10:00	001e0610ee33	41.96509	-87.67908
2018-12-15	2018-12-15 00:10:00	001e06113a48	41.94326	-87.68807
2018-12-15	2018-12-15 00:10:00	001e0610ba13	41.75124	-87.71299
2018-12-15	2018-12-15 00:10:00	001e0610bc12	41.75034	-87.66352
2018-12-15	2018-12-15 00:20:00	001e0610ba13	41.75124	-87.71299
2018-12-15	2018-12-15 00:20:00	001e0610ee36	41.75129	-87.60529
2018-12-15	2018-12-15 00:20:00	001e0610ee5d	41.92400	-87.76107
2018-12-15	2018-12-15 00:20:00	001e06113f54	41.88461	-87.62458
2018-12-15	2018-12-15 00:20:00	001e0610e537	41.96162	-87.66595

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfHumidity4a<-NULL
for(i in unique(dfHumidity4$by10)){
  
subset <- 
      dfHumidity4%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

Ps2AreaProp<-gArea(Ps2)/chigArea

df1<-NULL
df1$by10<-i
df1$AreaProp<-Ps2AreaProp
df1<-as.data.frame(df1)

dfHumidity4a<-rbind(dfHumidity4a, df1)
  
}

dfHumidity4a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfHumidity4a
dfHumidity4a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.5907152	59.07152
2018-12-15 00:10:00	0.5907152	59.07152
2018-12-15 00:20:00	0.5907152	59.07152
2018-12-15 00:30:00	0.5907152	59.07152
2018-12-15 00:40:00	0.5907152	59.07152
2018-12-15 00:50:00	0.5907152	59.07152
2018-12-15 01:00:00	0.5907152	59.07152
2018-12-15 01:10:00	0.5907152	59.07152
2018-12-15 01:20:00	0.5907152	59.07152
2018-12-15 01:30:00	0.5907152	59.07152
2018-12-15 01:40:00	0.5907152	59.07152
2018-12-15 01:50:00	0.5907152	59.07152
2018-12-15 02:00:00	0.5907152	59.07152
2018-12-15 02:10:00	0.5907152	59.07152
2018-12-15 02:20:00	0.5907152	59.07152
2018-12-15 02:30:00	0.5907152	59.07152
2018-12-15 02:40:00	0.5907152	59.07152
2018-12-15 02:50:00	0.5907152	59.07152
2018-12-15 03:00:00	0.5907152	59.07152
2018-12-15 03:10:00	0.5907152	59.07152
2018-12-15 03:20:00	0.5907152	59.07152
2018-12-15 03:30:00	0.5907152	59.07152
2018-12-15 03:40:00	0.5907152	59.07152
2018-12-15 03:50:00	0.5907152	59.07152
2018-12-15 04:00:00	0.5907152	59.07152
2018-12-15 04:10:00	0.5907152	59.07152
2018-12-15 04:20:00	0.5907152	59.07152
2018-12-15 04:30:00	0.5907152	59.07152
2018-12-15 04:40:00	0.5907152	59.07152
2018-12-15 04:50:00	0.5907152	59.07152
2018-12-15 05:00:00	0.5907152	59.07152
2018-12-15 05:10:00	0.5907152	59.07152
2018-12-15 05:20:00	0.5907152	59.07152
2018-12-15 05:30:00	0.5907152	59.07152
2018-12-15 05:40:00	0.5907152	59.07152
2018-12-15 05:50:00	0.5907152	59.07152
2018-12-15 06:00:00	0.5907152	59.07152
2018-12-15 06:10:00	0.5907152	59.07152
2018-12-15 06:20:00	0.5907152	59.07152
2018-12-15 06:30:00	0.5907152	59.07152
2018-12-15 06:40:00	0.5907152	59.07152
2018-12-15 06:50:00	0.5907152	59.07152
2018-12-15 07:00:00	0.5907152	59.07152
2018-12-15 07:10:00	0.5907152	59.07152
2018-12-15 07:20:00	0.5907152	59.07152
2018-12-15 07:30:00	0.5907152	59.07152
2018-12-15 07:40:00	0.5907152	59.07152
2018-12-15 07:50:00	0.5907152	59.07152
2018-12-15 08:00:00	0.5907152	59.07152
2018-12-15 08:10:00	0.5907152	59.07152
2018-12-15 08:20:00	0.5907152	59.07152
2018-12-15 08:30:00	0.5907152	59.07152
2018-12-15 08:40:00	0.5907152	59.07152
2018-12-15 08:50:00	0.5907152	59.07152
2018-12-15 09:00:00	0.5907152	59.07152
2018-12-15 09:10:00	0.5907152	59.07152
2018-12-15 09:20:00	0.5907152	59.07152
2018-12-15 09:30:00	0.5907152	59.07152
2018-12-15 09:40:00	0.5907152	59.07152
2018-12-15 09:50:00	0.5907152	59.07152
2018-12-15 10:00:00	0.5907152	59.07152
2018-12-15 10:10:00	0.5907152	59.07152
2018-12-15 10:20:00	0.5907152	59.07152
2018-12-15 10:30:00	0.5907152	59.07152
2018-12-15 10:40:00	0.5907152	59.07152
2018-12-15 10:50:00	0.5907152	59.07152
2018-12-15 11:00:00	0.5907152	59.07152
2018-12-15 11:10:00	0.5907152	59.07152
2018-12-15 11:20:00	0.5907152	59.07152
2018-12-15 11:30:00	0.5907152	59.07152
2018-12-15 11:40:00	0.5907152	59.07152
2018-12-15 11:50:00	0.5907152	59.07152
2018-12-15 12:00:00	0.5907152	59.07152
2018-12-15 12:10:00	0.5907152	59.07152
2018-12-15 12:20:00	0.5907152	59.07152
2018-12-15 12:30:00	0.5907152	59.07152
2018-12-15 12:40:00	0.5907152	59.07152
2018-12-15 12:50:00	0.5907152	59.07152
2018-12-15 13:00:00	0.5907152	59.07152
2018-12-15 13:10:00	0.5907152	59.07152
2018-12-15 13:20:00	0.5907152	59.07152
2018-12-15 13:30:00	0.5907152	59.07152
2018-12-15 13:40:00	0.5907152	59.07152
2018-12-15 13:50:00	0.5907152	59.07152
2018-12-15 14:00:00	0.5907152	59.07152
2018-12-15 14:10:00	0.5907152	59.07152
2018-12-15 14:20:00	0.5907152	59.07152
2018-12-15 14:30:00	0.5907152	59.07152
2018-12-15 14:40:00	0.5907152	59.07152
2018-12-15 14:50:00	0.5907152	59.07152
2018-12-15 15:00:00	0.5907152	59.07152
2018-12-15 15:10:00	0.5907152	59.07152
2018-12-15 15:20:00	0.5907152	59.07152
2018-12-15 15:30:00	0.5907152	59.07152
2018-12-15 15:40:00	0.5907152	59.07152
2018-12-15 15:50:00	0.5907152	59.07152
2018-12-15 16:00:00	0.5907152	59.07152
2018-12-15 16:10:00	0.5907152	59.07152
2018-12-15 16:20:00	0.5907152	59.07152
2018-12-15 16:30:00	0.5907152	59.07152
2018-12-15 16:40:00	0.5907152	59.07152
2018-12-15 16:50:00	0.5907152	59.07152
2018-12-15 17:00:00	0.5907152	59.07152
2018-12-15 17:10:00	0.5907152	59.07152
2018-12-15 17:20:00	0.5907152	59.07152
2018-12-15 17:30:00	0.5907152	59.07152
2018-12-15 17:40:00	0.5907152	59.07152
2018-12-15 17:50:00	0.5907152	59.07152
2018-12-15 18:00:00	0.5907152	59.07152
2018-12-15 18:10:00	0.5907152	59.07152
2018-12-15 18:20:00	0.5907152	59.07152
2018-12-15 18:30:00	0.5907152	59.07152
2018-12-15 18:40:00	0.5907152	59.07152
2018-12-15 18:50:00	0.5907152	59.07152
2018-12-15 19:00:00	0.5907152	59.07152
2018-12-15 19:10:00	0.5907152	59.07152
2018-12-15 19:20:00	0.5907152	59.07152
2018-12-15 19:30:00	0.5907152	59.07152
2018-12-15 19:40:00	0.5907152	59.07152
2018-12-15 19:50:00	0.5907152	59.07152
2018-12-15 20:00:00	0.5907152	59.07152
2018-12-15 20:10:00	0.5907152	59.07152
2018-12-15 20:20:00	0.5907152	59.07152
2018-12-15 20:30:00	0.5907152	59.07152
2018-12-15 20:40:00	0.5907152	59.07152
2018-12-15 20:50:00	0.5907152	59.07152
2018-12-15 21:00:00	0.5907152	59.07152
2018-12-15 21:10:00	0.5907152	59.07152
2018-12-15 21:20:00	0.5907152	59.07152
2018-12-15 21:30:00	0.5907152	59.07152
2018-12-15 21:40:00	0.5907152	59.07152
2018-12-15 21:50:00	0.5907152	59.07152
2018-12-15 22:00:00	0.5907152	59.07152
2018-12-15 22:10:00	0.5907152	59.07152
2018-12-15 22:20:00	0.5907152	59.07152
2018-12-15 22:30:00	0.5907152	59.07152
2018-12-15 22:40:00	0.5907152	59.07152
2018-12-15 22:50:00	0.5907152	59.07152
2018-12-15 23:00:00	0.5907152	59.07152
2018-12-15 23:10:00	0.5907152	59.07152
2018-12-15 23:20:00	0.5907152	59.07152
2018-12-15 23:30:00	0.5907152	59.07152
2018-12-15 23:40:00	0.5907152	59.07152
2018-12-15 23:50:00	0.5907152	59.07152

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the humidity network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfHumidity4$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfHumidity4b<-merge(dfHumidity4, by10, by='by10', all.x=TRUE)

for(i in unique(dfHumidity4b$by10)){

subset <- 
      dfHumidity4b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT humidity Network on 2012-12-15,

Score 3 = 20.8 : At any given 10-minute time interval in any given node, an average 20.8% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 59.1 : At any given 10-minute time interval in any given node, reliable data is collected for an average 59.1% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

Pressure

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable pressure data in the AoT network vary across the day’s duration.

dfPressure%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfPressure3

## Adding missing grouping variables: `date`

ggplot(data=dfPressure3, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfPressure3$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable pressure data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable pressure data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfPressure3%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	24	27.90698	27.64858
2018-12-15 00:10:00	24	27.90698	27.64858
2018-12-15 00:20:00	24	27.90698	27.64858
2018-12-15 00:30:00	24	27.90698	27.64858
2018-12-15 00:40:00	24	27.90698	27.64858
2018-12-15 00:50:00	24	27.90698	27.64858
2018-12-15 01:00:00	24	27.90698	27.64858
2018-12-15 01:10:00	24	27.90698	27.64858
2018-12-15 01:20:00	23	26.74419	27.64858
2018-12-15 01:30:00	23	26.74419	27.64858
2018-12-15 01:40:00	23	26.74419	27.64858
2018-12-15 01:50:00	23	26.74419	27.64858
2018-12-15 02:00:00	23	26.74419	27.64858
2018-12-15 02:10:00	23	26.74419	27.64858
2018-12-15 02:20:00	23	26.74419	27.64858
2018-12-15 02:30:00	23	26.74419	27.64858
2018-12-15 02:40:00	23	26.74419	27.64858
2018-12-15 02:50:00	23	26.74419	27.64858
2018-12-15 03:00:00	23	26.74419	27.64858
2018-12-15 03:10:00	24	27.90698	27.64858
2018-12-15 03:20:00	24	27.90698	27.64858
2018-12-15 03:30:00	24	27.90698	27.64858
2018-12-15 03:40:00	24	27.90698	27.64858
2018-12-15 03:50:00	24	27.90698	27.64858
2018-12-15 04:00:00	24	27.90698	27.64858
2018-12-15 04:10:00	24	27.90698	27.64858
2018-12-15 04:20:00	24	27.90698	27.64858
2018-12-15 04:30:00	24	27.90698	27.64858
2018-12-15 04:40:00	24	27.90698	27.64858
2018-12-15 04:50:00	24	27.90698	27.64858
2018-12-15 05:00:00	24	27.90698	27.64858
2018-12-15 05:10:00	24	27.90698	27.64858
2018-12-15 05:20:00	24	27.90698	27.64858
2018-12-15 05:30:00	24	27.90698	27.64858
2018-12-15 05:40:00	24	27.90698	27.64858
2018-12-15 05:50:00	24	27.90698	27.64858
2018-12-15 06:00:00	24	27.90698	27.64858
2018-12-15 06:10:00	24	27.90698	27.64858
2018-12-15 06:20:00	24	27.90698	27.64858
2018-12-15 06:30:00	24	27.90698	27.64858
2018-12-15 06:40:00	24	27.90698	27.64858
2018-12-15 06:50:00	24	27.90698	27.64858
2018-12-15 07:00:00	24	27.90698	27.64858
2018-12-15 07:10:00	24	27.90698	27.64858
2018-12-15 07:20:00	24	27.90698	27.64858
2018-12-15 07:30:00	24	27.90698	27.64858
2018-12-15 07:40:00	24	27.90698	27.64858
2018-12-15 07:50:00	24	27.90698	27.64858
2018-12-15 08:00:00	24	27.90698	27.64858
2018-12-15 08:10:00	24	27.90698	27.64858
2018-12-15 08:20:00	24	27.90698	27.64858
2018-12-15 08:30:00	24	27.90698	27.64858
2018-12-15 08:40:00	24	27.90698	27.64858
2018-12-15 08:50:00	24	27.90698	27.64858
2018-12-15 09:00:00	24	27.90698	27.64858
2018-12-15 09:10:00	24	27.90698	27.64858
2018-12-15 09:20:00	24	27.90698	27.64858
2018-12-15 09:30:00	24	27.90698	27.64858
2018-12-15 09:40:00	24	27.90698	27.64858
2018-12-15 09:50:00	24	27.90698	27.64858
2018-12-15 10:00:00	24	27.90698	27.64858
2018-12-15 10:10:00	24	27.90698	27.64858
2018-12-15 10:20:00	24	27.90698	27.64858
2018-12-15 10:30:00	24	27.90698	27.64858
2018-12-15 10:40:00	24	27.90698	27.64858
2018-12-15 10:50:00	24	27.90698	27.64858
2018-12-15 11:00:00	24	27.90698	27.64858
2018-12-15 11:10:00	24	27.90698	27.64858
2018-12-15 11:20:00	24	27.90698	27.64858
2018-12-15 11:30:00	24	27.90698	27.64858
2018-12-15 11:40:00	24	27.90698	27.64858
2018-12-15 11:50:00	24	27.90698	27.64858
2018-12-15 12:00:00	24	27.90698	27.64858
2018-12-15 12:10:00	24	27.90698	27.64858
2018-12-15 12:20:00	24	27.90698	27.64858
2018-12-15 12:30:00	24	27.90698	27.64858
2018-12-15 12:40:00	24	27.90698	27.64858
2018-12-15 12:50:00	24	27.90698	27.64858
2018-12-15 13:00:00	24	27.90698	27.64858
2018-12-15 13:10:00	24	27.90698	27.64858
2018-12-15 13:20:00	24	27.90698	27.64858
2018-12-15 13:30:00	24	27.90698	27.64858
2018-12-15 13:40:00	24	27.90698	27.64858
2018-12-15 13:50:00	24	27.90698	27.64858
2018-12-15 14:00:00	24	27.90698	27.64858
2018-12-15 14:10:00	24	27.90698	27.64858
2018-12-15 14:20:00	24	27.90698	27.64858
2018-12-15 14:30:00	24	27.90698	27.64858
2018-12-15 14:40:00	24	27.90698	27.64858
2018-12-15 14:50:00	24	27.90698	27.64858
2018-12-15 15:00:00	23	26.74419	27.64858
2018-12-15 15:10:00	22	25.58140	27.64858
2018-12-15 15:20:00	22	25.58140	27.64858
2018-12-15 15:30:00	22	25.58140	27.64858
2018-12-15 15:40:00	22	25.58140	27.64858
2018-12-15 15:50:00	22	25.58140	27.64858
2018-12-15 16:00:00	22	25.58140	27.64858
2018-12-15 16:10:00	22	25.58140	27.64858
2018-12-15 16:20:00	22	25.58140	27.64858
2018-12-15 16:30:00	22	25.58140	27.64858
2018-12-15 16:40:00	22	25.58140	27.64858
2018-12-15 16:50:00	24	27.90698	27.64858
2018-12-15 17:00:00	24	27.90698	27.64858
2018-12-15 17:10:00	24	27.90698	27.64858
2018-12-15 17:20:00	24	27.90698	27.64858
2018-12-15 17:30:00	24	27.90698	27.64858
2018-12-15 17:40:00	24	27.90698	27.64858
2018-12-15 17:50:00	24	27.90698	27.64858
2018-12-15 18:00:00	24	27.90698	27.64858
2018-12-15 18:10:00	24	27.90698	27.64858
2018-12-15 18:20:00	24	27.90698	27.64858
2018-12-15 18:30:00	24	27.90698	27.64858
2018-12-15 18:40:00	24	27.90698	27.64858
2018-12-15 18:50:00	24	27.90698	27.64858
2018-12-15 19:00:00	24	27.90698	27.64858
2018-12-15 19:10:00	24	27.90698	27.64858
2018-12-15 19:20:00	24	27.90698	27.64858
2018-12-15 19:30:00	24	27.90698	27.64858
2018-12-15 19:40:00	24	27.90698	27.64858
2018-12-15 19:50:00	24	27.90698	27.64858
2018-12-15 20:00:00	24	27.90698	27.64858
2018-12-15 20:10:00	24	27.90698	27.64858
2018-12-15 20:20:00	24	27.90698	27.64858
2018-12-15 20:30:00	24	27.90698	27.64858
2018-12-15 20:40:00	24	27.90698	27.64858
2018-12-15 20:50:00	24	27.90698	27.64858
2018-12-15 21:00:00	24	27.90698	27.64858
2018-12-15 21:10:00	24	27.90698	27.64858
2018-12-15 21:20:00	24	27.90698	27.64858
2018-12-15 21:30:00	24	27.90698	27.64858
2018-12-15 21:40:00	24	27.90698	27.64858
2018-12-15 21:50:00	24	27.90698	27.64858
2018-12-15 22:00:00	24	27.90698	27.64858
2018-12-15 22:10:00	24	27.90698	27.64858
2018-12-15 22:20:00	24	27.90698	27.64858
2018-12-15 22:30:00	24	27.90698	27.64858
2018-12-15 22:40:00	24	27.90698	27.64858
2018-12-15 22:50:00	24	27.90698	27.64858
2018-12-15 23:00:00	24	27.90698	27.64858
2018-12-15 23:10:00	24	27.90698	27.64858
2018-12-15 23:20:00	24	27.90698	27.64858
2018-12-15 23:30:00	24	27.90698	27.64858
2018-12-15 23:40:00	24	27.90698	27.64858
2018-12-15 23:50:00	24	27.90698	27.64858

dfPressure3%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfPressure%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfPressure4

## Adding missing grouping variables: `date`

dfPressure4%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

date	by10	node_id	lat	lon
2018-12-15	2018-12-15 00:00:00	001e0610ee36	41.75129	-87.60529
2018-12-15	2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15	2018-12-15 00:00:00	001e06113f54	41.88461	-87.62458
2018-12-15	2018-12-15 00:00:00	001e0611537d	41.79417	-87.60165
2018-12-15	2018-12-15 00:00:00	001e061144c0	41.76412	-87.72242
2018-12-15	2018-12-15 00:00:00	001e061130f4	41.89616	-87.66239
2018-12-15	2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15	2018-12-15 00:00:00	001e0610f05c	41.92490	-87.68770
2018-12-15	2018-12-15 00:00:00	001e0610ee5d	41.92400	-87.76107
2018-12-15	2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15	2018-12-15 00:00:00	001e06113dbc	41.71387	-87.53651
2018-12-15	2018-12-15 00:00:00	001e0610e532	41.85796	-87.65643
2018-12-15	2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15	2018-12-15 00:00:00	001e0610ee33	41.96509	-87.67908
2018-12-15	2018-12-15 00:00:00	001e0610e538	41.73659	-87.60476
2018-12-15	2018-12-15 00:00:00	001e0610f732	41.89500	-87.74582
2018-12-15	2018-12-15 00:00:00	001e0610bc10	41.73631	-87.62418
2018-12-15	2018-12-15 00:00:00	001e0610eef4	41.91268	-87.68105
2018-12-15	2018-12-15 00:00:00	001e0610bbf9	41.76832	-87.68340
2018-12-15	2018-12-15 00:00:00	001e06113a48	41.94326	-87.68807
2018-12-15	2018-12-15 00:00:00	001e0610ba15	41.72246	-87.57535
2018-12-15	2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15	2018-12-15 00:00:00	001e0610e537	41.96162	-87.66595
2018-12-15	2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15	2018-12-15 00:10:00	001e0610ba15	41.72246	-87.57535
2018-12-15	2018-12-15 00:10:00	001e0610ee36	41.75129	-87.60529
2018-12-15	2018-12-15 00:10:00	001e06113f54	41.88461	-87.62458
2018-12-15	2018-12-15 00:10:00	001e0610e537	41.96162	-87.66595
2018-12-15	2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15	2018-12-15 00:10:00	001e0610ee43	41.78861	-87.59871
2018-12-15	2018-12-15 00:10:00	001e061130f4	41.89616	-87.66239
2018-12-15	2018-12-15 00:10:00	001e0610e538	41.73659	-87.60476
2018-12-15	2018-12-15 00:10:00	001e0610f6db	41.79133	-87.59868
2018-12-15	2018-12-15 00:10:00	001e061144c0	41.76412	-87.72242
2018-12-15	2018-12-15 00:10:00	001e0610ee5d	41.92400	-87.76107
2018-12-15	2018-12-15 00:10:00	001e06113cf1	41.88469	-87.62786
2018-12-15	2018-12-15 00:10:00	001e0610e532	41.85796	-87.65643
2018-12-15	2018-12-15 00:10:00	001e06113dbc	41.71387	-87.53651
2018-12-15	2018-12-15 00:10:00	001e0610ba46	41.87838	-87.62768
2018-12-15	2018-12-15 00:10:00	001e0610eef4	41.91268	-87.68105
2018-12-15	2018-12-15 00:10:00	001e0611537d	41.79417	-87.60165

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfPressure4a<-NULL
for(i in unique(dfPressure4$by10)){
  
subset <- 
      dfPressure4%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

Ps2AreaProp<-gArea(Ps2)/chigArea

df1<-NULL
df1$by10<-i
df1$AreaProp<-Ps2AreaProp
df1<-as.data.frame(df1)

dfPressure4a<-rbind(dfPressure4a, df1)
  
}

dfPressure4a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfPressure4a
dfPressure4a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.5907152	59.07152
2018-12-15 00:10:00	0.5907152	59.07152
2018-12-15 00:20:00	0.5907152	59.07152
2018-12-15 00:30:00	0.5907152	59.07152
2018-12-15 00:40:00	0.5907152	59.07152
2018-12-15 00:50:00	0.5907152	59.07152
2018-12-15 01:00:00	0.5907152	59.07152
2018-12-15 01:10:00	0.5907152	59.07152
2018-12-15 01:20:00	0.5907152	59.07152
2018-12-15 01:30:00	0.5907152	59.07152
2018-12-15 01:40:00	0.5907152	59.07152
2018-12-15 01:50:00	0.5907152	59.07152
2018-12-15 02:00:00	0.5907152	59.07152
2018-12-15 02:10:00	0.5907152	59.07152
2018-12-15 02:20:00	0.5907152	59.07152
2018-12-15 02:30:00	0.5907152	59.07152
2018-12-15 02:40:00	0.5907152	59.07152
2018-12-15 02:50:00	0.5907152	59.07152
2018-12-15 03:00:00	0.5907152	59.07152
2018-12-15 03:10:00	0.5907152	59.07152
2018-12-15 03:20:00	0.5907152	59.07152
2018-12-15 03:30:00	0.5907152	59.07152
2018-12-15 03:40:00	0.5907152	59.07152
2018-12-15 03:50:00	0.5907152	59.07152
2018-12-15 04:00:00	0.5907152	59.07152
2018-12-15 04:10:00	0.5907152	59.07152
2018-12-15 04:20:00	0.5907152	59.07152
2018-12-15 04:30:00	0.5907152	59.07152
2018-12-15 04:40:00	0.5907152	59.07152
2018-12-15 04:50:00	0.5907152	59.07152
2018-12-15 05:00:00	0.5907152	59.07152
2018-12-15 05:10:00	0.5907152	59.07152
2018-12-15 05:20:00	0.5907152	59.07152
2018-12-15 05:30:00	0.5907152	59.07152
2018-12-15 05:40:00	0.5907152	59.07152
2018-12-15 05:50:00	0.5907152	59.07152
2018-12-15 06:00:00	0.5907152	59.07152
2018-12-15 06:10:00	0.5907152	59.07152
2018-12-15 06:20:00	0.5907152	59.07152
2018-12-15 06:30:00	0.5907152	59.07152
2018-12-15 06:40:00	0.5907152	59.07152
2018-12-15 06:50:00	0.5907152	59.07152
2018-12-15 07:00:00	0.5907152	59.07152
2018-12-15 07:10:00	0.5907152	59.07152
2018-12-15 07:20:00	0.5907152	59.07152
2018-12-15 07:30:00	0.5907152	59.07152
2018-12-15 07:40:00	0.5907152	59.07152
2018-12-15 07:50:00	0.5907152	59.07152
2018-12-15 08:00:00	0.5907152	59.07152
2018-12-15 08:10:00	0.5907152	59.07152
2018-12-15 08:20:00	0.5907152	59.07152
2018-12-15 08:30:00	0.5907152	59.07152
2018-12-15 08:40:00	0.5907152	59.07152
2018-12-15 08:50:00	0.5907152	59.07152
2018-12-15 09:00:00	0.5907152	59.07152
2018-12-15 09:10:00	0.5907152	59.07152
2018-12-15 09:20:00	0.5907152	59.07152
2018-12-15 09:30:00	0.5907152	59.07152
2018-12-15 09:40:00	0.5907152	59.07152
2018-12-15 09:50:00	0.5907152	59.07152
2018-12-15 10:00:00	0.5907152	59.07152
2018-12-15 10:10:00	0.5907152	59.07152
2018-12-15 10:20:00	0.5907152	59.07152
2018-12-15 10:30:00	0.5907152	59.07152
2018-12-15 10:40:00	0.5907152	59.07152
2018-12-15 10:50:00	0.5907152	59.07152
2018-12-15 11:00:00	0.5907152	59.07152
2018-12-15 11:10:00	0.5907152	59.07152
2018-12-15 11:20:00	0.5907152	59.07152
2018-12-15 11:30:00	0.5907152	59.07152
2018-12-15 11:40:00	0.5907152	59.07152
2018-12-15 11:50:00	0.5907152	59.07152
2018-12-15 12:00:00	0.5907152	59.07152
2018-12-15 12:10:00	0.5907152	59.07152
2018-12-15 12:20:00	0.5907152	59.07152
2018-12-15 12:30:00	0.5907152	59.07152
2018-12-15 12:40:00	0.5907152	59.07152
2018-12-15 12:50:00	0.5907152	59.07152
2018-12-15 13:00:00	0.5907152	59.07152
2018-12-15 13:10:00	0.5907152	59.07152
2018-12-15 13:20:00	0.5907152	59.07152
2018-12-15 13:30:00	0.5907152	59.07152
2018-12-15 13:40:00	0.5907152	59.07152
2018-12-15 13:50:00	0.5907152	59.07152
2018-12-15 14:00:00	0.5907152	59.07152
2018-12-15 14:10:00	0.5907152	59.07152
2018-12-15 14:20:00	0.5907152	59.07152
2018-12-15 14:30:00	0.5907152	59.07152
2018-12-15 14:40:00	0.5907152	59.07152
2018-12-15 14:50:00	0.5907152	59.07152
2018-12-15 15:00:00	0.5907152	59.07152
2018-12-15 15:10:00	0.5907152	59.07152
2018-12-15 15:20:00	0.5907152	59.07152
2018-12-15 15:30:00	0.5907152	59.07152
2018-12-15 15:40:00	0.5907152	59.07152
2018-12-15 15:50:00	0.5907152	59.07152
2018-12-15 16:00:00	0.5907152	59.07152
2018-12-15 16:10:00	0.5907152	59.07152
2018-12-15 16:20:00	0.5907152	59.07152
2018-12-15 16:30:00	0.5907152	59.07152
2018-12-15 16:40:00	0.5907152	59.07152
2018-12-15 16:50:00	0.5907152	59.07152
2018-12-15 17:00:00	0.5907152	59.07152
2018-12-15 17:10:00	0.5907152	59.07152
2018-12-15 17:20:00	0.5907152	59.07152
2018-12-15 17:30:00	0.5907152	59.07152
2018-12-15 17:40:00	0.5907152	59.07152
2018-12-15 17:50:00	0.5907152	59.07152
2018-12-15 18:00:00	0.5907152	59.07152
2018-12-15 18:10:00	0.5907152	59.07152
2018-12-15 18:20:00	0.5907152	59.07152
2018-12-15 18:30:00	0.5907152	59.07152
2018-12-15 18:40:00	0.5907152	59.07152
2018-12-15 18:50:00	0.5907152	59.07152
2018-12-15 19:00:00	0.5907152	59.07152
2018-12-15 19:10:00	0.5907152	59.07152
2018-12-15 19:20:00	0.5907152	59.07152
2018-12-15 19:30:00	0.5907152	59.07152
2018-12-15 19:40:00	0.5907152	59.07152
2018-12-15 19:50:00	0.5907152	59.07152
2018-12-15 20:00:00	0.5907152	59.07152
2018-12-15 20:10:00	0.5907152	59.07152
2018-12-15 20:20:00	0.5907152	59.07152
2018-12-15 20:30:00	0.5907152	59.07152
2018-12-15 20:40:00	0.5907152	59.07152
2018-12-15 20:50:00	0.5907152	59.07152
2018-12-15 21:00:00	0.5907152	59.07152
2018-12-15 21:10:00	0.5907152	59.07152
2018-12-15 21:20:00	0.5907152	59.07152
2018-12-15 21:30:00	0.5907152	59.07152
2018-12-15 21:40:00	0.5907152	59.07152
2018-12-15 21:50:00	0.5907152	59.07152
2018-12-15 22:00:00	0.5907152	59.07152
2018-12-15 22:10:00	0.5907152	59.07152
2018-12-15 22:20:00	0.5907152	59.07152
2018-12-15 22:30:00	0.5907152	59.07152
2018-12-15 22:40:00	0.5907152	59.07152
2018-12-15 22:50:00	0.5907152	59.07152
2018-12-15 23:00:00	0.5907152	59.07152
2018-12-15 23:10:00	0.5907152	59.07152
2018-12-15 23:20:00	0.5907152	59.07152
2018-12-15 23:30:00	0.5907152	59.07152
2018-12-15 23:40:00	0.5907152	59.07152
2018-12-15 23:50:00	0.5907152	59.07152

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the pressure network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfPressure4$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfPressure4b<-merge(dfPressure4, by10, by='by10', all.x=TRUE)

for(i in unique(dfPressure4b$by10)){

subset <- 
      dfPressure4b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT pressure Network on 2012-12-15,

Score 3 = 27.6 : At any given 10-minute time interval in any given node, an average 27.6% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 59.1 : At any given 10-minute time interval in any given node, reliable data is collected for an average 59.1% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

PM2.5 Concentration

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable PM2.5 concentration data in the AoT network vary across the day’s duration.

dfPM25%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfPM253

ggplot(data=dfPM253, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfPM253$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable PM 2.5 concentration data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable PM2.5 concentration data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfPM253%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	1	1.162791	1.657595
2018-12-15 00:10:00	1	1.162791	1.657595
2018-12-15 00:20:00	1	1.162791	1.657595
2018-12-15 00:30:00	1	1.162791	1.657595
2018-12-15 00:40:00	1	1.162791	1.657595
2018-12-15 00:50:00	1	1.162791	1.657595
2018-12-15 01:00:00	1	1.162791	1.657595
2018-12-15 01:10:00	1	1.162791	1.657595
2018-12-15 01:20:00	1	1.162791	1.657595
2018-12-15 01:30:00	1	1.162791	1.657595
2018-12-15 01:40:00	1	1.162791	1.657595
2018-12-15 01:50:00	1	1.162791	1.657595
2018-12-15 02:00:00	1	1.162791	1.657595
2018-12-15 02:10:00	1	1.162791	1.657595
2018-12-15 02:20:00	1	1.162791	1.657595
2018-12-15 02:30:00	1	1.162791	1.657595
2018-12-15 02:40:00	1	1.162791	1.657595
2018-12-15 02:50:00	1	1.162791	1.657595
2018-12-15 03:00:00	1	1.162791	1.657595
2018-12-15 03:10:00	1	1.162791	1.657595
2018-12-15 03:20:00	1	1.162791	1.657595
2018-12-15 03:30:00	1	1.162791	1.657595
2018-12-15 03:40:00	1	1.162791	1.657595
2018-12-15 03:50:00	1	1.162791	1.657595
2018-12-15 04:00:00	1	1.162791	1.657595
2018-12-15 04:10:00	1	1.162791	1.657595
2018-12-15 04:20:00	1	1.162791	1.657595
2018-12-15 04:30:00	1	1.162791	1.657595
2018-12-15 04:40:00	1	1.162791	1.657595
2018-12-15 04:50:00	1	1.162791	1.657595
2018-12-15 05:00:00	1	1.162791	1.657595
2018-12-15 05:10:00	1	1.162791	1.657595
2018-12-15 05:20:00	1	1.162791	1.657595
2018-12-15 05:30:00	1	1.162791	1.657595
2018-12-15 05:40:00	1	1.162791	1.657595
2018-12-15 05:50:00	1	1.162791	1.657595
2018-12-15 06:00:00	1	1.162791	1.657595
2018-12-15 06:10:00	1	1.162791	1.657595
2018-12-15 06:20:00	1	1.162791	1.657595
2018-12-15 06:30:00	2	2.325581	1.657595
2018-12-15 06:40:00	2	2.325581	1.657595
2018-12-15 06:50:00	2	2.325581	1.657595
2018-12-15 07:00:00	2	2.325581	1.657595
2018-12-15 07:10:00	2	2.325581	1.657595
2018-12-15 07:20:00	2	2.325581	1.657595
2018-12-15 07:30:00	2	2.325581	1.657595
2018-12-15 07:40:00	2	2.325581	1.657595
2018-12-15 07:50:00	2	2.325581	1.657595
2018-12-15 08:00:00	2	2.325581	1.657595
2018-12-15 08:10:00	2	2.325581	1.657595
2018-12-15 08:20:00	2	2.325581	1.657595
2018-12-15 08:30:00	2	2.325581	1.657595
2018-12-15 08:40:00	2	2.325581	1.657595
2018-12-15 08:50:00	2	2.325581	1.657595
2018-12-15 09:00:00	2	2.325581	1.657595
2018-12-15 09:10:00	2	2.325581	1.657595
2018-12-15 09:20:00	2	2.325581	1.657595
2018-12-15 09:30:00	2	2.325581	1.657595
2018-12-15 09:40:00	2	2.325581	1.657595
2018-12-15 09:50:00	2	2.325581	1.657595
2018-12-15 10:00:00	2	2.325581	1.657595
2018-12-15 10:10:00	2	2.325581	1.657595
2018-12-15 10:20:00	2	2.325581	1.657595
2018-12-15 10:30:00	2	2.325581	1.657595
2018-12-15 10:40:00	2	2.325581	1.657595
2018-12-15 10:50:00	2	2.325581	1.657595
2018-12-15 11:00:00	2	2.325581	1.657595
2018-12-15 11:10:00	2	2.325581	1.657595
2018-12-15 11:20:00	2	2.325581	1.657595
2018-12-15 11:30:00	2	2.325581	1.657595
2018-12-15 11:40:00	2	2.325581	1.657595
2018-12-15 11:50:00	2	2.325581	1.657595
2018-12-15 12:00:00	2	2.325581	1.657595
2018-12-15 12:10:00	2	2.325581	1.657595
2018-12-15 12:20:00	2	2.325581	1.657595
2018-12-15 12:30:00	2	2.325581	1.657595
2018-12-15 12:40:00	2	2.325581	1.657595
2018-12-15 12:50:00	2	2.325581	1.657595
2018-12-15 13:00:00	2	2.325581	1.657595
2018-12-15 13:10:00	2	2.325581	1.657595
2018-12-15 13:20:00	2	2.325581	1.657595
2018-12-15 13:30:00	2	2.325581	1.657595
2018-12-15 13:40:00	2	2.325581	1.657595
2018-12-15 13:50:00	2	2.325581	1.657595
2018-12-15 14:00:00	2	2.325581	1.657595
2018-12-15 14:10:00	2	2.325581	1.657595
2018-12-15 14:20:00	2	2.325581	1.657595
2018-12-15 14:30:00	2	2.325581	1.657595
2018-12-15 14:40:00	2	2.325581	1.657595
2018-12-15 14:50:00	2	2.325581	1.657595
2018-12-15 15:00:00	1	1.162791	1.657595
2018-12-15 15:10:00	1	1.162791	1.657595
2018-12-15 15:20:00	1	1.162791	1.657595
2018-12-15 15:30:00	1	1.162791	1.657595
2018-12-15 15:40:00	1	1.162791	1.657595
2018-12-15 15:50:00	1	1.162791	1.657595
2018-12-15 16:00:00	1	1.162791	1.657595
2018-12-15 16:10:00	1	1.162791	1.657595
2018-12-15 16:50:00	1	1.162791	1.657595
2018-12-15 17:00:00	1	1.162791	1.657595
2018-12-15 17:10:00	1	1.162791	1.657595
2018-12-15 17:20:00	1	1.162791	1.657595
2018-12-15 17:30:00	1	1.162791	1.657595
2018-12-15 17:40:00	1	1.162791	1.657595
2018-12-15 17:50:00	1	1.162791	1.657595
2018-12-15 18:00:00	1	1.162791	1.657595
2018-12-15 18:10:00	1	1.162791	1.657595
2018-12-15 18:20:00	1	1.162791	1.657595
2018-12-15 18:30:00	1	1.162791	1.657595
2018-12-15 18:40:00	1	1.162791	1.657595
2018-12-15 18:50:00	1	1.162791	1.657595
2018-12-15 19:00:00	1	1.162791	1.657595
2018-12-15 19:10:00	1	1.162791	1.657595
2018-12-15 19:20:00	1	1.162791	1.657595
2018-12-15 19:30:00	1	1.162791	1.657595
2018-12-15 19:40:00	1	1.162791	1.657595
2018-12-15 19:50:00	1	1.162791	1.657595
2018-12-15 20:00:00	1	1.162791	1.657595
2018-12-15 20:10:00	1	1.162791	1.657595
2018-12-15 20:20:00	1	1.162791	1.657595
2018-12-15 20:30:00	1	1.162791	1.657595
2018-12-15 20:40:00	1	1.162791	1.657595
2018-12-15 20:50:00	1	1.162791	1.657595
2018-12-15 21:00:00	1	1.162791	1.657595
2018-12-15 21:10:00	1	1.162791	1.657595
2018-12-15 21:20:00	1	1.162791	1.657595
2018-12-15 21:30:00	1	1.162791	1.657595
2018-12-15 21:40:00	1	1.162791	1.657595
2018-12-15 21:50:00	1	1.162791	1.657595
2018-12-15 22:00:00	1	1.162791	1.657595
2018-12-15 22:10:00	1	1.162791	1.657595
2018-12-15 22:20:00	2	2.325581	1.657595
2018-12-15 22:30:00	1	1.162791	1.657595
2018-12-15 22:40:00	2	2.325581	1.657595
2018-12-15 22:50:00	2	2.325581	1.657595
2018-12-15 23:00:00	2	2.325581	1.657595
2018-12-15 23:10:00	2	2.325581	1.657595
2018-12-15 23:20:00	2	2.325581	1.657595
2018-12-15 23:30:00	2	2.325581	1.657595
2018-12-15 23:40:00	2	2.325581	1.657595
2018-12-15 23:50:00	2	2.325581	1.657595

dfPM253%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfPM25%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfPM254

dfPM254%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

by10	node_id	lat	lon
2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15 00:20:00	001e06113107	41.75114	-87.71299
2018-12-15 00:30:00	001e06113107	41.75114	-87.71299
2018-12-15 00:40:00	001e06113107	41.75114	-87.71299
2018-12-15 00:50:00	001e06113107	41.75114	-87.71299
2018-12-15 01:00:00	001e06113107	41.75114	-87.71299
2018-12-15 01:10:00	001e06113107	41.75114	-87.71299
2018-12-15 01:20:00	001e06113107	41.75114	-87.71299
2018-12-15 01:30:00	001e06113107	41.75114	-87.71299
2018-12-15 01:40:00	001e06113107	41.75114	-87.71299
2018-12-15 01:50:00	001e06113107	41.75114	-87.71299
2018-12-15 02:00:00	001e06113107	41.75114	-87.71299
2018-12-15 02:10:00	001e06113107	41.75114	-87.71299
2018-12-15 02:20:00	001e06113107	41.75114	-87.71299
2018-12-15 02:30:00	001e06113107	41.75114	-87.71299
2018-12-15 02:40:00	001e06113107	41.75114	-87.71299
2018-12-15 02:50:00	001e06113107	41.75114	-87.71299
2018-12-15 03:00:00	001e06113107	41.75114	-87.71299
2018-12-15 03:10:00	001e06113107	41.75114	-87.71299
2018-12-15 03:20:00	001e06113107	41.75114	-87.71299
2018-12-15 03:30:00	001e06113107	41.75114	-87.71299
2018-12-15 03:40:00	001e06113107	41.75114	-87.71299
2018-12-15 03:50:00	001e06113107	41.75114	-87.71299
2018-12-15 04:00:00	001e06113107	41.75114	-87.71299
2018-12-15 04:10:00	001e06113107	41.75114	-87.71299
2018-12-15 04:20:00	001e06113107	41.75114	-87.71299
2018-12-15 04:30:00	001e06113107	41.75114	-87.71299
2018-12-15 04:40:00	001e06113107	41.75114	-87.71299
2018-12-15 04:50:00	001e06113107	41.75114	-87.71299
2018-12-15 05:00:00	001e06113107	41.75114	-87.71299
2018-12-15 05:10:00	001e06113107	41.75114	-87.71299
2018-12-15 05:20:00	001e06113107	41.75114	-87.71299
2018-12-15 05:30:00	001e06113107	41.75114	-87.71299
2018-12-15 05:40:00	001e06113107	41.75114	-87.71299
2018-12-15 05:50:00	001e06113107	41.75114	-87.71299
2018-12-15 06:00:00	001e06113107	41.75114	-87.71299
2018-12-15 06:10:00	001e06113107	41.75114	-87.71299
2018-12-15 06:20:00	001e06113107	41.75114	-87.71299
2018-12-15 06:30:00	001e06113107	41.75114	-87.71299
2018-12-15 06:30:00	001e0610bc10	41.73631	-87.62418

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfPM254a<-NULL

for(i in unique(dfPM254$by10)){
  
subset <- 
      dfPM254%>%
      filter(by10==i)

if(nrow(subset)>1){
  subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
  subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
  P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    dfPM254a<-rbind(dfPM254a, df1)
}else{
  df1<-NULL
  df1$by10<-i
  df1$AreaProp<-0
  df1<-as.data.frame(df1)
  dfPM254a<-rbind(dfPM254a, df1)
}
  
}

dfPM254a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfPM254a
dfPM254a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.0000000	0.7976292
2018-12-15 00:10:00	0.0000000	0.7976292
2018-12-15 00:20:00	0.0000000	0.7976292
2018-12-15 00:30:00	0.0000000	0.7976292
2018-12-15 00:40:00	0.0000000	0.7976292
2018-12-15 00:50:00	0.0000000	0.7976292
2018-12-15 01:00:00	0.0000000	0.7976292
2018-12-15 01:10:00	0.0000000	0.7976292
2018-12-15 01:20:00	0.0000000	0.7976292
2018-12-15 01:30:00	0.0000000	0.7976292
2018-12-15 01:40:00	0.0000000	0.7976292
2018-12-15 01:50:00	0.0000000	0.7976292
2018-12-15 02:00:00	0.0000000	0.7976292
2018-12-15 02:10:00	0.0000000	0.7976292
2018-12-15 02:20:00	0.0000000	0.7976292
2018-12-15 02:30:00	0.0000000	0.7976292
2018-12-15 02:40:00	0.0000000	0.7976292
2018-12-15 02:50:00	0.0000000	0.7976292
2018-12-15 03:00:00	0.0000000	0.7976292
2018-12-15 03:10:00	0.0000000	0.7976292
2018-12-15 03:20:00	0.0000000	0.7976292
2018-12-15 03:30:00	0.0000000	0.7976292
2018-12-15 03:40:00	0.0000000	0.7976292
2018-12-15 03:50:00	0.0000000	0.7976292
2018-12-15 04:00:00	0.0000000	0.7976292
2018-12-15 04:10:00	0.0000000	0.7976292
2018-12-15 04:20:00	0.0000000	0.7976292
2018-12-15 04:30:00	0.0000000	0.7976292
2018-12-15 04:40:00	0.0000000	0.7976292
2018-12-15 04:50:00	0.0000000	0.7976292
2018-12-15 05:00:00	0.0000000	0.7976292
2018-12-15 05:10:00	0.0000000	0.7976292
2018-12-15 05:20:00	0.0000000	0.7976292
2018-12-15 05:30:00	0.0000000	0.7976292
2018-12-15 05:40:00	0.0000000	0.7976292
2018-12-15 05:50:00	0.0000000	0.7976292
2018-12-15 06:00:00	0.0000000	0.7976292
2018-12-15 06:10:00	0.0000000	0.7976292
2018-12-15 06:20:00	0.0000000	0.7976292
2018-12-15 06:30:00	0.0187443	0.7976292
2018-12-15 06:40:00	0.0187443	0.7976292
2018-12-15 06:50:00	0.0187443	0.7976292
2018-12-15 07:00:00	0.0187443	0.7976292
2018-12-15 07:10:00	0.0187443	0.7976292
2018-12-15 07:20:00	0.0187443	0.7976292
2018-12-15 07:30:00	0.0187443	0.7976292
2018-12-15 07:40:00	0.0187443	0.7976292
2018-12-15 07:50:00	0.0187443	0.7976292
2018-12-15 08:00:00	0.0187443	0.7976292
2018-12-15 08:10:00	0.0187443	0.7976292
2018-12-15 08:20:00	0.0187443	0.7976292
2018-12-15 08:30:00	0.0187443	0.7976292
2018-12-15 08:40:00	0.0187443	0.7976292
2018-12-15 08:50:00	0.0187443	0.7976292
2018-12-15 09:00:00	0.0187443	0.7976292
2018-12-15 09:10:00	0.0187443	0.7976292
2018-12-15 09:20:00	0.0187443	0.7976292
2018-12-15 09:30:00	0.0187443	0.7976292
2018-12-15 09:40:00	0.0187443	0.7976292
2018-12-15 09:50:00	0.0187443	0.7976292
2018-12-15 10:00:00	0.0187443	0.7976292
2018-12-15 10:10:00	0.0187443	0.7976292
2018-12-15 10:20:00	0.0187443	0.7976292
2018-12-15 10:30:00	0.0187443	0.7976292
2018-12-15 10:40:00	0.0187443	0.7976292
2018-12-15 10:50:00	0.0187443	0.7976292
2018-12-15 11:00:00	0.0187443	0.7976292
2018-12-15 11:10:00	0.0187443	0.7976292
2018-12-15 11:20:00	0.0187443	0.7976292
2018-12-15 11:30:00	0.0187443	0.7976292
2018-12-15 11:40:00	0.0187443	0.7976292
2018-12-15 11:50:00	0.0187443	0.7976292
2018-12-15 12:00:00	0.0187443	0.7976292
2018-12-15 12:10:00	0.0187443	0.7976292
2018-12-15 12:20:00	0.0187443	0.7976292
2018-12-15 12:30:00	0.0187443	0.7976292
2018-12-15 12:40:00	0.0187443	0.7976292
2018-12-15 12:50:00	0.0187443	0.7976292
2018-12-15 13:00:00	0.0187443	0.7976292
2018-12-15 13:10:00	0.0187443	0.7976292
2018-12-15 13:20:00	0.0187443	0.7976292
2018-12-15 13:30:00	0.0187443	0.7976292
2018-12-15 13:40:00	0.0187443	0.7976292
2018-12-15 13:50:00	0.0187443	0.7976292
2018-12-15 14:00:00	0.0187443	0.7976292
2018-12-15 14:10:00	0.0187443	0.7976292
2018-12-15 14:20:00	0.0187443	0.7976292
2018-12-15 14:30:00	0.0187443	0.7976292
2018-12-15 14:40:00	0.0187443	0.7976292
2018-12-15 14:50:00	0.0187443	0.7976292
2018-12-15 15:00:00	0.0000000	0.7976292
2018-12-15 15:10:00	0.0000000	0.7976292
2018-12-15 15:20:00	0.0000000	0.7976292
2018-12-15 15:30:00	0.0000000	0.7976292
2018-12-15 15:40:00	0.0000000	0.7976292
2018-12-15 15:50:00	0.0000000	0.7976292
2018-12-15 16:00:00	0.0000000	0.7976292
2018-12-15 16:10:00	0.0000000	0.7976292
2018-12-15 16:50:00	0.0000000	0.7976292
2018-12-15 17:00:00	0.0000000	0.7976292
2018-12-15 17:10:00	0.0000000	0.7976292
2018-12-15 17:20:00	0.0000000	0.7976292
2018-12-15 17:30:00	0.0000000	0.7976292
2018-12-15 17:40:00	0.0000000	0.7976292
2018-12-15 17:50:00	0.0000000	0.7976292
2018-12-15 18:00:00	0.0000000	0.7976292
2018-12-15 18:10:00	0.0000000	0.7976292
2018-12-15 18:20:00	0.0000000	0.7976292
2018-12-15 18:30:00	0.0000000	0.7976292
2018-12-15 18:40:00	0.0000000	0.7976292
2018-12-15 18:50:00	0.0000000	0.7976292
2018-12-15 19:00:00	0.0000000	0.7976292
2018-12-15 19:10:00	0.0000000	0.7976292
2018-12-15 19:20:00	0.0000000	0.7976292
2018-12-15 19:30:00	0.0000000	0.7976292
2018-12-15 19:40:00	0.0000000	0.7976292
2018-12-15 19:50:00	0.0000000	0.7976292
2018-12-15 20:00:00	0.0000000	0.7976292
2018-12-15 20:10:00	0.0000000	0.7976292
2018-12-15 20:20:00	0.0000000	0.7976292
2018-12-15 20:30:00	0.0000000	0.7976292
2018-12-15 20:40:00	0.0000000	0.7976292
2018-12-15 20:50:00	0.0000000	0.7976292
2018-12-15 21:00:00	0.0000000	0.7976292
2018-12-15 21:10:00	0.0000000	0.7976292
2018-12-15 21:20:00	0.0000000	0.7976292
2018-12-15 21:30:00	0.0000000	0.7976292
2018-12-15 21:40:00	0.0000000	0.7976292
2018-12-15 21:50:00	0.0000000	0.7976292
2018-12-15 22:00:00	0.0000000	0.7976292
2018-12-15 22:10:00	0.0000000	0.7976292
2018-12-15 22:20:00	0.0187443	0.7976292
2018-12-15 22:30:00	0.0000000	0.7976292
2018-12-15 22:40:00	0.0187443	0.7976292
2018-12-15 22:50:00	0.0187443	0.7976292
2018-12-15 23:00:00	0.0187443	0.7976292
2018-12-15 23:10:00	0.0187443	0.7976292
2018-12-15 23:20:00	0.0187443	0.7976292
2018-12-15 23:30:00	0.0187443	0.7976292
2018-12-15 23:40:00	0.0187443	0.7976292
2018-12-15 23:50:00	0.0187443	0.7976292

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 12:00(bottom left), and 23:00(bottom right). In the case of the PM2.5 concentration network on this day,there is no node collecting any reliable data for 00:00 and 07:00 - n map can be displayed. For 12:00 and 23:00, however, the coverage area of one single node in the south can be observed.

p<-list()

by10<-as.data.frame(unique(dfPM254$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfPM254b<-merge(dfPM254, by10, by='by10', all.x=TRUE)

for(i in unique(dfPM254b$by10)){

subset <- 
      dfPM254b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[71]], p[[141]])

In summary, for the AoT PM2.5 concentration Network on 2012-12-15,

Score 3 = 1.7 : At any given 10-minute time interval in any given node, an average 1.7% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 0.80 : At any given 10-minute time interval in any given node, reliable data is collected for an average 0.8% of Chicago’s area.

CO Concentration

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable CO concentration data in the AoT network vary across the day’s duration.

dfCO%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfCO3

ggplot(data=dfCO3, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfCO3$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable CO concentration data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable CO concentration data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfCO3%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	16	18.604651	17.11886
2018-12-15 00:10:00	14	16.279070	17.11886
2018-12-15 00:20:00	16	18.604651	17.11886
2018-12-15 00:30:00	16	18.604651	17.11886
2018-12-15 00:40:00	15	17.441861	17.11886
2018-12-15 00:50:00	14	16.279070	17.11886
2018-12-15 01:00:00	17	19.767442	17.11886
2018-12-15 01:10:00	16	18.604651	17.11886
2018-12-15 01:20:00	17	19.767442	17.11886
2018-12-15 01:30:00	17	19.767442	17.11886
2018-12-15 01:40:00	16	18.604651	17.11886
2018-12-15 01:50:00	15	17.441861	17.11886
2018-12-15 02:00:00	15	17.441861	17.11886
2018-12-15 02:10:00	16	18.604651	17.11886
2018-12-15 02:20:00	15	17.441861	17.11886
2018-12-15 02:30:00	14	16.279070	17.11886
2018-12-15 02:40:00	16	18.604651	17.11886
2018-12-15 02:50:00	16	18.604651	17.11886
2018-12-15 03:00:00	16	18.604651	17.11886
2018-12-15 03:10:00	17	19.767442	17.11886
2018-12-15 03:20:00	15	17.441861	17.11886
2018-12-15 03:30:00	15	17.441861	17.11886
2018-12-15 03:40:00	17	19.767442	17.11886
2018-12-15 03:50:00	14	16.279070	17.11886
2018-12-15 04:00:00	16	18.604651	17.11886
2018-12-15 04:10:00	16	18.604651	17.11886
2018-12-15 04:20:00	16	18.604651	17.11886
2018-12-15 04:30:00	16	18.604651	17.11886
2018-12-15 04:40:00	18	20.930233	17.11886
2018-12-15 04:50:00	17	19.767442	17.11886
2018-12-15 05:00:00	17	19.767442	17.11886
2018-12-15 05:10:00	17	19.767442	17.11886
2018-12-15 05:20:00	18	20.930233	17.11886
2018-12-15 05:30:00	18	20.930233	17.11886
2018-12-15 05:40:00	16	18.604651	17.11886
2018-12-15 05:50:00	17	19.767442	17.11886
2018-12-15 06:00:00	17	19.767442	17.11886
2018-12-15 06:10:00	17	19.767442	17.11886
2018-12-15 06:20:00	17	19.767442	17.11886
2018-12-15 06:30:00	17	19.767442	17.11886
2018-12-15 06:40:00	17	19.767442	17.11886
2018-12-15 06:50:00	16	18.604651	17.11886
2018-12-15 07:00:00	18	20.930233	17.11886
2018-12-15 07:10:00	17	19.767442	17.11886
2018-12-15 07:20:00	18	20.930233	17.11886
2018-12-15 07:30:00	17	19.767442	17.11886
2018-12-15 07:40:00	16	18.604651	17.11886
2018-12-15 07:50:00	18	20.930233	17.11886
2018-12-15 08:00:00	17	19.767442	17.11886
2018-12-15 08:10:00	15	17.441861	17.11886
2018-12-15 08:20:00	18	20.930233	17.11886
2018-12-15 08:30:00	17	19.767442	17.11886
2018-12-15 08:40:00	17	19.767442	17.11886
2018-12-15 08:50:00	17	19.767442	17.11886
2018-12-15 09:00:00	17	19.767442	17.11886
2018-12-15 09:10:00	18	20.930233	17.11886
2018-12-15 09:20:00	16	18.604651	17.11886
2018-12-15 09:30:00	15	17.441861	17.11886
2018-12-15 09:40:00	14	16.279070	17.11886
2018-12-15 09:50:00	15	17.441861	17.11886
2018-12-15 10:00:00	16	18.604651	17.11886
2018-12-15 10:10:00	16	18.604651	17.11886
2018-12-15 10:20:00	16	18.604651	17.11886
2018-12-15 10:30:00	16	18.604651	17.11886
2018-12-15 10:40:00	16	18.604651	17.11886
2018-12-15 10:50:00	16	18.604651	17.11886
2018-12-15 11:00:00	16	18.604651	17.11886
2018-12-15 11:10:00	16	18.604651	17.11886
2018-12-15 11:20:00	15	17.441861	17.11886
2018-12-15 11:30:00	14	16.279070	17.11886
2018-12-15 11:40:00	16	18.604651	17.11886
2018-12-15 11:50:00	14	16.279070	17.11886
2018-12-15 12:00:00	15	17.441861	17.11886
2018-12-15 12:10:00	16	18.604651	17.11886
2018-12-15 12:20:00	15	17.441861	17.11886
2018-12-15 12:30:00	14	16.279070	17.11886
2018-12-15 12:40:00	14	16.279070	17.11886
2018-12-15 12:50:00	15	17.441861	17.11886
2018-12-15 13:00:00	16	18.604651	17.11886
2018-12-15 13:10:00	15	17.441861	17.11886
2018-12-15 13:20:00	17	19.767442	17.11886
2018-12-15 13:30:00	15	17.441861	17.11886
2018-12-15 13:40:00	17	19.767442	17.11886
2018-12-15 13:50:00	16	18.604651	17.11886
2018-12-15 14:00:00	14	16.279070	17.11886
2018-12-15 14:10:00	16	18.604651	17.11886
2018-12-15 14:20:00	14	16.279070	17.11886
2018-12-15 14:30:00	16	18.604651	17.11886
2018-12-15 14:40:00	14	16.279070	17.11886
2018-12-15 14:50:00	14	16.279070	17.11886
2018-12-15 15:00:00	14	16.279070	17.11886
2018-12-15 15:10:00	12	13.953488	17.11886
2018-12-15 15:20:00	13	15.116279	17.11886
2018-12-15 15:30:00	14	16.279070	17.11886
2018-12-15 15:40:00	14	16.279070	17.11886
2018-12-15 15:50:00	12	13.953488	17.11886
2018-12-15 16:00:00	13	15.116279	17.11886
2018-12-15 16:10:00	12	13.953488	17.11886
2018-12-15 16:20:00	13	15.116279	17.11886
2018-12-15 16:30:00	13	15.116279	17.11886
2018-12-15 16:40:00	13	15.116279	17.11886
2018-12-15 16:50:00	13	15.116279	17.11886
2018-12-15 17:00:00	13	15.116279	17.11886
2018-12-15 17:10:00	11	12.790698	17.11886
2018-12-15 17:20:00	13	15.116279	17.11886
2018-12-15 17:30:00	12	13.953488	17.11886
2018-12-15 17:40:00	13	15.116279	17.11886
2018-12-15 17:50:00	11	12.790698	17.11886
2018-12-15 18:00:00	13	15.116279	17.11886
2018-12-15 18:10:00	12	13.953488	17.11886
2018-12-15 18:20:00	10	11.627907	17.11886
2018-12-15 18:30:00	11	12.790698	17.11886
2018-12-15 18:40:00	11	12.790698	17.11886
2018-12-15 18:50:00	10	11.627907	17.11886
2018-12-15 19:00:00	10	11.627907	17.11886
2018-12-15 19:10:00	10	11.627907	17.11886
2018-12-15 19:20:00	11	12.790698	17.11886
2018-12-15 19:30:00	7	8.139535	17.11886
2018-12-15 19:40:00	10	11.627907	17.11886
2018-12-15 19:50:00	11	12.790698	17.11886
2018-12-15 20:00:00	10	11.627907	17.11886
2018-12-15 20:10:00	13	15.116279	17.11886
2018-12-15 20:20:00	9	10.465116	17.11886
2018-12-15 20:30:00	12	13.953488	17.11886
2018-12-15 20:40:00	12	13.953488	17.11886
2018-12-15 20:50:00	10	11.627907	17.11886
2018-12-15 21:00:00	13	15.116279	17.11886
2018-12-15 21:10:00	13	15.116279	17.11886
2018-12-15 21:20:00	12	13.953488	17.11886
2018-12-15 21:30:00	13	15.116279	17.11886
2018-12-15 21:40:00	13	15.116279	17.11886
2018-12-15 21:50:00	13	15.116279	17.11886
2018-12-15 22:00:00	13	15.116279	17.11886
2018-12-15 22:10:00	14	16.279070	17.11886
2018-12-15 22:20:00	15	17.441861	17.11886
2018-12-15 22:30:00	16	18.604651	17.11886
2018-12-15 22:40:00	16	18.604651	17.11886
2018-12-15 22:50:00	16	18.604651	17.11886
2018-12-15 23:00:00	16	18.604651	17.11886
2018-12-15 23:10:00	14	16.279070	17.11886
2018-12-15 23:20:00	16	18.604651	17.11886
2018-12-15 23:30:00	16	18.604651	17.11886
2018-12-15 23:40:00	16	18.604651	17.11886
2018-12-15 23:50:00	16	18.604651	17.11886

dfCO3%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfCO%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfCO4

dfCO4%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

by10	node_id	lat	lon
2018-12-15 00:00:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:00:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15 00:00:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:00:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:00:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:00:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:00:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:00:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:00:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:00:00	001e06114503	41.66608	-87.53937
2018-12-15 00:10:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:10:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:10:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:10:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:10:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:10:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:10:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:10:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:10:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:10:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:10:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15 00:10:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:10:00	001e06114503	41.66608	-87.53937
2018-12-15 00:20:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:20:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:20:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:20:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:20:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:20:00	001e06113107	41.75114	-87.71299
2018-12-15 00:20:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:20:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:20:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:20:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:20:00	001e0610ee43	41.78861	-87.59871

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfCO4a<-NULL

for(i in unique(dfCO4$by10)){
  
subset <- 
      dfCO4%>%
      filter(by10==i)

if(nrow(subset)>1){
  subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
  subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
  P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    dfCO4a<-rbind(dfCO4a, df1)
}else{
  df1<-NULL
  df1$by10<-i
  df1$AreaProp<-0
  df1<-as.data.frame(df1)
  dfCO4a<-rbind(dfCO4a, df1)
}
  
}

dfCO4a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfCO4a
dfCO4a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.5639259	52.29076
2018-12-15 00:10:00	0.5639259	52.29076
2018-12-15 00:20:00	0.6010409	52.29076
2018-12-15 00:30:00	0.5639259	52.29076
2018-12-15 00:40:00	0.5639259	52.29076
2018-12-15 00:50:00	0.5639259	52.29076
2018-12-15 01:00:00	0.6010409	52.29076
2018-12-15 01:10:00	0.5639259	52.29076
2018-12-15 01:20:00	0.5639259	52.29076
2018-12-15 01:30:00	0.5639259	52.29076
2018-12-15 01:40:00	0.5639259	52.29076
2018-12-15 01:50:00	0.5639259	52.29076
2018-12-15 02:00:00	0.5639259	52.29076
2018-12-15 02:10:00	0.5639259	52.29076
2018-12-15 02:20:00	0.5639259	52.29076
2018-12-15 02:30:00	0.4172133	52.29076
2018-12-15 02:40:00	0.5639259	52.29076
2018-12-15 02:50:00	0.5639259	52.29076
2018-12-15 03:00:00	0.5639259	52.29076
2018-12-15 03:10:00	0.5639259	52.29076
2018-12-15 03:20:00	0.5639259	52.29076
2018-12-15 03:30:00	0.5639259	52.29076
2018-12-15 03:40:00	0.6010409	52.29076
2018-12-15 03:50:00	0.5639259	52.29076
2018-12-15 04:00:00	0.5639259	52.29076
2018-12-15 04:10:00	0.5639259	52.29076
2018-12-15 04:20:00	0.5639259	52.29076
2018-12-15 04:30:00	0.6010409	52.29076
2018-12-15 04:40:00	0.6010409	52.29076
2018-12-15 04:50:00	0.6010409	52.29076
2018-12-15 05:00:00	0.5639259	52.29076
2018-12-15 05:10:00	0.6010409	52.29076
2018-12-15 05:20:00	0.6010409	52.29076
2018-12-15 05:30:00	0.6010409	52.29076
2018-12-15 05:40:00	0.6010409	52.29076
2018-12-15 05:50:00	0.6010409	52.29076
2018-12-15 06:00:00	0.6010409	52.29076
2018-12-15 06:10:00	0.6010409	52.29076
2018-12-15 06:20:00	0.4496355	52.29076
2018-12-15 06:30:00	0.6010409	52.29076
2018-12-15 06:40:00	0.6010409	52.29076
2018-12-15 06:50:00	0.6010409	52.29076
2018-12-15 07:00:00	0.6010409	52.29076
2018-12-15 07:10:00	0.6010409	52.29076
2018-12-15 07:20:00	0.6010409	52.29076
2018-12-15 07:30:00	0.6010409	52.29076
2018-12-15 07:40:00	0.6010409	52.29076
2018-12-15 07:50:00	0.6010409	52.29076
2018-12-15 08:00:00	0.6010409	52.29076
2018-12-15 08:10:00	0.5639259	52.29076
2018-12-15 08:20:00	0.6010409	52.29076
2018-12-15 08:30:00	0.6010409	52.29076
2018-12-15 08:40:00	0.6010409	52.29076
2018-12-15 08:50:00	0.6010409	52.29076
2018-12-15 09:00:00	0.5639259	52.29076
2018-12-15 09:10:00	0.6010409	52.29076
2018-12-15 09:20:00	0.6010409	52.29076
2018-12-15 09:30:00	0.5639259	52.29076
2018-12-15 09:40:00	0.6010409	52.29076
2018-12-15 09:50:00	0.6010409	52.29076
2018-12-15 10:00:00	0.6010409	52.29076
2018-12-15 10:10:00	0.6010409	52.29076
2018-12-15 10:20:00	0.6010409	52.29076
2018-12-15 10:30:00	0.6010409	52.29076
2018-12-15 10:40:00	0.6010409	52.29076
2018-12-15 10:50:00	0.5639259	52.29076
2018-12-15 11:00:00	0.5639259	52.29076
2018-12-15 11:10:00	0.5639259	52.29076
2018-12-15 11:20:00	0.5639259	52.29076
2018-12-15 11:30:00	0.5639259	52.29076
2018-12-15 11:40:00	0.6010409	52.29076
2018-12-15 11:50:00	0.5639259	52.29076
2018-12-15 12:00:00	0.6010409	52.29076
2018-12-15 12:10:00	0.6010409	52.29076
2018-12-15 12:20:00	0.5639259	52.29076
2018-12-15 12:30:00	0.5639259	52.29076
2018-12-15 12:40:00	0.4114511	52.29076
2018-12-15 12:50:00	0.6010409	52.29076
2018-12-15 13:00:00	0.6010409	52.29076
2018-12-15 13:10:00	0.6010409	52.29076
2018-12-15 13:20:00	0.5639259	52.29076
2018-12-15 13:30:00	0.5639259	52.29076
2018-12-15 13:40:00	0.5639259	52.29076
2018-12-15 13:50:00	0.5639259	52.29076
2018-12-15 14:00:00	0.5639259	52.29076
2018-12-15 14:10:00	0.5639259	52.29076
2018-12-15 14:20:00	0.3790289	52.29076
2018-12-15 14:30:00	0.6010409	52.29076
2018-12-15 14:40:00	0.5639259	52.29076
2018-12-15 14:50:00	0.4172133	52.29076
2018-12-15 15:00:00	0.5639259	52.29076
2018-12-15 15:10:00	0.5639259	52.29076
2018-12-15 15:20:00	0.3790289	52.29076
2018-12-15 15:30:00	0.5639259	52.29076
2018-12-15 15:40:00	0.5639259	52.29076
2018-12-15 15:50:00	0.3790195	52.29076
2018-12-15 16:00:00	0.3790289	52.29076
2018-12-15 16:10:00	0.3790289	52.29076
2018-12-15 16:20:00	0.3790289	52.29076
2018-12-15 16:30:00	0.3790289	52.29076
2018-12-15 16:40:00	0.3790289	52.29076
2018-12-15 16:50:00	0.3790258	52.29076
2018-12-15 17:00:00	0.5639259	52.29076
2018-12-15 17:10:00	0.3790289	52.29076
2018-12-15 17:20:00	0.3790289	52.29076
2018-12-15 17:30:00	0.5639222	52.29076
2018-12-15 17:40:00	0.3790289	52.29076
2018-12-15 17:50:00	0.3790289	52.29076
2018-12-15 18:00:00	0.4730930	52.29076
2018-12-15 18:10:00	0.3790195	52.29076
2018-12-15 18:20:00	0.2992375	52.29076
2018-12-15 18:30:00	0.4730930	52.29076
2018-12-15 18:40:00	0.5639222	52.29076
2018-12-15 18:50:00	0.3790164	52.29076
2018-12-15 19:00:00	0.2640792	52.29076
2018-12-15 19:10:00	0.2992375	52.29076
2018-12-15 19:20:00	0.2992469	52.29076
2018-12-15 19:30:00	0.2525195	52.29076
2018-12-15 19:40:00	0.2574762	52.29076
2018-12-15 19:50:00	0.2992375	52.29076
2018-12-15 20:00:00	0.3790164	52.29076
2018-12-15 20:10:00	0.4172133	52.29076
2018-12-15 20:20:00	0.3790195	52.29076
2018-12-15 20:30:00	0.5639222	52.29076
2018-12-15 20:40:00	0.3790195	52.29076
2018-12-15 20:50:00	0.2679517	52.29076
2018-12-15 21:00:00	0.3790195	52.29076
2018-12-15 21:10:00	0.6010409	52.29076
2018-12-15 21:20:00	0.3790195	52.29076
2018-12-15 21:30:00	0.5639222	52.29076
2018-12-15 21:40:00	0.3790195	52.29076
2018-12-15 21:50:00	0.3790195	52.29076
2018-12-15 22:00:00	0.4114417	52.29076
2018-12-15 22:10:00	0.5639259	52.29076
2018-12-15 22:20:00	0.4114511	52.29076
2018-12-15 22:30:00	0.4496355	52.29076
2018-12-15 22:40:00	0.6010409	52.29076
2018-12-15 22:50:00	0.6010409	52.29076
2018-12-15 23:00:00	0.6010409	52.29076
2018-12-15 23:10:00	0.5639259	52.29076
2018-12-15 23:20:00	0.6010409	52.29076
2018-12-15 23:30:00	0.6010409	52.29076
2018-12-15 23:40:00	0.4496355	52.29076
2018-12-15 23:50:00	0.6010409	52.29076

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the CO concentration network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfCO4$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfCO4b<-merge(dfCO4, by10, by='by10', all.x=TRUE)

for(i in unique(dfCO4b$by10)){

subset <- 
      dfCO4b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT CO concentration Network on 2012-12-15,

Score 3 = 17.1 : At any given 10-minute time interval in any given node, an average 17.1% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 52.3 : At any given 10-minute time interval in any given node, reliable data is collected for an average 52.3% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

H2S Concentration

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable H2S concentration data in the AoT network vary across the day’s duration.

dfH2S%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfH2S3

ggplot(data=dfH2S3, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfH2S3$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable H2S concentration data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable H2S concentration data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfH2S3%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	15	17.44186	19.80782
2018-12-15 00:10:00	16	18.60465	19.80782
2018-12-15 00:20:00	15	17.44186	19.80782
2018-12-15 00:30:00	15	17.44186	19.80782
2018-12-15 00:40:00	15	17.44186	19.80782
2018-12-15 00:50:00	16	18.60465	19.80782
2018-12-15 01:00:00	16	18.60465	19.80782
2018-12-15 01:10:00	15	17.44186	19.80782
2018-12-15 01:20:00	17	19.76744	19.80782
2018-12-15 01:30:00	16	18.60465	19.80782
2018-12-15 01:40:00	17	19.76744	19.80782
2018-12-15 01:50:00	17	19.76744	19.80782
2018-12-15 02:00:00	17	19.76744	19.80782
2018-12-15 02:10:00	17	19.76744	19.80782
2018-12-15 02:20:00	17	19.76744	19.80782
2018-12-15 02:30:00	16	18.60465	19.80782
2018-12-15 02:40:00	17	19.76744	19.80782
2018-12-15 02:50:00	16	18.60465	19.80782
2018-12-15 03:00:00	17	19.76744	19.80782
2018-12-15 03:10:00	16	18.60465	19.80782
2018-12-15 03:20:00	17	19.76744	19.80782
2018-12-15 03:30:00	17	19.76744	19.80782
2018-12-15 03:40:00	17	19.76744	19.80782
2018-12-15 03:50:00	17	19.76744	19.80782
2018-12-15 04:00:00	17	19.76744	19.80782
2018-12-15 04:10:00	17	19.76744	19.80782
2018-12-15 04:20:00	16	18.60465	19.80782
2018-12-15 04:30:00	16	18.60465	19.80782
2018-12-15 04:40:00	17	19.76744	19.80782
2018-12-15 04:50:00	17	19.76744	19.80782
2018-12-15 05:00:00	17	19.76744	19.80782
2018-12-15 05:10:00	16	18.60465	19.80782
2018-12-15 05:20:00	16	18.60465	19.80782
2018-12-15 05:30:00	17	19.76744	19.80782
2018-12-15 05:40:00	17	19.76744	19.80782
2018-12-15 05:50:00	17	19.76744	19.80782
2018-12-15 06:00:00	17	19.76744	19.80782
2018-12-15 06:10:00	17	19.76744	19.80782
2018-12-15 06:20:00	17	19.76744	19.80782
2018-12-15 06:30:00	17	19.76744	19.80782
2018-12-15 06:40:00	17	19.76744	19.80782
2018-12-15 06:50:00	16	18.60465	19.80782
2018-12-15 07:00:00	16	18.60465	19.80782
2018-12-15 07:10:00	17	19.76744	19.80782
2018-12-15 07:20:00	17	19.76744	19.80782
2018-12-15 07:30:00	17	19.76744	19.80782
2018-12-15 07:40:00	17	19.76744	19.80782
2018-12-15 07:50:00	16	18.60465	19.80782
2018-12-15 08:00:00	17	19.76744	19.80782
2018-12-15 08:10:00	17	19.76744	19.80782
2018-12-15 08:20:00	17	19.76744	19.80782
2018-12-15 08:30:00	16	18.60465	19.80782
2018-12-15 08:40:00	17	19.76744	19.80782
2018-12-15 08:50:00	17	19.76744	19.80782
2018-12-15 09:00:00	18	20.93023	19.80782
2018-12-15 09:10:00	18	20.93023	19.80782
2018-12-15 09:20:00	18	20.93023	19.80782
2018-12-15 09:30:00	18	20.93023	19.80782
2018-12-15 09:40:00	18	20.93023	19.80782
2018-12-15 09:50:00	17	19.76744	19.80782
2018-12-15 10:00:00	17	19.76744	19.80782
2018-12-15 10:10:00	17	19.76744	19.80782
2018-12-15 10:20:00	17	19.76744	19.80782
2018-12-15 10:30:00	17	19.76744	19.80782
2018-12-15 10:40:00	17	19.76744	19.80782
2018-12-15 10:50:00	18	20.93023	19.80782
2018-12-15 11:00:00	17	19.76744	19.80782
2018-12-15 11:10:00	18	20.93023	19.80782
2018-12-15 11:20:00	17	19.76744	19.80782
2018-12-15 11:30:00	17	19.76744	19.80782
2018-12-15 11:40:00	17	19.76744	19.80782
2018-12-15 11:50:00	17	19.76744	19.80782
2018-12-15 12:00:00	17	19.76744	19.80782
2018-12-15 12:10:00	18	20.93023	19.80782
2018-12-15 12:20:00	18	20.93023	19.80782
2018-12-15 12:30:00	18	20.93023	19.80782
2018-12-15 12:40:00	17	19.76744	19.80782
2018-12-15 12:50:00	17	19.76744	19.80782
2018-12-15 13:00:00	17	19.76744	19.80782
2018-12-15 13:10:00	17	19.76744	19.80782
2018-12-15 13:20:00	18	20.93023	19.80782
2018-12-15 13:30:00	18	20.93023	19.80782
2018-12-15 13:40:00	18	20.93023	19.80782
2018-12-15 13:50:00	17	19.76744	19.80782
2018-12-15 14:00:00	17	19.76744	19.80782
2018-12-15 14:10:00	18	20.93023	19.80782
2018-12-15 14:20:00	17	19.76744	19.80782
2018-12-15 14:30:00	18	20.93023	19.80782
2018-12-15 14:40:00	17	19.76744	19.80782
2018-12-15 14:50:00	17	19.76744	19.80782
2018-12-15 15:00:00	17	19.76744	19.80782
2018-12-15 15:10:00	16	18.60465	19.80782
2018-12-15 15:20:00	17	19.76744	19.80782
2018-12-15 15:30:00	16	18.60465	19.80782
2018-12-15 15:40:00	16	18.60465	19.80782
2018-12-15 15:50:00	17	19.76744	19.80782
2018-12-15 16:00:00	16	18.60465	19.80782
2018-12-15 16:10:00	17	19.76744	19.80782
2018-12-15 16:20:00	16	18.60465	19.80782
2018-12-15 16:30:00	17	19.76744	19.80782
2018-12-15 16:40:00	16	18.60465	19.80782
2018-12-15 16:50:00	17	19.76744	19.80782
2018-12-15 17:00:00	17	19.76744	19.80782
2018-12-15 17:10:00	17	19.76744	19.80782
2018-12-15 17:20:00	17	19.76744	19.80782
2018-12-15 17:30:00	17	19.76744	19.80782
2018-12-15 17:40:00	17	19.76744	19.80782
2018-12-15 17:50:00	17	19.76744	19.80782
2018-12-15 18:00:00	17	19.76744	19.80782
2018-12-15 18:10:00	18	20.93023	19.80782
2018-12-15 18:20:00	17	19.76744	19.80782
2018-12-15 18:30:00	17	19.76744	19.80782
2018-12-15 18:40:00	17	19.76744	19.80782
2018-12-15 18:50:00	17	19.76744	19.80782
2018-12-15 19:00:00	17	19.76744	19.80782
2018-12-15 19:10:00	17	19.76744	19.80782
2018-12-15 19:20:00	17	19.76744	19.80782
2018-12-15 19:30:00	18	20.93023	19.80782
2018-12-15 19:40:00	17	19.76744	19.80782
2018-12-15 19:50:00	17	19.76744	19.80782
2018-12-15 20:00:00	18	20.93023	19.80782
2018-12-15 20:10:00	17	19.76744	19.80782
2018-12-15 20:20:00	17	19.76744	19.80782
2018-12-15 20:30:00	18	20.93023	19.80782
2018-12-15 20:40:00	17	19.76744	19.80782
2018-12-15 20:50:00	18	20.93023	19.80782
2018-12-15 21:00:00	18	20.93023	19.80782
2018-12-15 21:10:00	18	20.93023	19.80782
2018-12-15 21:20:00	18	20.93023	19.80782
2018-12-15 21:30:00	18	20.93023	19.80782
2018-12-15 21:40:00	18	20.93023	19.80782
2018-12-15 21:50:00	18	20.93023	19.80782
2018-12-15 22:00:00	18	20.93023	19.80782
2018-12-15 22:10:00	18	20.93023	19.80782
2018-12-15 22:20:00	18	20.93023	19.80782
2018-12-15 22:30:00	18	20.93023	19.80782
2018-12-15 22:40:00	17	19.76744	19.80782
2018-12-15 22:50:00	18	20.93023	19.80782
2018-12-15 23:00:00	18	20.93023	19.80782
2018-12-15 23:10:00	18	20.93023	19.80782
2018-12-15 23:20:00	18	20.93023	19.80782
2018-12-15 23:30:00	18	20.93023	19.80782
2018-12-15 23:40:00	17	19.76744	19.80782
2018-12-15 23:50:00	18	20.93023	19.80782

dfH2S3%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfH2S%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfH2S4

dfH2S4%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

by10	node_id	lat	lon
2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:00:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:00:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:00:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:00:00	001e06114503	41.66608	-87.53937
2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:00:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:00:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:00:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:00:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:00:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:10:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:10:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:10:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:10:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:10:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:10:00	001e06114503	41.66608	-87.53937
2018-12-15 00:10:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:10:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:10:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:10:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:10:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15 00:10:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:10:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:10:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:10:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:20:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:20:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:20:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:20:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:20:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:20:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:20:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:20:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:20:00	001e06114503	41.66608	-87.53937
2018-12-15 00:20:00	001e06113107	41.75114	-87.71299

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfH2S4a<-NULL

for(i in unique(dfH2S4$by10)){
  
subset <- 
      dfH2S4%>%
      filter(by10==i)

if(nrow(subset)>1){
  subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
  subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
  P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    dfH2S4a<-rbind(dfH2S4a, df1)
}else{
  df1<-NULL
  df1$by10<-i
  df1$AreaProp<-0
  df1<-as.data.frame(df1)
  dfH2S4a<-rbind(dfH2S4a, df1)
}
  
}

dfH2S4a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfH2S4a
dfH2S4a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.5639259	57.91328
2018-12-15 00:10:00	0.5639259	57.91328
2018-12-15 00:20:00	0.5639259	57.91328
2018-12-15 00:30:00	0.5639259	57.91328
2018-12-15 00:40:00	0.5639259	57.91328
2018-12-15 00:50:00	0.5639259	57.91328
2018-12-15 01:00:00	0.5639259	57.91328
2018-12-15 01:10:00	0.5639259	57.91328
2018-12-15 01:20:00	0.5639259	57.91328
2018-12-15 01:30:00	0.5639259	57.91328
2018-12-15 01:40:00	0.5639259	57.91328
2018-12-15 01:50:00	0.5639259	57.91328
2018-12-15 02:00:00	0.5639259	57.91328
2018-12-15 02:10:00	0.5639259	57.91328
2018-12-15 02:20:00	0.5639259	57.91328
2018-12-15 02:30:00	0.5639259	57.91328
2018-12-15 02:40:00	0.5639259	57.91328
2018-12-15 02:50:00	0.5639259	57.91328
2018-12-15 03:00:00	0.5639259	57.91328
2018-12-15 03:10:00	0.5639259	57.91328
2018-12-15 03:20:00	0.5639259	57.91328
2018-12-15 03:30:00	0.5639259	57.91328
2018-12-15 03:40:00	0.5639259	57.91328
2018-12-15 03:50:00	0.5639259	57.91328
2018-12-15 04:00:00	0.5639259	57.91328
2018-12-15 04:10:00	0.5639259	57.91328
2018-12-15 04:20:00	0.5639259	57.91328
2018-12-15 04:30:00	0.5639259	57.91328
2018-12-15 04:40:00	0.5639259	57.91328
2018-12-15 04:50:00	0.5639259	57.91328
2018-12-15 05:00:00	0.5639259	57.91328
2018-12-15 05:10:00	0.5639259	57.91328
2018-12-15 05:20:00	0.5639259	57.91328
2018-12-15 05:30:00	0.5639259	57.91328
2018-12-15 05:40:00	0.5639259	57.91328
2018-12-15 05:50:00	0.5639259	57.91328
2018-12-15 06:00:00	0.5639259	57.91328
2018-12-15 06:10:00	0.5639259	57.91328
2018-12-15 06:20:00	0.5639259	57.91328
2018-12-15 06:30:00	0.5639259	57.91328
2018-12-15 06:40:00	0.5639259	57.91328
2018-12-15 06:50:00	0.5639259	57.91328
2018-12-15 07:00:00	0.5639259	57.91328
2018-12-15 07:10:00	0.5639259	57.91328
2018-12-15 07:20:00	0.5639259	57.91328
2018-12-15 07:30:00	0.5639259	57.91328
2018-12-15 07:40:00	0.5639259	57.91328
2018-12-15 07:50:00	0.5639259	57.91328
2018-12-15 08:00:00	0.5639259	57.91328
2018-12-15 08:10:00	0.5639259	57.91328
2018-12-15 08:20:00	0.5639259	57.91328
2018-12-15 08:30:00	0.5639259	57.91328
2018-12-15 08:40:00	0.5639259	57.91328
2018-12-15 08:50:00	0.5639259	57.91328
2018-12-15 09:00:00	0.6010409	57.91328
2018-12-15 09:10:00	0.6010409	57.91328
2018-12-15 09:20:00	0.6010409	57.91328
2018-12-15 09:30:00	0.6010409	57.91328
2018-12-15 09:40:00	0.6010409	57.91328
2018-12-15 09:50:00	0.6010409	57.91328
2018-12-15 10:00:00	0.6010409	57.91328
2018-12-15 10:10:00	0.6010409	57.91328
2018-12-15 10:20:00	0.6010409	57.91328
2018-12-15 10:30:00	0.6010409	57.91328
2018-12-15 10:40:00	0.6010409	57.91328
2018-12-15 10:50:00	0.6010409	57.91328
2018-12-15 11:00:00	0.6010409	57.91328
2018-12-15 11:10:00	0.6010409	57.91328
2018-12-15 11:20:00	0.6010409	57.91328
2018-12-15 11:30:00	0.6010409	57.91328
2018-12-15 11:40:00	0.6010409	57.91328
2018-12-15 11:50:00	0.6010409	57.91328
2018-12-15 12:00:00	0.6010409	57.91328
2018-12-15 12:10:00	0.6010409	57.91328
2018-12-15 12:20:00	0.6010409	57.91328
2018-12-15 12:30:00	0.6010409	57.91328
2018-12-15 12:40:00	0.6010409	57.91328
2018-12-15 12:50:00	0.5639259	57.91328
2018-12-15 13:00:00	0.5639259	57.91328
2018-12-15 13:10:00	0.5639259	57.91328
2018-12-15 13:20:00	0.6010409	57.91328
2018-12-15 13:30:00	0.6010409	57.91328
2018-12-15 13:40:00	0.6010409	57.91328
2018-12-15 13:50:00	0.6010409	57.91328
2018-12-15 14:00:00	0.6010409	57.91328
2018-12-15 14:10:00	0.6010409	57.91328
2018-12-15 14:20:00	0.5639259	57.91328
2018-12-15 14:30:00	0.6010409	57.91328
2018-12-15 14:40:00	0.6010409	57.91328
2018-12-15 14:50:00	0.5639259	57.91328
2018-12-15 15:00:00	0.6010409	57.91328
2018-12-15 15:10:00	0.5639259	57.91328
2018-12-15 15:20:00	0.6010409	57.91328
2018-12-15 15:30:00	0.5639259	57.91328
2018-12-15 15:40:00	0.5639259	57.91328
2018-12-15 15:50:00	0.6010409	57.91328
2018-12-15 16:00:00	0.5639259	57.91328
2018-12-15 16:10:00	0.6010409	57.91328
2018-12-15 16:20:00	0.5639259	57.91328
2018-12-15 16:30:00	0.6010409	57.91328
2018-12-15 16:40:00	0.5639259	57.91328
2018-12-15 16:50:00	0.5639259	57.91328
2018-12-15 17:00:00	0.5639259	57.91328
2018-12-15 17:10:00	0.5639259	57.91328
2018-12-15 17:20:00	0.5639259	57.91328
2018-12-15 17:30:00	0.5639259	57.91328
2018-12-15 17:40:00	0.5639259	57.91328
2018-12-15 17:50:00	0.5639259	57.91328
2018-12-15 18:00:00	0.5639259	57.91328
2018-12-15 18:10:00	0.6010409	57.91328
2018-12-15 18:20:00	0.5639259	57.91328
2018-12-15 18:30:00	0.5639259	57.91328
2018-12-15 18:40:00	0.5639259	57.91328
2018-12-15 18:50:00	0.5639259	57.91328
2018-12-15 19:00:00	0.5639259	57.91328
2018-12-15 19:10:00	0.5639259	57.91328
2018-12-15 19:20:00	0.5639259	57.91328
2018-12-15 19:30:00	0.6010409	57.91328
2018-12-15 19:40:00	0.5639259	57.91328
2018-12-15 19:50:00	0.5639259	57.91328
2018-12-15 20:00:00	0.6010409	57.91328
2018-12-15 20:10:00	0.5639259	57.91328
2018-12-15 20:20:00	0.5639259	57.91328
2018-12-15 20:30:00	0.6010409	57.91328
2018-12-15 20:40:00	0.5639259	57.91328
2018-12-15 20:50:00	0.6010409	57.91328
2018-12-15 21:00:00	0.6010409	57.91328
2018-12-15 21:10:00	0.6010409	57.91328
2018-12-15 21:20:00	0.6010409	57.91328
2018-12-15 21:30:00	0.6010409	57.91328
2018-12-15 21:40:00	0.6010409	57.91328
2018-12-15 21:50:00	0.6010409	57.91328
2018-12-15 22:00:00	0.6010409	57.91328
2018-12-15 22:10:00	0.6010409	57.91328
2018-12-15 22:20:00	0.6010409	57.91328
2018-12-15 22:30:00	0.6010409	57.91328
2018-12-15 22:40:00	0.6010409	57.91328
2018-12-15 22:50:00	0.6010409	57.91328
2018-12-15 23:00:00	0.6010409	57.91328
2018-12-15 23:10:00	0.6010409	57.91328
2018-12-15 23:20:00	0.6010409	57.91328
2018-12-15 23:30:00	0.6010409	57.91328
2018-12-15 23:40:00	0.6010409	57.91328
2018-12-15 23:50:00	0.6010409	57.91328

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the H2S concentration network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfH2S4$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfH2S4b<-merge(dfH2S4, by10, by='by10', all.x=TRUE)

for(i in unique(dfH2S4b$by10)){

subset <- 
      dfH2S4b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT H2S concentration Network on 2012-12-15,

Score 3 = 19.8 : At any given 10-minute time interval in any given node, an average 19.8% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 57.9 : At any given 10-minute time interval in any given node, reliable data is collected for an average 57.9% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

NO2 Concentration

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable NO2 concentration data in the AoT network vary across the day’s duration.

dfNO2%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfNO23

ggplot(data=dfNO23, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfNO23$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable NO2 concentration data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable NO2 concentration data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfNO23%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	16	18.60465	20.00162
2018-12-15 00:10:00	16	18.60465	20.00162
2018-12-15 00:20:00	17	19.76744	20.00162
2018-12-15 00:30:00	16	18.60465	20.00162
2018-12-15 00:40:00	16	18.60465	20.00162
2018-12-15 00:50:00	16	18.60465	20.00162
2018-12-15 01:00:00	16	18.60465	20.00162
2018-12-15 01:10:00	17	19.76744	20.00162
2018-12-15 01:20:00	16	18.60465	20.00162
2018-12-15 01:30:00	17	19.76744	20.00162
2018-12-15 01:40:00	16	18.60465	20.00162
2018-12-15 01:50:00	17	19.76744	20.00162
2018-12-15 02:00:00	17	19.76744	20.00162
2018-12-15 02:10:00	17	19.76744	20.00162
2018-12-15 02:20:00	16	18.60465	20.00162
2018-12-15 02:30:00	16	18.60465	20.00162
2018-12-15 02:40:00	17	19.76744	20.00162
2018-12-15 02:50:00	17	19.76744	20.00162
2018-12-15 03:00:00	16	18.60465	20.00162
2018-12-15 03:10:00	18	20.93023	20.00162
2018-12-15 03:20:00	18	20.93023	20.00162
2018-12-15 03:30:00	17	19.76744	20.00162
2018-12-15 03:40:00	17	19.76744	20.00162
2018-12-15 03:50:00	18	20.93023	20.00162
2018-12-15 04:00:00	17	19.76744	20.00162
2018-12-15 04:10:00	18	20.93023	20.00162
2018-12-15 04:20:00	16	18.60465	20.00162
2018-12-15 04:30:00	18	20.93023	20.00162
2018-12-15 04:40:00	18	20.93023	20.00162
2018-12-15 04:50:00	18	20.93023	20.00162
2018-12-15 05:00:00	17	19.76744	20.00162
2018-12-15 05:10:00	17	19.76744	20.00162
2018-12-15 05:20:00	18	20.93023	20.00162
2018-12-15 05:30:00	18	20.93023	20.00162
2018-12-15 05:40:00	18	20.93023	20.00162
2018-12-15 05:50:00	18	20.93023	20.00162
2018-12-15 06:00:00	17	19.76744	20.00162
2018-12-15 06:10:00	18	20.93023	20.00162
2018-12-15 06:20:00	18	20.93023	20.00162
2018-12-15 06:30:00	17	19.76744	20.00162
2018-12-15 06:40:00	18	20.93023	20.00162
2018-12-15 06:50:00	17	19.76744	20.00162
2018-12-15 07:00:00	18	20.93023	20.00162
2018-12-15 07:10:00	17	19.76744	20.00162
2018-12-15 07:20:00	18	20.93023	20.00162
2018-12-15 07:30:00	17	19.76744	20.00162
2018-12-15 07:40:00	17	19.76744	20.00162
2018-12-15 07:50:00	17	19.76744	20.00162
2018-12-15 08:00:00	17	19.76744	20.00162
2018-12-15 08:10:00	18	20.93023	20.00162
2018-12-15 08:20:00	17	19.76744	20.00162
2018-12-15 08:30:00	18	20.93023	20.00162
2018-12-15 08:40:00	18	20.93023	20.00162
2018-12-15 08:50:00	16	18.60465	20.00162
2018-12-15 09:00:00	16	18.60465	20.00162
2018-12-15 09:10:00	16	18.60465	20.00162
2018-12-15 09:20:00	16	18.60465	20.00162
2018-12-15 09:30:00	17	19.76744	20.00162
2018-12-15 09:40:00	16	18.60465	20.00162
2018-12-15 09:50:00	16	18.60465	20.00162
2018-12-15 10:00:00	16	18.60465	20.00162
2018-12-15 10:10:00	17	19.76744	20.00162
2018-12-15 10:20:00	16	18.60465	20.00162
2018-12-15 10:30:00	17	19.76744	20.00162
2018-12-15 10:40:00	16	18.60465	20.00162
2018-12-15 10:50:00	17	19.76744	20.00162
2018-12-15 11:00:00	16	18.60465	20.00162
2018-12-15 11:10:00	17	19.76744	20.00162
2018-12-15 11:20:00	16	18.60465	20.00162
2018-12-15 11:30:00	16	18.60465	20.00162
2018-12-15 11:40:00	18	20.93023	20.00162
2018-12-15 11:50:00	17	19.76744	20.00162
2018-12-15 12:00:00	18	20.93023	20.00162
2018-12-15 12:10:00	18	20.93023	20.00162
2018-12-15 12:20:00	18	20.93023	20.00162
2018-12-15 12:30:00	17	19.76744	20.00162
2018-12-15 12:40:00	17	19.76744	20.00162
2018-12-15 12:50:00	17	19.76744	20.00162
2018-12-15 13:00:00	17	19.76744	20.00162
2018-12-15 13:10:00	17	19.76744	20.00162
2018-12-15 13:20:00	17	19.76744	20.00162
2018-12-15 13:30:00	18	20.93023	20.00162
2018-12-15 13:40:00	17	19.76744	20.00162
2018-12-15 13:50:00	18	20.93023	20.00162
2018-12-15 14:00:00	18	20.93023	20.00162
2018-12-15 14:10:00	18	20.93023	20.00162
2018-12-15 14:20:00	18	20.93023	20.00162
2018-12-15 14:30:00	17	19.76744	20.00162
2018-12-15 14:40:00	17	19.76744	20.00162
2018-12-15 14:50:00	18	20.93023	20.00162
2018-12-15 15:00:00	17	19.76744	20.00162
2018-12-15 15:10:00	17	19.76744	20.00162
2018-12-15 15:20:00	17	19.76744	20.00162
2018-12-15 15:30:00	17	19.76744	20.00162
2018-12-15 15:40:00	16	18.60465	20.00162
2018-12-15 15:50:00	16	18.60465	20.00162
2018-12-15 16:00:00	16	18.60465	20.00162
2018-12-15 16:10:00	17	19.76744	20.00162
2018-12-15 16:20:00	17	19.76744	20.00162
2018-12-15 16:30:00	17	19.76744	20.00162
2018-12-15 16:40:00	17	19.76744	20.00162
2018-12-15 16:50:00	18	20.93023	20.00162
2018-12-15 17:00:00	18	20.93023	20.00162
2018-12-15 17:10:00	18	20.93023	20.00162
2018-12-15 17:20:00	18	20.93023	20.00162
2018-12-15 17:30:00	18	20.93023	20.00162
2018-12-15 17:40:00	18	20.93023	20.00162
2018-12-15 17:50:00	18	20.93023	20.00162
2018-12-15 18:00:00	18	20.93023	20.00162
2018-12-15 18:10:00	18	20.93023	20.00162
2018-12-15 18:20:00	18	20.93023	20.00162
2018-12-15 18:30:00	17	19.76744	20.00162
2018-12-15 18:40:00	18	20.93023	20.00162
2018-12-15 18:50:00	18	20.93023	20.00162
2018-12-15 19:00:00	18	20.93023	20.00162
2018-12-15 19:10:00	17	19.76744	20.00162
2018-12-15 19:20:00	18	20.93023	20.00162
2018-12-15 19:30:00	18	20.93023	20.00162
2018-12-15 19:40:00	17	19.76744	20.00162
2018-12-15 19:50:00	18	20.93023	20.00162
2018-12-15 20:00:00	18	20.93023	20.00162
2018-12-15 20:10:00	18	20.93023	20.00162
2018-12-15 20:20:00	17	19.76744	20.00162
2018-12-15 20:30:00	17	19.76744	20.00162
2018-12-15 20:40:00	18	20.93023	20.00162
2018-12-15 20:50:00	18	20.93023	20.00162
2018-12-15 21:00:00	18	20.93023	20.00162
2018-12-15 21:10:00	18	20.93023	20.00162
2018-12-15 21:20:00	18	20.93023	20.00162
2018-12-15 21:30:00	18	20.93023	20.00162
2018-12-15 21:40:00	17	19.76744	20.00162
2018-12-15 21:50:00	17	19.76744	20.00162
2018-12-15 22:00:00	17	19.76744	20.00162
2018-12-15 22:10:00	17	19.76744	20.00162
2018-12-15 22:20:00	17	19.76744	20.00162
2018-12-15 22:30:00	17	19.76744	20.00162
2018-12-15 22:40:00	17	19.76744	20.00162
2018-12-15 22:50:00	18	20.93023	20.00162
2018-12-15 23:00:00	17	19.76744	20.00162
2018-12-15 23:10:00	18	20.93023	20.00162
2018-12-15 23:20:00	18	20.93023	20.00162
2018-12-15 23:30:00	17	19.76744	20.00162
2018-12-15 23:40:00	17	19.76744	20.00162
2018-12-15 23:50:00	17	19.76744	20.00162

dfNO23%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfNO2%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfNO24

dfNO24%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

by10	node_id	lat	lon
2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:00:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:00:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:00:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:00:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:00:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:00:00	001e06114503	41.66608	-87.53937
2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:00:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:00:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:00:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:00:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15 00:10:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:10:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:10:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:10:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:10:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:10:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:10:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:10:00	001e06114503	41.66608	-87.53937
2018-12-15 00:10:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:10:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:10:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:10:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:10:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:10:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:10:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:20:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:20:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:20:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:20:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:20:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:20:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:20:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:20:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:20:00	001e06113107	41.75114	-87.71299

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfNO24a<-NULL

for(i in unique(dfNO24$by10)){
  
subset <- 
      dfNO24%>%
      filter(by10==i)

if(nrow(subset)>1){
  subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
  subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
  P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    dfNO24a<-rbind(dfNO24a, df1)
}else{
  df1<-NULL
  df1$by10<-i
  df1$AreaProp<-0
  df1<-as.data.frame(df1)
  dfNO24a<-rbind(dfNO24a, df1)
}
  
}

dfNO24a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfNO24a
dfNO24a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.6010409	59.82058
2018-12-15 00:10:00	0.6010409	59.82058
2018-12-15 00:20:00	0.6010409	59.82058
2018-12-15 00:30:00	0.6010409	59.82058
2018-12-15 00:40:00	0.6010409	59.82058
2018-12-15 00:50:00	0.6010409	59.82058
2018-12-15 01:00:00	0.6010409	59.82058
2018-12-15 01:10:00	0.6010409	59.82058
2018-12-15 01:20:00	0.5639259	59.82058
2018-12-15 01:30:00	0.5639259	59.82058
2018-12-15 01:40:00	0.5639259	59.82058
2018-12-15 01:50:00	0.5639259	59.82058
2018-12-15 02:00:00	0.5639259	59.82058
2018-12-15 02:10:00	0.5639259	59.82058
2018-12-15 02:20:00	0.5639259	59.82058
2018-12-15 02:30:00	0.5639259	59.82058
2018-12-15 02:40:00	0.5639259	59.82058
2018-12-15 02:50:00	0.5639259	59.82058
2018-12-15 03:00:00	0.5639259	59.82058
2018-12-15 03:10:00	0.6010409	59.82058
2018-12-15 03:20:00	0.6010409	59.82058
2018-12-15 03:30:00	0.6010409	59.82058
2018-12-15 03:40:00	0.6010409	59.82058
2018-12-15 03:50:00	0.6010409	59.82058
2018-12-15 04:00:00	0.6010409	59.82058
2018-12-15 04:10:00	0.6010409	59.82058
2018-12-15 04:20:00	0.6010409	59.82058
2018-12-15 04:30:00	0.6010409	59.82058
2018-12-15 04:40:00	0.6010409	59.82058
2018-12-15 04:50:00	0.6010409	59.82058
2018-12-15 05:00:00	0.6010409	59.82058
2018-12-15 05:10:00	0.6010409	59.82058
2018-12-15 05:20:00	0.6010409	59.82058
2018-12-15 05:30:00	0.6010409	59.82058
2018-12-15 05:40:00	0.6010409	59.82058
2018-12-15 05:50:00	0.6010409	59.82058
2018-12-15 06:00:00	0.6010409	59.82058
2018-12-15 06:10:00	0.6010409	59.82058
2018-12-15 06:20:00	0.6010409	59.82058
2018-12-15 06:30:00	0.6010409	59.82058
2018-12-15 06:40:00	0.6010409	59.82058
2018-12-15 06:50:00	0.6010409	59.82058
2018-12-15 07:00:00	0.6010409	59.82058
2018-12-15 07:10:00	0.6010409	59.82058
2018-12-15 07:20:00	0.6010409	59.82058
2018-12-15 07:30:00	0.6010409	59.82058
2018-12-15 07:40:00	0.6010409	59.82058
2018-12-15 07:50:00	0.6010409	59.82058
2018-12-15 08:00:00	0.6010409	59.82058
2018-12-15 08:10:00	0.6010409	59.82058
2018-12-15 08:20:00	0.6010409	59.82058
2018-12-15 08:30:00	0.6010409	59.82058
2018-12-15 08:40:00	0.6010409	59.82058
2018-12-15 08:50:00	0.6010409	59.82058
2018-12-15 09:00:00	0.6010409	59.82058
2018-12-15 09:10:00	0.6010409	59.82058
2018-12-15 09:20:00	0.6010409	59.82058
2018-12-15 09:30:00	0.6010409	59.82058
2018-12-15 09:40:00	0.6010409	59.82058
2018-12-15 09:50:00	0.6010409	59.82058
2018-12-15 10:00:00	0.6010409	59.82058
2018-12-15 10:10:00	0.6010409	59.82058
2018-12-15 10:20:00	0.6010409	59.82058
2018-12-15 10:30:00	0.6010409	59.82058
2018-12-15 10:40:00	0.6010409	59.82058
2018-12-15 10:50:00	0.6010409	59.82058
2018-12-15 11:00:00	0.6010409	59.82058
2018-12-15 11:10:00	0.6010409	59.82058
2018-12-15 11:20:00	0.6010409	59.82058
2018-12-15 11:30:00	0.6010409	59.82058
2018-12-15 11:40:00	0.6010409	59.82058
2018-12-15 11:50:00	0.6010409	59.82058
2018-12-15 12:00:00	0.6010409	59.82058
2018-12-15 12:10:00	0.6010409	59.82058
2018-12-15 12:20:00	0.6010409	59.82058
2018-12-15 12:30:00	0.6010409	59.82058
2018-12-15 12:40:00	0.6010409	59.82058
2018-12-15 12:50:00	0.6010409	59.82058
2018-12-15 13:00:00	0.6010409	59.82058
2018-12-15 13:10:00	0.6010409	59.82058
2018-12-15 13:20:00	0.6010409	59.82058
2018-12-15 13:30:00	0.6010409	59.82058
2018-12-15 13:40:00	0.6010409	59.82058
2018-12-15 13:50:00	0.6010409	59.82058
2018-12-15 14:00:00	0.6010409	59.82058
2018-12-15 14:10:00	0.6010409	59.82058
2018-12-15 14:20:00	0.6010409	59.82058
2018-12-15 14:30:00	0.6010409	59.82058
2018-12-15 14:40:00	0.6010409	59.82058
2018-12-15 14:50:00	0.6010409	59.82058
2018-12-15 15:00:00	0.6010409	59.82058
2018-12-15 15:10:00	0.6010409	59.82058
2018-12-15 15:20:00	0.6010409	59.82058
2018-12-15 15:30:00	0.6010409	59.82058
2018-12-15 15:40:00	0.6010409	59.82058
2018-12-15 15:50:00	0.6010409	59.82058
2018-12-15 16:00:00	0.6010409	59.82058
2018-12-15 16:10:00	0.6010409	59.82058
2018-12-15 16:20:00	0.6010409	59.82058
2018-12-15 16:30:00	0.6010409	59.82058
2018-12-15 16:40:00	0.6010409	59.82058
2018-12-15 16:50:00	0.6010409	59.82058
2018-12-15 17:00:00	0.6010409	59.82058
2018-12-15 17:10:00	0.6010409	59.82058
2018-12-15 17:20:00	0.6010409	59.82058
2018-12-15 17:30:00	0.6010409	59.82058
2018-12-15 17:40:00	0.6010409	59.82058
2018-12-15 17:50:00	0.6010409	59.82058
2018-12-15 18:00:00	0.6010409	59.82058
2018-12-15 18:10:00	0.6010409	59.82058
2018-12-15 18:20:00	0.6010409	59.82058
2018-12-15 18:30:00	0.6010409	59.82058
2018-12-15 18:40:00	0.6010409	59.82058
2018-12-15 18:50:00	0.6010409	59.82058
2018-12-15 19:00:00	0.6010409	59.82058
2018-12-15 19:10:00	0.6010409	59.82058
2018-12-15 19:20:00	0.6010409	59.82058
2018-12-15 19:30:00	0.6010409	59.82058
2018-12-15 19:40:00	0.6010409	59.82058
2018-12-15 19:50:00	0.6010409	59.82058
2018-12-15 20:00:00	0.6010409	59.82058
2018-12-15 20:10:00	0.6010409	59.82058
2018-12-15 20:20:00	0.6010409	59.82058
2018-12-15 20:30:00	0.6010409	59.82058
2018-12-15 20:40:00	0.6010409	59.82058
2018-12-15 20:50:00	0.6010409	59.82058
2018-12-15 21:00:00	0.6010409	59.82058
2018-12-15 21:10:00	0.6010409	59.82058
2018-12-15 21:20:00	0.6010409	59.82058
2018-12-15 21:30:00	0.6010409	59.82058
2018-12-15 21:40:00	0.6010409	59.82058
2018-12-15 21:50:00	0.6010409	59.82058
2018-12-15 22:00:00	0.6010409	59.82058
2018-12-15 22:10:00	0.6010409	59.82058
2018-12-15 22:20:00	0.6010409	59.82058
2018-12-15 22:30:00	0.6010409	59.82058
2018-12-15 22:40:00	0.6010409	59.82058
2018-12-15 22:50:00	0.6010409	59.82058
2018-12-15 23:00:00	0.6010409	59.82058
2018-12-15 23:10:00	0.6010409	59.82058
2018-12-15 23:20:00	0.6010409	59.82058
2018-12-15 23:30:00	0.6010409	59.82058
2018-12-15 23:40:00	0.6010409	59.82058
2018-12-15 23:50:00	0.6010409	59.82058

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the NO2 concentration network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfNO24$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfNO24b<-merge(dfNO24, by10, by='by10', all.x=TRUE)

for(i in unique(dfNO24b$by10)){

subset <- 
      dfNO24b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT NO2 concentration Network on 2012-12-15,

Score 3 = 20.0 : At any given 10-minute time interval in any given node, an average 20.0% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 59.8 : At any given 10-minute time interval in any given node, reliable data is collected for an average 59.8% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

O3 Concentration

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable O3 concentration data in the AoT network vary across the day’s duration.

dfO3%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfO33

ggplot(data=dfO33, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfO33$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable O3 concentration data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable O3 concentration data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfO33%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	16	18.60465	20.59109
2018-12-15 00:10:00	16	18.60465	20.59109
2018-12-15 00:20:00	17	19.76744	20.59109
2018-12-15 00:30:00	17	19.76744	20.59109
2018-12-15 00:40:00	16	18.60465	20.59109
2018-12-15 00:50:00	17	19.76744	20.59109
2018-12-15 01:00:00	17	19.76744	20.59109
2018-12-15 01:10:00	17	19.76744	20.59109
2018-12-15 01:20:00	17	19.76744	20.59109
2018-12-15 01:30:00	17	19.76744	20.59109
2018-12-15 01:40:00	17	19.76744	20.59109
2018-12-15 01:50:00	17	19.76744	20.59109
2018-12-15 02:00:00	17	19.76744	20.59109
2018-12-15 02:10:00	17	19.76744	20.59109
2018-12-15 02:20:00	17	19.76744	20.59109
2018-12-15 02:30:00	17	19.76744	20.59109
2018-12-15 02:40:00	17	19.76744	20.59109
2018-12-15 02:50:00	17	19.76744	20.59109
2018-12-15 03:00:00	17	19.76744	20.59109
2018-12-15 03:10:00	18	20.93023	20.59109
2018-12-15 03:20:00	18	20.93023	20.59109
2018-12-15 03:30:00	18	20.93023	20.59109
2018-12-15 03:40:00	18	20.93023	20.59109
2018-12-15 03:50:00	18	20.93023	20.59109
2018-12-15 04:00:00	18	20.93023	20.59109
2018-12-15 04:10:00	18	20.93023	20.59109
2018-12-15 04:20:00	18	20.93023	20.59109
2018-12-15 04:30:00	18	20.93023	20.59109
2018-12-15 04:40:00	18	20.93023	20.59109
2018-12-15 04:50:00	18	20.93023	20.59109
2018-12-15 05:00:00	18	20.93023	20.59109
2018-12-15 05:10:00	18	20.93023	20.59109
2018-12-15 05:20:00	18	20.93023	20.59109
2018-12-15 05:30:00	18	20.93023	20.59109
2018-12-15 05:40:00	18	20.93023	20.59109
2018-12-15 05:50:00	18	20.93023	20.59109
2018-12-15 06:00:00	18	20.93023	20.59109
2018-12-15 06:10:00	18	20.93023	20.59109
2018-12-15 06:20:00	18	20.93023	20.59109
2018-12-15 06:30:00	18	20.93023	20.59109
2018-12-15 06:40:00	18	20.93023	20.59109
2018-12-15 06:50:00	18	20.93023	20.59109
2018-12-15 07:00:00	18	20.93023	20.59109
2018-12-15 07:10:00	18	20.93023	20.59109
2018-12-15 07:20:00	18	20.93023	20.59109
2018-12-15 07:30:00	18	20.93023	20.59109
2018-12-15 07:40:00	18	20.93023	20.59109
2018-12-15 07:50:00	18	20.93023	20.59109
2018-12-15 08:00:00	18	20.93023	20.59109
2018-12-15 08:10:00	18	20.93023	20.59109
2018-12-15 08:20:00	18	20.93023	20.59109
2018-12-15 08:30:00	18	20.93023	20.59109
2018-12-15 08:40:00	18	20.93023	20.59109
2018-12-15 08:50:00	18	20.93023	20.59109
2018-12-15 09:00:00	18	20.93023	20.59109
2018-12-15 09:10:00	18	20.93023	20.59109
2018-12-15 09:20:00	18	20.93023	20.59109
2018-12-15 09:30:00	18	20.93023	20.59109
2018-12-15 09:40:00	18	20.93023	20.59109
2018-12-15 09:50:00	18	20.93023	20.59109
2018-12-15 10:00:00	18	20.93023	20.59109
2018-12-15 10:10:00	18	20.93023	20.59109
2018-12-15 10:20:00	17	19.76744	20.59109
2018-12-15 10:30:00	17	19.76744	20.59109
2018-12-15 10:40:00	16	18.60465	20.59109
2018-12-15 10:50:00	17	19.76744	20.59109
2018-12-15 11:00:00	17	19.76744	20.59109
2018-12-15 11:10:00	17	19.76744	20.59109
2018-12-15 11:20:00	18	20.93023	20.59109
2018-12-15 11:30:00	18	20.93023	20.59109
2018-12-15 11:40:00	18	20.93023	20.59109
2018-12-15 11:50:00	17	19.76744	20.59109
2018-12-15 12:00:00	18	20.93023	20.59109
2018-12-15 12:10:00	17	19.76744	20.59109
2018-12-15 12:20:00	18	20.93023	20.59109
2018-12-15 12:30:00	18	20.93023	20.59109
2018-12-15 12:40:00	18	20.93023	20.59109
2018-12-15 12:50:00	18	20.93023	20.59109
2018-12-15 13:00:00	18	20.93023	20.59109
2018-12-15 13:10:00	18	20.93023	20.59109
2018-12-15 13:20:00	18	20.93023	20.59109
2018-12-15 13:30:00	18	20.93023	20.59109
2018-12-15 13:40:00	18	20.93023	20.59109
2018-12-15 13:50:00	18	20.93023	20.59109
2018-12-15 14:00:00	18	20.93023	20.59109
2018-12-15 14:10:00	18	20.93023	20.59109
2018-12-15 14:20:00	18	20.93023	20.59109
2018-12-15 14:30:00	18	20.93023	20.59109
2018-12-15 14:40:00	18	20.93023	20.59109
2018-12-15 14:50:00	18	20.93023	20.59109
2018-12-15 15:00:00	17	19.76744	20.59109
2018-12-15 15:10:00	17	19.76744	20.59109
2018-12-15 15:20:00	17	19.76744	20.59109
2018-12-15 15:30:00	17	19.76744	20.59109
2018-12-15 15:40:00	17	19.76744	20.59109
2018-12-15 15:50:00	17	19.76744	20.59109
2018-12-15 16:00:00	17	19.76744	20.59109
2018-12-15 16:10:00	17	19.76744	20.59109
2018-12-15 16:20:00	17	19.76744	20.59109
2018-12-15 16:30:00	17	19.76744	20.59109
2018-12-15 16:40:00	17	19.76744	20.59109
2018-12-15 16:50:00	18	20.93023	20.59109
2018-12-15 17:00:00	18	20.93023	20.59109
2018-12-15 17:10:00	18	20.93023	20.59109
2018-12-15 17:20:00	18	20.93023	20.59109
2018-12-15 17:30:00	18	20.93023	20.59109
2018-12-15 17:40:00	18	20.93023	20.59109
2018-12-15 17:50:00	18	20.93023	20.59109
2018-12-15 18:00:00	18	20.93023	20.59109
2018-12-15 18:10:00	18	20.93023	20.59109
2018-12-15 18:20:00	18	20.93023	20.59109
2018-12-15 18:30:00	18	20.93023	20.59109
2018-12-15 18:40:00	18	20.93023	20.59109
2018-12-15 18:50:00	18	20.93023	20.59109
2018-12-15 19:00:00	18	20.93023	20.59109
2018-12-15 19:10:00	18	20.93023	20.59109
2018-12-15 19:20:00	18	20.93023	20.59109
2018-12-15 19:30:00	18	20.93023	20.59109
2018-12-15 19:40:00	18	20.93023	20.59109
2018-12-15 19:50:00	18	20.93023	20.59109
2018-12-15 20:00:00	18	20.93023	20.59109
2018-12-15 20:10:00	18	20.93023	20.59109
2018-12-15 20:20:00	18	20.93023	20.59109
2018-12-15 20:30:00	18	20.93023	20.59109
2018-12-15 20:40:00	18	20.93023	20.59109
2018-12-15 20:50:00	18	20.93023	20.59109
2018-12-15 21:00:00	18	20.93023	20.59109
2018-12-15 21:10:00	18	20.93023	20.59109
2018-12-15 21:20:00	18	20.93023	20.59109
2018-12-15 21:30:00	18	20.93023	20.59109
2018-12-15 21:40:00	18	20.93023	20.59109
2018-12-15 21:50:00	18	20.93023	20.59109
2018-12-15 22:00:00	18	20.93023	20.59109
2018-12-15 22:10:00	18	20.93023	20.59109
2018-12-15 22:20:00	18	20.93023	20.59109
2018-12-15 22:30:00	18	20.93023	20.59109
2018-12-15 22:40:00	18	20.93023	20.59109
2018-12-15 22:50:00	18	20.93023	20.59109
2018-12-15 23:00:00	18	20.93023	20.59109
2018-12-15 23:10:00	18	20.93023	20.59109
2018-12-15 23:20:00	18	20.93023	20.59109
2018-12-15 23:30:00	18	20.93023	20.59109
2018-12-15 23:40:00	18	20.93023	20.59109
2018-12-15 23:50:00	18	20.93023	20.59109

dfO33%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfO3%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfO34

dfO34%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

by10	node_id	lat	lon
2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:00:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:00:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:00:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:00:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:00:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:00:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:00:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:00:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:00:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:00:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:10:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:10:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15 00:10:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:10:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:10:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:10:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:10:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:10:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:10:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:10:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:10:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:10:00	001e0610bc10	41.73631	-87.62418
2018-12-15 00:10:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:10:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:10:00	001e06113ace	41.83107	-87.61730
2018-12-15 00:20:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:20:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:20:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:20:00	001e0610eef2	41.96526	-87.66672
2018-12-15 00:20:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:20:00	001e061130f4	41.89616	-87.66239
2018-12-15 00:20:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:20:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:20:00	001e061144c0	41.76412	-87.72242

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfO34a<-NULL

for(i in unique(dfO34$by10)){
  
subset <- 
      dfO34%>%
      filter(by10==i)

if(nrow(subset)>1){
  subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
  subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
  P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    dfO34a<-rbind(dfO34a, df1)
}else{
  df1<-NULL
  df1$by10<-i
  df1$AreaProp<-0
  df1<-as.data.frame(df1)
  dfO34a<-rbind(dfO34a, df1)
}
  
}

dfO34a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfO34a
dfO34a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.4364638	59.04799
2018-12-15 00:10:00	0.4364638	59.04799
2018-12-15 00:20:00	0.6010409	59.04799
2018-12-15 00:30:00	0.6010409	59.04799
2018-12-15 00:40:00	0.4364638	59.04799
2018-12-15 00:50:00	0.6010409	59.04799
2018-12-15 01:00:00	0.6010409	59.04799
2018-12-15 01:10:00	0.6010409	59.04799
2018-12-15 01:20:00	0.5639259	59.04799
2018-12-15 01:30:00	0.5639259	59.04799
2018-12-15 01:40:00	0.5639259	59.04799
2018-12-15 01:50:00	0.5639259	59.04799
2018-12-15 02:00:00	0.5639259	59.04799
2018-12-15 02:10:00	0.5639259	59.04799
2018-12-15 02:20:00	0.5639259	59.04799
2018-12-15 02:30:00	0.5639259	59.04799
2018-12-15 02:40:00	0.5639259	59.04799
2018-12-15 02:50:00	0.5639259	59.04799
2018-12-15 03:00:00	0.5639259	59.04799
2018-12-15 03:10:00	0.6010409	59.04799
2018-12-15 03:20:00	0.6010409	59.04799
2018-12-15 03:30:00	0.6010409	59.04799
2018-12-15 03:40:00	0.6010409	59.04799
2018-12-15 03:50:00	0.6010409	59.04799
2018-12-15 04:00:00	0.6010409	59.04799
2018-12-15 04:10:00	0.6010409	59.04799
2018-12-15 04:20:00	0.6010409	59.04799
2018-12-15 04:30:00	0.6010409	59.04799
2018-12-15 04:40:00	0.6010409	59.04799
2018-12-15 04:50:00	0.6010409	59.04799
2018-12-15 05:00:00	0.6010409	59.04799
2018-12-15 05:10:00	0.6010409	59.04799
2018-12-15 05:20:00	0.6010409	59.04799
2018-12-15 05:30:00	0.6010409	59.04799
2018-12-15 05:40:00	0.6010409	59.04799
2018-12-15 05:50:00	0.6010409	59.04799
2018-12-15 06:00:00	0.6010409	59.04799
2018-12-15 06:10:00	0.6010409	59.04799
2018-12-15 06:20:00	0.6010409	59.04799
2018-12-15 06:30:00	0.6010409	59.04799
2018-12-15 06:40:00	0.6010409	59.04799
2018-12-15 06:50:00	0.6010409	59.04799
2018-12-15 07:00:00	0.6010409	59.04799
2018-12-15 07:10:00	0.6010409	59.04799
2018-12-15 07:20:00	0.6010409	59.04799
2018-12-15 07:30:00	0.6010409	59.04799
2018-12-15 07:40:00	0.6010409	59.04799
2018-12-15 07:50:00	0.6010409	59.04799
2018-12-15 08:00:00	0.6010409	59.04799
2018-12-15 08:10:00	0.6010409	59.04799
2018-12-15 08:20:00	0.6010409	59.04799
2018-12-15 08:30:00	0.6010409	59.04799
2018-12-15 08:40:00	0.6010409	59.04799
2018-12-15 08:50:00	0.6010409	59.04799
2018-12-15 09:00:00	0.6010409	59.04799
2018-12-15 09:10:00	0.6010409	59.04799
2018-12-15 09:20:00	0.6010409	59.04799
2018-12-15 09:30:00	0.6010409	59.04799
2018-12-15 09:40:00	0.6010409	59.04799
2018-12-15 09:50:00	0.6010409	59.04799
2018-12-15 10:00:00	0.6010409	59.04799
2018-12-15 10:10:00	0.6010409	59.04799
2018-12-15 10:20:00	0.6010409	59.04799
2018-12-15 10:30:00	0.6010409	59.04799
2018-12-15 10:40:00	0.4364638	59.04799
2018-12-15 10:50:00	0.4496355	59.04799
2018-12-15 11:00:00	0.4496355	59.04799
2018-12-15 11:10:00	0.4496355	59.04799
2018-12-15 11:20:00	0.6010409	59.04799
2018-12-15 11:30:00	0.6010409	59.04799
2018-12-15 11:40:00	0.6010409	59.04799
2018-12-15 11:50:00	0.6010409	59.04799
2018-12-15 12:00:00	0.6010409	59.04799
2018-12-15 12:10:00	0.6010409	59.04799
2018-12-15 12:20:00	0.6010409	59.04799
2018-12-15 12:30:00	0.6010409	59.04799
2018-12-15 12:40:00	0.6010409	59.04799
2018-12-15 12:50:00	0.6010409	59.04799
2018-12-15 13:00:00	0.6010409	59.04799
2018-12-15 13:10:00	0.6010409	59.04799
2018-12-15 13:20:00	0.6010409	59.04799
2018-12-15 13:30:00	0.6010409	59.04799
2018-12-15 13:40:00	0.6010409	59.04799
2018-12-15 13:50:00	0.6010409	59.04799
2018-12-15 14:00:00	0.6010409	59.04799
2018-12-15 14:10:00	0.6010409	59.04799
2018-12-15 14:20:00	0.6010409	59.04799
2018-12-15 14:30:00	0.6010409	59.04799
2018-12-15 14:40:00	0.6010409	59.04799
2018-12-15 14:50:00	0.6010409	59.04799
2018-12-15 15:00:00	0.6010409	59.04799
2018-12-15 15:10:00	0.6010409	59.04799
2018-12-15 15:20:00	0.6010409	59.04799
2018-12-15 15:30:00	0.6010409	59.04799
2018-12-15 15:40:00	0.6010409	59.04799
2018-12-15 15:50:00	0.6010409	59.04799
2018-12-15 16:00:00	0.6010409	59.04799
2018-12-15 16:10:00	0.6010409	59.04799
2018-12-15 16:20:00	0.6010409	59.04799
2018-12-15 16:30:00	0.6010409	59.04799
2018-12-15 16:40:00	0.6010409	59.04799
2018-12-15 16:50:00	0.6010409	59.04799
2018-12-15 17:00:00	0.6010409	59.04799
2018-12-15 17:10:00	0.6010409	59.04799
2018-12-15 17:20:00	0.6010409	59.04799
2018-12-15 17:30:00	0.6010409	59.04799
2018-12-15 17:40:00	0.6010409	59.04799
2018-12-15 17:50:00	0.6010409	59.04799
2018-12-15 18:00:00	0.6010409	59.04799
2018-12-15 18:10:00	0.6010409	59.04799
2018-12-15 18:20:00	0.6010409	59.04799
2018-12-15 18:30:00	0.6010409	59.04799
2018-12-15 18:40:00	0.6010409	59.04799
2018-12-15 18:50:00	0.6010409	59.04799
2018-12-15 19:00:00	0.6010409	59.04799
2018-12-15 19:10:00	0.6010409	59.04799
2018-12-15 19:20:00	0.6010409	59.04799
2018-12-15 19:30:00	0.6010409	59.04799
2018-12-15 19:40:00	0.6010409	59.04799
2018-12-15 19:50:00	0.6010409	59.04799
2018-12-15 20:00:00	0.6010409	59.04799
2018-12-15 20:10:00	0.6010409	59.04799
2018-12-15 20:20:00	0.6010409	59.04799
2018-12-15 20:30:00	0.6010409	59.04799
2018-12-15 20:40:00	0.6010409	59.04799
2018-12-15 20:50:00	0.6010409	59.04799
2018-12-15 21:00:00	0.6010409	59.04799
2018-12-15 21:10:00	0.6010409	59.04799
2018-12-15 21:20:00	0.6010409	59.04799
2018-12-15 21:30:00	0.6010409	59.04799
2018-12-15 21:40:00	0.6010409	59.04799
2018-12-15 21:50:00	0.6010409	59.04799
2018-12-15 22:00:00	0.6010409	59.04799
2018-12-15 22:10:00	0.6010409	59.04799
2018-12-15 22:20:00	0.6010409	59.04799
2018-12-15 22:30:00	0.6010409	59.04799
2018-12-15 22:40:00	0.6010409	59.04799
2018-12-15 22:50:00	0.6010409	59.04799
2018-12-15 23:00:00	0.6010409	59.04799
2018-12-15 23:10:00	0.6010409	59.04799
2018-12-15 23:20:00	0.6010409	59.04799
2018-12-15 23:30:00	0.6010409	59.04799
2018-12-15 23:40:00	0.6010409	59.04799
2018-12-15 23:50:00	0.6010409	59.04799

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the O3 concentration network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfO34$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfO34b<-merge(dfO34, by10, by='by10', all.x=TRUE)

for(i in unique(dfO34b$by10)){

subset <- 
      dfO34b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT O3 concentration Network on 2012-12-15,

Score 3 = 20.6 : At any given 10-minute time interval in any given node, an average 20.6% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 59.0 : At any given 10-minute time interval in any given node, reliable data is collected for an average 59.0% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

SO2 Concentration

Constructing Score 3

Before we calculate the relevant proportions, it is useful to observe how the absolute number of nodes collecting reliable data in the network varies across the different time-intervals of the day. The figure below shows how the number of nodes collecting reliable SO2 concentration data in the AoT network vary across the day’s duration.

dfSO2%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))->dfSO23

ggplot(data=dfSO23, aes(x=by10, y=count))+
  scale_x_discrete(breaks=c('2018-12-15 00:00:00', 
                            '2018-12-15 04:00:00',
                            '2018-12-15 08:00:00', 
                            '2018-12-15 12:00:00', 
                            '2018-12-15 16:00:00', 
                            '2018-12-15 20:00:00'), 
                   labels=c('00:00',
                            '04:00',
                            '08:00',
                            '12:00',
                            '16:00',
                            '20:00'
                            ))+
  geom_col(fill='indianred', col=NA)+
  ylim(0, 86)+
  geom_hline(yintercept=86, col='black', size=1)+
  geom_hline(yintercept=c(1:85), col='white')+
  geom_vline(aes(xintercept=as.numeric(by10)), col='white', size=2)+
  geom_hline(aes(yintercept=mean(count)), col='black', size=1)+
  geom_text(aes(y=86, x=0), label='Full network size: 86 nodes', size=4, hjust=-1, vjust=-1)+
  geom_text(aes(y=mean(count), x=0), label=paste('Average network size:', round(mean(dfSO23$count)),  'nodes', sep=""), size=4, hjust=-1, vjust=-1)+
  labs(x='Time', y='Number of Active Nodes', 
       title='Number of nodes collecting reliable SO2 concentration data throughout the day',
       subtitle='Each x-axis tick represents a 10-minute time interval')+
  theme(plot.title=element_text(face='bold', size=20),
        text=element_text(size=20),
        legend.position = 'bottom', 
        axis.text.x=element_text(angle=90, hjust=1))

The table below shows the number of nodes collecting reliable SO2 concentration data at each time interval during the day, the proportion of these active nodes in relation to the full network of 86 nodes, and the average proportion during the full day. This average proportion is Score 3.

dfSO23%>%
  arrange(by10)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

by10	count	propActive	Score 3
2018-12-15 00:00:00	12	13.95349	14.76906
2018-12-15 00:10:00	12	13.95349	14.76906
2018-12-15 00:20:00	13	15.11628	14.76906
2018-12-15 00:30:00	13	15.11628	14.76906
2018-12-15 00:40:00	13	15.11628	14.76906
2018-12-15 00:50:00	13	15.11628	14.76906
2018-12-15 01:00:00	12	13.95349	14.76906
2018-12-15 01:10:00	12	13.95349	14.76906
2018-12-15 01:20:00	12	13.95349	14.76906
2018-12-15 01:30:00	12	13.95349	14.76906
2018-12-15 01:40:00	12	13.95349	14.76906
2018-12-15 01:50:00	12	13.95349	14.76906
2018-12-15 02:00:00	12	13.95349	14.76906
2018-12-15 02:10:00	12	13.95349	14.76906
2018-12-15 02:20:00	12	13.95349	14.76906
2018-12-15 02:30:00	12	13.95349	14.76906
2018-12-15 02:40:00	12	13.95349	14.76906
2018-12-15 02:50:00	12	13.95349	14.76906
2018-12-15 03:00:00	12	13.95349	14.76906
2018-12-15 03:10:00	13	15.11628	14.76906
2018-12-15 03:20:00	12	13.95349	14.76906
2018-12-15 03:30:00	13	15.11628	14.76906
2018-12-15 03:40:00	13	15.11628	14.76906
2018-12-15 03:50:00	12	13.95349	14.76906
2018-12-15 04:00:00	12	13.95349	14.76906
2018-12-15 04:10:00	12	13.95349	14.76906
2018-12-15 04:20:00	13	15.11628	14.76906
2018-12-15 04:30:00	12	13.95349	14.76906
2018-12-15 04:40:00	12	13.95349	14.76906
2018-12-15 04:50:00	13	15.11628	14.76906
2018-12-15 05:00:00	12	13.95349	14.76906
2018-12-15 05:10:00	13	15.11628	14.76906
2018-12-15 05:20:00	12	13.95349	14.76906
2018-12-15 05:30:00	13	15.11628	14.76906
2018-12-15 05:40:00	13	15.11628	14.76906
2018-12-15 05:50:00	13	15.11628	14.76906
2018-12-15 06:00:00	13	15.11628	14.76906
2018-12-15 06:10:00	13	15.11628	14.76906
2018-12-15 06:20:00	13	15.11628	14.76906
2018-12-15 06:30:00	13	15.11628	14.76906
2018-12-15 06:40:00	13	15.11628	14.76906
2018-12-15 06:50:00	12	13.95349	14.76906
2018-12-15 07:00:00	13	15.11628	14.76906
2018-12-15 07:10:00	12	13.95349	14.76906
2018-12-15 07:20:00	13	15.11628	14.76906
2018-12-15 07:30:00	12	13.95349	14.76906
2018-12-15 07:40:00	12	13.95349	14.76906
2018-12-15 07:50:00	12	13.95349	14.76906
2018-12-15 08:00:00	12	13.95349	14.76906
2018-12-15 08:10:00	12	13.95349	14.76906
2018-12-15 08:20:00	12	13.95349	14.76906
2018-12-15 08:30:00	12	13.95349	14.76906
2018-12-15 08:40:00	12	13.95349	14.76906
2018-12-15 08:50:00	13	15.11628	14.76906
2018-12-15 09:00:00	12	13.95349	14.76906
2018-12-15 09:10:00	12	13.95349	14.76906
2018-12-15 09:20:00	11	12.79070	14.76906
2018-12-15 09:30:00	11	12.79070	14.76906
2018-12-15 09:40:00	11	12.79070	14.76906
2018-12-15 09:50:00	11	12.79070	14.76906
2018-12-15 10:00:00	11	12.79070	14.76906
2018-12-15 10:10:00	10	11.62791	14.76906
2018-12-15 10:20:00	12	13.95349	14.76906
2018-12-15 10:30:00	10	11.62791	14.76906
2018-12-15 10:40:00	11	12.79070	14.76906
2018-12-15 10:50:00	11	12.79070	14.76906
2018-12-15 11:00:00	11	12.79070	14.76906
2018-12-15 11:10:00	11	12.79070	14.76906
2018-12-15 11:20:00	11	12.79070	14.76906
2018-12-15 11:30:00	11	12.79070	14.76906
2018-12-15 11:40:00	12	13.95349	14.76906
2018-12-15 11:50:00	12	13.95349	14.76906
2018-12-15 12:00:00	12	13.95349	14.76906
2018-12-15 12:10:00	13	15.11628	14.76906
2018-12-15 12:20:00	12	13.95349	14.76906
2018-12-15 12:30:00	12	13.95349	14.76906
2018-12-15 12:40:00	11	12.79070	14.76906
2018-12-15 12:50:00	11	12.79070	14.76906
2018-12-15 13:00:00	11	12.79070	14.76906
2018-12-15 13:10:00	13	15.11628	14.76906
2018-12-15 13:20:00	12	13.95349	14.76906
2018-12-15 13:30:00	13	15.11628	14.76906
2018-12-15 13:40:00	13	15.11628	14.76906
2018-12-15 13:50:00	13	15.11628	14.76906
2018-12-15 14:00:00	13	15.11628	14.76906
2018-12-15 14:10:00	12	13.95349	14.76906
2018-12-15 14:20:00	14	16.27907	14.76906
2018-12-15 14:30:00	13	15.11628	14.76906
2018-12-15 14:40:00	13	15.11628	14.76906
2018-12-15 14:50:00	14	16.27907	14.76906
2018-12-15 15:00:00	11	12.79070	14.76906
2018-12-15 15:10:00	12	13.95349	14.76906
2018-12-15 15:20:00	13	15.11628	14.76906
2018-12-15 15:30:00	12	13.95349	14.76906
2018-12-15 15:40:00	13	15.11628	14.76906
2018-12-15 15:50:00	13	15.11628	14.76906
2018-12-15 16:00:00	13	15.11628	14.76906
2018-12-15 16:10:00	13	15.11628	14.76906
2018-12-15 16:20:00	13	15.11628	14.76906
2018-12-15 16:30:00	13	15.11628	14.76906
2018-12-15 16:40:00	13	15.11628	14.76906
2018-12-15 16:50:00	14	16.27907	14.76906
2018-12-15 17:00:00	14	16.27907	14.76906
2018-12-15 17:10:00	14	16.27907	14.76906
2018-12-15 17:20:00	14	16.27907	14.76906
2018-12-15 17:30:00	14	16.27907	14.76906
2018-12-15 17:40:00	14	16.27907	14.76906
2018-12-15 17:50:00	14	16.27907	14.76906
2018-12-15 18:00:00	14	16.27907	14.76906
2018-12-15 18:10:00	14	16.27907	14.76906
2018-12-15 18:20:00	14	16.27907	14.76906
2018-12-15 18:30:00	14	16.27907	14.76906
2018-12-15 18:40:00	14	16.27907	14.76906
2018-12-15 18:50:00	14	16.27907	14.76906
2018-12-15 19:00:00	14	16.27907	14.76906
2018-12-15 19:10:00	14	16.27907	14.76906
2018-12-15 19:20:00	14	16.27907	14.76906
2018-12-15 19:30:00	14	16.27907	14.76906
2018-12-15 19:40:00	13	15.11628	14.76906
2018-12-15 19:50:00	14	16.27907	14.76906
2018-12-15 20:00:00	14	16.27907	14.76906
2018-12-15 20:10:00	14	16.27907	14.76906
2018-12-15 20:20:00	14	16.27907	14.76906
2018-12-15 20:30:00	14	16.27907	14.76906
2018-12-15 20:40:00	14	16.27907	14.76906
2018-12-15 20:50:00	13	15.11628	14.76906
2018-12-15 21:00:00	14	16.27907	14.76906
2018-12-15 21:10:00	14	16.27907	14.76906
2018-12-15 21:20:00	15	17.44186	14.76906
2018-12-15 21:30:00	14	16.27907	14.76906
2018-12-15 21:40:00	14	16.27907	14.76906
2018-12-15 21:50:00	14	16.27907	14.76906
2018-12-15 22:00:00	15	17.44186	14.76906
2018-12-15 22:10:00	14	16.27907	14.76906
2018-12-15 22:20:00	15	17.44186	14.76906
2018-12-15 22:30:00	14	16.27907	14.76906
2018-12-15 22:40:00	13	15.11628	14.76906
2018-12-15 22:50:00	13	15.11628	14.76906
2018-12-15 23:00:00	14	16.27907	14.76906
2018-12-15 23:10:00	12	13.95349	14.76906
2018-12-15 23:20:00	12	13.95349	14.76906
2018-12-15 23:30:00	13	15.11628	14.76906
2018-12-15 23:40:00	13	15.11628	14.76906
2018-12-15 23:50:00	14	16.27907	14.76906

dfSO23%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 3`), size = 1)+
  geom_text(aes(x= `Score 3`, y=0), label='Score 3:\n Average Proportion of\nNetwork Active', size = 4, vjust= -2, hjust=1.4)+
  labs(x='Proportions of active nodes',
       y='Density',
       title='Distribution of Proportions of Active Nodes')+
  xlim(0, 100)+
  plotTheme()

Constructing Score 4

The table below shows this result. In the table, the 41 latitude and longitude locations of the 41 nodes active at 12 midnight on 2012-12-15 are listed.

dfSO2%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> dfSO24

dfSO24%>%
  arrange(by10)%>%
  head(41)%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped'))%>%
  scroll_box(height='300px')

by10	node_id	lat	lon
2018-12-15 00:00:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:00:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:00:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:00:00	001e06113107	41.75114	-87.71299
2018-12-15 00:00:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:00:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:00:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:00:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:00:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:00:00	001e06114503	41.66608	-87.53937
2018-12-15 00:00:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:00:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:10:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:10:00	001e06113107	41.75114	-87.71299
2018-12-15 00:10:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:10:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:10:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:10:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:10:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:10:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:10:00	001e06114503	41.66608	-87.53937
2018-12-15 00:10:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:10:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:10:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:20:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:20:00	001e0610ba15	41.72246	-87.57535
2018-12-15 00:20:00	001e0610e537	41.96162	-87.66595
2018-12-15 00:20:00	001e06113cf1	41.88469	-87.62786
2018-12-15 00:20:00	001e061144c0	41.76412	-87.72242
2018-12-15 00:20:00	001e061146bc	41.91873	-87.66826
2018-12-15 00:20:00	001e06113107	41.75114	-87.71299
2018-12-15 00:20:00	001e06114503	41.66608	-87.53937
2018-12-15 00:20:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:20:00	001e0610ee43	41.78861	-87.59871
2018-12-15 00:20:00	001e0610f6db	41.79133	-87.59868
2018-12-15 00:20:00	001e0610ba46	41.87838	-87.62768
2018-12-15 00:20:00	001e06114fd4	41.79448	-87.61596
2018-12-15 00:30:00	001e0610ba13	41.75124	-87.71299
2018-12-15 00:30:00	001e06113107	41.75114	-87.71299
2018-12-15 00:30:00	001e0610f05c	41.92490	-87.68770
2018-12-15 00:30:00	001e0610ba15	41.72246	-87.57535

chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

dfSO24a<-NULL

for(i in unique(dfSO24$by10)){
  
subset <- 
      dfSO24%>%
      filter(by10==i)

if(nrow(subset)>1){
  subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
  subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
  P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    dfSO24a<-rbind(dfSO24a, df1)
}else{
  df1<-NULL
  df1$by10<-i
  df1$AreaProp<-0
  df1<-as.data.frame(df1)
  dfSO24a<-rbind(dfSO24a, df1)
}
  
}

dfSO24a%>%
  mutate(`Score 4`= 100*mean(AreaProp))->dfSO24a
dfSO24a%>%
  head(144)%>%
  kable()%>%
  kable_styling(bootstrap_options = 'striped')%>%
  scroll_box(height = "300px")

by10	AreaProp	Score 4
2018-12-15 00:00:00	0.5961605	57.4354
2018-12-15 00:10:00	0.5961605	57.4354
2018-12-15 00:20:00	0.5961605	57.4354
2018-12-15 00:30:00	0.5961605	57.4354
2018-12-15 00:40:00	0.5961605	57.4354
2018-12-15 00:50:00	0.5961605	57.4354
2018-12-15 01:00:00	0.5961605	57.4354
2018-12-15 01:10:00	0.5961605	57.4354
2018-12-15 01:20:00	0.5595557	57.4354
2018-12-15 01:30:00	0.5595557	57.4354
2018-12-15 01:40:00	0.5595557	57.4354
2018-12-15 01:50:00	0.5595557	57.4354
2018-12-15 02:00:00	0.5595557	57.4354
2018-12-15 02:10:00	0.5595557	57.4354
2018-12-15 02:20:00	0.5595557	57.4354
2018-12-15 02:30:00	0.5595557	57.4354
2018-12-15 02:40:00	0.5595557	57.4354
2018-12-15 02:50:00	0.5595557	57.4354
2018-12-15 03:00:00	0.5595557	57.4354
2018-12-15 03:10:00	0.5961605	57.4354
2018-12-15 03:20:00	0.5961605	57.4354
2018-12-15 03:30:00	0.5961605	57.4354
2018-12-15 03:40:00	0.5961605	57.4354
2018-12-15 03:50:00	0.5961605	57.4354
2018-12-15 04:00:00	0.5961605	57.4354
2018-12-15 04:10:00	0.5961605	57.4354
2018-12-15 04:20:00	0.5961605	57.4354
2018-12-15 04:30:00	0.5961605	57.4354
2018-12-15 04:40:00	0.5961605	57.4354
2018-12-15 04:50:00	0.5961605	57.4354
2018-12-15 05:00:00	0.5961605	57.4354
2018-12-15 05:10:00	0.5961605	57.4354
2018-12-15 05:20:00	0.5961605	57.4354
2018-12-15 05:30:00	0.5961605	57.4354
2018-12-15 05:40:00	0.5961605	57.4354
2018-12-15 05:50:00	0.5961605	57.4354
2018-12-15 06:00:00	0.5961605	57.4354
2018-12-15 06:10:00	0.5961605	57.4354
2018-12-15 06:20:00	0.5961605	57.4354
2018-12-15 06:30:00	0.5961605	57.4354
2018-12-15 06:40:00	0.5961605	57.4354
2018-12-15 06:50:00	0.5961605	57.4354
2018-12-15 07:00:00	0.5961605	57.4354
2018-12-15 07:10:00	0.5961605	57.4354
2018-12-15 07:20:00	0.5961605	57.4354
2018-12-15 07:30:00	0.5961605	57.4354
2018-12-15 07:40:00	0.5961605	57.4354
2018-12-15 07:50:00	0.5961605	57.4354
2018-12-15 08:00:00	0.5961605	57.4354
2018-12-15 08:10:00	0.5961605	57.4354
2018-12-15 08:20:00	0.5961605	57.4354
2018-12-15 08:30:00	0.5961605	57.4354
2018-12-15 08:40:00	0.5961605	57.4354
2018-12-15 08:50:00	0.5961605	57.4354
2018-12-15 09:00:00	0.5961605	57.4354
2018-12-15 09:10:00	0.5961605	57.4354
2018-12-15 09:20:00	0.5595557	57.4354
2018-12-15 09:30:00	0.5595557	57.4354
2018-12-15 09:40:00	0.5595557	57.4354
2018-12-15 09:50:00	0.5595557	57.4354
2018-12-15 10:00:00	0.5595557	57.4354
2018-12-15 10:10:00	0.3996714	57.4354
2018-12-15 10:20:00	0.4315833	57.4354
2018-12-15 10:30:00	0.3996714	57.4354
2018-12-15 10:40:00	0.5595557	57.4354
2018-12-15 10:50:00	0.5595557	57.4354
2018-12-15 11:00:00	0.5595557	57.4354
2018-12-15 11:10:00	0.5595557	57.4354
2018-12-15 11:20:00	0.5595557	57.4354
2018-12-15 11:30:00	0.5595557	57.4354
2018-12-15 11:40:00	0.5595557	57.4354
2018-12-15 11:50:00	0.5961605	57.4354
2018-12-15 12:00:00	0.5961605	57.4354
2018-12-15 12:10:00	0.5961605	57.4354
2018-12-15 12:20:00	0.5961605	57.4354
2018-12-15 12:30:00	0.5961605	57.4354
2018-12-15 12:40:00	0.4315833	57.4354
2018-12-15 12:50:00	0.4315833	57.4354
2018-12-15 13:00:00	0.4315833	57.4354
2018-12-15 13:10:00	0.5961605	57.4354
2018-12-15 13:20:00	0.4315833	57.4354
2018-12-15 13:30:00	0.5961605	57.4354
2018-12-15 13:40:00	0.5961605	57.4354
2018-12-15 13:50:00	0.5961605	57.4354
2018-12-15 14:00:00	0.5961605	57.4354
2018-12-15 14:10:00	0.4315833	57.4354
2018-12-15 14:20:00	0.6010409	57.4354
2018-12-15 14:30:00	0.5961605	57.4354
2018-12-15 14:40:00	0.5961605	57.4354
2018-12-15 14:50:00	0.6010409	57.4354
2018-12-15 15:00:00	0.4315833	57.4354
2018-12-15 15:10:00	0.5961605	57.4354
2018-12-15 15:20:00	0.6010409	57.4354
2018-12-15 15:30:00	0.5961605	57.4354
2018-12-15 15:40:00	0.6010409	57.4354
2018-12-15 15:50:00	0.6010409	57.4354
2018-12-15 16:00:00	0.6010409	57.4354
2018-12-15 16:10:00	0.6010409	57.4354
2018-12-15 16:20:00	0.6010409	57.4354
2018-12-15 16:30:00	0.6010409	57.4354
2018-12-15 16:40:00	0.6010409	57.4354
2018-12-15 16:50:00	0.6010409	57.4354
2018-12-15 17:00:00	0.6010409	57.4354
2018-12-15 17:10:00	0.6010409	57.4354
2018-12-15 17:20:00	0.6010409	57.4354
2018-12-15 17:30:00	0.6010409	57.4354
2018-12-15 17:40:00	0.6010409	57.4354
2018-12-15 17:50:00	0.6010409	57.4354
2018-12-15 18:00:00	0.6010409	57.4354
2018-12-15 18:10:00	0.6010409	57.4354
2018-12-15 18:20:00	0.6010409	57.4354
2018-12-15 18:30:00	0.6010409	57.4354
2018-12-15 18:40:00	0.6010409	57.4354
2018-12-15 18:50:00	0.6010409	57.4354
2018-12-15 19:00:00	0.6010409	57.4354
2018-12-15 19:10:00	0.6010409	57.4354
2018-12-15 19:20:00	0.6010409	57.4354
2018-12-15 19:30:00	0.6010409	57.4354
2018-12-15 19:40:00	0.5961605	57.4354
2018-12-15 19:50:00	0.6010409	57.4354
2018-12-15 20:00:00	0.6010409	57.4354
2018-12-15 20:10:00	0.6010409	57.4354
2018-12-15 20:20:00	0.6010409	57.4354
2018-12-15 20:30:00	0.6010409	57.4354
2018-12-15 20:40:00	0.6010409	57.4354
2018-12-15 20:50:00	0.4364638	57.4354
2018-12-15 21:00:00	0.6010409	57.4354
2018-12-15 21:10:00	0.6010409	57.4354
2018-12-15 21:20:00	0.6010409	57.4354
2018-12-15 21:30:00	0.6010409	57.4354
2018-12-15 21:40:00	0.6010409	57.4354
2018-12-15 21:50:00	0.6010409	57.4354
2018-12-15 22:00:00	0.6010409	57.4354
2018-12-15 22:10:00	0.6010409	57.4354
2018-12-15 22:20:00	0.6010409	57.4354
2018-12-15 22:30:00	0.6010409	57.4354
2018-12-15 22:40:00	0.4364638	57.4354
2018-12-15 22:50:00	0.4364638	57.4354
2018-12-15 23:00:00	0.6010409	57.4354
2018-12-15 23:10:00	0.4315833	57.4354
2018-12-15 23:20:00	0.4315833	57.4354
2018-12-15 23:30:00	0.5961605	57.4354
2018-12-15 23:40:00	0.4315833	57.4354
2018-12-15 23:50:00	0.6010409	57.4354

We can also plot out the area covered relative to the whole of Chicago for a visual observation. The 4 plots below are obtained for 00:00 (top left), 07:00 (top right), 13:00(bottom left), and 19:00(bottom right). In the case of the SO2 concentration network on this day, the area covered remained constant throughout the day - the 4 plots are therefore identical.

p<-list()

by10<-as.data.frame(unique(dfSO24$by10))
by10$no<-row_number(by10)
colnames(by10)<-c('by10', 'no')

dfSO24b<-merge(dfSO24, by10, by='by10', all.x=TRUE)

for(i in unique(dfSO24b$by10)){

subset <- 
      dfSO24b%>%
      filter(by10==i)
    
subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    
P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    
Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))

#clip using chicago
Ps2<-gIntersection(Ps2, chig, byid=FALSE)

j<-unique(subset$no)

p[[j]]<-spplot(chig, colorkey=FALSE, col.regions='red', 
       sp.layout=list(list(Ps2, fill='blue', first=FALSE)))
}

library(gridExtra)
grid.arrange(p[[1]], p[[7]], p[[13]], p[[19]])

In summary, for the AoT SO2 concentration Network on 2012-12-15,

Score 3 = 14.8 : At any given 10-minute time interval in any given node, an average 14.8% of the nodes in the network is collecting reliable data. This is a low score.
Score 4 = 57.4 : At any given 10-minute time interval in any given node, reliable data is collected for an average 57.4% of Chicago’s area. Interpreting this in consideration of Score 3, this indicates that the low number of nodes is dispersed apart from one another in Chicago.

3.5 Scoring Temporal Reliability

In this section, the method of scoring Temporal Reliability is presented for each data parameter type for the day of 2012-12-15. There is 1 score obtained for this criteria. In summary, this section determines the number of 10-minute time intervals during which reliable data is collected for each node, and scores it as a proportion of the whole day, which consists of 144 of such time-intervals. The average proportion of day-duration node is active (Score 5) is the mean-average of the proportions calculated for all the nodes in the network.

The flowchart below illustrates the scoring process in this section:

Click on the tabs below to view the scores constructed for each data parameter.

Temperature

Constructing Score 5

The figure below presents the node activity of each node during the day of 2012-12-15. Each tile strip represents a 10-minute time interval period. It can be observed from the figure that most of the nodes are collecting reliable data throughout all the time intervals during the day. However, some experience periods of inactivity, where no data is collected at all.

dfTemp%>%
  filter(val_qual==1)%>%
  group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values='cornflowerblue',
                      name="Collecting Reliable Data",
                      labels=c("Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

The code block below translates the figure above into numerical proportions (propActive). The table then presents the duration in terms of hours for which each node is collecting reliable data (durationActive), the duration in terms of proportion of the day (propActive), and the average of these proportions (Score 5).

dfTemp%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfTemp5

dfTemp5%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	144	24.00000	100.00000	99.15312
001e0610ba15	144	24.00000	100.00000	99.15312
001e0610ba46	144	24.00000	100.00000	99.15312
001e0610bbf9	144	24.00000	100.00000	99.15312
001e0610bc10	144	24.00000	100.00000	99.15312
001e0610bc12	144	24.00000	100.00000	99.15312
001e0610e532	144	24.00000	100.00000	99.15312
001e0610e537	144	24.00000	100.00000	99.15312
001e0610e538	134	22.33333	93.05556	99.15312
001e0610e835	144	24.00000	100.00000	99.15312
001e0610ee33	144	24.00000	100.00000	99.15312
001e0610ee36	144	24.00000	100.00000	99.15312
001e0610ee43	144	24.00000	100.00000	99.15312
001e0610ee5d	144	24.00000	100.00000	99.15312
001e0610eef2	144	24.00000	100.00000	99.15312
001e0610eef4	144	24.00000	100.00000	99.15312
001e0610ef27	144	24.00000	100.00000	99.15312
001e0610f05c	144	24.00000	100.00000	99.15312
001e0610f6db	144	24.00000	100.00000	99.15312
001e0610f703	144	24.00000	100.00000	99.15312
001e0610f732	144	24.00000	100.00000	99.15312
001e0610f8f4	144	24.00000	100.00000	99.15312
001e0610fb4c	144	24.00000	100.00000	99.15312
001e061130f4	144	24.00000	100.00000	99.15312
001e06113107	133	22.16667	92.36111	99.15312
001e061135cb	144	24.00000	100.00000	99.15312
001e06113a48	144	24.00000	100.00000	99.15312
001e06113ace	144	24.00000	100.00000	99.15312
001e06113cf1	144	24.00000	100.00000	99.15312
001e06113d22	134	22.33333	93.05556	99.15312
001e06113dbc	144	24.00000	100.00000	99.15312
001e06113f54	144	24.00000	100.00000	99.15312
001e061144c0	133	22.16667	92.36111	99.15312
001e06114500	136	22.66667	94.44444	99.15312
001e06114503	144	24.00000	100.00000	99.15312
001e0611462f	144	24.00000	100.00000	99.15312
001e061146ba	144	24.00000	100.00000	99.15312
001e061146bc	144	24.00000	100.00000	99.15312
001e06114fd4	144	24.00000	100.00000	99.15312
001e0611536c	144	24.00000	100.00000	99.15312
001e0611537d	144	24.00000	100.00000	99.15312

The density plot below shows the distribution of propActive (Proportion of Active Duration) recorded for each node relative to the network average - this average is taken as Score 5, which represents the average proportion of day-duration node is active of the AoT network for temperature data on 2012-12-15.

dfTemp5%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT Temperature Network on 2012-12-15,

Score 5 = 99.1 : In any given node during the day, reliable data is collected for an average 99.1% of the time.

Humidity

Constructing Score 5

dfHumidity%>%
 group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfHumidity%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfHumidity5

## Adding missing grouping variables: `date`

dfHumidity5%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	144	24.00000	100.00000	99.57562
001e0610ba46	144	24.00000	100.00000	99.57562
001e0610bbf9	144	24.00000	100.00000	99.57562
001e0610bc12	144	24.00000	100.00000	99.57562
001e0610e532	144	24.00000	100.00000	99.57562
001e0610e537	144	24.00000	100.00000	99.57562
001e0610ee33	144	24.00000	100.00000	99.57562
001e0610ee36	144	24.00000	100.00000	99.57562
001e0610ee43	144	24.00000	100.00000	99.57562
001e0610ee5d	144	24.00000	100.00000	99.57562
001e0610f6db	144	24.00000	100.00000	99.57562
001e0610f732	144	24.00000	100.00000	99.57562
001e061130f4	144	24.00000	100.00000	99.57562
001e06113107	133	22.16667	92.36111	99.57562
001e06113a48	144	24.00000	100.00000	99.57562
001e06113cf1	144	24.00000	100.00000	99.57562
001e06113dbc	144	24.00000	100.00000	99.57562
001e06113f54	144	24.00000	100.00000	99.57562

dfHumidity5%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT humidity Network on 2012-12-15,

Score 5 = 99.6 : In any given node during the day, reliable data is collected for an average 99.6% of the time.

Pressure

Constructing Score 5

dfPressure%>%
  group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfPressure%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfPressure5

## Adding missing grouping variables: `date`

dfPressure5%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	144	24.00000	100.00000	99.07407
001e0610ba15	144	24.00000	100.00000	99.07407
001e0610ba46	144	24.00000	100.00000	99.07407
001e0610bbf9	144	24.00000	100.00000	99.07407
001e0610bc10	144	24.00000	100.00000	99.07407
001e0610e532	144	24.00000	100.00000	99.07407
001e0610e537	144	24.00000	100.00000	99.07407
001e0610e538	134	22.33333	93.05556	99.07407
001e0610ee33	144	24.00000	100.00000	99.07407
001e0610ee36	144	24.00000	100.00000	99.07407
001e0610ee43	144	24.00000	100.00000	99.07407
001e0610ee5d	144	24.00000	100.00000	99.07407
001e0610eef4	144	24.00000	100.00000	99.07407
001e0610f05c	144	24.00000	100.00000	99.07407
001e0610f6db	144	24.00000	100.00000	99.07407
001e0610f732	144	24.00000	100.00000	99.07407
001e061130f4	144	24.00000	100.00000	99.07407
001e06113107	133	22.16667	92.36111	99.07407
001e06113a48	144	24.00000	100.00000	99.07407
001e06113cf1	144	24.00000	100.00000	99.07407
001e06113dbc	144	24.00000	100.00000	99.07407
001e06113f54	144	24.00000	100.00000	99.07407
001e061144c0	133	22.16667	92.36111	99.07407
001e0611537d	144	24.00000	100.00000	99.07407

dfPressure5%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT Pressure Network on 2012-12-15,

Score 5 = 99.1 : In any given node during the day, reliable data is collected for an average 99.1% of the time.

PM2.5 Concentration

Constructing Score 5

dfPM25%>%
  group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfPM25%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfPM255

dfPM255%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610bc10	68	11.33333	47.22222	69.79167
001e06113107	133	22.16667	92.36111	69.79167

dfPM255%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT PM 2.5 Concentration Network on 2012-12-15,

Score 5 = 69.8 : In any given node during the day, reliable data is collected for an average 69.8% of the time.

CO Concentration

Constructing Score 5

dfCO%>%
  group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfCO%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfCO5

dfCO5%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	127	21.166667	88.19444	81.79012
001e0610ba15	58	9.666667	40.27778	81.79012
001e0610ba46	44	7.333333	30.55556	81.79012
001e0610bc10	134	22.333333	93.05556	81.79012
001e0610e537	134	22.333333	93.05556	81.79012
001e0610ee43	142	23.666667	98.61111	81.79012
001e0610eef2	144	24.000000	100.00000	81.79012
001e0610f05c	137	22.833333	95.13889	81.79012
001e0610f6db	94	15.666667	65.27778	81.79012
001e061130f4	135	22.500000	93.75000	81.79012
001e06113107	114	19.000000	79.16667	81.79012
001e06113ace	142	23.666667	98.61111	81.79012
001e06113cf1	140	23.333333	97.22222	81.79012
001e061144c0	56	9.333333	38.88889	81.79012
001e06114500	136	22.666667	94.44444	81.79012
001e06114503	105	17.500000	72.91667	81.79012
001e061146bc	141	23.500000	97.91667	81.79012
001e06114fd4	137	22.833333	95.13889	81.79012

dfCO5%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT CO concentration Network on 2012-12-15,

Score 5 = 81.8 : In any given node during the day, reliable data is collected for an average 81.8% of the time.

H2S Concentration

Constructing Score 5

dfH2S%>%
  group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfH2S%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfH2S5

dfH2S5%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	144	24.000000	100.00000	94.63735
001e0610ba15	110	18.333333	76.38889	94.63735
001e0610ba46	144	24.000000	100.00000	94.63735
001e0610bc10	144	24.000000	100.00000	94.63735
001e0610e537	144	24.000000	100.00000	94.63735
001e0610ee43	144	24.000000	100.00000	94.63735
001e0610eef2	144	24.000000	100.00000	94.63735
001e0610f05c	144	24.000000	100.00000	94.63735
001e0610f6db	144	24.000000	100.00000	94.63735
001e061130f4	144	24.000000	100.00000	94.63735
001e06113107	132	22.000000	91.66667	94.63735
001e06113ace	144	24.000000	100.00000	94.63735
001e06113cf1	144	24.000000	100.00000	94.63735
001e061144c0	59	9.833333	40.97222	94.63735
001e06114500	136	22.666667	94.44444	94.63735
001e06114503	144	24.000000	100.00000	94.63735
001e061146bc	144	24.000000	100.00000	94.63735
001e06114fd4	144	24.000000	100.00000	94.63735

dfH2S5%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT H2S Concentration Network on 2012-12-15,

Score 5 = 94.6 : In any given node during the day, reliable data is collected for an average 94.6% of the time.

NO2 Concentration

Constructing Score 5

dfNO2%>%
 group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfNO2%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfNO25

dfNO25%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	142	23.66667	98.61111	95.56327
001e0610ba15	144	24.00000	100.00000	95.56327
001e0610ba46	144	24.00000	100.00000	95.56327
001e0610bc10	127	21.16667	88.19444	95.56327
001e0610e537	78	13.00000	54.16667	95.56327
001e0610ee43	144	24.00000	100.00000	95.56327
001e0610eef2	144	24.00000	100.00000	95.56327
001e0610f05c	144	24.00000	100.00000	95.56327
001e0610f6db	144	24.00000	100.00000	95.56327
001e061130f4	144	24.00000	100.00000	95.56327
001e06113107	133	22.16667	92.36111	95.56327
001e06113ace	144	24.00000	100.00000	95.56327
001e06113cf1	144	24.00000	100.00000	95.56327
001e061144c0	133	22.16667	92.36111	95.56327
001e06114500	136	22.66667	94.44444	95.56327
001e06114503	144	24.00000	100.00000	95.56327
001e061146bc	144	24.00000	100.00000	95.56327
001e06114fd4	144	24.00000	100.00000	95.56327

dfNO25%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT NO2 Concentration Network on 2012-12-15,

Score 5 = 95.6 : In any given node during the day, reliable data is collected for an average 95.6% of the time.

O3 Concentration

Constructing Score 5

dfO3%>%
 group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfO3%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfO35

dfO35%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	144	24.00000	100.00000	98.37963
001e0610ba15	144	24.00000	100.00000	98.37963
001e0610ba46	144	24.00000	100.00000	98.37963
001e0610bc10	144	24.00000	100.00000	98.37963
001e0610e537	144	24.00000	100.00000	98.37963
001e0610ee43	144	24.00000	100.00000	98.37963
001e0610eef2	144	24.00000	100.00000	98.37963
001e0610f05c	144	24.00000	100.00000	98.37963
001e0610f6db	144	24.00000	100.00000	98.37963
001e061130f4	144	24.00000	100.00000	98.37963
001e06113107	133	22.16667	92.36111	98.37963
001e06113ace	144	24.00000	100.00000	98.37963
001e06113cf1	144	24.00000	100.00000	98.37963
001e061144c0	133	22.16667	92.36111	98.37963
001e06114500	131	21.83333	90.97222	98.37963
001e06114503	137	22.83333	95.13889	98.37963
001e061146bc	144	24.00000	100.00000	98.37963
001e06114fd4	144	24.00000	100.00000	98.37963

dfO35%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT O3 concentration Network on 2012-12-15,

Score 5 = 98.4 : In any given node during the day, reliable data is collected for an average 98.4% of the time.

SO2 Concentration

Constructing Score 5

dfSO2%>%
 group_by(node_id, by10)%>%
  mutate(Active=ifelse(sum(val_qual)>0, 1, 0))%>%
  ggplot()+
  geom_tile(aes(x=by10, y=node_id, fill=as.factor(Active)), col='grey90')+
  scale_fill_manual(values=c('indianred1','cornflowerblue'),
                      name="Collecting Reliable Data",
                      labels=c("No", "Yes"))+
  labs(x = "Time", title= paste('2018-12-15 | Node activity by time')) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0))+
  theme(
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank())

dfSO2%>%
  filter(val_qual==1)%>%
  select(node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))->dfSO25

dfSO25%>%
  kable()%>%
  kable_styling(bootstrap_options = c('striped', 'hover'))%>%
  scroll_box(height = "300px")

node_id	duration	durationActive	propActive	Score 5
001e0610ba13	144	24.0000000	100.000000	84.67593
001e0610ba15	144	24.0000000	100.000000	84.67593
001e0610ba46	144	24.0000000	100.000000	84.67593
001e0610e537	144	24.0000000	100.000000	84.67593
001e0610ee43	144	24.0000000	100.000000	84.67593
001e0610eef2	48	8.0000000	33.333333	84.67593
001e0610f05c	144	24.0000000	100.000000	84.67593
001e0610f6db	144	24.0000000	100.000000	84.67593
001e061130f4	4	0.6666667	2.777778	84.67593
001e06113107	133	22.1666667	92.361111	84.67593
001e06113cf1	144	24.0000000	100.000000	84.67593
001e061144c0	119	19.8333333	82.638889	84.67593
001e06114503	129	21.5000000	89.583333	84.67593
001e061146bc	144	24.0000000	100.000000	84.67593
001e06114fd4	100	16.6666667	69.444444	84.67593

dfSO25%>%
  ggplot()+
  geom_density(aes(propActive), fill='indianred', col='indianred', alpha=0.1)+
  geom_vline(aes(xintercept = `Score 5`), size = 1)+
  geom_text(aes(x= `Score 5`, y=0), label='Score 5:\nAverage Proportion of\nactive duration', size = 4, vjust= -2, hjust=1.1)+
  labs(x='Proportions of active duration', 
       y='Density',
       title='Distribution of proportions of active duration')+
  xlim(0, 100)+
  plotTheme()

In summary, for the AoT SO2 concentration Network on 2012-12-15,

Score 5 = 84.6 : In any given node during the day, reliable data is collected for an average 84.6% of the time.

3.6 Scoring Overall Reliability

In this section, the 5 component scores evaluating the 3 different reliability criteria are averaged to score the Overall Reliability of the AoT network for the day of 2018-12-15, for each data parameter type.

Click on the tabs below to view the overall scores for each data parameter.

Temperature

tempScore<-as.data.frame(c(mean(dfTemp1$NodeMeanX), 
                           dfTemp2$`Score 2`, 
                           dfTemp3$`Score 3`, 
                           dfTemp4a$`Score 4`, 
                           dfTemp5$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(tempScore)<-'Component Scores'
tempScore<-rbind(tempScore, mean(tempScore$`Component Scores`))

tempScore$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")

tempScore%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - Temperature', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

Humidity

HumidityScore<-as.data.frame(c(mean(dfHumidity1$NodeMeanX), 
                           dfHumidity2$`Score 2`, 
                           dfHumidity3$`Score 3`, 
                           dfHumidity4a$`Score 4`, 
                           dfHumidity5$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(HumidityScore)<-'Component Scores'
HumidityScore<-rbind(HumidityScore, mean(HumidityScore$`Component Scores`))

HumidityScore$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")

HumidityScore%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - Humidity', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

Pressure

PressureScore<-as.data.frame(c(mean(dfPressure1$NodeMeanX), 
                           dfPressure2$`Score 2`, 
                           dfPressure3$`Score 3`, 
                           dfPressure4a$`Score 4`, 
                           dfPressure5$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(PressureScore)<-'Component Scores'
PressureScore<-rbind(PressureScore, mean(PressureScore$`Component Scores`))

PressureScore$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")


PressureScore%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - Pressure', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

PM2.5 Concentration

PM25Score<-as.data.frame(c(mean(dfPM251$NodeMeanX), 
                           dfPM252$`Score 2`, 
                           dfPM253$`Score 3`, 
                           dfPM254a$`Score 4`, 
                           dfPM255$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(PM25Score)<-'Component Scores'
PM25Score<-rbind(PM25Score, mean(PM25Score$`Component Scores`))

PM25Score$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")

PM25Score%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - PM 2.5 Concentration', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

CO Concentration

COScore<-as.data.frame(c(mean(dfCO1$NodeMeanX), 
                           dfCO2$`Score 2`, 
                           dfCO3$`Score 3`, 
                           dfCO4a$`Score 4`, 
                           dfCO5$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(COScore)<-'Component Scores'
COScore<-rbind(COScore, mean(COScore$`Component Scores`))

COScore$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")


COScore%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - CO concentration', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

H2S Concentration

H2SScore<-as.data.frame(c(mean(dfH2S1$NodeMeanX), 
                           dfH2S2$`Score 2`, 
                           dfH2S3$`Score 3`, 
                           dfH2S4a$`Score 4`, 
                           dfH2S5$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(H2SScore)<-'Component Scores'
H2SScore<-rbind(H2SScore, mean(H2SScore$`Component Scores`))

H2SScore$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")


H2SScore%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - H2S Concentration', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

NO2 Concentration

NO2Score<-as.data.frame(c(mean(dfNO21$NodeMeanX), 
                           dfNO22$`Score 2`, 
                           dfNO23$`Score 3`, 
                           dfNO24a$`Score 4`, 
                           dfNO25$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(NO2Score)<-'Component Scores'
NO2Score<-rbind(NO2Score, mean(NO2Score$`Component Scores`))

NO2Score$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")


NO2Score%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - NO2 Concentration', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

O3 Concentration

O3Score<-as.data.frame(c(mean(dfO31$NodeMeanX), 
                           dfO32$`Score 2`, 
                           dfO33$`Score 3`, 
                           dfO34a$`Score 4`, 
                           dfO35$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(O3Score)<-'Component Scores'
O3Score<-rbind(O3Score, mean(O3Score$`Component Scores`))

O3Score$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")


O3Score%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - O3 Concentration', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

SO2 Concentration

SO2Score<-as.data.frame(c(mean(dfSO21$NodeMeanX), 
                           dfSO22$`Score 2`, 
                           dfSO23$`Score 3`, 
                           dfSO24a$`Score 4`, 
                           dfSO25$`Score 5`))%>%
  unique()%>%
  as.data.frame()

colnames(SO2Score)<-'Component Scores'
SO2Score<-rbind(SO2Score, mean(SO2Score$`Component Scores`))

SO2Score$Score<-c('Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5', 'Overall')

caption<-"Indicator of Sensor Value Reliability - Score 1: Average proportion of reliable data collected by each node\n
Indicator of Sensor Value Reliablity - Score 2: Overall consistency in sensor value reliability\n
Indicator of Spatial Coverage - Score 3: Average proportion of network 'active'\n
Indicator of Spatial Coverage - Score 4: Average proportion of Chicago area covered\n
Indicator of Temporal Coverage - Score 5: Average proportion of day-duration node is 'active'"
caption <- paste0(strwrap(caption, 50), sep="", collapse="\n")


SO2Score%>%
  arrange(Score)%>%
  ggplot()+
  geom_col(aes(y=`Component Scores`, x=Score), 
           fill=c('indianred4', 
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred',
                  'indianred'))+
  geom_text(aes(y=`Component Scores`, x=Score, label=round(`Component Scores`, digits=1)), vjust=-0.5)+
  labs(x='Score type',
       y='Score value',
       title='Distribution of overall score and its 5 component scores - SO2 Concentration', 
       caption=caption)+
  geom_hline(yintercept=100, col='grey50', size=1)+
  theme_classic()

3.6 The scores for December 2018

We apply the scoring method above to the data collected by the AoT network for the whole month of December. In the code block below, we have provided the general code required to apply the scoring method for a month’s worth of data.

This section is computed separately and the scores for each data type is compiled separately as well. To obtain a full set of scores for the whole of December for all the data types, we then combined these separate data files to obtain dfScoresAll.rds. For the plots in the sections below, we will be using this compiled set of scores.

Here, df is the dataset retrieved using the retrieval function provided in Section 3.2.

1.Scoring

#score 1
df%>%
  group_by(date, node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(date, node_id, by10, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  mutate(NodeMeanX = sum(X)/144)%>%
  select(date, node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date)%>%
  mutate(`Score 1`=mean(NodeMeanX))%>%
  select(date,`Score 1` )%>%
  unique()%>%
  as.data.frame()->score1


#score 2
df%>%
  group_by(date, node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(date, node_id, by10, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  mutate(NodeMeanX = sum(X)/144)%>%
  select(date, node_id, by10, NodeMeanX, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>NodeMeanX, 
                       NodeMeanX,
                       abs(sd(X))))%>%
  group_by(date, node_id)%>%
  mutate(NodeSDScore= ifelse(NodeMeanX==0, 
                             0, 
                             ifelse(NodeMeanX<50, 
                                    abs(100-abs(100-100*(NodeSD/NodeMeanX))),
                                    abs(100-100*(NodeSD/NodeMeanX)))))%>%
  select(date, node_id, NodeSDScore)%>%
  na.omit()%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date)%>%
  mutate(`Score 2`= mean(NodeSDScore))%>%
  select(date,`Score 2`)%>%
  unique()%>%
  as.data.frame()->score2

#Score 3

df%>%
  filter(val_qual==1)%>%
  select(date, node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))%>%
  select(date, `Score 3`)%>%
  unique()%>%
  as.data.frame()->score3

#Score 4
df%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> df4

chig<-readOGR('.', 'chigBound')
chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

df4a<-NULL
for(i in unique(df4$by10)){
  
  subset <- 
    df4%>%
    filter(by10==i)
  
  if(nrow(subset)>1){
    subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
    subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    df4a<-rbind(df4a, df1)
  }else{
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-0
    df1<-as.data.frame(df1)
    df4a<-rbind(df4a, df1)
  }
  
}

df4a%>%
  mutate(date=date(by10))%>%
  group_by(date)%>%
  mutate(`Score 4`= 100*mean(AreaProp))%>%
  select(date, `Score 4`)%>%
  unique()%>%
  as.data.frame()->score4



#Score 5
df%>%
  filter(val_qual==1)%>%
  select(date,node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))%>%
  select(date, `Score 5`)%>%
  unique()%>%
  as.data.frame()->score5

scoreDec<-merge(score1, score2, by='date')
scoreDec<-merge(scoreDec, score3, by='date')
scoreDec<-merge(scoreDec, score4, by='date')
scoreDec<-merge(scoreDec, score5, by='date')
colnames(scoreDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
scoreDec$Overall<-(scoreDec$`Score 1` + scoreDec$`Score 2`+ scoreDec$`Score 3` + scoreDec$`Score 4`+ scoreDec$`Score 5`)/5
saveRDS(scoreDec, 'temperatureScoresDecember.rds') #change name of file according to the data type

2. Compiling scores for different data types

Temp<-readRDS('temperatureScoresDecember.rds')
Temp$Data<-'Temperature'
Humidity<-readRDS('humidityScoresDecember.rds')
Humidity$Data<-'Humidity'
Pressure<-readRDS('pressureScoresDecember.rds')
Pressure$Data<-'Pressure'
PM25<-readRDS('pm25ScoresDecember.rds')
PM25$Data<-'PM25'
CO<-readRDS('coScoresDecember.rds')
CO$Data<-'CO'
H2S<-readRDS('h2sDecember.rds')
H2S$Data<-'H2S'
NO2<-readRDS('no2sDecember.rds')
NO2$Data<-'NO2'
O3<-readRDS('o3December.rds')
O3$Data<-'O3'
SO2<-readRDS('so2December.rds')
SO2$Data<-'SO2'

dfScoresAll<-rbind(Temp, Humidity, Pressure, PM25, CO, H2S, NO2, O3, SO2)
saveRDS(dfScoresAll, 'dfScoresAll.rds')

For the sections below, we use the final compiled set of scores we obtained separately using the code blocks presented above in this section.

dfScoresAll<-readRDS('dfScoresAll.rds')

Below, we visualise how the scores for different data types during the month of December. The calendar plot was produced by applying openair’s calendarPlot function - this function was created specifically to plot variations in air pollutant in the calendar format. Therefore, you might notice that the parameter for which the function plots for is named pollutant. In our case, since we are plotting the overall scores for each day, pollutant is specified as Overall, the variable that is the overall score for the day.

Click on each tab to view how the scores vary for different days of the month for different data types.

Temperature

Based on the plot, it seems that temperature data collected by the AoT network is slightly unreliable for all the days of December 2018.

dfScoresAll%>%
  filter(Data=='Temperature')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='Temperature')

Looking at how the component score values vary throughout the month, it could be observed that the temporal reliability of temperature data is consistently high for the whole month. Spatial coverage is also considered reliable for the month as well. However, the overall score is observably lowered by poor reliability in terms of sensor value reliability (Score 1 and Score 2).

dfScoresAll%>%
  filter(Data=='Temperature')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - Temperature')+
  theme_bw()

Humidity

Based on the plot, it seems that humidity data collected by the AoT network is slightly unreliable for all the days of December 2018.

dfScoresAll%>%
  filter(Data=='Humidity')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
            labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='Humidity')

Looking at how the component score values vary throughout the month, it could be observed that the temporal reliability of humidity data is consistently high for the whole month. However, the overall score is observably lowered by poor reliability in terms of sensor value reliability (Score 1 and Score 2) and spatial reliability (Score 3 and Score 4).

dfScoresAll%>%
  filter(Data=='Humidity')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - Humidity')+
  theme_bw()

Pressure

Based on the plot, it seems that Pressure data collected by the AoT network is reliable for most the days of December 2018, with the exception of 4 days during the first 2 weeks where the data is slightly unreliable.

dfScoresAll%>%
  filter(Data=='Pressure')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='Pressure')

dfScoresAll%>%
  filter(Data=='Pressure')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - Pressure')+
  theme_bw()

PM2.5 Concentration

Based on the plot, it seems that PM 2.5 Concentration data collected by the AoT network is unreliable or slightly unreliable for most of the days in December 2018. For 5 days of the month, the data is even worse and is classified as highly unreliable.

dfScoresAll%>%
  filter(Data=='PM25')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='PM 2.5')

Looking at how the component score values vary throughout the month, it could be observed that the spatial reliability of humidity data is consistently low for the whole month. Sensor value and temporal reliability varies more widely throughout the month.

dfScoresAll%>%
  filter(Data=='PM25')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - PM2.5 Concentration')+
  theme_bw()

CO Concentration

Based on the plot, it seems that CO concentration data collected by the AoT network is slightly unreliable for most days of December 2018. However, there are 5 consecutive days of reliable data collected.

dfScoresAll%>%
  filter(Data=='CO')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='CO')

Looking at how the component score values vary throughout the month, it could be observed that sensor value, spatial, and temporal reliability varies erratically across the whole month.

dfScoresAll%>%
  filter(Data=='CO')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - CO Concentration')+
  theme_bw()

H2S Concentration

Based on the plot, it seems that H2S concentrtion data collected by the AoT network is slightly unreliable for most days of December 2018.

dfScoresAll%>%
  filter(Data=='H2S')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
              labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='H2S')

Looking at how the component score values vary throughout the month, it could be observed that the temporal reliability of H2S concentration data is consistently high for the whole month. However, the overall score is observably lowered by poor reliability in terms of spatial reliability (Score 3).

dfScoresAll%>%
  filter(Data=='H2S')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - H2S Concentration')+
  theme_bw()

NO2 Concentration

Based on the plot, it seems that NO2 Concentration data collected by the AoT network is reliable for all the days of December 2018, except one.

dfScoresAll%>%
  filter(Data=='NO2')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='NO2')

Looking at how the component score values vary throughout the month, it could be observed that the temporal reliability of NO2 concentration data is consistently high for the whole month. However, the overall score is observably lowered by poor reliability in terms of spatial reliability (Score 3).

dfScoresAll%>%
  filter(Data=='NO2')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - NO2 Concentration')+
  theme_bw()

O3 Concentration

Based on the plot, it seems that O3 Concentration data collected by the AoT network is reliable for all the days of December 2018.

dfScoresAll%>%
  filter(Data=='O3')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='O3')

Looking at how the component score values vary throughout the month, it could be observed that the temporal reliability of O3 concentration data is consistently high for the whole month. However, the overall score is observably lowered by poor reliability in terms of spatial reliability (Score 3).

dfScoresAll%>%
  filter(Data=='O3')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - O3 Concentration')+
  theme_bw()

SO2 Concentration

Based on the plot, it seems that SO2 concentration data collected by the AoT network is slightly unreliable for all the days of December 2018.

dfScoresAll%>%
  filter(Data=='SO2')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='SO2')

Looking at how the component score values vary throughout the month, it could be observed that the temporal reliability of SO2 concentration data is consistently high for the whole month. However, the overall score is observably lowered by poor reliability in terms of sensor value reliability (Score 1 and Score 2) and spatial reliability (Score 3 and Score 4).

dfScoresAll%>%
  filter(Data=='SO2')%>%
  select(-Overall, -Data)%>%
  gather(key='Score Component', value='value', -date)%>%
  ggplot()+
  geom_line(aes(x=date, y=value, col=`Score Component`), size=2, alpha=0.5)+
  geom_area(aes(x=date, y=value, fill=`Score Component`), col=NA,alpha=0.25)+
  geom_point(aes(x=date, y=value, col=`Score Component`), size=2)+
    facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  scale_color_brewer(palette = 'Dark2')+
  scale_fill_brewer(palette='Dark2')+
  labs(x='Date', y='Score value',
       title='Variations in component score values for December - SO2 Concentration')+
  theme_bw()

4. Scoring Data Reliability After Imputation

To assess imputability, we compare the reliability scores of the data before and after we apply the imputation method described in Section 2.5.

The flow chart below illustrates the workflow we adopt for scoring data reliability for each day in this section - here, it can be observed that the imputation procedure is incorporated after data retrieval.

The codeblocks below contain the code required to carry out the workflow above.

1. Impute and update dataset with imputed values

#imputation

maindf<-dfSO2
maindf$value_hrf[maindf$val_qual==0]<-NA #convert invalid values to NA

maindf%>%
  as.data.frame()%>%
  select(by10, node_id, lat, lon, value_hrf)%>%
  group_by(by10, node_id, lat, lon)%>%
  mutate(temp=mean(value_hrf, na.rm = TRUE))%>%
  unique()%>%
  as.data.frame()->maindf #aggregate valid values for each 10-minute interval


maindf %>%
  mutate(time = ymd_hms(by10)) %>%
  select(-by10)->maindf


#create unique timestamps
timestamps <- unique(maindf$time) %>% as.data.frame() %>% rename('time' = '.')

#create empty data frame 
complete_temp_group <- c()

#fill in empty data frame
for (i in unique(maindf$node_id)) {
  longitude = df[maindf$node_id == i, ][1, 'lon'] %>% as.numeric()
  latitude = df[maindf$node_id == i, ][1, 'lat'] %>% as.numeric()
  
  sample_node <-
    maindf %>%
    filter(node_id == i) %>%
    right_join(., timestamps, by = 'time')%>%
    mutate(node_id = i,
           lon = longitude,
           lat = latitude)
  
  complete_temp_group <-
    rbind(complete_temp_group, sample_node)
}

rm(timestamps)

#set up
all_temp_nodes <-
  complete_temp_group %>%
  group_by(node_id) %>%
  summarise(lat = first(lat),
            lon = first(lon)) %>%
  st_as_sf(coords = c('lon', 'lat'), crs = 4326, agr = 'constant')

all_temp_nodes_harn <-
  all_temp_nodes %>%
  st_transform(crs = 102641)

all_temp_nodes_harn_xy <-
  all_temp_nodes_harn %>%
  cbind(.,st_coordinates(all_temp_nodes_harn))  %>%
  st_set_geometry(NULL) %>%
  dplyr::select(X,Y) %>%
  as.matrix()


nn6 <-   
  get.knnx(all_temp_nodes_harn_xy, all_temp_nodes_harn_xy, 2)$nn.dist %>%
  as.data.frame() %>%
  rename(distance = V2) %>%
  select(distance)

buffer_dist <- max(nn6$distance) + 10 

# Know which nodes are inside buffer
all_temp_nodes_harn_buffer_intersect <-
  st_buffer(all_temp_nodes_harn, buffer_dist) %>%
  st_join(all_temp_nodes_harn, join = st_intersects) %>%
  st_set_geometry(NULL) %>%
  select(node_id.x, node_id.y) %>%
  rename(node_id = node_id.x, 
         inside_buffer = node_id.y) %>%
  filter(node_id != inside_buffer)

# Join temperature 10 minutes ago of nodes in buffer
df_buffer <-
  left_join(complete_temp_group, all_temp_nodes_harn_buffer_intersect, by = 'node_id') %>%
  left_join(complete_temp_group %>%
              mutate(time_lag = time + 600) %>%
              select(node_id, time_lag,temp),
            by = c('time' = 'time_lag',
                   'inside_buffer' = 'node_id')) %>%
  mutate(buffer_temp = temp.y,
         temp = temp.x) %>%
  select(node_id, inside_buffer, time, lon, lat, temp, buffer_temp)


df_buffer <-
  df_buffer %>%
  group_by(node_id, time) %>%
  summarise(lon = first(lon),
            lat = first(lat),
            temp = first(temp),
            avg_buffer_temp = mean(buffer_temp, na.rm = TRUE))



#identify 5 closest neighbours
# Get the node IDs of nearest 5
nn7 <-   
  get.knnx(all_temp_nodes_harn_xy, all_temp_nodes_harn_xy, 6)$nn.index %>%
  as.data.frame() %>%
  rename(N1 = V2, N2 = V3, N3 = V4, N4 = V5, N5 = V6) %>%
  left_join(all_temp_nodes_harn %>%
              mutate(index = as.numeric(row.names(.))), 
            by = c('V1' = 'index')) %>%
  select(node_id, V1, N1, N2, N3, N4, N5) 

nn8 <-
  left_join(nn7, nn7 %>%
              select(node_id, V1),
            by = c('N1' = 'V1')) %>%
  left_join(nn7 %>%
              select(node_id, V1),
            by = c('N2' = 'V1')) %>%
  left_join(nn7 %>%
              select(node_id, V1),
            by = c('N3' = 'V1')) %>%
  left_join(nn7 %>%
              select(node_id, V1),
            by = c('N4' = 'V1')) %>%
  left_join(nn7 %>%
              select(node_id, V1),
            by = c('N5' = 'V1')) %>%
  select(node_id.x, node_id.y, node_id.x.x, node_id.y.y, 
         node_id.x.x.x, node_id.y.y.y) %>%
  rename(node_id = node_id.x,
         nearest_1 = node_id.y,
         nearest_2 = node_id.x.x,
         nearest_3 = node_id.y.y,
         nearest_4 = node_id.x.x.x,
         nearest_5 = node_id.y.y.y)

# Get the average temperature 10 minutes ago of nearest five
dat_buffer_nearest5 <-
  left_join(df_buffer, nn8, by = 'node_id')%>%
  left_join(df_buffer %>%
              mutate(time_lag = time + 600) %>%
              select(node_id, time_lag, temp),
            by = c('time' = 'time_lag',
                   'nearest_1' = 'node_id')) %>%
  left_join(df_buffer %>%
              mutate(time_lag = time + 600) %>%
              select(node_id, time_lag, temp),
            by = c('time' = 'time_lag',
                   'nearest_2' = 'node_id')) %>%
  left_join(df_buffer %>%
              mutate(time_lag = time + 600) %>%
              select(node_id, time_lag, temp),
            by = c('time' = 'time_lag',
                   'nearest_3' = 'node_id')) %>%
  left_join(df_buffer %>%
              mutate(time_lag = time + 600) %>%
              select(node_id, time_lag, temp),
            by = c('time' = 'time_lag',
                   'nearest_4' = 'node_id')) %>%
  left_join(df_buffer %>%
              mutate(time_lag = time + 600) %>%
              select(node_id, time_lag, temp),
            by = c('time' = 'time_lag',
                   'nearest_5' = 'node_id')) %>%
  select(node_id, lon, lat, time, temp.x, avg_buffer_temp, temp.y, temp.x.x, 
         temp.y.y, temp.x.x.x, temp.y.y.y) %>%
  rename(temp = temp.x,
         nearest_1 = temp.y,
         nearest_2 = temp.x.x,
         nearest_3 = temp.y.y,
         nearest_4 = temp.x.x.x,
         nearest_5 = temp.y.y.y)

dat_buffer_nearest5 <-
  dat_buffer_nearest5 %>%
  gather(nearest, value, nearest_1:nearest_5) %>%
  group_by(node_id, time) %>%
  summarise(lon = first(lon),
            lat = first(lat), 
            temp = first(temp),
            avg_buffer_temp = first(avg_buffer_temp),
            avg_nearby_temp = mean(value, na.rm = TRUE))

#identifying from 10 minutes
# Get temps from 10 minutes ago
dat_whole <-
  left_join(dat_buffer_nearest5,
            dat_buffer_nearest5 %>%
              mutate(time_lag = time + 600) %>%
              select(node_id, time_lag, temp),
            by = c('time' = 'time_lag',
                   'node_id' = 'node_id')) %>%
  rename(temp = temp.x,
         temp_10m = temp.y)

rm(dat_buffer_nearest5, nn6, nn7, nn8, all_temp_nodes_harn_buffer_intersect, all_temp_nodes_harn_xy, all_temp_nodes_harn)

#fit models
dat_timemodel <- dat_whole[!(is.na(dat_whole$temp) | is.na(dat_whole$temp_10m)), ]
dat_nbormodel <- dat_whole[!(is.na(dat_whole$temp) | is.na(dat_whole$avg_nearby_temp)), ]

# Time model
mod7 <- lm(temp ~ temp_10m, data = dat_timemodel)

# Nearest 5 model
mod9 <- lm(temp ~ avg_nearby_temp, data = dat_nbormodel)


#metrics
mod7PredValues <-
  data.frame(node_id = dat_whole$node_id,
             lon = dat_whole$lon,
             lat = dat_whole$lat,
             time = dat_whole$time,
             observed = dat_whole$temp,
             predicted = predict(mod7, dat_whole)) 
mod9PredValues <-
  data.frame(node_id = dat_whole$node_id,
             lon = dat_whole$lon,
             lat = dat_whole$lat,
             time = dat_whole$time,
             observed = dat_whole$temp,
             predicted = predict(mod9, dat_whole))
#predict
## Create a complete dataset which includes all imputed values
predicted_set = c()

for (i in 1:dim(mod7PredValues)[1]){
  if (!is.na(mod7PredValues[i, 'predicted'])){
    predicted_set <- rbind(predicted_set, mod7PredValues[i, 'predicted'])
  }
  
  else if (is.na(mod7PredValues[i, 'predicted']) & !is.na(mod9PredValues[i, 'predicted'])) {
    predicted_set <- rbind(predicted_set, mod9PredValues[i, 'predicted'])
  }
  
  else {
    predicted_set <- rbind(predicted_set, mod9PredValues[i, 'predicted'])
  }
  
}

complete_set <-
  data.frame(node_id = dat_whole$node_id,
             time = dat_whole$time,
             lon = dat_whole$lon,
             lat = dat_whole$lat,
             observed = dat_whole$temp,
             predicted = predicted_set)


#sub in the values
complete_set$final<-ifelse(is.na(complete_set$observed), 
                           complete_set$predicted, 
                           complete_set$observed)

#compute valqual
complete_set$val_qual<-ifelse(is.na(complete_set$final), 0,1)
complete_set$date<-date(complete_set$time)
complete_set$by10<-cut(complete_set$time, breaks='10 min')

2. Apply scoring procedures on updated dataset

#define parameters


df<-complete_set

#score 1
df%>%
  group_by(date, node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(date, node_id, by10, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  mutate(NodeMeanX = sum(X)/144)%>%
  select(date, node_id, NodeMeanX)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date)%>%
  mutate(`Score 1`=mean(NodeMeanX))%>%
  select(date,`Score 1` )%>%
  unique()%>%
  as.data.frame()->temp1


#score 2
df%>%
  group_by(date, node_id, by10)%>%
  mutate(X = 100*sum(val_qual)/n())%>%
  select(date, node_id, by10, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  mutate(NodeMeanX = sum(X)/144)%>%
  select(date, node_id, by10, NodeMeanX, X)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  mutate(NodeSD=ifelse(abs(sd(X))>NodeMeanX, 
                       NodeMeanX,
                       abs(sd(X))))%>%
  group_by(date, node_id)%>%
  mutate(NodeSDScore= ifelse(NodeMeanX==0, 
                             0, 
                             ifelse(NodeMeanX<50, 
                                    abs(100-abs(100-100*(NodeSD/NodeMeanX))),
                                    abs(100-100*(NodeSD/NodeMeanX)))))%>%
  select(date, node_id, NodeSDScore)%>%
  na.omit()%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date)%>%
  mutate(`Score 2`= mean(NodeSDScore))%>%
  select(date,`Score 2`)%>%
  unique()%>%
  as.data.frame()->temp2

#Score 3

df%>%
  filter(val_qual==1)%>%
  select(date, node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, by10)%>%
  summarise(count=n())%>%
  mutate(propActive=(count*100)/86)%>%
  mutate(`Score 3`=mean(propActive))%>%
  select(date, `Score 3`)%>%
  unique()%>%
  as.data.frame()->temp3

#Score 4
df%>%
  filter(val_qual==1)%>%
  select(by10, node_id, lat, lon)%>%
  mutate(lat=as.numeric(lat), 
         lon=as.numeric(lon))%>%
  unique()%>%
  as.data.frame()-> df4

chig<-readOGR('.', 'chigBound')
chig<-spTransform(chig, CRS('+init=EPSG:3435'))
chigArea<-gArea(chig)

df4a<-NULL
for(i in unique(df4$by10)){
  
  subset <- 
    df4%>%
    filter(by10==i)
  
  if(nrow(subset)>1){
    subset_nodes_sp <- SpatialPointsDataFrame(subset[, c('lon','lat')], subset, proj4string = CRS('+init=EPSG:4326'))
    subset_nodes_trnsfrmd <- spTransform(subset_nodes_sp, CRS('+init=EPSG:3435'))
    P2 <- matrix(c(subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,2],
                   subset_nodes_trnsfrmd@bbox[1,2], subset_nodes_trnsfrmd@bbox[2,1],
                   subset_nodes_trnsfrmd@bbox[1,1], subset_nodes_trnsfrmd@bbox[2,1]),
                 ncol = 2, byrow = TRUE) %>%
      Polygon()
    Ps2 <- SpatialPolygons(list(Polygons(list(P2), ID = "a")), proj4string=CRS('+init=EPSG:3435'))
    #clip using chicago
    Ps2<-gIntersection(Ps2, chig, byid=FALSE)
    Ps2AreaProp<-gArea(Ps2)/chigArea
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-Ps2AreaProp
    df1<-as.data.frame(df1)
    df4a<-rbind(df4a, df1)
  }else{
    df1<-NULL
    df1$by10<-i
    df1$AreaProp<-0
    df1<-as.data.frame(df1)
    df4a<-rbind(df4a, df1)
  }
  
}

df4a%>%
  mutate(date=date(by10))%>%
  group_by(date)%>%
  mutate(`Score 4`= 100*mean(AreaProp))%>%
  select(date, `Score 4`)%>%
  unique()%>%
  as.data.frame()->temp4

#Score 5
df%>%
  filter(val_qual==1)%>%
  select(date,node_id, by10)%>%
  unique()%>%
  as.data.frame()%>%
  group_by(date, node_id)%>%
  summarise(duration=n())%>%
  mutate(durationActive = (duration*10)/60)%>%
  mutate(propActive=100*durationActive/24)%>%
  mutate(`Score 5` = mean(propActive))%>%
  select(date, `Score 5`)%>%
  unique()%>%
  as.data.frame()->temp5


#temperature
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedtemperatureScoresDecember.rds')

#humidity
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedhumidityScoresDecember.rds')

#pressure
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedpressureScoresDecember.rds')

#pm25
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedpm25ScoresDecember.rds')

#CO
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedcoDecember.rds')

#h2s
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedh2sDecember.rds')

#no2
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedno2December.rds')

#o3
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedo3December.rds')

#so2
tempDec<-merge(temp1, temp2, by='date')
tempDec<-merge(tempDec, temp3, by='date')
tempDec<-merge(tempDec, temp4, by='date')
tempDec<-merge(tempDec, temp5, by='date')
colnames(tempDec)<-c('date', 'Score 1', 'Score 2', 'Score 3', 'Score 4', 'Score 5')
tempDec$Overall<-(tempDec$`Score 1` + tempDec$`Score 2`+ tempDec$`Score 3` + tempDec$`Score 4`+ tempDec$`Score 5`)/5
saveRDS(tempDec, 'imputedso2December.rds')


Temp<-readRDS('imputedtemperatureScoresDecember.rds')
Temp$Data<-'Temperature'
Humidity<-readRDS('imputedhumidityScoresDecember.rds')
Humidity$Data<-'Humidity'
Pressure<-readRDS('imputedpressureScoresDecember.rds')
Pressure$Data<-'Pressure'
PM25<-readRDS('imputedpm25ScoresDecember.rds')
PM25$Data<-'PM25'
CO<-readRDS('imputedcoDecember.rds')
CO$Data<-'CO'
H2S<-readRDS('imputedh2sDecember.rds')
H2S$Data<-'H2S'
NO2<-readRDS('imputedno2December.rds')
NO2$Data<-'NO2'
O3<-readRDS('imputedo3December.rds')
O3$Data<-'O3'
SO2<-readRDS('imputedso2December.rds')
SO2$Data<-'SO2'

dfScoresAll<-rbind(Temp, Humidity, Pressure, PM25, CO, H2S, NO2, O3, SO2)
saveRDS(dfScoresAll, 'imputeddfScoresAll.rds')

For the sections belw, we use the final compiled set of imputed scores we obtained separately using the code blocks presented above in this section.

dfScoresImputed<-readRDS('imputeddfScoresAll.rds')
dfScoresImputed$Type<-'After imputation'

dfScoresAll$Type<-'Before imputation'
dfScoresImputed%>%
  bind_rows(dfScoresAll)%>%
  select(-Overall)%>%
  gather(key='Score Component', value='value', -date, -Data, -Type)%>%
  spread(key=Type, value=value)%>%
  mutate(segment=findInterval(`Before imputation`, c(!is.na(`After imputation`))))%>%
  mutate(change=ifelse(`After imputation`-`Before imputation`>0, 
                       'Improved', 
                       ifelse(`After imputation`-`Before imputation`==0, 
                              'No change','Worsened')))->dfScoresImputedAll

Below, we visualise how the scores for different data types vary during the month of December after imputation.

Click on each tab to view how the scores vary for different days of the month for different data types, and a discussion of how they compare to the scores obtained before imputation.

Temperature

Based on the plot, the imputed temperature data is reliable for all the days of December 2018. This is a marked improvement from the scores obtained before imputation - the data was slightly unreliable for all the days.

dfScoresImputed%>%
  filter(Data=='Temperature')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='Temperature')

From the plots below, it could be observed how imputation has affected different reliability metrics to create this overall improvement in reliability. Imputation has substantially improved the sensor value reliability scores (Score 1 and Score 2), but slightly worsened the other scores.

ggplot(data=subset(dfScoresImputedAll, Data=='Temperature'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'indianred1', 'grey90'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - Temperature')+
  theme_bw()

Humidity

Based on the plot, the imputed humidity data is still slightly unreliable for all the days of December 2018. However, reliability improved for 3 separate days of the month. This is slight improvement from the scores obtained before imputation - the data was slightly unreliable for all the days.

dfScoresImputed%>%
  filter(Data=='Humidity')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='Humidity')

From the plots below, it could be observed how imputation has affected different reliability metrics to create this overall improvement in reliability. It has slightly improved sensor value reliability (Score 1 and Score 2) and spatial reliability(Score 3 and Score 4). In terms of temporal reliability (Score 5), imputation has slightly worsened this instead.

ggplot(data=subset(dfScoresImputedAll, Data=='Humidity'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'indianred1', 'grey90'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - Humidity')+
  theme_bw()

Pressure

Based on the plot, the imputed pressure data is reliable for all the days of December 2018. This is slight improvement from the scores obtained before imputation - the data was slightly unreliable for 3 of the days.

dfScoresImputed%>%
  filter(Data=='Pressure')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='Pressure')

From the plots below, it could be observed how imputation has affected different reliability metrics.It has slightly improved sensor value reliability (Score 1 and Score 2) and spatial reliability(Score 3).

ggplot(data=subset(dfScoresImputedAll, Data=='Pressure'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'grey90', 'indianred'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - Pressure')+
  theme_bw()

PM 2.5 Concentration

Based on the plot, the imputed PM 2.5 concentration data is unreliable or highly unreliable for all the days of December 2018. This means that reliability got worse from the scores obtained before imputation - the data was at least only slightly unreliable for 8 of the days.

dfScoresImputed%>%
  filter(Data=='PM25')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='PM 2.5')

From the plots below, it could be observed how imputation has affected different reliability metrics to result in the overall worsening of reliability. It has substantially worsened sensor value reliability (Score 1 and Score 2) for the second half of the month.

ggplot(data=subset(dfScoresImputedAll, Data=='PM25'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'grey90', 'indianred1'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - PM 2.5 Concentration')+
  theme_bw()

CO Concentration

Based on the plot, the imputed CO concentration data is reliable for half of the days of December 2018. This is marked improvement from the scores obtained before imputation - the data was slightly unreliable for most of the days. However, reliability for one of the days worsened after imputation.

dfScoresImputed%>%
  filter(Data=='CO')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='CO')

From the plots below, it could be observed how imputation has affected different reliability metrics.It has slightly improved sensor value reliability (Score 1). Its effect on Score 2 is more variable depending on the day of the month. In terms of temporal reliability (Score 5), imputation has slightly worsened this instead.

ggplot(data=subset(dfScoresImputedAll, Data=='CO'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'grey90', 'indianred1'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - CO Concentration')+
  theme_bw()

H2S Concentration

Based on the plot, the imputed H2S concentration data is reliable for most days of December 2018. This is marked improvement from the scores obtained before imputation - the data was slightly unreliable for all days except three.

dfScoresImputed%>%
  filter(Data=='H2S')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='H2S')

From the plots below, it could be observed how imputation has affected different reliability metrics.It has slightly improved sensor value reliability (Score 1 and Score 2).

ggplot(data=subset(dfScoresImputedAll, Data=='H2S'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'grey90', 'indianred1'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - H2S Concentration')+
  theme_bw()

NO2 Concentration

Based on the plot, the imputed NO2 concentration data is still reliable for most days of December 2018.

dfScoresImputed%>%
  filter(Data=='NO2')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='NO2')

From the plots below, it could be observed how imputation has affected different reliability metrics.It has slightly worsened sensor value reliability (Score 2).

ggplot(data=subset(dfScoresImputedAll, Data=='NO2'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'grey90', 'indianred1'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - NO2 Concentration')+
  theme_bw()

O3 Concentration

Based on the plot, the imputed O3 concentration data is still reliable for all the days of December 2018.

dfScoresImputed%>%
  filter(Data=='O3')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='O3')

From the plots below, it could be observed how imputation has affected different reliability metrics.It has slightly worsened sensor value reliability (Score 1 and Score 2).

ggplot(data=subset(dfScoresImputedAll, Data=='O3'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'grey90', 'indianred1'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - O3 Concentration')+
  theme_bw()

SO2 Concentration

Based on the plot, the imputed SO2 concentration data is still reliable for half of the days of December 2018. This is an improvement from the scores obtained before imputation - the data was slightly unreliable for all the days.

dfScoresImputed%>%
  filter(Data=='SO2')%>%
  select(date, Overall)%>%
  unique()%>%
  as.data.frame()%>%
  calendarPlot(pollutant = 'Overall', year=2018, month=12,
             annotate = 'date', cols='RdYlBu', limits=c(0, 100), 
             labels=c('Highly unreliable', 'Unreliable', 'Slightly unreliable','Reliable', 'Highly reliable'),
             breaks = c(0,20,40,60,80,100),
             main='SO2')

From the plots below, it could be observed how imputation has affected different reliability metrics.It has slightly improved sensor value reliability (Score 1 and Score 2).

ggplot(data=subset(dfScoresImputedAll, Data=='SO2'), 
       aes(x=date, ymin=`Before imputation`, ymax=`After imputation`))+
  geom_ribbon(aes(fill=factor(change)), alpha=0.25)+
  scale_fill_manual('Imputation effect',
                    values=c('cornflowerblue', 'grey90', 'indianred1'))+
  geom_path(aes(y=`Before imputation`), colour='red', size=2, alpha=0.5)+
  geom_path(aes(y=`After imputation`), colour='blue', size=2, alpha=0.5)+
  geom_point(aes(y=`Before imputation`), colour='red', size=2)+
  geom_point(aes(y=`After imputation`), colour='blue', size=2)+
  facet_wrap(~`Score Component`, nrow=3)+
  ylim(0,100)+
  labs(x='Date', y='Score value',
       title='Effect of imputation on data reliability component - SO2 Concentration')+
  theme_bw()

Scoring AoT Data Reliability

Chin Yee Lee, Rongzhi Mai, Xiaoqi Tang

Spring Term 2019

Return to MUSA 801 Projects Page

1. Introduction

A Senseable Smart City

The Array of Things (AoT)

Importance of sensor data reliability

1.2 Setup

2. Defining Data Reliability

2.2 Sensor Value Reliability

2.3 Spatial Reliability

2.4 Temporal Reliability

2.5 Imputability

3. Scoring Data Reliability

3.2 Pre-scoring Data Retrieval and Processing

Temperature

Humidity

Pressure

PM2.5 Concentration

CO Concentration

H2S Concentration

NO2 Concentration

O3 Concentration

SO2 Concentration

3.3 Scoring Sensor Value Reliability

Temperature

Humidity

Pressure

PM2.5 Concentration

CO Concentration

H2S Concentration

NO2 Concentration

O3 Concentration

SO2 Concentration

3.4 Scoring Spatial Reliability

Temperature

Humidity

Pressure

PM2.5 Concentration

CO Concentration

H2S Concentration

NO2 Concentration

O3 Concentration

SO2 Concentration

3.5 Scoring Temporal Reliability

Temperature

Humidity

Pressure

PM2.5 Concentration

CO Concentration

H2S Concentration

NO2 Concentration

O3 Concentration

SO2 Concentration

3.6 Scoring Overall Reliability

Temperature

Humidity

Pressure

PM2.5 Concentration

CO Concentration

H2S Concentration

NO2 Concentration

O3 Concentration

SO2 Concentration

3.6 The scores for December 2018

Temperature

Humidity

Pressure

PM2.5 Concentration

CO Concentration

H2S Concentration

NO2 Concentration

O3 Concentration

SO2 Concentration

4. Scoring Data Reliability After Imputation

Temperature

Humidity

Pressure

PM 2.5 Concentration