This project was conducted as part of the University of Pennsylvania’s Master of Urban Spatial Analytics/Smart Cities Practicum in Spring 2023, which was taught by Michael Fichman and Matt Harris. In addition to Michael and Matt, we would like to thank Ren Massari, Peter Casey, and Kevin Wilson of The Lab at DC for their support and guidance on this project.

View RATScreener Web App | View RATScreener Github Repository | Return to MUSA 801 Projects page


Rats thrive in cities. However, the feeling is not mutual according to city residents. In 2021, DC’s 311 Service received over 11,000 requests for rodent inspection and abatement. As inspection requests continue to increase after a pandemic-induced spike in rat populations, inspection departments are inundated with requests for treatment [1]. However, inspection and treatment require resources: personnel, time, and money. With over half of the inspection requests in DC not finding evidence of rats, the inspection and treatment process is costly to the city.

This project explores the spatial and time patterns related to vermin infestation in order to develop a predictive model for estimating the probability of rat detection in a given area of Washington, DC. The information provided by the tool will allow city health and vermin inspectors to prioritize exterior inspections of properties suspected of vermin infestation based on the actual likelihood of rodents being detected. We aim to create a proof-of-concept infestation forecast that will be used as the basis of an inspection optimization data system and web app, which will allow for more targeted and efficient inspections.

1. Introduction: Context & Use Case

Municipal rodent management is a critical part of local government operations. In many cities, rodent infestation is persistent and can have detrimental impacts on the public infrastructure, local economy, overall health and well-being of both residents and the environment [2]. Cities face the monumental challenge of conducting large-scale rodent management, responding to resident complaints, and educating the public on the cause of rodent infestations.

This project aims to develop a screening tool that prioritizes DC 311 requests for rodent inspection based on the likelihood of rat detection on a given block. The goal of this work is to more effectively distribute resources used for inspection and treatment with the hopes of freeing up resources for other rodent management needs. The ultimate goal is to aid in the abatement of DC’s rat infestation issues.

1.1: Understanding Urban Vermin Infestations

Norway rats, also known as brown rats, are the most common species found in U.S. cities. These rats are commonly associated with sanitation problems in cities. While this reputation is accurate, rats are also behind many other problems that occur in urban areas. Rodents can cause structural issues by burrowing in streets or buildings, causing property damage that can result in the loss of businesses and homes. Rats also cause power outages, internet blackouts, and fires by gnawing on gas lines or electrical wires. Finally, they also pose a risk to public health and well-being as they can contaminate food, carry diseases, and spread pathogens [1].

Cities, with their large and dense human populations, provide optimal habitats for rats. Colonies of rats can stretch across entire city blocks. Rats utilize human-made infrastructure by traveling via sewer systems or utility lines to reach neighboring buildings. Food is the most important resource that cities provide to rats. They often explore their territories for new food sources at night when there is less human activity. City residents know that residential and commercial trash cans and dumpsters are often not tightly secured, making them prime opportunities for rats to find food. Ultimately, human behaviors and food waste are a main driving force behind urban rat infestations [3].

1.2: DC’s Current Rat Infestation Abatement Approach

The Distric of Columbia Department of Health (DC Health) is responsible for the city’s rodent control program. Currently, DC Health aims to protect public health by reducing rodent activity through a combination of proactive surveys, inspections, baiting, enforcement, community outreach and distribution of educational materials. This work is performed by the Rodent and Vector Control Division, but relies on interagency cooperation.

DC’s business-as-usual approach relies on professional knowledge and ad-hoc decision-making on a daily basis. Residents can request inspections from DC Health and the Rodent and Vector Control Division by submitting reports through 311. When a request is received, DC Health inspectors are sent to the location of the call to inspect and treat any infestations. However, there are more inspection requests than inspectors can handle on a daily basis. There is currently no formal prioritization of inspection requests.This strategy allows for inefficiencies in terms of employee time and financial resources, limiting the benefits of the inspection services to the public.

1.3: Our Approach to Improving Rat Inspection Efficiency

Our project aims to disrupt the current inefficiencies in inspection services by developing an inspection optimization system, called RATScreener. The RATScreener system will be based on a predictive model that forecasts the likelihood of infestation based on a range of spatial, population, and built environment variables. This system will assign probabilities of infestation to specific blocks and provide an overview of hotspot areas in DC.

This approach will allow the inspection office to prioritize incoming requests and understand the probability of actual infestations. DC Health inspectors can then make more informed decisions, target inspections to areas of high infestation probability, and reduce strain on limited resources within the department.

1.4: RATScreener Overview

We present an overview of the modeling process behind RATScreener below. Our process began by collecting data and assigning variables to each block in DC. We then run our model and calculate predictions for each block, which are then categorized by priority based on the probability of rat detection. As new requests come in, RATScreener identifies the block in which the address is located and assigns a priority level based on the block’s risk of rat detection. Finally, a list of prioritized requests is presented to inspectors via the RATScreener app.

An overview of the process behind RATScreener

2. Exploratory Analysis

In the exploratory analysis phase of this project, we aim to identify the patterns of rat infestation across time and space. Our analysis attempts to understand the spatial process associated with infestation and the relationships between built environment, spatial, and population variables.

2.1: Processing Rat Inspection Outcomes as the Dependent Variable

The primary independent variable that the RATScreener model is trying to predict is whether a given 311 request for a rat inspection will lead to the discovery and treatment of an actual infestation. DC Health provided a dataset of all rat inspection requests placed through DC’s 311 Service between 2015 and 2018. The dataset includes the address the request was placed at, administrative information, and field notes from the resulting inspection.

A text analysis was performed on the field notes to assign each 311 request to a “rats found/no rats found” binary variable. The analysis detected words such as “baited,” “treatment,” and “treated” to indicate that rat activity was identified, and phrases such as “no evidence,” “no rat burrows,” or “no rat activity” as indications that no evidence of rats was found. The resulting binary variable is used as the independent variable for the remainder of the analysis.

As previously mentioned, rat colonies tend to be limited by barriers in the built environment such as roads and other impervious surfaces. Because of this, rat inspections typically cover an entire block, not just the address that the 311 request came from. In order to have the modeling process reflect this approach to treatment, the RATScreener tool predicts the likelihood of rat infestation at the city block level. After creating city block polygons using DC Open Data’s Street Centerlines shapefile, the binary variable described above was translated to a block-level binary variable of whether rats had been found anywhere on that block in the past. This process yielded 5,243 city blocks in DC which are used for the remainder of the analysis.

In addition to the 311 data shown above, DC Health provided us with a dataset of 100 inspections performed across the city at locations which were not connected to a 311 request. This secondary dataset is used later in the analysis for additional validation and to better understand how reliant the models are on patterns seen in 311 data specifically, rather than the underlying data of actual rat infestation.

2.2: Exploring Risk of Rat Infestation

# Load rat infestation dataset and spatialize
Rats <- read.csv("./data/rats_to_blocks.csv.gz", header = TRUE) %>%
  na.omit() %>%
  st_as_sf(.,coords=c("LONGITUDE","LATITUDE"),crs=4326) %>%
  st_transform('ESRI:102685') %>%
  mutate(month = month(ymd_hms(SERVICEORDERDATE)),
         year = year(ymd_hms(SERVICEORDERDATE)),
         serviceday = ymd(substr(SERVICEORDERDATE,1,10))) %>%
  dplyr::select(P0010001, index_right, SERVICEORDERDATE, INSPECTIONDATE, SERVICENOTES, serviceday, WARD, week, year, month, calls, activity, geometry)

# load street centerlines from DC open data
centerlines <- st_read("./data/Street_Centerlines_2013/Street_Centerlines_2013.geojson") %>% 
  st_transform("ESRI:102685") %>% 
  filter(ROADTYPE == "Street")

# convert street centerlines to block polygons
blocks <- %>% 
  dplyr::mutate(block_id = row_number()) %>% 
boundary <- st_union(blocks)

# spatial join to assign each rat datapoint to a block polygon
rats_block_join <- st_join(Rats, blocks)

# count observations per block for mapping
block_dat <- left_join(blocks, rats_block_join %>% 
                                          st_drop_geometry() %>% 
                                          group_by(block_id) %>% 
                                          dplyr::summarize(inspection_count = n(),
                                                    rats_found_yn = ifelse(1 %in% activity, 1, 0),
                                                    rats_found_count = sum(activity))) %>% 
  dplyr::mutate(inspection_count = replace_na(inspection_count, 0),
         rats_found_yn = replace_na(rats_found_yn, 0),
         rats_found_count = replace_na(rats_found_count, 0),
         area_acres = as.numeric(st_area(.)) / 43560) %>%
 dplyr::mutate(found = case_when(rats_found_yn == "0" ~ "not_found",
                            rats_found_yn == "1" ~ "found"))

# count of inspections m