Return to MUSA 801 Projects Page
This project was produced as part of the University of Pennsylvania’s Master of Urban Spatial Analytics Spring 2021 Practicum (MUSA 801) taught by Ken Steif, Michael Fichman, and Matt Harris. We would like to thank the Philadelphia Fire Department for providing useful information and data.
The following document is intended to present the methodology of an analysis of property fire risk in Philadelphia and an API tool that aims to improve the fire department’s situational awareness. This new API tool can provide real-time information and assessment for all parcels in the city. This document also includes detailed codes for building the risk assessment model and the situational awareness API.
Structure fire is one of the most common yet destructive urban disasters. On Nov. 21, 2020, two children were killed in a rowhome fire in Philadelphia’s Grays Ferry neighborhood despite rescue efforts by Philadelphia firefighters and desperate neighbors. The fire was reported at about 1:15 a.m. Although firefighters responded within two minutes, the row home was consumed by fire when they arrived.
It is not yet known if the home had working smoke detectors, but the fire risk for each building is not the same. In a recent study, conducted by American Survey CO, for the period of 2005 - 2010, the causes of house fires across America were as follows:
Besides those immediate cause of fire, there are other factors that are important to fire risk prediction and worth fire fighter awareness. For instance, code violation and 311 requests can be a vital factor for fire risk, for fire is more likely to break out and to spread in run-down neighborhoods.
Every day, The Philadelphia Fire Department responds to hundreds or even thousands of locations to deal with an array of emergencies. Currently, the Fire Department has little ‘situational awareness’ of fire risk for a given location when an emergency call comes in. In this project, we aim to improve this situation by creating a new API tool, which can return two deliverables for each property in the city:
In addition, we want to let residents get a real-time update of the fire risk of their houses so they will have a situational awareness on risk for each property citywide.
Fire Response Situational Awareness API:
http://3.22.171.167:8000/parcel_info?addr=1200%20W%20VENANGO%20ST
“1200%20W%20VENANGO%20ST” here can be replaced by any street address in Philadelphia and is not necessary to be URL-encoded.
API is the acronym for Application Programming Interface, which is an interface that defines interactions between multiple software applications or mixed hardware-software intermediaries. It defines the kinds of calls or requests that can be made, how to make them, the data formats that should be used, the conventions to follow, etc.[Wikipedia-API] When someone uses an application on his or her device, the application connects to the Internet and sends data to a server. The server then retrieves that data, interprets it, performs the necessary actions and sends it back to the device. The application then interprets that data and presents the information in a readable way. That’s how API is used in our daily life.
API- How does it work?
The diagram above shows how our API works. Once getting an input, it automatically grabs data elsewhere, deals with it and gives back a response to us. All the data we use is from diffuse data sources belonging to the city. The challenge here is figuring out how to get and combine all the data we need, structure it, make a relative fire risk assessment based on it and present them all in an API response.
API- How was it built?
The API was built in R using Plumber[See appendix]. Once built, it serves locally in our own machines. Then we built a container which contains the API and the environment, push it to the cloud. At last, we pull the container in a virtual machine, run it and open it to the public.
First, the API we built provides real-time data. It requests the latest data only when getting an input. When you use it next time, it will request data again, giving back a new response without using the old one.
Second, the API integrates data from diffuse data sources of the city. It’s convenient to use it because it gives us all the information we want in only one request.
Third, the API provides clean data for us. It means the API gives back data without redundant or irrelevant information. It also structures it to make the response highly readable.
Last but not least, the API can be used for developments of multiple apps without causing any conflict. Apps in different usages can use the same API to get data.
An app demo
We used data retrievd from different sources to build dataset for modeling, which includes fire data from the Philadelphia Fire Department; Property Assessments, Licensing and Inspection Code Violations, 311 Service and Information Requests from OpenDataPhilly; and Socioeconomic and Demographic Data through 5-year American Community Survey APIs.
The data was also “wrangled” before being explored in the following section, in order to optimize predictive ability of each variable. For details on this procedure, please see Appendix: Data Wrangling. Since the fire risk is considered on parcel level, we joined multi-source dataset using unique identifiers OPA number or parcel registry number & address (the relationship of data is shown in the plot below). Once the wrangling and exploratory analysis process were complete, we were left with a dataset of about 490,000 individual parcels, which includes a “universe” of fire, property, risk factor information.
We organized the data into four different categories of features which we anticipated would influence the risk of fire for each parcel city wide. This is done to ensure our model is sensitive to a diverse range of variables.
1.Previous fire
Hypothesis 1: Is previous fire occurrence spatially correlated with risk of fire?
We calculated the average distance from each parcel centroid to 5 nearest fire accidents to see if there are spatial patterns in fire occurrence. This variable can show us whether there are any clustering patterns, the idea being that fire events may be likely to happen in surrounding parcels with certain features.
2.Demographics
Hypothesis 2: Are demographic characteristics correlated with fire accidents?
We wanted to see if demographic characteristics were also linked to fire accidents in Philadelphia. For example, if areas with higher population are more likely to have fire occurrence since it would introduce more human factors?
We also assigned each property to its neighborhood and census tract to see if, at a different scale, there is a spatial clustering of fire accidents that varies from neighborhood to neighborhood or tract to tract.
3.Property attributes
Hypothesis 3: Could property attributes relate to the risk of fire?
We wanted to see if the physical characteristics of a house would be correlated with fire accidents. We looked almost exclusively at the property dataset, and then focused on zoning type, category, and housing condition.
4.Risk factors
Hypothesis 4: Does the presence of risk factors affect risk of fire?
The risk factors here refer to the unsafety in the property itself and neighborhood blight features. Violation and 311 request data were used in this part to identify the risk factors.
As to the risk factor of property itself, unsafe structures or a lack of fire related equipment in the property are carefully analyzed, since the presence of such structures or problems could certainly add to the risk of fire.
In addition, we want to explore if adjacent to neighborhood blight/risk factors matter. Features like average distance to nearest broken street light request, no heat houses, and violation were analyzed.
Following part is our exploratory data analysis. In our analysis, the dependent variable is a binary outcome - either a parcel had a fire accidents or it wasn’t. In this case, the relevant question is whether for a given feature, there is a statistically significant difference between areas that caught a fire and areas that did not. These differences are explored in a set of plots below.
The recordings of fire accidents have increased a lot in 2018. This may be due to the change in the procedure for recording fires after 2017. in 2020, the number of fire events decreased. Considering the outbreak of the COVID 19, it seems reasonable. The plot on the right shows the most common types of building fire. Since we had sum up the fire events information to the parcel level, we left out the wildfire.