Case Study 3: Potholes in Baltimore
Case Study 3: Potholes in Baltimore#
The city of Baltimore has problems with potholes (holes in the street that are not fun to drive over). Part of the city’s responsibilities is to fix these potholes to make roads safer. There has always been a system in place for people to report these potholes, but the process was slow. The city invested in building a smartphone app to automatically report potholes and reduce the time it takes to fix potholes. The idea was to use someone’s phone GPS and accelerometer to report the pothole’s location as someone drives over it. Then it’s a data science problem at that point to take this incoming data and predict where the potholes are.
While this case study sounds less risky at first (maybe even like a useful application of data science), it demonstrates a very dangerous pitfall data scientists face. To benefit from this technology, it requires that people have a smartphone. That means areas where residents are less likely to have smartphones, are less likely to have these automatic reports sent in. This can be a real fear that these more impoverished communities will be left behind, as more resources are sent towards the more affluent neighborhoods with more reports, purely because there are more people with smartphones there.
In some sense, the city added a reporting bias to their system. A reporting bias exists when there is some reason the answers reported differ from the truth. An example of reporting bias is asking a married person, “Have you cheated on your spouse?” The answers people say are most likely biased towards “no” since there is a risk of reporting truthfully. Here, the reporting bias comes from differing levels of technological access.
When designing a data analysis, application, or model, you need to think carefully about how it impacts people of different races, genders, physical or mental abilities, socioeconomic status, etc. (and how it can affect intersecting identities). Thinking of diversity and inclusion is crucial for a data scientist since we want to make artifacts that benefit all people.
One approach you might consider solving this problem of different access or needs would be to design something to the “average person.” This isn’t always the best idea since your definition of “average” could put certain groups at a disadvantage. Additionally, it’s still not apparent that designing for the average helps anyone at all (example). Instead, you want to think about how you can create a modular or customizable system to work for any person’s individual needs.