A Multi-Source Machine Learning Study of Pedestrian and Bicycle Safety Using Google Street View and SHAP: San Diego County, CA
-
Graphical Abstract
-
Abstract
As demand for walking and cycling increases, ensuring road user safety is crucial. Recent data shows cyclist fatalities rose by 7% and pedestrian deaths by 13% in the U.S. from 2020 to 2021. This study evaluates ped/bike crash risks in San Diego County, California, addressing gaps in research on how built environments, especially street design, impact safety. It uses Google Street View and fuses multiple data sources, including socio-demographics, traffic, land use, and street-level features, using traffic volume as a proxy for exposure.
Crash data from 2011 to 2022 and control points (defined as locations with no recorded pedestrian or bicycle crashes) were analyzed using XGBoost and RF models. XGBoost outperformed RF, achieving 90% accuracy, making it more effective at identifying crash and control points.
Key findings from the SHAP feature importance revealed key predictors of crash risk. We found that visible environmental characteristics highly influence crash probability. Streets with higher visible building density and visible road area were associated with increased crash exposure. Roads with a 25-mph speed limit showed lower crash risk, while crash probability increased at 35–40 mph and declined again beyond 45 mph. Low-income commercial areas were identified to have a higher ped/bike crash risk, suggesting a lack of infrastructure investment and safety measures in these neighborhoods. Race and crash risk demonstrated non-linear associations. For the Hispanic population, crash risk fluctuated across the distribution, peaking in census block groups with 40% to 80% Hispanic residents and decreasing beyond that range. Areas with a lower proportion of Black residents were more prone to crashes, which may reflect underrepresentation in planning and safety investments in minority neighborhoods. This study provides new insights to identify high-risk ped/bike crash factors and proposes phased interventions to improve overall safety outcomes.
-
-