Conclusion
Philadelphia’s Amenity Landscape
In Philadelphia’s urban landscape, the densest concentration of amenities is found in the center city, characterized predominantly by an array of restaurants, services, and retail outlets. This central zone stands out as a vibrant hub for everyday urban life and economic activity.
However, the amenity distribution across the city is not uniform. While essentials like restaurants, beauty services, and grocery stores are widespread, specific amenities such as green spaces, sports facilities, historic sites, and arts venues are more localized, creating unique clusters in certain districts. This pattern of specialized neighborhoods beyond the amenity-rich center city adds to Philadelphia’s appeal as a case study in urban amenity clustering, highlighting its diverse and distinct urban identity.
Neighborhood Amenity Clusters
In our analysis, we identified 7 distinct clusters of neighborhoods with similar amenity compositions. These clusters were determined using a k-means cluster analysis of the percentage share of each amenity type. Percentage share was used in lieu of raw counts or counts per area to counter the high variation in population density, and to emphasize qualitative differences rather than rank neighborhoods against each other. The clusters are as follows:
Applications for future research
The integration of detailed urban amenity data is crucial for urban planning, public policy, and public health research. Accurate and current data on neighborhood amenities is essential, yet often missing, in studies of neighborhood choice and housing opportunities. This gap highlights the need for a framework that includes such data for better decision-making.
Utilizing this data can significantly inform housing policies, particularly Housing Choice Vouchers, by centering on the lived experiences and preferences of residents. This approach promotes equity by considering the distribution and accessibility of amenities, which are vital for quality urban living. Moreover, it can guide city residents, especially those with housing vouchers, in making informed residential choices, ensuring access to their preferred amenities and services.
Additionally, this data can lead to more effective public health strategies by linking amenity distribution with health outcomes, allowing for targeted interventions. Overall, this advancement in urban studies, with a focus on detailed amenity data, can enhance urban living quality and foster more equitable urban environments.
Limitations
The analysis faces two primary limitations. First, there is a demographic mismatch between Yelp’s user base and the general population of Philadelphia. Yelp users are predominantly older (40% over 55 years), higher-income (80% earning above $60k annually), and more educated (77% with at least a college degree), whereas Philadelphia’s median age is 34.6 years, median income is $27,331, and only 33.6% hold a college degree. This discrepancy suggests that Yelp reviews may not fully represent Philadelphia’s diverse amenity landscape, potentially overlooking preferences and needs of younger, lower-income, and less-educated residents.
Second, the challenge in using Yelp data arises from the ambiguity in its business listing categorization. The lack of a clear hierarchy in the alias field, which classifies businesses, required the development of a complex Python script. This script iteratively sifts through the alias terms to determine the most accurate business category, sometimes relying on specific keywords in business titles for clarity. While this method enhances data accuracy, it is labor-intensive and subject to human error and subjectivity, reducing its replicability and potentially introducing biases in categorizing the amenities.
The use of Zillow data benefits from problems in the Yelp data’s current form such as searching no clear ranking in the terms under a business’s alias, but ensuring that these terms do not overlap with information about a house in the description is difficult.
Furthermore, the presence of records that put the address within the description or have no address at all present issues with offering later granularity should they not have their neighborhood in the description.
Otherwise, current failure to include geocoding or some weight or importance factor for ranking mentions–ideally giving great preference to specific business mentions–are the largest limitations of the Zillow analysis in our current implementation.