Methodology
This project maps the cultural identity embedded in the names of Buenos Aires venues — cafés, bars, restaurants, bookstores, cinemas, and theaters — using OpenStreetMap data and a hybrid classification approach.
Data collection
Venue data was queried from OpenStreetMap via the Overpass API, filtered to the administrative boundary of Ciudad Autónoma de Buenos Aires. Only venues with name tags were retained, covering amenity types like cafe, bar, restaurant, cinema, theatre, and shop types like books and bakery.
Spatial enrichment
Each venue point was spatially joined with barrio (neighborhood) and comuna (district) boundaries from the Buenos Aires open data portal. Socioeconomic indicators — average household income and average age — were merged at the comuna level from the city’s Comunas en la Web statistics.
Cultural motif classification
Venue names were classified into cultural motifs through a three-stage pipeline:
- Regex patterns — curated regular expressions match names to motifs like Italian heritage, French heritage, Argentine folklore, Religious / devotional, and others.
- LLM review — remaining unclassified names were processed through GPT with structured prompts defining each cultural theme, then flagged for manual verification.
- Manual labeling — review of both regex and LLM outputs, plus direct classification of all names still left unassigned.
Analysis
TF-IDF term extraction was computed per motif to surface the most distinctive words characterizing each cultural pattern.
Pearson correlations between motif proportions and comuna-level socioeconomic variables (income and average age) were calculated to quantify how naming patterns relate to neighborhood profiles.
Chain venues (defined as ≥ 8 locations) were excluded from correlation analysis to prevent franchise brands from dominating the results.