Chicago metropolitan area relation

Last month, @mtmail mapped a relation for Chicago metropolitan area. This is currently tagged:

boundary=statistical
loc_name=Chicagoland

To my knowledge, this is the first time that someone has mapped a Metropolitan Statistical Area in OSM. I am posting this to gether feedback on whether MSAs should be mapped in OSM. There are 387 of these MSAs in OSM the United States, so keeping this one would implicitly allow the others to be mapped also.

If we think these should be mapped, I would suggest also adding:

border_type=metropolitan_statistical_area

…to be consistent with CDP tagging.

One of the arguments against mapping MSAs is that they are simply a collection of counties, and would effectively be replicating data which is already in wikidata relationships. In addition, it would add yet one more shared relation to deal with when editing boundaries.

Arguments in favor include ā€œwe have the tagging for it, so why not?ā€ as well as the ability to use it in things like Overpass area queries.

There is also the question of whether ā€œChicagolandā€ is actually a local name for the MSA or if it’s more of a local name for Chicago and its suburbs more generally.

Discuss.

I am vehemently against such area inflation for the already-mentioned reason that we’re making editing harder (someone splitting a way will ā€œeditā€ and upload a ton of relation, and when later asked why they edited the Chicagoland boundary will probably reply with ā€œhuh?ā€) and that we are duplicating information. If someone is desperate to map these, make them a relation where the individual counties are members, not the boundary lines of the counties. (This might make it more difficult to run a overpass query against the whole area - but that is a limitation of Overpass that we should improve there and not circumvent it by adding triplicate data to OSM.)

1 Like

I’d say data consumers would be better off getting this freely available data straight from the Census Bureau rather than second hand through OSM.

2 Likes

There are also micropolitan statistical areas – the same thing, but for smaller population centers. Together, we’d have to maintain 935 core-based statistical areas. These boundaries are by definition aligned with county boundaries and only change at most annually (and at least every five years).

CBSAs are only one tier of the OMB’s geographies within the Census Bureau’s standard model of geography. They’re often subdivisions of combined statistical areas and are often further subdivided into metropolitan divisions. Wikidata has extensive coverage of CBSAs and CSAs in terms of the counties they consist of. (Metropolitan divisions need a lot of work there.)

In past discussions, the consensus has been to avoid mapping these areas, because of slim benefit over simply enumerating the counties in a CBSA. (In fact, many MSAs and most μSAs consist of only one county). For example, one of my saved queries in Overpass turbo begins by enumerating the counties that form the Cincinnati–Wilmington, OH–KY–IN Combined Statistical Area. Since my goal for this query is to track my mapping progress, I ignored the 2023 change that removed Mason County, Kentucky, from the CSA:

(
  {{geocodeArea:Dearborn County, Indiana}};
  {{geocodeArea:Franklin County, Indiana}};
  {{geocodeArea:Ohio County, Indiana}};
  {{geocodeArea:Union County, Indiana}};
  {{geocodeArea:Boone County, Kentucky}};
  {{geocodeArea:Bracken County, Kentucky}};
  {{geocodeArea:Campbell County, Kentucky}};
  {{geocodeArea:Gallatin County, Kentucky}};
  {{geocodeArea:Grant County, Kentucky}};
  {{geocodeArea:Kenton County, Kentucky}};
  {{geocodeArea:Pendleton County, Kentucky}};
  {{geocodeArea:Mason County, Kentucky}}; // Maysville, KY μSA detached in 2023
  {{geocodeArea:Brown County, Ohio}};
  {{geocodeArea:Clermont County, Ohio}};
  {{geocodeArea:Hamilton County, Ohio, United States}};
  {{geocodeArea:Warren County, Ohio}};
  {{geocodeArea:Butler County, Ohio}};
  {{geocodeArea:Clinton County, Ohio}};
);

It’s verbose boilerplate, but a small price to pay for some control over the precise definition and vintage. If I needed to query at a larger scale, let’s say within all the MSAs in the country, I’d turn to a tool that can perform a crosswalk with Wikidata, which knows which counties are part of the Chicago metropolitan area (Q1754965). For example, Sophox and QLever can automatically query for cities in a metropolitan area based on Wikidata.

The loc_name=Chicagoland on this boundary relation is misleading, placing too much emphasis on a statistical unit for a name that isn’t used in demography or economic statistics at all. My understanding is that Chicagoland refers to a broader area that reaches farther into Wisconsin and Illinois, but the extent depends on who you ask.

When laypeople want coverage of ā€œmetropolitan areasā€, they aren’t necessarily thinking of the MSA boundaries specifically. Any given metro area will have a number of other definitions, especially outside of government. Most local print and broadcast media outlets have their own proprietary definitions based on their own service areas. This is a general issue with mapping ā€œmetropolitan areasā€, and there are some far more problematic cases than Chicago, such as New York and San Francisco.

I think it would be worth understanding the motivation for adding this particular boundary: did a geocoder user complain about an unsuccessful query for Chicagoland or the Chicago metropolitan area per se? Did they need the full boundary or only an approximate point feature (which could be derived from the namesake city)? Did they expect systematic coverage beyond this one moniker?

3 Likes

Original mapper here. My colleague mentioned ā€˜Chicagoland’, not a geocoding user. Looking at chat history he also mentioned ā€˜research triangle’ (Node: ‪Research Triangle‬ (‪2379521221‬) | OpenStreetMap) and various tri-state areas (Tri-state area - Wikipedia) as examples of informal metro areas.

For Chicagoland several websites give a broad and narrow definition of the extend. Wikipedia has it as alias for the statistical region. On Wikipedia’s talk page Talk:Chicago metropolitan area - Wikipedia somebody claims some areas shouldn’t be included and another user that it should, so it doesn’t seem clear cut.

Removing it from OSM makes sense, I don’t mind. It’s not part of (administrative) address hierarchy geocoders use.

Similarly, I live in Silicon Valley and the San Francisco Bay Area. Both are represented as place=region nodes. The former is located in front of a historically significant building, and the latter is smack-dab in the middle of the bay for neutrality. A node doesn’t express the scale of the region, but it’s an admission that we aren’t in the business of defining arbitrary boundaries. We take the same approach for continents as well as many cities’ neighborhoods.

I’d support replacing the Chicago metropolitan area boundary relation with a place=region node named Chicagoland, representing only that informal place, nothing more scientific than that. It could be centered near the Loop right alongside the Chicago node, or anything else that comes to mind. I split out a Chicagoland (Q134718040) item on Wikidata that we can link to.

1 Like

Sounds good. I’ll make the changes in a couple of days and will add comment in this thread.

Now deleted. And Chicagoland created as node. Changeset: 167320305 | OpenStreetMap Thanks @Minh_Nguyen for creating the wikidata entry.

2 Likes