Improving OpenStreetMap shop coverage with AllThePlaces

Thanks!

Mismatch may be on distance and is definitely on

shop=builders_merchant vs shop = doityourself

  1. should we document shop=builders_merchant | Tags | OpenStreetMap Taginfo or retag shop=builders_merchant to some of existing tags?

  2. should I consider shop=builders_merchant shop = doityourself as matching? If yes, what else may be worth matching?

shop=interior_decoration vs shop = houseware

  1. is ATP badly classifying Dunelm? Should it be shop = houseware in ATP? Or maybe OSM is classifying it badly?

  2. shop=interior_decoration and shop = houseware should be considered as matching… What else belongs in this group?

Yes, distance alone will consider it as separate and not matching.

Some of problems here are that the same brand:wikidata may be for both say supermarket and fuel station (Tesco).

Or the same for fuel station, convenience shop and parcel locker (Orlen).

Also, I tried to use wikidata codes once - you ebd in a rabbit hole of some brands having separate wikidata entries for subbrands, some not. Sometimes brand:wikidata is linking dead company entry sometimes dedicated brand entry. And often multiple at once.

More importantly I have not yet found case where brand:wikidata matching would improve matching.

For UK using postcode is likely helpful but it does not work well in global coverage where postcodes are tagged fairly rare or in various formats.

And for global processing adding missing brand:wikidata as part of making this tool is juts not feasible at all.

(some prices paid for having global tool - dedicated localized one are likely to remain better)

I have major TODO for matching: consider also website tag

I will try to reduce this confusion, I am not yet sure how.

But likely it will include customized distance thresholds for when POI is considered as gone (for some spiders it should be 15m, for some 1500m).

Next regeneration of data will be published with lower threshold for distance when object is considered as missing. If this goes well I will reduce it further, maybe eliminating gray area completely.

Or maybe it should be listed as a separate category?


I just drafted something that should report it as a separate category, lets see how well it will work.

for now it is kind of blocked by misusing several tools in my tech stack :slight_smile:

one of funniest part is that commit of multitude of generated static files for publication takes multiple hours

I will try gray area listing of away, but not very far away, if that will not explode processing I will try listing also successes.

In such case both will match the same object without spotting problem.

I considered doing this but ATP data in general is not good enough for that on a global scale. Too many shop on the wrong continent or misplaced by several kilometres.

It is still in vague plans but I am not planning to do any time soon. Maybe for few spiders with known excellent quality?