Forest import

smartsmartie · October 22, 2020, 12:43pm

Hi!

I’m quite new at OSM and have read about this import yesterday. And I’d like to quickly point out a way how to use the data in a way which might be appreciated by the community. Sadly, I won’t have time to do this myself, so this is just supposed to be a quick idea for anyone interested.

Firstly, I agree that this whole import needs to be reverted because what you did, posiki, is overfitting of your data (https://en.wikipedia.org/wiki/Overfitting).

But in my opinion, such a large amount of data is a huge chance to improve OSM if it is used in the right way. Such an import should not cause any harmful interference with past or future edits of the community. After reverting the first import, you could follow these steps:

Vectorization of the pixel forest data: Initial forest patches just as you did, posiki.
Creation of new forest patches, only based on new points at the center of each line of the initial forest patches. Like this, you effectively average each two neighboring points. This eliminates most of the pixel structure without increasig the amount of points. And you avoid overfitting: You use a minimum amount of points to represent the forest shape at the accuracy of the underlying data set. This sows the accuracy of the data to future mappers and makes future improvements as easy as possible because they need to move only a few points for corrections.
Only cut out highways of type tertiary and above. Like this, you avoid pseudo mapping of small roads and paths with small blank lines in the forest as you did before. Thus, future mappers won’t run into accidentally merging your forest points with their new highway points.
Cut out all existing OSM areas except of forest with 3m buffer (buildings, landuse, lakes, sportsgrounds etc.). Cut out forest without buffer. Like this, you avoid sharing points with any existing OSM object except of forest. Thus, future mappers can change these without being bothered by freeing them from your forest.
Deletion of all forest patches below 2300m². The LUKE data set’s pixels are sized 16m x 16m. So you should at least delete all patches below an area of 3x3 pixels, 48m x 48m, approx. 2300m² to avoid messy small forest patches.
Test import and visual inspection at randomly chosen areas.
Community Discussion. Adding additional image processing steps if needed.
Full import.

In my opinion, OSM needs to welcome and professionaly process external data to keep up with e.g. AI based map services in the future. That’s why I want to thank you, posiki, for caring about importing this data. I hope I could help a little with this.

Cheers,
smartsmartie