ShapeFile indexing problems (esp LineString geoms)

Topics: Data Access, Algorithms, SharpMap v0.9 / v1.x, SharpMap v2.0
Oct 11, 2007 at 8:36 AM

Does anyone know of any known issues with SharpMap's indexing of ShapeFiles? We have been trialing an integration of SharpMap into our GIS editor application using some reasonably large (85MB) shapefiles. The results we get appear incorrect in many cases.

In each case we open a ShapeFile data provider and use the GetGeometriesInView method to return geometries within a BoundingBox.

We've used release version 0.9 and Alpha 2.0, but in both cases get strange results.

Version 0.9:
- Point geometries appear to work perfectly. When integrated with our application, we pass SharpMap a BoundingBox representing the viewable area of the screen, and SharpMap correctly passes back geometries within this area for us to render.
- LineString geometries seem broken. When our application loads a LineString shapefile we get around the right number of results for the size of BoundingBox supplied, but many of these geometries (and their associated BoundingBoxes) are completely outwith the BoundingBox on the original query, and others which should definitely be included in the results are missing. Its almost as if the BoundingBox provided in the query is being offset in some way before its applied to the data.

Version 2.0 (Alpha):
- Fairly inconsistent results. Again point geometries appear more stable, but even these are prone to breaking when we request all geometries within a smaller BoundingBox - ShapMap returns no results when there are definitely some within the BoundingBox provided to it. I recognise that its possible the ShapeFile indexing isn't completely stable yet in this branch of the code.

Any knowledge / suggestions welcome!
Oct 14, 2007 at 2:43 AM
The v2.0 alpha code is quite old and broken.

2.0 Beta 1 was just released and spatial indexing is much improved and even well covered by unit testing.
Oct 15, 2007 at 12:14 PM

Many thanks for the link. I've built against the Beta1 assemblies today and its looking much better.

There are a couple of bugs I've noticed which may or may not be known issues:
- Some of our shapefiles will not load as Open()ing the ShapeFileProvider throws a ShapeFileIsInvalidException "Polyline found with 0 parts." It does seem a bit strange that some of our (client's) data has polylines without any vertexes, but these are normally treated as valid data by ESRI software afaik so I guess SharpMap should support this too?
- When we try to Open() a ShapeFileProvider object which already has a spatial index built and saved to disk, SharpMap throws a SerializationException "End of Stream encountered before parsing was completed."
- I can also see an intermittant problem with MultiPolygon geometries where ExecuteGeometryIntersectionQuery misses features which should be within the BoundingBox supplied, and any polygons which are returned seem to have corrupted geometries. We are using our own rendering toolkit so this seems to be something low down in the data provider which is getting a bit confused. If I can isolate the problem more specifically I'll certainly add the details to this discussion.
Oct 15, 2007 at 1:50 PM
... I now have a way of consistenly reproducing the polygon data provider problem above.

If I load a MultiPoint shapefile, and then load a Point shapefile afterwards, the MultiPoint datasource becomes corrupted in the way described above - ExecuteGeometryIntersectionQuery misses out features, and those it does return have incorrect geometries.

I don't see the problem if all the Point shapefiles being used are opened before the MultiPoint shapefile - weird!
Oct 15, 2007 at 2:31 PM
..and I've also just noticed that the same thing will happen to LineString files if a Point file is opened after it. So to summarise, I can only open multiple shapefiles without them corrupting one another by ensuring that I open them in a strict Point, Line, Polygon order.

It would be an interesting test to open two line or polyon shapefiles which have features intersecting each other and see if the data providers handle this too..
Oct 15, 2007 at 4:06 PM
Thanks for the great feedback, Fkp.

I'll convert this to a work item and flag it for the next interim release of v2.0.
Oct 15, 2007 at 4:07 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.