Large Shapefile

Sep 28, 2006 at 3:08 PM
I am trying to load a large shapefile (about 1.5Gb size) into my Sharp Map application and application throws an exception that it has 'run out of memory'. Is there anyway to stop the application loading the whole layer into memory at once? Is there a way of splitting the shapefile - along the lines of seamless tables in MapInfo?



Sep 28, 2006 at 8:21 PM
I think it's the spatial index, not the shapefile itself, is fully loaded in memory, which in this case must be too large. I suggest you try PostGIS and shp2pgsql. I wouldn't recommend MsSqlSpatial for this task yet (importing very large shapefiles) because it uses the same code for loading shapefile's index in memory of SharpMap, but as soon this changes I will notify you in this thread.
Sep 29, 2006 at 3:18 PM
Actually it is building the spatial index that makes it run out of memory. Once the index is built, this shouldn't be an issue. Perhaps codekaizens new spatial index doesn't have this problem?
Oct 11, 2006 at 8:47 PM
It's possible, but the index is still all loaded into memory. I rewrote the BoundingBox class to be a struct (ValueType) so that memory issues will be minimized and speed increased, since they all live on the stack, instead of in the managed heap.

You could give it a shot... I'm interested to know how it goes. Since the index is dynamic, it is possible that it could be adapted to large shapefiles by indexing features in progressively refined nodes...
Oct 11, 2006 at 8:50 PM
Ooh, or based on a filter (such as an extent filter or attribute filter).

Interesting stuff... I'd like to get my hands on a 1.5GB shapefile.
Oct 19, 2006 at 5:12 AM

Are your changes checked in and downloadable?

I'd love to have a look at a faster Shapefile implementation.

Thanks :)
Oct 19, 2006 at 5:27 PM
They aren't checked in due to the need to restructure the project to allow different branches of development, but you can get the code in the 2.0 release section here:
Nov 10, 2006 at 3:21 AM
To codekaizen:

If you changed boundingbox class to struct, have you evaluated the performance loss brought by boxing and unboxing?
Nov 13, 2006 at 6:54 PM
There really isn't much boxing/unboxing. Both 2.0 generics and careful marshalling see to that.
Nov 14, 2006 at 2:41 AM
Oh, I didn't know too much of new Generic features in 2.0. But, in a C++ view, the size of a class instance only depends on the member variables and the table that contains the address of virtual functions.

For .NET, all value types is allocated on the stack except for value array, all reference types is allocated on the heap. Actually, you can't decide the size of an instance of a reference type or a struct by operator sizeof at all in .NET.

So, I am wondering how could you know the value type of envelop is able to save more memory than the previous version?
Nov 14, 2006 at 3:10 PM
Because the memory problem is not with the envelope or the tree, but how the tree i built (as soon as the tree is built the memory problems is resolved). The new index used a different approach to create the tree.
Nov 14, 2006 at 5:25 PM
That's on the point.

BTW, Is it a size limit for shp files in ArcMap?
Nov 23, 2006 at 2:05 AM
There might be, but the specification doesn't say anything about this. I would recommend the new filebased geodatabase for use in ArcMap 9.2 has instead though. As far as I recall the limit is 2TB.