Benchmark NtsProvider vs Database provider

Topics: Algorithms, Data Access, General Topics, Italiano, SharpMap Project
Apr 4, 2014 at 10:22 AM
Edited Apr 4, 2014 at 10:34 AM
Hi all, with this post I just want to explain my method to handle datasources up to some hundreds MB of data.
The image shows 2 forms. They are both programs written by me with this project's support.
The one on the left is a newer (beta) version, and the one on the right is an older version.

http://imageshack.com/a/img713/1043/8raf.png
Image

The one on the right uses a database provider (SQL Server), and the one to the left did a memory load, taking data from sqlserver and putting it into a NtsProvider.

In the map frame it can be noticed the "commit time" in red. The NtsProvider took about 5 secs to load the data from the data source, and of course my RAM has expanded a little, but the refresh time has greatly shortened.
I think it's a great method for who wants to enhance their application: the more geometries you load into NtsProvider, the more time of "preload" is needed and the more RAM expands, but the commit time doesn't change much; instead, the more geometries you have into your data source, the more commit time increases.

I use this method with shapefiles too: with a memory loading, I don't have to care about file locks or other things related to the file itself

You can also take a look to my patch: 16065 - NtsProvider.cs
Editor
Apr 4, 2014 at 11:32 AM
Have you got any spatial indexes on the sql table? are they being used in the query?

I guess the main issue here is scalability. In theory, using a db provider, you get a slower render time, but it should scale well (the only variable being the render time)

Using an in memory buffer, you get the improved render time (although this will still be effected by the amount of geometries rendered), but you will suffer degrading load times and will eventually run out of memory.

I think this method is good for small amounts of geometries, where know there are going to be small amounts of geometies, but for unknown quantities, it would be better to stick to more scalable solutions.
Apr 4, 2014 at 3:32 PM
Edited Apr 4, 2014 at 3:35 PM
Yes I have a spatial index, but I used SQL Server just for example (I work with mysql, oracle, sqlserver and postgre), and this is a whole table, not a query result.

I wanted to suggest a way to get little/medium maps faster. I also work with huge datasources with 50.000.000 and more records, but I don't use this method.

The example above loads about 2.500 geometries, but now I did a better test.
I have an oracle spatial table with 72858 records, spatial index, column indexes, primary key, good server... all good conditions.

My program with empty map is 32MB in memory

I loaded the data using oracle spatial provider, it took 0 secs of preload. With 0 geometries rendered, my program is 70MB in memory.
GetMap function took 16 secs to render ALL geometries, 13 secs for all the next loading (thanks to oracle cache). With all geometries rendered, my program now is 165 MB in memory.

I loaded the data using my example method, it took about 5 secs of preload. With 0 geometries rendered my program is 235MB in memory.
GetMap function took 2 secs. Memory usage is about 243MB.

Now the results are more evident. I can try up to with 500.000 geometries, and I know this example can work well.

I don't understand what you say about " degrading load times and will eventually run out of memory"; I load the whole table in NtsProvider once. It's obvious that this can't cover all the solutions, I think I can find memory issues after 500.000 geometries. In fact, I use this method when I want to work in deep with a data source, so 70.000 geometries are already too many for me

This example shows that interactions with map (pan, zoom, select feature, highlight, get area, get scale, get height...) work a way better with all geometries loaded in memory. I also developed some drawing functions, and working with the data source's delay it's not possible, although it's fast , it's not fast enough.

Another example can be if you are in a intranet or in a internet network you will wait a lot every time you have to query the data sources to redraw the map
Editor
Apr 4, 2014 at 4:38 PM
I am basically agreeing with you! As you try and load more and more geometries into memory, your initial load time will increase, as will your memory useage, until you run of memory!

In order to get the data out of your table you obviously need to query it! and if that query doesn't use the spatial index then it will not be as fast as it could be. remember with Sql server, if you don't tell it to use the spatial index in the query it won't use it by default.
Apr 5, 2014 at 11:28 AM
I know that, and I also know if you don't use a spatial index you can't do spatial queries
Editor
Apr 5, 2014 at 3:14 PM
you can do spatial queries without a spatial index, they just take longer...