GeometryProvider.cs vs GeometryFeatureProvider.cs and a few other comments.

Topics: Data Access, SharpMap Project, SharpMap v0.9 / v1.x
Mar 23, 2008 at 11:55 PM
What is the reason of having both of these classes in 0.9? GeometryFeatureProvider.cs seems to implement access to add in memory access to attributes via FeatureDataTable and GeometryProvider is just incomplete version of it. It will be nice to have them merged into one class only, something like InMemoryFeatureProvider.cs with corresponding in-memory geometries.
Developer
Mar 24, 2008 at 7:55 AM
Good point. I am not sure though if we get to this kind of refactoring in v1 before we make the jump to v2.

Mar 24, 2008 at 6:30 PM
I looked on v2 sources - Data Access part seems to be much more complicated and unfinished compare to v0.9 and I'm not sure if all interfaces / classes there are really required. Would be nice to have some kind of design document / wiki page(s) describing and discussing all parts of the library:

1. Geometries
2. Features
3. Map
4. Rendering
5. Presentaiton
6. Data

Short look on Data:

v0.9:


IProvider.cs - provider of geometry and feature data row (why not IFeature?). Used by VectorLayer and LabelLayer AbcProvider.cs - implementation / factory of Geometries based on specific storage

Straightforward and intuitive, but incomplete - no support for rasters/grids, not sufficient support for features/attributes, etc.

v2.0:


30 classes and 7 interfaces in Data/:

... is it really necessary to increase complexity so much in v2.0? Below short look on interfaces only:

IFeatureDataReader.cs
  • why it is IFeatureDataRecord?
  • isn't provider sufficient to read features? FeatureProvider can be also perfectly implemented to provide fast sequential access to all Features as a list.
IFeatureDataRecord.cs
  • seems to be like IFeature.cs
  • GetOid, HasOid, OidType - why not just public int Id { get; } instead of all of them
  • IsFullyLoaded - IsItNeededAtAll?
IFeatureLayerProvider.cs
  • why Layer provider. Isn't it the same as FeatureProvider?
  • Layer looks more like a map-related entity why it should be provided by someone? Usually we have external storage containing Features, their Geometries and Attributes.
IFeatureLayerProvider`1.cs
  • why to worry about different types for Oids, probably having them as integers should be sufficient for the beginning.
*use .NET class library design guidelines for generic classes instead of `1, `2. I'd add additional namespace Generics/ to keep generic classes / interfaces like in .NET class library.
ILayerProvider.cs
  • properties / methods there have nothing to do with layer, by looking on the interface it does seems to provide any layer ;)
  • it is not nice to use interfaces to implement polymorphisms for different types of LayerProviders and I'm not sure if there is need to have different providers (currently there are 5 IAbcProvider interfaces).
  • provide Features (and their Geometries, Attributes) instead of Layers.
IRasterLayerProvider.cs
  • why it returns Stream and not something like Image?
  • can't Raster (or Grid) be just another type derived from Geometry (and GeometryCollection)? E.g. for Rectangular grid it may be Feature with grid properties as attributes and Image or cell polygons, boundaries as Geometry. In this case IFeatureProvider will be sufficient to read it. Concrete implementation of FeatureProvider will only need to create corresponding type of Image / Grid geometry.
IWritableFeatureLayerProvider.cs
  • why not to use only IFeatureProvider interface? Or IFeatureStore (e.g. Repository pattern then it can be used to provide and to store features, I guess provider name was chosen only because sharp map was originally designed as read-only map engine).

Many questions :). Please do not consider any of the above questions too harsh, I think that you're doing very good and important job with SharpMap and I'm happy that project keeps going.

--

I'd like to contribute to SharpMap development, is there any IRC meeting planned in the nearest feature? I'd start with contribution to the Data package.



Developer
Mar 24, 2008 at 9:38 PM
Thanks for your remarks. Feedback can only make it better, that is one of the advantages of open source.

We hang in IRC most of the time, so drop by. Meetings are at wednesday 20 UTC, but do not have a very formal nature.
Coordinator
Mar 25, 2008 at 7:01 AM
Edited Mar 25, 2008 at 7:02 AM
@gvd:

I'm glad you are getting engaged in looking at v2.0 sources. It's one of the strengths of open source, and the value of the software mostly collapses when peer review isn't present.

Probably the best way to address your points is one at a time. However, to give you a better handle on the context of the refactoring in the SharpMap.Data namespace consider that a few principles were followed. Some of these might appear to fly in the face of the current trend of agile methodology, but consider that one of the fundamental principles of agile - YAGNI (you ain't gonna need it), on which you appear to be relying to view the general direction of the v2.0 changes - applies weakly to libraries like SharpMap. Libraries are more general purpose, and so aspects which aren't needed by you, are needed by others. The principles are:

  1. Keep it simple, yet well-factored. Think MapObjects vs. ArcObjects.
  2. If a more complex implementation allows a simpler default usage pattern, choose it over a simpler implementation. Generics allow this, but designing a library around them is harder.
  3. Allow incremental support of new SharpMap features by data providers. Some providers will still work fine read-only.
  4. Be implementation agnostic. Don't rely on a specific implementation of a concept, such as System.Drawing primitives, ADO.Net, etc.
  5. Make sure there is a completely implemented provider to examine the design in real-world usage.

These are probably not all of them, but the ones which are most readily apparent. Also note that we haven't completely followed this yet (the reliance on System.Data.DataTable for example).

Now, on to the specifics. Please, bear in mind that while I've tried to carefully think through these matters as I developed v2.0, I am aware that they may not be the best way to do it. I really hope that you don't take these answers as the final statement, but rather as the best understanding of the subject, as far as current developers are aware. Where you disagree and can provide a well-reasoned and researched set of alternatives, they will be happily examined and considered, and adopted if there is generally positive support.

IFeatureDataReader.cs

why it is IFeatureDataRecord?

Because it makes looping through results in a while loop easier. ADO.Net does this, too. It supports an easy syntax for library users - one which is immediately familiar to those who have worked with ADO from 1999 on.

isn't provider sufficient to read features? FeatureProvider can be also perfectly implemented to provide fast sequential access to all Features as a list.

Not exactly sure what you mean here, but there is no way that you can read all data from any given data source into a list. Feature data being what it is, from simple file formats to databases to web feature services, allowing both fast read access and updatability, a list is not a data structure which allows the kind of flexibility you need.

IFeatureDataRecord.cs

seems to be like IFeature.cs

Where is this? I can't find this interface in SharpMap v2.0.

GetOid, HasOid, OidType - why not just public int Id { get; } instead of all of them?

What if Id isn't an int? What if it is a Guid or a String? What if a feature dataset doesn't have an id for a given feature (such as is the case for pure geometries).

IsFullyLoaded - IsItNeededAtAll?

Yes. If you look more closely at the in-memory model of FeatureDataTable, you'll see that there are several options you can use when issuing a query to a feature provider. These options control the degree of lazy loading from the datasource. When you are asynchronously loading hundreds of megabytes of feature data, not loading it all upfront can come in handy.

IFeatureLayerProvider.cs

why Layer provider. Isn't it the same as FeatureProvider?

No. Not all layers are feature layers. However, all layers have some properties and behaviors in common. Thus ILayerProvider.

Layer looks more like a map-related entity why it should be provided by someone? Usually we have external storage containing Features, their Geometries and Attributes.

Not sure what you mean here exactly, but a Layer can contain either feature data or raster data in SharpMap v2.0. We model this with a base Layer class, and specialize via FeatureLayer and RasterLayer, and further specialize FeatureLayer into LabelLayer and GeometryLayer. In SharpMap, a Layer object refers to a specific set of data, which is accessed via a data provider (an instance of a type which implements ILayerProvider). If you are objecting to the name - ILayerProvider - perhaps ILayerDataProvider would be better?

IFeatureLayerProvider`1.cs

why to worry about different types for Oids, probably having them as integers should be sufficient for the beginning.

Perhaps, but this is v2.0, not the beginning. I've got some features with String OIDs and some with Guid OIDs. I also didn't want to just make the Id column an Object due to performance (indexing costs were high with boxing and heap allocation).

use .NET class library design guidelines for generic classes instead of `1, `2. I'd add additional namespace Generics/ to keep generic classes / interfaces like in .NET class library.

Where is this guideline from? I've seen it both ways, even in the .Net Base Class Library. The tags `1 etc. apply only to the source files, so everyday users won't see it. I've personally always thought that having a separate "Generics" namespace was a bit cumbersome, but there are advantages (fewer name collisions, for example). I'm not sure I'd want to change namespaces for v2.0 this late in the game.

ILayerProvider.cs

properties / methods there have nothing to do with layer, by looking on the interface it does seems to provide any layer

it is not nice to use interfaces to implement polymorphisms for different types of LayerProviders and I'm not sure if there is need to have different providers (currently there are 5 IAbcProvider interfaces). provide Features (and their Geometries, Attributes) instead of Layers.

I think you misunderstand ILayerProvider. It doesn't provide Layers. It provides layer data. Yes, I think it is a bad name. No, I never had it reviewed - consider yours the first. Bad names happen when I refactor repeatedly without a partner. I'm hesitant to change it, but it's probably easy to show how it would be a good move in the long run. Care to weigh in further?

To address feature providers: each layer (or layer group) needs a data provider. The data will either be feature data or raster data, and the layer will use either an IFeatureLayerDataProvider or an IRasterLayerDataProvider instance, respectively.

IRasterLayerProvider.cs

why it returns Stream and not something like Image?

System.Drawing.Image or System.Windows.Media.Image? We want to make it implementation agnostic. The rendering layer takes care of this anyway, and users just need to pick a renderer, not deal with byte streams. SharpMap developers, however, get to deal with byte streams. I think they are capable of this.

can't Raster (or Grid) be just another type derived from Geometry (and GeometryCollection)?

No. These two are completely different types of data, and need to have special handling. We can't just abstract away the differences. There are numerous academic and industry papers and books dealing with the disparity between these two formats, and an extremely large body of work which examines the particular aspects of each of these two fundamental GIS data types. It would be cavalier indeed to ignore all of this learning, and a disservice to users trying to access GIS via SharpMap, since the rest of the GIS world sees vector and raster data as different.

IWritableFeatureLayerProvider.cs

why not to use only IFeatureProvider interface? Or IFeatureStore (e.g. Repository pattern then it can be used to provide and to store features, I guess provider name was chosen only because sharp map was originally designed as read-only map engine).

Because I don't want to force providers to implement write methods if they don't want to or cannot. It wouldn't make sense to have a method which indicates that you can update a data source when in fact you can't. Having this as a separate interface also makes sense when you are reasoning about the layer providers in a collection.

I chose a provider model pattern with an in-memory table pattern vs. a repository pattern because that is what most people who will use this library are used to. They either come from ADO.Net which implements the same patterns, or from VB and Map/Arc Objects, which had/has a similar interface. Plus, it works well. I think it's the right hammer for the job, currently.

I really do hope that you do become a developer in the SharpMap effort: your ability to question the design choices shows the knowledge and insight which would help this project. Along with your great participation in the forums, please consider contributing patches and/or demo apps so that we can make a fair decision.

(Update: fixing sentence construction)