Re: Opinion Wanted - How to Expose a Collection



On Sun, 24 May 2009 12:23:31 -0700, jehugaleahsa@xxxxxxxxx <jehugaleahsa@xxxxxxxxx> wrote:

Hello:

The "gurus" out there suggests being very careful about how your
expose collections in your interface. If you do not intend the users
of your class to alter a collection, you must make sure that they
can't.

Seems to me that the above statement says it all. It's all about the requirements. If you have a requirement that the users of your class not alter a collection, either by modifying the collection instance or by providing a different instance, then you need to enforce that somehow. But that's not always the requirement.

Here are some common implementations for returning a collection and
some of their pros and cons. Take a look and tell me what you
typically do and which one you think is the best practice.

There is no "best practice". It depends entirely on the context.

1) Return the collection directly - Directly expose the collection in
the interface. If the collection is created in the property or method,
altering the collection won't likely affect the class. If the
collection is a member of the class, changes to it could invalidate
the state of the class. Future versions of the interface will be
required to expose the functionality provided by the collection (to be
backward compatible).

This approach (not counting the unrelated comment above regarding "if the collection is created in the property or method"...clearly that's an entirely different situation and doesn't really belong in this paragraph) doesn't meet a requirement that states the collection cannot be changed.

So obviously you'd only use it in a scenario where it makes sense for the collection to be modified by the client code.

Note that there are two variations on the theme: a read-only property, where the collection itself may be modified, but where the client cannot replace the collection instance itself; and a read/write property where the collection as well as the specific instance used may be modified.

You'll find various examples of each in .NET, all used in situations where the requirement isn't that the client may not change the collection.

2) Return a copy of the collection - Expose a copy of the collection.
The copy would provide all of the same functionality as the
collection, but changes to it couldn't affect the state of the class.
This could lead to a lot of overhead if the collections are large or
complex. Your interface will be forced to expose the collection in
future versions.

This overhead is required if you need "absolute" safety. The only way to guarantee that the client code can't modify your collection is to not let it see it in the first place. Note that even in that case, nothing precludes the client code from using reflection to discover your collection and violate whatever contract you've stipulated. But at least if they do that, you know they didn't get it from the instance you returned from the property. :)

3) Return a read-only wrapper around the collection - Expose a read-
only collection. This prevents modification. It has minimal overhead.
However, modifications to the collection may cause runtime errors.
You're still exposing it in your interface.

This is my preference when I want to return a read-only collection implementing the IList<T> interface. In particular, I create an instance of ReadOnlyCollection<T> and return that. You're right...errors caused by misuse are delayed until run-time, but when you're dealing with the built-in .NET interfaces, you don't have much choice about that. IList<T> offers indexed access to the collection, which is sometimes what you need, but there's no read-only version (i.e. one with a read-only indexer).

Even if you added a custom read-only IReadOnlyList<T> interface, you can't retroactively make existing .NET classes implement it. Wrapping the class is the only available option. Of course, you can always wrap an existing class in a custom read-only interface implementation, but then you run into potential design questions if you then need to pass that implementation back to external code somewhere (e.g. sure, IReadOnlyList<T> could implement IList<T> and then throw an exception if an attempt to mutate the instance occurred, but then how's that any better than just using ReadOnlyCollection<T>?).

4) Return the collection through an interface - Expose the collection,
but through one of its interfaces. This would be like exposing List<T>
as IEnumerable<T>. This approach is vulnerable to users casting. Do
you expose List<T> as IList<T>, IEnumerable, etc?

I do this often, yes. I try to treat my types, including collections, as the lowest-common-denominator needed. And yes, this provides basic compile-time protections, but doesn't ensure against clients casting the type. Note, however, that casting the type is about the same design-wise as using reflection to get at the implementation details of any class. It's more efficient, but otherwise violates all the same encapsulation rules. Clients that do so, do so at their own risk.

Note also that violating that encapsulation rule isn't always bad. For example, the Enumerable.Count() extension method attempts to cast the target object to ICollection, and gets the ICollection.Count property instead of enumerating the entire collection when possible. One could argue that's a violation of the encapsulation, but personally I'm glad that performance optimization is there.

5) Implement a custom collection - Expose the collection, but through
a custom collection with a constrained interface. This can eliminate
generics. The collection could inherit Collection<T>, List<T>, etc.

I would implement a custom interface in this scenario. You don't want to inherit Collection<T> or List<T> because they aren't designed with that kind of inheritence in mind (e.g. the indexer for List<T> isn't virtual...you can't make a List<T> read-only by inheriting it).

After all, we've already got ReadOnlyCollection<T> if what you want is run-time protection. For compile-time protection, an interface accomplishes the same thing, but without requiring a specific inheritance hierarchy.

Obviously a custom interface does require a custom implementation. So, for example, you could define an IReadOnlyList<T> in which the indexer was read-only, and then implement the interface in your own collection types. But, I wouldn't bother with thinking of your custom implementation as anything other than a read-only wrapper with compile-time information.

Of course, as long as you're happy to pass around a reference actually _typed_ as ReadOnlyCollection<T>, then you get the same compile-time protection you'd get with an interface, without the bother of actually defining one. But obviously there may be cases where you'd prefer to deal with a custom interface instead.

If anything, maybe this option #5 is best implemented by sub-classing ReadOnlyCollection<T> (which obviously already is read-only), and then in that sub-class, implementing your custom IReadOnlyList<T> interface. But I would only bother with this if you anticipate other implementations of IReadOnlyList<T>. After all, otherwise why not just use ReadOnlyCollection<T> as the type in your code?

This is under the assumption that I actually want to expose a
collection. I'm not too concerned with the logistics behind
determining when to expose a collection, I just want to know how.

My approach, that has been evolving for a while now, has been to
return the collection through a restricted interface. I just have a
policy that says I can't cast from IEnumerable<T> to List<T>. If I
want to work with a List<T>, I have create a new one and add the
members via the ctor or the AddRange method.

I realize that this has costs associated with it. Generally, it has
not been an issue.

Nor should it. If it is an issue, you've got bigger fish to fry. :)

It's a bad idea to down-cast like that, and the people writing the code should know that even without an explicit policy. Making the policy explicit should make things even better. (Not that down-casting is always wrong...sometimes it's unavoidable. Just that it needs to be used very carefully, and in a situation where it's clearly documented to be safe and within the contract of the code involved).

Note that your hypothetical situation "If I want to work with a List<T>" sort of puts the cart before the horse. That is, sure...creating a new List<T> and copying the data from the IEnumerable<T> to it will work, and preserves the safety of the original collection. But, it begs the question as to why it's appropriate to take the data you got from the IEnumerable<T> and treat it as a List<T> in the first place. That you might have to do that, often that suggests that you picked the wrong lowest-common-denominator in the first place, or that you're handling the data incorrectly.

Hard to say without specific examples, but any time you find yourself working around some design contract, the first question should be "is this really the right thing to do, rather than either following the design contract, or modifying the design contract to suit the need better?"

There are times where I won't even follow this policy. Some times I
don't think exposing a List<T> is such a big deal. Some times when the
collection is part of the class's state, I will create a copy to
mitigate a cast being performed. It kind of bothers me that I switch
from time to time.

It should. Not that being inconsistent is bad. It's that you should at least be concerned when you're inconsistent, and take some time to understand why you're doing it. It should never be out of simple convenience. You should have some clear, direct goal that is achieved only through the inconsistency and which isn't in conflict with your other goals.

Pete
.



Relevant Pages

  • COM Interop - Interface Questions
    ... Wrapper for it to expose it to a VB6 client. ... I have to use this because most of my code is in the base class called ... One question I have is can I create one Interface called IStandardLogging ... Interface called IComLogging to expose my modified methods? ...
    (microsoft.public.dotnet.framework.interop)
  • Re: Custom Serializable Objects
    ... my own custom objects which are for all intensive purposes an Interface ... IDE so happy to auto generate and overwrite any custom code you have written ... There has got to be a way for WSDL ... As for the remoting, I know what remoting is capable of ...
    (microsoft.public.dotnet.framework.webservices)
  • Re: ATL and COM
    ... There was no header file. ... Can I use the cpp file for class method code and expose the ... You don't yet have any interface. ... You don't normally provide an .h file of your class to the COM client. ...
    (microsoft.public.win32.programmer.ole)
  • Re: Best of breed PDM
    ... good SolidWorks integration, good customizability, and great tech ... decent interface and good functionality for its price. ... write custom apps for it, ... > What is the best PDM system for Solidworks. ...
    (comp.cad.solidworks)
  • Re: Another great example of how Word 2007 "brings commands closer to the surface"
    ... trashing and replacing Word's total interface and method of operation was ... remake the entire Word user interface to produce these new features? ... since the "majority" of users don't create custom toolbars and ... Hey, folks, the "majority" of users also never create a macro, ...
    (microsoft.public.word.newusers)