LINQ

Coordinator
May 26, 2009 at 5:52 PM

We have a few options for implementing LINQ support. The problem, of course, is that none of SharePoint's collection classes implement IEnumerable<T>. For use within actual LINQ syntax, this isn't a huge deal because the type can be specified within the query:

var items = from SPListItem i in list.Items select i;

This is compiled into list.Items.Cast<SPListItem>() behind the scenes. However, for the other useful operators (All, Any, Contains, etc) it's not exactly convenient to Cast every time. So one of our goals could be to provide some/all of these operators on the SharePoint collections. This begs a few questions:

  1. Which operators do we implement?
    Many of the operators are quite simple. Others, like SelectMany, that accept multiple IEnumerable parameters are much less straightforward in which overloads would need to be provided for consistent use. Also, the more operators we implement, the more will need to be maintained. However, if a user needs a method we don't implement, then they'll need to know how to Cast<> so what have we really saved?
  2. For which collections do we implement them?
    Some collections are obviously used more than others, but if we're going for consistency across the API we should at least consider if there's a way to include the less-used collections. As most inherit from SPBaseCollection, we could use that as the connection point using generic methods, as I've done for a Any and Contains. Essentially that would eliminate some explicit Cast<> calls. However, the type parameter will still need to be specified [i.e. lists.Any<SPList>(...)]. One option would be to also define SPListCollection.Any and have it return base.Any<SPList>. That would be the most usable with the least redundancy (of course it's a collection of SPList objects), but again we're back to either a lot of work or picking which collections are important enough.

Having used LINQ with SharePoint for a while, I'm almost content to just skip most of the LINQ helpers and just get the word out about Cast<> and AsSafeEnumerable, since they will probably be needed sooner or later despite our efforts. What do you think?

Coordinator
May 26, 2009 at 6:20 PM

I agree on you here and there are several aspects of it; full scale implementation or implement those parts that are most used and extend the SPBaseCollection and optionally all implementations of it for "better reading".

I vote for starting small scale and get a stable release; ie implement it for SPBaseCollection and then go to SPListItemCollection, SPFieldCollection and implement the "interesting" parts. For instance I've been using the Where() and ForEach() quite often of fields, event receivers and content types. And I find it quite nice to not having to Cast<> this and Cast<> that.

Another point of implementing the extension directly onto the collections is that we help people not to miss using the AsSafeEnumerable (which is one awesome extension) if they use the source and make their own extensions.

Coordinator
May 26, 2009 at 7:38 PM

For this first pass, I'd say let's implement everything for SPBaseCollection, SPWebCollection and SPSiteCollection (to use AsSafeEnumerable). Most of the sub-collections beyond that could probably be done with code generation

I'll take a look at how we can make things work well for the methods that take multiple enumerations (GroupJoin, SelectMany, etc). Particularly if we're trying to let people ignore the "implementation detail" of AsSafeEnumerable, we need to make sure they can't inadvertently leak the references by using a collection on the wrong side of a join.

Coordinator
May 26, 2009 at 9:38 PM

Sounds like a good plan for a release.

Coordinator
May 27, 2009 at 1:02 PM
Edited May 27, 2009 at 1:23 PM

I think I have base implementations done for SPWebCollection and SPSiteCollection. Many of the methods just throw a NotSupportedException for one of a few reasons:

  1. Methods returning an SPSite/SPWeb (ElementAt, First, etc) would dispose before returning. Trivial to implement without using AsSafeEnumerable, but the semantics would be completely different: dispose these, but not these. Another option would be to provide higher-order functions like SelectFromFirst, ProcessElementAt, etc. that call a delegate on the SPSite/SPWeb before disposing.
  2. Contains accepts an SPSite/SPWeb object which will never match any of the elements in the collection because they will be allocated and disposed on demand. Could provide an overload (or a new method, maybe ContainsWhere?) that accepts a predicate, implemented as "return source.Count(predicate) > 0".
  3. Distinct doesn't make sense.
  4. Methods that place the enumerated items in an intermediate collection, either to return (GroupBy, ToDictionary, ToList, etc) or as an implementation detail (Except, Union, Join) would dispose after caching the object. Some of these (particularly Join, GroupJoin) may be nice to have, but there will be memory implications for keeping all the references and we will need to implement dispose-aware containers.

I think providing all methods, and documenting which methods throw and why, will be easier on users than thinking we just skipped some.

The other thing I noticed is that the non-generic methods I implemented do not hide the existing SPBaseCollection implementation. So on SPWebCollection we see Any (mine) and Any<T> (base). This seems like a pretty severe usability issue, particularly for new users. The only way to hide one set of methods would be to put it in a different namespace, which also doesn't seem like a good idea. What do you think?

Coordinator
May 27, 2009 at 4:06 PM

They are awesome! Started using SPExLib into a project today actually and already had some great usage of the extensions.

If it's just documented I think it's great to throw the NotSupportedException() and agree on having them all there.

About the generic methods appearing side by side with the specialized implementations are something I can live with if we make sure that we don't help the users leak memory. On the other side if we put the extensions of the SPBaseCollection into a separate namespace (SPExLib.SharePoint.Linq.Base?), it could perhaps be more useful. Tough question.

Also the SPBaseCollection extensions should make sure that no users accidentally use the methods without properly disposing it, this can be done with some type checking in the extensions like this:

if(source.GetType() == typeof(SPWebCollection) {
   // throw an exception or use the specialized implementation
}

 

Coordinator
May 27, 2009 at 5:44 PM

I guess putting the Base extensions in a separate namespace would make the difference explicit so the user doesn't stumble on them by accident.

For guarding against using the base implementations for SPSite/SPWeb, I just added a new SPBaseCollection.Cast<T>() that will check for us. I also added an overload to let the user override the guard if they know what they're doing.

Coordinator
May 27, 2009 at 6:24 PM

Perfect. Namespaces looks good now, not cluttered and they feel safe also.

I noticed that you omitted Select<T> from SPBaseCollection, any reason behind that?

Coordinator
May 27, 2009 at 9:23 PM

I didn't omit it, I just haven't gotten there yet. :)

Coordinator
May 28, 2009 at 7:08 AM

Haha, no problem, I needed it yesterday that's why I discovered it.

Coordinator
May 29, 2009 at 7:46 AM

I think SPBaseCollectionLinqExtensions is done. Found several bugs in the SPSite/SPWeb implementations, so I'm assuming there are glitches in SPBase as well that will expose themselves over time.

Coordinator
May 29, 2009 at 8:01 AM

Good job, I'm using it in a project right now, so let's see what I discover.

Jul 13, 2009 at 9:27 PM
Edited Jul 13, 2009 at 9:27 PM

Any updates Wictor?  Any feedback/bugs from the project you were/are using the linq code on?

Coordinator
Jul 14, 2009 at 9:44 AM
Howdy,
I'm currently on some holiday. But I will return home in about a week and have planned release then. Current latest source works really nice and we have been using it in some projects without any hickups.
/WW