Tuesday, May 15, 2007

The Europa Train - Blessing or Curse?

A bit of a controversial topic today...

I've always had mixed feelings about the simultaneous releases of Eclipse. It's definitely great to be able to go to one place and download a family pack of nuggets of Eclipse goodness (mmm.... nuggets...), and know that they all work together. Each year the release makes a big splash in the media too, which is great for Eclipse and everyone involved. That kind of publicity will likely make Eclipse.org the victim of its own self generated Denial Of Service attack as the download servers are brought to their knees by the throngs of downloads. That is definitely the kind of popularity and success that we all want.

One thing I've never liked about it though is all the peer pressure to make sure the train is on time. Don't get me wrong, I love having the train on time just as much as everyone else. It's nice to be able to know when the train is going to arrive so that you can plan your ride. Your boss is not going to be happy if you tell him you have absolutely no clue when you will be in to work because you don't know when the train is coming. But, if you knew that parts of the train were broken, wheels were missing, etc., wouldn't you want the mechanics to take the time to fix it, rather than it breaking down mid trip, forcing you to whip out your multi-tool and a pack of gum and MacGuyvering a solution? (BTW, I hereby declare MacGuyvering to be a word.)

My point is, the train sometime feels less like a train and more like a juggernaut. Last year for Callisto, we shipped a CDT 3.1.0 that had a lot of problems with it. CDT 3.1 was a big release that had a lot of new functionality, and it's inevitable that when you deliver a lot of new content there are going to be lots of problems, through absolutely no fault of the committers and contributors. Even good code inevitably has a certain amount of bug, and so it follows to at least a certain extent that the more new code you put in, the more bugs there will be. As some for-instances, there were a lot of problems with the indexer, and search was pretty much broken at the time. Yet, we shipped anyway, because the train had to be on time. It's often said to ISVs that consume CDT that they shouldn't take the dot-zero because it will be buggy, instead take the dot-one. I hate to say it, but really IMHO CDT 3.1.0 should have been baked for a while longer.

I don't think this is the greatest situation. Individual projects ought to be able to hold up the train if they need to. However, along the lines of a recent post by Doug Schaefer, I think Eclipse gets itself sometimes into a situation where it's a victim of its own hype. The date for the release is picked a year in advance. We spend so much time in that year hyping the release that by the time we start getting into the bug fixing cycle, there is already so much pressure to release on time that we couldn't hold things up if we wanted to. We shouldn't be releasing things that we know for a fact that people shouldn't be using. Speaking in practical terms, a dot-zero release is never going to be flawless, but if you're shipping with major functionality broken, or with crippling bugs that preclude widespread adoption, then I think the purpose for the release has been somehow lost sight of. A release that can't really be used is somewhat pointless.

Now, I'm sure someone is going to reply to this and say something along the lines of "the release is at the whims of the committers", and that we really have the power to hold things up. Technically it may be true by the letter of the process, but if you really believe that I suggest you try it and see what happens. Short of the Platform or JDT being horribly busted, I'm pretty sure you will get voted down.

I think that what needs to happen is that the release needs to be bug count driven. This process is not flawless either, but the idea is that you do what we do on the CDT milestones, and we don't do the build until all the bugs targeted for that milestone are fixed. When we reach Zarro Boogs, then it's Go Time. Sure, nefarious people can play games still by spuriously marking bugs RESOLVED - INVALID, or by playing games with severeties and target milestones or what have you, but I think that on the whole the idea works. This way, the date of the release is driven from the bottom up by the committers, and is not imposed on them from above for them to deal with after the fact. Sure, you still need to give a rough estimate to people as to when they can expect something (e.g. "Summer 2008"), and the committers shouldn't be given license to delay as long as they please without reason, but at least then there is some flexibility built into the plan.

Don't misread what I am saying here either. CDT 4.0, which is coming out on the Europa train, is shaping up to be both the most feature rich and yet most robust version of CDT yet. I'm not currently anticipating a recurrance of what happened on 3.1.0, and I would definitely recommend to users of previous versions of CDT that they move up to 4.0 and take advantage of the scores of bug fixes and new features it includes. And, I also think having a release train is on the whole a good thing. But, I think there are some things we can do so make sure everyone gets a say in how the train operates.

I'm curious to see how next year's release will unfold...


Wassim Melhem said...

Shipping a quality component with fewer features on time is much better than delaying a release train because a particular project wanted to keep pumping in new features until the last minute.

If a particular project does not think it's stable enough for the release train, I think it should gracefully drop out, rather than delay the train.

Unless it's the Platform, JDT or PDE, life will go on of course.

Knowing in advance the release date is a good thing. A team has a year to make realistic plans that they think they will be able to contain within that period of time. These plans may change over the course of the year if some items take longer than expected. That's fine of course and that's why milestones are good indicators of how realistic/feasible a plan is.

Focusing on quality and meeting the deadline is far more important than quantity of features.

Chris Recoskie said...

I agree with all you say, but at the same time, sometimes it's a tradeoff.

Not all features are ones you can live without, especially if they are features that used to work. Take my Search example. I'd argue that you wouldn't want to ship with a Search that just plain doesn't work. You can cut the feature out (and at the time we perhaps should have if we'd known how broken it was), but your users will scream bloody murder. In this case I think that if you want to do the right thing for quality you should take the time to fix the feature, not rip it out.

And I agree that having a rough date is a good thing. You need a target to aim for. But I think setting it in stone that far ahead is a recipe for failure. There is no harm in giving it some wiggle room if it should be needed.

Pascal Rapicault said...

I think the real problem comes with the fact of having a train! The idea of a train is completly opposite to what we are trying to build: components.

It also has the following bad points: everyone has to ship on a given date (with the problems you mentionned), the train is seen as a 'product' whereas it is just a bunch of components put together but never tested (neither for functionality nor for usability), and it seems to preclude other releases. On the good side, users are happy because it is really easy to try out things.

I think all components should be free to release as often as they want and on their own schedule. Then they can 'tag' something to work with a given train (basically be added to the train update site) to facilitate user consumption.

Conclusion, no trains, many buses.