What Works and What Doesn’t
Jun 28, 2009
After SSC11 I got a note from my colleague Paman Gujral at the Automatic Control Laboratory at EPFL summarizing some of the talks. He wrote: “Rasmus Bro gave an excellent talk too about the pitfalls in using chemometric methods. Kowalski commented that software firms are a lot to blame for advocating methods that don’t work.” I was a little alarmed by this and so asked Rasmus about it, who wrote: “The comment of Bruce is maybe correct but it wasn’t meant as he states it. … as part of a discussion Bruce and others mentioned that also software companies have responsibilities in helping people take proper decisions. This was added to a more general agreement that education should be improved. So there was nothing dramatic or controversial in that.”
In any case I’ve thought a great deal about that comment since Paman’s note. In fact, as a software company sometimes it’s hard to know when a new method comes out if it is going to live up to it’s initial hype. This is largely because nobody publishes negative results. If you lived on a steady diet of J. Chemo and ChemoLab, you’d think that EVERYTHING works, at least initially. So yes, it’s often up to us to sort out what works and what doesn’t. The often means coding it up ourselves, and trying it out. And sometimes even then the results are ambiguous. We don’t see where it definitely doesn’t work, and feel that, given what was in the original journal article, it’s worth putting into the software.
We figure that if more people have experience with a new method then eventually we’ll all figure out if it is generally useful. That happens faster when there is code available which implements the method. So yes, I’ll admit that we’ve put things in PLS_Toolbox that were “unproven.” I have stated this publicly, more than once. That said, I don’t recall ever promising these new techniques would work. Caveat emptor. Only in this case, it’s “let the user beware.”
I’ve put a lot of effort trying to make some methods live up to the initial claims. I could give you a list of my all time greatest “wastes of time” but at this point that would only serve to upset the originators of the methods. But I think we’ve been of more than a little service in helping sort some of these things out.
I certainly agree that “education should be improved,” and we strive to do that. One of the things we tell our students is to not believe everything they read or hear, and we try to give them the tools to dig past the hype. We also teach proper model validation. If you do a proper validation, you’ll at least know if one of these new techniques doesn’t work on your particular data.
But I don’t know how many times I’ve had to answer the question, “Why doesn’t your software do ______? The author/speaker says that it’s the best thing since Gauss.” And I have to answer, “Well, we haven’t tried it. But our experience suggests there are no silver bullets.”
BMW