Science Blog

Science news straight from the source

Navigation

  • Topics
    • Aerospace
    • Animals
    • Anthro and Archaeology
    • Bio and Medicine
    • Brain and Behavior
    • Business and Economy
    • Computers and Electronics
    • Education and Outreach
    • Energy and Environment
    • Geoscience
    • Humor
    • Internet and Communication
    • Media and Entertainment
    • Nanotech, Chem and Materials
    • Physics and Numbers
    • Security and Defense
    • Software
    • Space
    • Transportation
  • Reader Blogs
  • Commerce
  • Register/Login
  • RSS
Home Topics Bio and Medicine
  • Contact
  • Home
× Close

Similar entries

  • Doctors will soon be able to feel organs via a display screen
  • Computer vision may not be as good as thought
  • Mathematician comes up with answer to Ben Franklin's 'magic squares'
  • Producing medicines in plant seeds

Recent Comments

  • My suspicion..
  • psychological measurements are subjective?
  • On Time
  • Re: Skewed View
  • As a physicist who is very
more

Reader Blogs

  • Aristotle May Provide the Key to Quantum Gravity
  • Energy Farming Summit 2009
  • Physics is for wimps
  • Two pilot whales groups strand on Tasmania within 9 days!
more

25 years of conventional evaluation of data analysis proves worthless in practice

So-called 'intelligent' computer-based methods for classifying patient samples, for example, have been evaluated with the help of two methods that have completely dominated research for 25 years. Now Swedish researchers at Uppsala University are revealing that this methodology is worthless when it comes to practical problems. The article is published in the journal Pattern Recognition Letters.

Today there is rapidly growing interest in 'intelligent' computer-based methods that use various classes of measurement signals, from different patient samples, for instance, to create a model for classifying new observations. This type of method is the basis for many technical applications, such as recognition of human speech, images, and fingerprints, and is now also beginning to attract new fields such as health care.

"Especially in applications in which faulty classification decisions can lead to catastrophic consequences, such as choosing the wrong form of therapy for treating cancer, it is extremely important to be able to make a reliable estimate of the performance of the classification model," explains Mats Gustafsson, Professor of signal processing and medical bioinformatics at Uppsala University, who co-directed the new study together with Associate Professor Anders Isaksson.

To evaluate the performance of a classification model, one normally tests it on a number of trial examples that have never been involved in the design of the model. Unfortunately there are seldom tens of thousands of test examples available for this type of evaluation. In biomedicine, for instance, it is often expensive and difficult to collect the patient samples needed, especially if one wishes to analyze a rare disease. To solve this problem, many different methods have been proposed. Since the 1980s two methods have completely dominated research, namely, cross validation and resampling/bootstrapping.

"This has entailed that the performance assessment of virtually all new methods and applications reported in the scientific literature in the last 25 years has been carried out using one of these two methods," says Mats Gustafsson.

In the new study, the Uppsala researchers use both theory and convincing computer simulations to show that this methodology is worthless in practice when the total number of examples is small in relation to the natural variation that exists among different observations. What is considered a small number depends in turn on what problem is being studied-­in other words, it is impossible to determine whether the number of examples is sufficient.

"Our main conclusion is that this methodology cannot be depended on at all, and that it therefore needs to be immediately replaces by Bayesian methods, for example, which can deliver reliable measures of the uncertainty that exists. Only then will multivariate analyses be in any position to be adopted in such critical applications as health care," says Mats Gustafsson.

Submitted by BJS on Wed, 2008-09-03 07:07.

  • Bio and Medicine
  • Computers and Electronics
 
  • Printer-friendly version
  • 11686 reads



Comments

Submitted by Anonymous on Thu, 2008-09-04 22:01.

The previous comments have showed critical thinking and familiarity with the subject at hand. This is surprisingly refreshing.

I particularly agree with the response stating the conclusion has been stretched to the point of sensationalism; but it is good that the models and method of implementing systems are being reviewed. Finding problems and patterns to avoid is one of the first steps to improving system design. Also it certainly should not news to any programmer, I will admit to being one, that without adequate design, concept validation, code testing, maintenance and disclosure of the limitations of the system it should not be expected that the end product would produce reliable results.

  • reply

Did anyone bother to read the article?

Submitted by Anonymous on Thu, 2008-09-04 13:01.

I'll try to spell it out for you.

The article is not saying that classification methods don't work, and isn't disparaging any particular method or combination of methods. It is not even stating that you cannot achieve arbitrarily small error. The article is ONLY saying that you cannot reliably estimate the resulting classifier accuracy (out in the real world) using cross validation or bootstrapping on your dataset.

If your data sanple is large and diverse enough, you could. But you just won't ever know if that is the case or not.

Personally, I think their 'main conclusion' stretches the point a bit, probably for sensationalism's sake.

  • reply

Close to the Mark

Submitted by Anonymous on Thu, 2008-09-04 12:50.

I think the real problem is the blind application of various models without knowing when the model is applicable or not and other constraints of the model.

"All models are wrong, some are useful." --George Box

  • reply

What?

Submitted by Anonymous on Thu, 2008-09-04 12:49.

Conventional evaluation went out the window years ago dude.

Jiff
www.anonymize.us.tc

  • reply

Stating the fine print

Submitted by Anonymous on Thu, 2008-09-04 10:00.

Like the last reviewer said, everybody involved in computer science research knows this already, any model that we come up includes a fine print of positive/negative constraints and includes error variable.

It's upto the application to decide what error is acceptable, for example if the research in next 10 or 20 years comes up with a model which has smaller error than human error (essentially increases the survival probability of patient over the entire world - which includes advanced diagnostics available to first world countries and simple diagnostics for third world) then you might as well use it for third world countries.

  • reply

I don't agree...

Submitted by Anonymous on Thu, 2008-09-04 09:59.

Pattern Recognition is viable only for particular cases. However, for the cases where test sets cannot be generated (or there's no benefit to generating them), wouldn't it be better to use CART, c4.5, c5, or other similar clustering algorithms? These algorithms are widely used in both academic and commercial sectors, and I would even venture that Pattern Recognition is lower on the totem pole than these.

  • reply

Yes, we know!

Submitted by Anonymous on Thu, 2008-09-04 08:33.

Pattern recognition program is only as good as your test data set. Good data sets are hard to come by, that is why there is always errors in computer recognition. It is also why a small sample size can not describe the entire population. These professors are just stating the obvious. Anyone that is in the pattern recognition field knows better methods for recognition like multiple Gaussian or a combination of methods. To say that the entire methodology is worthless is a bold statement. As each method is tailored to a specific task. Prof. Gustafsson's issue is with the data set(s), there is nothing wrong with the methods. Can i have my 5 minutes back now.

  • reply

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <sub> <sup> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <blockquote>
  • Lines and paragraphs break automatically.

More information about formatting options

Copyright, Science Blog.
Think. It's not illegal yet. Read our Privacy Policy.
RoopleTheme