Tuesday, December 1, 2009

SVM Results

I've been playing with using a support vector machine (SVM) for quasar target selection. Adam suggested this because it seem that the likelihood method was very similar to this already solved computer science problem. He helped me get SVM-light working using R.

The way this method works is that you input variable information for a set of training objects that represent what you are looking for (in this case quasars) and what you are not looking for (everything else).

I trained on the u-g-r-i-z fluxes (the same fluxes used to in the likelihood method) from the qso template and everything else (from now on referred to as everything) template objects. I tell SVM which are quasars and which are everything. It takes a long time to train the SVM, so I've been taking subsets of the quasar/everything catalogs to make the system faster (the likelihood uses ~1,000,000 objects).

I then take the human-confirmed truth table objects and run them through the SVM to classify as quasar or not. Below are the results:

Using 30,000 training qsos and 30,000 training stars
#quasars targeted
[1] 700
#not quasars not targeted
[1] 466
#quasars not targeted
[1] 255
#not quasars targeted
[1] 926
>
#Accuracy of targeting (qsos targeted / total targeted)
[1] 0.4305043


~~~

Using 100,000 training qsos and 100,000 training stars
#quasars targeted
[1] 709
#not quasars not targeted
[1] 480
#quasars not targeted
[1] 246
#not quasars targeted
[1] 912
>
#Accuracy of targeting (qsos targeted / total targeted)
[1] 0.4373843


~~~

Compared with the likelihood method:
#quasars targeted
[1] 793
#not quasars not targeted
[1] 908
#quasars not targeted
[1] 162
#not quasars targeted
[1] 484
>
#Accuracy of targeting (qsos targeted / total targeted)
[1] 0.620987


I can try adding in more information like the errors, or the likelihood values as additional vectors in this analysis, but it doesn't seem like it is working as well as the likelihood or that adding more objects improves accuracy very much (using ~3 times as many objects only increased the accuracy by >1%). I could run on larger training sets and see if this makes a difference too.

No comments:

Post a Comment