Monday, September 28, 2009

Pretty Plots

I got Joe's plotting code to work. Below are color-color plots of quasars where the "temperature" of the data points is the redshift of the quasars. They are plotted on top of a black/gray stellar locus. Now to plot these same objects where the temperature is a likelihood instead of redshift:

Pretty colors!

Friday, September 25, 2009

Plotting with Joe

Joe Hennawi and I worked on trying to figure out where exactly we are going wrong with the likelihood method. He suggested making a two-dimensional histogram of the likelihoods binned in color-color space. This would allow us to see where in color-color space were are getting objects which have high likelihoods and will help us debug if there is a population of objects which are being falsely selected.

It is quite easy to make the 2D histogram, but plotting it has proved to be an issue. I've spent most of the day trying to do this. Here is how I create the histogram:
qsos = testqsos

b_u = 1.4
b_g = 0.9
b_r = 1.2
b_i = 1.8
b_z = 7.4

z_temp = qsos.Z_TOT
fu1 = qsos.PSFFLUX[0]
fg1 = qsos.PSFFLUX[1]
fr1 = qsos.PSFFLUX[2]
fi1 = qsos.PSFFLUX[3]
fz1 = qsos.PSFFLUX[4]

ivar_u1 = qsos.PSFFLUX_IVAR[0]
ivar_g1 = qsos.PSFFLUX_IVAR[1]
ivar_r1 = qsos.PSFFLUX_IVAR[2]
ivar_i1 = qsos.PSFFLUX_IVAR[3]
ivar_z1 = qsos.PSFFLUX_IVAR[4]

u1 = sdss_flux2mags(fu1, b_u)
g1 = sdss_flux2mags(fg1, b_g)
r1 = sdss_flux2mags(fr1, b_r)
i1 = sdss_flux2mags(fi1, b_i)
z1 = sdss_flux2mags(fz1, b_z)

sigu = sdss_ivar2magerr(ivar_u1, fu1, b_u)
sigg = sdss_ivar2magerr(ivar_g1, fg1, b_g)
sigr = sdss_ivar2magerr(ivar_r1, fr1, b_r)
sigi = sdss_ivar2magerr(ivar_i1, fi1, b_i)
sigz = sdss_ivar2magerr(ivar_z1, fz1, b_z)

ug = u1-g1
gr = g1-r1
ri = r1-i1
iz = i1-z1

icut = where(i1 GT 19.5)

nx = 50
ny = 50
imageQSO = fltarr(nx,ny)
imageAll = fltarr(nx,ny)
imageRatio = fltarr(nx,ny)

likeQSO = qsoqlike[icut]
likeAll = qsoslike[icut]
likeRatio = likeQSO/likeAll

qsoug = ug[icut]
qsogr = gr[icut]

populate_image, imageQSO, qsoug, qsogr, weight = likeQSO
populate_image, imageALL, qsoug, qsogr, weight = likeAll
populate_image, imageRatio, qsoug, qsogr, weight = likeRatio

Now I need code to plot this histogram in color-color space with hotter colors in bins with higher likelihoods and cooler colors in bins with smaller likelihoods. Anyone have code that does this in IDL or python? Joe has some, but I'm having a hard time getting it to work.

Thursday, September 24, 2009

Back to the Likelihoods

It's been a while since I have worked on the likelihood QSO selection method. With the next deadline for target selection is coming up, it's time to go "Back to the Likelihoods." Note to self: It more time efficient to keep working on something continuously than to not work on it for weeks and then waste a day trying to remember what I was doing.

Where we left off...
Below is a color-color (ug - gr) plot of the final likelihood selection objects for the commissioning data.

The white data points are a random sampling of 20,000 possible objects to target. The red data points are objects who's likelihood ratio is greater than 0.1, where likelihood ratio is defined as:

where L_QSO and L_everything, as described in "A Likely Result" are defined as:

The green data points are objects are objects who's likelihood ratio is greater than 0.1 and L_everything is greater than 10^-6. This is eliminate classification of "fringe" objects that are not close to any objects (and therefore have a small everything likelihood.

The likelihood was then run on the co-added Stripe 82 data and all of the above green objects were submitted as targets for the commissioning data.

What we need to work out...
  • Why are the likelihoods so small/large? The likelihoods should be a probability, but we have likelihood's spanning from 0-11.
  • Why is our completeness and efficiency on the MMT data so poor?
  • How does the likeliness compare to QSOs based on variability?
  • How well does this method work on single epoch Stripe 82 data versus the co-added images?

Friday, September 18, 2009


I'm traveling from 9/18 - 9/24 and will not be posting.

Thursday, September 17, 2009

Bad Blogger

I've been a bad blogger this past week. Apologies for not posting. I could give excuses, but that would break two of my blog rules, so I'll quit while I am ahead (or behind as it were).

Here is a summary of where I am with this darn 3D correlation function. The Sloan data is in ra/dec/redshift coordinates. However, it is easier to calculate the 3d correlation function in x-y-z comoving coordinates. So I convert the data into comoving coordinates to calculate the correlation function. However the data lives in a ra/dec/redshift space (mask) and this corresponds to a non-boxlike x-y-z space. Therefore I need to apply the mask in ra/dec/redshift for both the randoms and the data, and then convert both to x-y-z to calculate the correlation function.

But the mock data I am currently testing my 3D correlation function with is, actually in x-y-z coordinates to begin with. This is causing an issue, because I am converting it to ra/dec/redshift, which results in non-uniform data distribution in ra/dec/redshift space (because the mock data, unlike the Sloan data, is a contiguous box in x-y-z space). However the randoms are generated in uniform ra/dec/redshift space and converted to x-y-z space (to match the Sloan mask). When I compare the mock data with the randoms they don't have the same masks because of this issue:

I think what I need to do is take the mock data in x-y-z, convert it to ra/dec/redshift and then apply a mask in that coordinate system. Then convert it back to x-y-z and use those points as my data, and then apply the same mask to the randoms. This will be more similar to what I will be doing with the Sloan data and should get my data and randoms to fall in the same location on my vector space.

Tuesday, September 8, 2009


Dearest C++,
What is the point of you making me allocate memory of my structures and arrays, if you wont tell me when I am trying to write outside the bounds of these objects? I hate you sometimes C++. You are so fast at doing my calculations, but make it so difficult to find mistakes.

Your frustrated friend,

P.S. Program received signal: “EXC_BAD_ACCESS”?

Monday, September 7, 2009

Labor Day

I choose to celebrate Labor Day by not posting on my blog.

Friday, September 4, 2009


My undergraduate research adviser Dan Snowden-Ifft always told me that the best and worst thing about computers is that they always do exactly what you tell them to do. If what they are doing isn't what you expect, then it is because you told them to do something wrong. This applies to my current situation with the randoms not matching because I was in fact asking the computer to print out the wrong numbers, and so of course they didn't match. I am in idiot sometimes. Here are some nice plots of it working now:

Thursday, September 3, 2009

Random Problems

My 3D correlation function matches Alexia's (to within 10^-9 -- which I assumed was rounding differences). However, when I print out the random numbers used to calculate the correlation function they stop matching halfway through the calculation (I am using the same seeds in both runs):

What is even more mysterious is that the number of random values generated is different by 10. This makes no sense because the input files for both functions are the same and both have the same number of mock data points. Do I spend time tracking down this issue, or let is go as it doesn't really effect the end result?

Wednesday, September 2, 2009

Precision Comparisons

I am still meticulously implementing the changes to my 3D correlation function (Xi) and constantly comparing it to the working version (Alexia's code). I haven't found where the breakdown is occuring. However, I do have some pretty plots which show how precisely these correlation functions match:

My correlation function is exactly underneath Alexia's so you can't see any distinction. I did this by seeding the random number generators the same. Below is a plot of the difference between our 3D correlation functions. Notice the 10^-9 at the top. This is due to rounding errors in the floating point numbers.

I just need to keep them working this well, while continuing to add the new geometry. Wish me luck!

Tuesday, September 1, 2009

Masking Difficulties

The first change I implemented to the 3D correlation function was setting up the masks in two coordinate systems. In the 2D code the mask is simply in ra and dec (because we are taking an angular correlation function in those dimensions). In the 3D code the correlation calculation is done in comoving coordinates, however the data mask is still in ra and dec because this is how we scan the sky. Therefore the continuous space that the data lives in is in ra, dec, redshift, but the space in which we are doing the correlation calculation in is x, y, z. Because we need to apply the same mask to the randoms in our correlation function as we do to the data, I need to apply a mask in ra, dec, redshift space... but then convert to x, y, z space for the calculation. I was thinking it was somewhere in this conversion where my problems were in my code. However in the first set of changes I made, I just added two masks (in the two coordinate systems) instead of one. And I got the following result:

I am really confused how my correlation function could be off by over 10 orders of magnitude by simply changing the number of input masks. I am not actually changing the values of the masks between this version of the code and the last version I plotted. Both are taking a data set which is contiguous in x, y, z, and therefore using a mask in x, y, z for the randoms and not changing yet to ra, dec, redshift space. There should not be any difference in the actual calculation. Time to revert back to "working" version and implement the masks more slowly I guess. I hate this!