Monday, October 12, 2009

Everything Working (finally)

Princeton (and Alexia) makes everything better! It seems I have finally got a working 3D correlation function. I don't know why it took me so long to get this thing to work, it seems like it should be simple enough to do. Anyway, here is a summary what I've done...

The inputs to the function are a set of mock "data" point in Cartesian coordinates. For Sloan data, these will be converted from ra/dec (spherical coordinates) to Cartesian in python. There are also mask inputs, both in spherical and Cartesian coordinates.

For the mock data I simply applied a mask cut on ra/dec/redshift and then converted those points back to x/y/z:


The mask in ra/dec/redshift is a contiguous box.


Converted to x/y/z

The mask has the minimum and maximum values that the data can fall in for each coordinate. For example in Cartesian coordinates the mask contains:
In [541]: minX
Out[541]: 173.568011004

In [542]: maxX
Out[542]: 449.984440618

In [543]: minY
Out[543]: -289.251614958

In [544]: maxY
Out[544]: 289.251614958

In [545]: minZ
Out[545]: -190.178217783

In [546]: maxZ
Out[546]: 190.178217783

The data is then scaled down to a 1x1x1 box (this is what the Alexia/Martin correlation function calculation code is expecting). This is done by doing the following to each dimension of each data point (i):

posX[i] = (dataX[i] - minX + padding*rmax/2)/maxBoxside

where dataX[i] is the x value of the ith data point, minX is the minimum value that x can be (from the mask), padding is how much padding you want around the edge of your data (this prevents power from being "wrapped" around as the correlation calculation uses periodic boundary conditions), rmax is the maximum distance you are calculating the correlation function out to, and maxBoxside is the length of the longest side of the databox [i.e. max(maxX - minX, maxY - minY, maxZ - minZ)].

Because the data falls within a contiguous ra/dec/redshift, I populate the randoms in the ra/dec/redshift mask and then convert them x/y/z using the same conversion as I do on the data. I then apply the same scaling (as described in previous paragraph) to the x/y/z randoms. The result is data which falls on top of randoms and is contained in a padded 1x1x1 box:


As you can see all the data falls between 0 and 1 and the data falls on top of the randoms.

It is hard to see from the above plots but I would also like the redshift distribution of the randoms to follow that of the data. This is done by binning the data into redshift bins (20 in the example I am plotting here) and then for every data point in a particular bin, I generate 10 random points in the same bin.



Histogram of number of point in each redshift bin.
I multiplied the data by 10 so that the scale is the same as the randoms.

Once I have both the data and the randoms in a padded 1x1x1 box (making sure the randoms follow the same redshift distribution as the data), then the 3D correlation function can be calculated. This calculation is done in Cartesian coordinates, but this shouldn't matter because we are just looking at distances of points from each other and so as long as each side of the box is in the same units we are good.

Here is a comparison of my 3D correlation function with Alexia's working 3D function. The reason they don't fall exactly on top of each other is because Alexia's is calculated on different data points (in the same mock catalog) due to her's requiring a Cartesian mask for the data:


My working 3D correlation function!

Now I get to run it on the Sloan data and see if the reconstruction still fails or if that fixes my problem. If it works, I am done with my PhD thesis (well not really, but it would be huge progress). Let's keep our fingers crossed!

No comments:

Post a Comment