Friday, May 20, 2011

New Project

Because things aren't yet working with my reconstruction project, David has suggested I work on something else in parallel so that if I end up not being able to get a result with the reconstruction, I still have another option for my thesis.

Here is the new project description:

SDSS-III Project 127: Cross-correlation of BOSS spectroscopic quasars on Stripe 82 with photometric CFH galaxies.

Jessica Kirkpatrick
Martin White, David Schlegel, Nic Ross, Alexie Leauthaud, Jean-Paul Kneib

Categories: BOSS

Project Description:
We plan to measure the intermediate scale clustering of low redshift BOSS quasars along Stripe 82 by cross-correlation against the photometric galaxy catalog from the CFH i-band imaging on Stripe 82. There are approximately 1,000 BOSS quasars with 0.5
<z<1 and just under 6 million galaxies in the -43<RA<43 and -1<DEC<1 region brighter than i=23.5, which should lead to a strong detection of clustering over approximately 2 orders of magnitude in length scale. The geometry of the stripe suggests errors on the cross-correlation can be efficiently obtained by jackknife or bootstrap sampling the ~50 2x2 degree blocks.

We intend to split the QSO sample in luminosity and black hole mass. We plan to estimate the BH mass using the fits from Vestergaard and Peterson, knowing the Hbeta line width and the continuum luminosity at 5100A. The pipeline measures the former, we plan to measure the latter from the photometry calibrated with Ian McGreer's mocks.

If we use the QG/QR-1 estimator we do not need the quasar mask, only that of the galaxies which will be provided by the CS82 team in the form of a pixelized mask from visual inspection. The dN/dz of the galaxies is known from photometric redshifts plus spectroscopic training sets. While the galaxies could be split in photometric redshift bins, the gains from doing so are not expected to be large, so our initial investigations will simply cross-correlate the quasars with the magnitude limited galaxy catalog.

Along with this project we will submit a request for EC status for:

Ludo van Waerbeke
Hendrik Hildebrant
David Woods
Thomas Erben

who were instrumental in obtaining and reducing the CS82 data and producing the required galaxy catalog and mask but are not members of the BOSS collaboration.

Details are available at

The first thing Martin wanted me to do was to approximate the errors bars for the correlation function based on the density of galaxies and quasars in my sample.

Starting with luminosity function in this paper, I am doing the following to estimate the density.

According to Table 1 in Ilbert et. al. the following are the Schechter parameters for the galaxy luminosity function, redshift 0.6-0.8:

ϕ* = 5.01e-3 h^3 Mpc^(-3) mag^(-1)
M* (i-band) = -22.17
α = -1.41

Inputing these into IDL's lf_schechter function we get the following:

mags = findgen(5*20+1)/20. - 23
vals= lf_schechter(mags, 5.01, -22.17, -1.41)
plot, mags, vals*10D^(-3), /ylog, XTITLE = 'Magnitude (iband)', YTITLE = 'log phi', TITLE = 'Luminosity Function', charsize = 2, charthick = 1

To get the density we integrate:
density = 0.05*vals*10D^(-3) #bin size is 0.05 mags
print, total(density)
=4.18 * 10^-2 h^3 Mpc^-3

This is similar to what they get in this paper by Faber et al:
According to Faber Paper the luminosity density:
log10(j_B) = 8.5 (@redshift 0.7) solar luminosity = 10^10 solar luminosity / galaxy

j_b = 10^8.5 solar lum = 10^(-1.5) galaxies / Mpc^3 (h = 0.7)
= 3.16*10^-2 galaxies / Mpc^3 (h = .7)
= 9.21 *10^-2 galaxies h^3 / Mpc^3

So we are looking at a density of something in the ballpark of 0.05 galaxies per (Mpc/h)^3.

To get it directly from the data:

5.5254 million galaxies (once mask/cuts applied)
sky area = 166 deg^2
redshift range = 0 - 1
redshift 1 = 2312.67 Mpc / h (in comoving distance)
volume = 166 sq-deg / (3282.90 sq-deg) * 4/3 pi r^3 = 2,619,871,820 (Mpc / h)^3

density = 2.00 *10^-3 galaxies * h^3 * Mpc^(-3)

This is an order of magnitude smaller. Not sure why.... am I perhaps doing the volume calculation incorrectly?

4257 quasars (out to redshift of 1, in the cfht footprint)
density = 1.66e-06 qsos * h^3 * Mpc^(-3)

Exchange with Martin White, RE: Estimating Errors

I've figured out the density of the galaxies/qsos from both the LFs and the catalogs, and would like to estimate the errors in the correlation function. I understand that the errors go as 1 / sqrt(pair counts) in each bin. But going from galaxy/qso density to pair counts in a bin is where I am a bit lost. I asked Alexie, and she thought that I just take the density and multiply it by the volume of each bin in the correlation function, and then use that number to compute the pair counts. I don't see how that is the same as the pair counts for the correlation function. The correlation function is measuring separation, so how is that the same as the number of pairs in a volume with a side of the separation?

I went ahead and calculated the correlation function with the following bins (degrees):
theta = (1.0000000e-05, 3.1622777e-05, 0.00010000000, 0.00031622777, 0.0010000000, 0.0031622777, 0.010000000, 0.031622777, 0.10000000, 0.31622777, 1.0000000)

I get the following number of dd pairs in each bin:
dd = (589.000, 561.000, 49.0000, 386.000, 4675.00, 41535.0, 394556, 3.81242e+06, 3.60765e+07, 2.94166e+08)

1/sqrt(dd) = (0.0412043, 0.0422200, 0.142857, 0.0508987, 0.0146254, 0.00490674, 0.00159201, 0.000512153, 0.000166490, 5.83048e-05)

The catalogs are not properly masked, so we might get a reduction in dd values by perhaps 30% once we mask the data.

This would result in the following:

dd = (412.300 392.700 34.3000 270.200 3272.50 29074.5
276189. 2.66869e+06 2.52536e+07 2.05916e+08)

1/(sqrt(dd)) = ( 0.0492485 0.0504626 0.170747 0.0608355 0.0174808 0.00586467
0.00190282 0.000612140 0.000198993 6.96875e-05)

I'm currently running the rr, and will have a "correlation function" this afternoon. Although this will of course be wrong, because I'm not masking properly yet. I should get the masks from Alexia today or tomorrow.




I understand that the errors go as 1 / sqrt(pair counts) in each bin. But going from galaxy/qso density to pair counts in a bin is where I am a bit lost.

If you think of the very simplest correlation function estimator that you can write down, xi=DD/RR-1, and imagine that you have so many randoms the fluctuations in RR are negligible then you see that the errors in 1+xi are given by the fluctuations in the counts of DD in a bin. Assuming Poisson statistics, the fractional error in 1+xi goes as 1/sqrt{Npair} where Npair is the number of quasar-galaxy pairs.

For a 3D correlation function the number of data pairs goes as Nqso times Nbar-galaxy times 1+xi times the volume of the bin (in 3D, e.g. 4\pi s^2 ds for a spherical shell). Just think of what the code does: sit on each quasar and count all the galaxies in the bin. To go from a 3D correlation function to a 2D correlation function you need to integrate in the Z direction. But remember that the sum of independent Poisson distributions is also a Poisson with a mean equal to the sum of the means of the contributing parts. So this allows you to figure out what the error on wp is. You should see that as you integrate to very large line-of-sight distance things become noisier. So choose something like +/-50Mpc/h for the width in line-of-sight distance to integrate over in defining wp.

It's a little easier to understand if you write the defining equations out for yourself on a piece of paper.


Project Reading list

No comments:

Post a Comment