Showing posts with label cross-correlation. Show all posts
Showing posts with label cross-correlation. Show all posts

Thursday, June 9, 2011

More QSO-Galaxy Cross Correlations

Here are the QSO-Galaxy Cross-Correlations comparing high/low redshift QSOs. I divide the QSO sample in half (0.5 < z < 0.77) & (0.77 < z <1.0):

(plots on wiki)

Here are the QSO-Galaxy Cross Correlations comparing bright/dim QSOs. I divide the QSO sample into the 1/3 brightest (20.20 < g-mag < 16.17) and 1/3 dimest (25.87

(plots on wiki)

It's pretty interesting that there seems to be a difference in the correlation functions for bright/dim objects. I am not sure if we should expect this. The redshift distribution of these objects is similar, so that points to the fact that these objects are actually brighter (not sure further away). I'm hoping to do this in terms of high/low luminosity... but I need to figure out the absolute magnitudes of these objects first. Have pinged Ross about this. More to come...

Monday, June 6, 2011

Testing QSO-Galaxy Cross CF

I spent the last couple days making sure that I trusted the correlation function code I had developed for this project. I compared it to the results from a (slow) sphere-match code on a small data set.

The following are the testing codes (also here ../logs/cross_corr_check.idl). In IDL using spherematch:

; Read in data/random files
readcol,'cfht_data_tiny.dat',rad1,decd1, format = ('D,D')
readcol, 'qso_data_tiny.dat',rad2,decd2,cd, format = ('D,D,D') ;cd is comoving distance of qsos
readcol,'cfht_randoms_tiny.dat',rar1,decr1, format = ('D,D')

; Find number of objects in files
nr1 = n_elements(rar1)
nd1 = n_elements(rad1)
nd2 = n_elements(rad2)

; Correlate out to a large theta to get all pairs
thetamax = 50.0

print,'Starting cross-correlation:'
print,'Estimating DD...'
spherematch,rad1,decd1,rad2,decd2,thetamax,ind1a,ind1b,dist_dd,maxmatch=0

; Convert from angular separation to comoving distance separation
this_dd = 2*sin(dist_dd*!pi/360)*cd[ind1b]

;Bins go from 0.1 to 10 with 15 bins.
corrmin = 0.1D
corrmax = 10.0D
nbins = 15.0D

; Find bins lower, upper and centers
bins_lower = (corrmax-corrmin)/(nbins)*findgen(nbins)+corrmin
bins_upper = (corrmax-corrmin)/(nbins)*(findgen(nbins)+1)+corrmin
rmean = fltarr(nbins)
for i = 0,(nbins-1) do rmean[i] = (bins_lower[i]+bins_upper[i])/2.

; Bin the DD separation distances
dd = fltarr(nbins)
for i = 0,(nbins-1) do dd[i] = n_elements(where(this_dd gt bins_lower[i] AND this_dd le bins_upper[i]))

print,'Estimating DR...'
spherematch,rar1,decr1,rad2,decd2,thetamax,ind1,ind2,dist_dr1,maxmatch=0

this_dr = 2*sin(dist_dr1*!pi/360)*cd[ind2]
dr = fltarr(nbins)
for i = 0,(nbins-1) do dr[i] = n_elements(where(this_dr ge bins_lower[i] AND this_dr le bins_upper[i]))

corr1 = 1L*dd/dr*1L*(nd2*nr1)/(1L*nd1*nd2)-1L

for i = 0,(nbins-1) do print, rmean[i], corr1[i]

Separation omega
0.430000 -0.115686
1.09000 -0.104478
1.75000 -0.120804
2.41000 -0.0914845
3.07000 -0.0393971
3.73000 -0.0268416
4.39000 0.0134841
5.05000 0.0596094
5.71000 0.0227162
6.37000 0.102554
7.03000 0.0929233
7.69000 0.0900670
8.35000 0.0591398
9.01000 0.0284724
9.67000 0.0598689

(Note that these "tiny" files only have 6 qsos and 9000 galaxies, so the correlation function values are very noisy, this was just to test and I used small files to)

By comparison I also have the python/C code which runs much faster (../Jessica/qsobias/Correlate/runCorrelation.py):

import numpy as N

from pylab import *

from correlationFunctions import *


#------------------------------------------------------------------------

# Create file names (tiny catalogs)

#------------------------------------------------------------------------

workingDir = 'tinyrun'

makeworkingdir(workingDir)

galaxyDataFile, qsoDataFile, randomDataFile, corr2dCodefile, argumentFile, runConstantsFile = makeFileNamesTiny(workingDir)


oversample = 5. # Amount that randoms should be oversampled

corrBins = 25.0 # Number of correlation bins (+1)

mincorr = 0.1 # (Mpc/h comoving distance separation) Must be great than zero if log-binning

maxcorr = 10.0 # (Mphc/h comoving distance separation)

convo = 180./pi # conversion from degrees to radians

tlogbin = 1 # = 0 for uniform spacing, = 1 for log spacing in theta


#------------------------------------------------------------------------

# Write run constants to a file

#------------------------------------------------------------------------


writeRunConstantsToFile(runConstantsFile, galaxyDataFile, qsoDataFile, \

randomDataFile, corr2dCodefile, argumentFile, oversample, corrBins, \

mincorr, maxcorr, tlogbin)


#------------------------------------------------------------------------

# Compute the Angular Correlation Function

#------------------------------------------------------------------------

runcrossCorrelation(workingDir, argumentFile, corr2dCodefile, galaxyDataFile,\

qsoDataFile, randomDataFile, mincorr, maxcorr, corrBins, tlogbin)


# separation (Mpc/h) crossw (Mpc/h)

0.4300000000 -0.1156862745

1.0900000000 -0.1044776119

1.7500000000 -0.1208039566

2.4100000000 -0.0914845135

3.0700000000 -0.0393970538

3.7300000000 -0.0268417043

4.3900000000 0.0134841235

5.0500000000 0.0596093513

5.7100000000 0.0227161938

6.3700000000 0.1025539385

7.0300000000 0.0929232804

7.6900000000 0.0900670231

8.3500000000 0.0591397849

9.0100000000 0.0284723490

9.6700000000 0.0598689436


As you can see the correlation functions match!

Wednesday, June 1, 2011

Data in Order

I've finished creating the randoms for the CFHT data. They are here:

../data/jessica/alexieData/Catalogs/cfht_random_catalog.dat

So now I have the masked galaxy data:
../data/jessica/alexieData/Catalogs/cfht_data.dat

The qso data:
../data/jessica/alexieData/Catalogs/qso_data.dat

The code to make these is catalogs are here:
randoms: ../Jessica/qsobias/Analysis/make_random_catalog.pro
galaxy data: ../Jessica/qsobias/Analysis/doall_cfht_se.pro
qso data: ../Jessica/qsobias/Analysis/make_qso_catalog.py

In my meeting last week, I realized that I had been calculating the correlation function incorrectly. I need to multiply the angle by the comoving distance to get a physical separation, not angular separation. This requires tweaking the c-code a bit to change that.

Friday, May 20, 2011

New Project

Because things aren't yet working with my reconstruction project, David has suggested I work on something else in parallel so that if I end up not being able to get a result with the reconstruction, I still have another option for my thesis.

Here is the new project description:
~~~~~~~~~~~~

SDSS-III Project 127: Cross-correlation of BOSS spectroscopic quasars on Stripe 82 with photometric CFH galaxies.

Participants:
Jessica Kirkpatrick
Martin White, David Schlegel, Nic Ross, Alexie Leauthaud, Jean-Paul Kneib

Categories: BOSS

Project Description:
We plan to measure the intermediate scale clustering of low redshift BOSS quasars along Stripe 82 by cross-correlation against the photometric galaxy catalog from the CFH i-band imaging on Stripe 82. There are approximately 1,000 BOSS quasars with 0.5
<z<1 and just under 6 million galaxies in the -43<RA<43 and -1<DEC<1 region brighter than i=23.5, which should lead to a strong detection of clustering over approximately 2 orders of magnitude in length scale. The geometry of the stripe suggests errors on the cross-correlation can be efficiently obtained by jackknife or bootstrap sampling the ~50 2x2 degree blocks.

We intend to split the QSO sample in luminosity and black hole mass. We plan to estimate the BH mass using the fits from Vestergaard and Peterson, knowing the Hbeta line width and the continuum luminosity at 5100A. The pipeline measures the former, we plan to measure the latter from the photometry calibrated with Ian McGreer's mocks.

If we use the QG/QR-1 estimator we do not need the quasar mask, only that of the galaxies which will be provided by the CS82 team in the form of a pixelized mask from visual inspection. The dN/dz of the galaxies is known from photometric redshifts plus spectroscopic training sets. While the galaxies could be split in photometric redshift bins, the gains from doing so are not expected to be large, so our initial investigations will simply cross-correlate the quasars with the magnitude limited galaxy catalog.

Along with this project we will submit a request for EC status for:

Ludo van Waerbeke
Hendrik Hildebrant
David Woods
Thomas Erben

who were instrumental in obtaining and reducing the CS82 data and producing the required galaxy catalog and mask but are not members of the BOSS collaboration.

Details are available at https://www.sdss3.org/internal/publications/cgi-bin/projects.pl/display_project/127

~~~~~~~~~~~~~
The first thing Martin wanted me to do was to approximate the errors bars for the correlation function based on the density of galaxies and quasars in my sample.

Starting with luminosity function in this paper, I am doing the following to estimate the density.

According to Table 1 in Ilbert et. al. the following are the Schechter parameters for the galaxy luminosity function, redshift 0.6-0.8:

ϕ* = 5.01e-3 h^3 Mpc^(-3) mag^(-1)
M* (i-band) = -22.17
α = -1.41

Inputing these into IDL's lf_schechter function we get the following:

mags = findgen(5*20+1)/20. - 23
vals= lf_schechter(mags, 5.01, -22.17, -1.41)
plot, mags, vals*10D^(-3), /ylog, XTITLE = 'Magnitude (iband)', YTITLE = 'log phi', TITLE = 'Luminosity Function', charsize = 2, charthick = 1


To get the density we integrate:
density = 0.05*vals*10D^(-3) #bin size is 0.05 mags
print, total(density)
0.041830996
=4.18 * 10^-2 h^3 Mpc^-3

This is similar to what they get in this paper by Faber et al:
According to Faber Paper the luminosity density:
log10(j_B) = 8.5 (@redshift 0.7) solar luminosity = 10^10 solar luminosity / galaxy

j_b = 10^8.5 solar lum = 10^(-1.5) galaxies / Mpc^3 (h = 0.7)
= 3.16*10^-2 galaxies / Mpc^3 (h = .7)
= 9.21 *10^-2 galaxies h^3 / Mpc^3

So we are looking at a density of something in the ballpark of 0.05 galaxies per (Mpc/h)^3.

To get it directly from the data:

5.5254 million galaxies (once mask/cuts applied)
sky area = 166 deg^2
redshift range = 0 - 1
redshift 1 = 2312.67 Mpc / h (in comoving distance)
volume = 166 sq-deg / (3282.90 sq-deg) * 4/3 pi r^3 = 2,619,871,820 (Mpc / h)^3

density = 2.00 *10^-3 galaxies * h^3 * Mpc^(-3)

This is an order of magnitude smaller. Not sure why.... am I perhaps doing the volume calculation incorrectly?

4257 quasars (out to redshift of 1, in the cfht footprint)
density = 1.66e-06 qsos * h^3 * Mpc^(-3)

Exchange with Martin White, RE: Estimating Errors

Martin,
I've figured out the density of the galaxies/qsos from both the LFs and the catalogs, and would like to estimate the errors in the correlation function. I understand that the errors go as 1 / sqrt(pair counts) in each bin. But going from galaxy/qso density to pair counts in a bin is where I am a bit lost. I asked Alexie, and she thought that I just take the density and multiply it by the volume of each bin in the correlation function, and then use that number to compute the pair counts. I don't see how that is the same as the pair counts for the correlation function. The correlation function is measuring separation, so how is that the same as the number of pairs in a volume with a side of the separation?


I went ahead and calculated the correlation function with the following bins (degrees):
theta = (1.0000000e-05, 3.1622777e-05, 0.00010000000, 0.00031622777, 0.0010000000, 0.0031622777, 0.010000000, 0.031622777, 0.10000000, 0.31622777, 1.0000000)

I get the following number of dd pairs in each bin:
dd = (589.000, 561.000, 49.0000, 386.000, 4675.00, 41535.0, 394556, 3.81242e+06, 3.60765e+07, 2.94166e+08)

1/sqrt(dd) = (0.0412043, 0.0422200, 0.142857, 0.0508987, 0.0146254, 0.00490674, 0.00159201, 0.000512153, 0.000166490, 5.83048e-05)

The catalogs are not properly masked, so we might get a reduction in dd values by perhaps 30% once we mask the data.

This would result in the following:

dd = (412.300 392.700 34.3000 270.200 3272.50 29074.5
276189. 2.66869e+06 2.52536e+07 2.05916e+08)

1/(sqrt(dd)) = ( 0.0492485 0.0504626 0.170747 0.0608355 0.0174808 0.00586467
0.00190282 0.000612140 0.000198993 6.96875e-05)

I'm currently running the rr, and will have a "correlation function" this afternoon. Although this will of course be wrong, because I'm not masking properly yet. I should get the masks from Alexia today or tomorrow.

Jessica

~~~~~~

Jessica,

I understand that the errors go as 1 / sqrt(pair counts) in each bin. But going from galaxy/qso density to pair counts in a bin is where I am a bit lost.

If you think of the very simplest correlation function estimator that you can write down, xi=DD/RR-1, and imagine that you have so many randoms the fluctuations in RR are negligible then you see that the errors in 1+xi are given by the fluctuations in the counts of DD in a bin. Assuming Poisson statistics, the fractional error in 1+xi goes as 1/sqrt{Npair} where Npair is the number of quasar-galaxy pairs.

For a 3D correlation function the number of data pairs goes as Nqso times Nbar-galaxy times 1+xi times the volume of the bin (in 3D, e.g. 4\pi s^2 ds for a spherical shell). Just think of what the code does: sit on each quasar and count all the galaxies in the bin. To go from a 3D correlation function to a 2D correlation function you need to integrate in the Z direction. But remember that the sum of independent Poisson distributions is also a Poisson with a mean equal to the sum of the means of the contributing parts. So this allows you to figure out what the error on wp is. You should see that as you integrate to very large line-of-sight distance things become noisier. So choose something like +/-50Mpc/h for the width in line-of-sight distance to integrate over in defining wp.

It's a little easier to understand if you write the defining equations out for yourself on a piece of paper.

Martin

~~~~~~~
Project Reading list
http://arxiv.org/abs/0802.2105
https://trac.sdss3.org/wiki/BOSS/quasars/black_hole_masses
http://iopscience.iop.org/0004-637X/665/1/265/pdf/62903.web.pdf
http://articles.adsabs.harvard.edu//full/1989ApJ...343....1D/0000011.000.html
http://adsabs.harvard.edu/abs/2005A%26A...439..863I